stop_cpus_hard when multiple CPUs are panicking from an NMI
Andriy Gapon
avg at FreeBSD.org
Fri Nov 16 14:47:39 UTC 2012
on 16/11/2012 16:41 Attilio Rao said the following:
> On Fri, Nov 16, 2012 at 1:18 PM, Andriy Gapon <avg at freebsd.org> wrote:
>> on 16/11/2012 14:30 Attilio Rao said the following:
>>> On Fri, Nov 16, 2012 at 7:54 AM, Andriy Gapon <avg at freebsd.org> wrote:
>>>> on 16/11/2012 00:58 Ryan Stone said the following:
>>>>> At work we have some custom watchdog hardware that sends an NMI upon
>>>>> expiry. We've modified the kernel to panic when it receives the watchdog
>>>>> NMI. I've been trying the "stop_scheduler_on_panic" mode, and I've
>>>>> discovered that when my watchdog expires, the system gets completely
>>>>> wedged. After some digging, I've discovered is that I have multiple CPUs
>>>>> getting the watchdog NMI and trying to panic concurrently. One of the CPUs
>>>>> wins, and the rest spin forever in this code:
>>>>>
>>>>> /*
>>>>> * We don't want multiple CPU's to panic at the same time, so we
>>>>> * use panic_cpu as a simple spinlock. We have to keep checking
>>>>> * panic_cpu if we are spinning in case the panic on the first
>>>>> * CPU is canceled.
>>>>> */
>>>>> if (panic_cpu != PCPU_GET(cpuid))
>>>>> while (atomic_cmpset_int(&panic_cpu, NOCPU,
>>>>> PCPU_GET(cpuid)) == 0)
>>>>> while (panic_cpu != NOCPU)
>>>>> ; /* nothing */
>>>>>
>>>>> The system wedges when stop_cpus_hard() is called, which sends NMIs to all
>>>>> of the other CPUs and waits for them to acknowledge that they are stopped
>>>>> before returning. However the CPU will not deliver an NMI to a CPU that is
>>>>> already handling an NMI, so the other CPUs that got a watchdog NMI and are
>>>>> spinning will never go into the NMI handler and acknowledge that they are
>>>>> stopped.
>>>>
>>>> I thought about this issue and fixed (in my tree) in a different way:
>>>> http://people.freebsd.org/~avg/cpu_stop-race.diff
>>>>
>>>> The need for spinlock_enter in the patch in not entirely clear.
>>>> The main idea is that a CPU which calls cpu_stop and loses a race should
>>>> voluntary enter cpustop_handler.
>>>> I am also not sure about MI-cleanness of this patch.
>>>
>>> It is similar to what I propose but with some differences:
>>> - It is not clean from MI perspective
>>
>> OK.
>>
>>> - I don't think we need to treact it specially, I would just
>>> unconditionally stop all the CPUs entering in the "spinlock zone",
>>> making the patch simpler.
>>
>> I couldn't understand this part.
>
> I'm sorry, I think I misread your patch.
>
> I was basically suggesting to fix Ryan case by calling
> cpustop_handler() (or the new MI interface) into the panic() function,
> in the case the CPU don't win the race for panic_cpu.
> Basically doing:
> Index: sys/kern/kern_shutdown.c
> ===================================================================
> --- sys/kern/kern_shutdown.c (revision 243154)
> +++ sys/kern/kern_shutdown.c (working copy)
> @@ -568,15 +568,11 @@ panic(const char *fmt, ...)
> #ifdef SMP
> /*
> * We don't want multiple CPU's to panic at the same time, so we
> - * use panic_cpu as a simple spinlock. We have to keep checking
> - * panic_cpu if we are spinning in case the panic on the first
> - * CPU is canceled.
> + * use panic_cpu as a simple lock.
> */
> if (panic_cpu != PCPU_GET(cpuid))
> - while (atomic_cmpset_int(&panic_cpu, NOCPU,
> - PCPU_GET(cpuid)) == 0)
> - while (panic_cpu != NOCPU)
> - ; /* nothing */
> + if (atomic_cmpset_int(&panic_cpu, NOCPU, PCPU_GET(cpuid)) == 0)
> + cpustop_handler();
>
> if (stop_scheduler_on_panic) {
> if (panicstr == NULL && !kdb_active) {
>
>
> Infact it seems to me that the comment is outdated and no longer
> represent truth.
Ah, I see. Thank you.
My older plan was to get rid of stop_scheduler_on_panic, that is to make the
behavior unconditional. And then to use stop_cpus_hard instead of the hand-rolled
'panic_cpu' spinlock. This way the whichever CPU wins stop_cpus_hard would be the
only CPU to enter panic.
Sorry, if I was not fully clear when I posted my patch.
--
Andriy Gapon
More information about the freebsd-hackers
mailing list