stop_cpus_hard when multiple CPUs are panicking from an NMI
Attilio Rao
attilio at freebsd.org
Sun Nov 25 12:31:44 UTC 2012
On Sun, Nov 25, 2012 at 12:29 PM, Attilio Rao <attilio at freebsd.org> wrote:
> On Fri, Nov 16, 2012 at 2:47 PM, Andriy Gapon <avg at freebsd.org> wrote:
>> on 16/11/2012 16:41 Attilio Rao said the following:
>>> On Fri, Nov 16, 2012 at 1:18 PM, Andriy Gapon <avg at freebsd.org> wrote:
>>>> on 16/11/2012 14:30 Attilio Rao said the following:
>>>>> On Fri, Nov 16, 2012 at 7:54 AM, Andriy Gapon <avg at freebsd.org> wrote:
>>>>>> on 16/11/2012 00:58 Ryan Stone said the following:
>>>>>>> At work we have some custom watchdog hardware that sends an NMI upon
>>>>>>> expiry. We've modified the kernel to panic when it receives the watchdog
>>>>>>> NMI. I've been trying the "stop_scheduler_on_panic" mode, and I've
>>>>>>> discovered that when my watchdog expires, the system gets completely
>>>>>>> wedged. After some digging, I've discovered is that I have multiple CPUs
>>>>>>> getting the watchdog NMI and trying to panic concurrently. One of the CPUs
>>>>>>> wins, and the rest spin forever in this code:
>>>>>>>
>>>>>>> /*
>>>>>>> * We don't want multiple CPU's to panic at the same time, so we
>>>>>>> * use panic_cpu as a simple spinlock. We have to keep checking
>>>>>>> * panic_cpu if we are spinning in case the panic on the first
>>>>>>> * CPU is canceled.
>>>>>>> */
>>>>>>> if (panic_cpu != PCPU_GET(cpuid))
>>>>>>> while (atomic_cmpset_int(&panic_cpu, NOCPU,
>>>>>>> PCPU_GET(cpuid)) == 0)
>>>>>>> while (panic_cpu != NOCPU)
>>>>>>> ; /* nothing */
>>>>>>>
>>>>>>> The system wedges when stop_cpus_hard() is called, which sends NMIs to all
>>>>>>> of the other CPUs and waits for them to acknowledge that they are stopped
>>>>>>> before returning. However the CPU will not deliver an NMI to a CPU that is
>>>>>>> already handling an NMI, so the other CPUs that got a watchdog NMI and are
>>>>>>> spinning will never go into the NMI handler and acknowledge that they are
>>>>>>> stopped.
>>>>>>
>>>>>> I thought about this issue and fixed (in my tree) in a different way:
>>>>>> http://people.freebsd.org/~avg/cpu_stop-race.diff
>>>>>>
>>>>>> The need for spinlock_enter in the patch in not entirely clear.
>>>>>> The main idea is that a CPU which calls cpu_stop and loses a race should
>>>>>> voluntary enter cpustop_handler.
>>>>>> I am also not sure about MI-cleanness of this patch.
>>>>>
>>>>> It is similar to what I propose but with some differences:
>>>>> - It is not clean from MI perspective
>>>>
>>>> OK.
>>>>
>>>>> - I don't think we need to treact it specially, I would just
>>>>> unconditionally stop all the CPUs entering in the "spinlock zone",
>>>>> making the patch simpler.
>>>>
>>>> I couldn't understand this part.
>>>
>>> I'm sorry, I think I misread your patch.
>>>
>>> I was basically suggesting to fix Ryan case by calling
>>> cpustop_handler() (or the new MI interface) into the panic() function,
>>> in the case the CPU don't win the race for panic_cpu.
>>> Basically doing:
>>> Index: sys/kern/kern_shutdown.c
>>> ===================================================================
>>> --- sys/kern/kern_shutdown.c (revision 243154)
>>> +++ sys/kern/kern_shutdown.c (working copy)
>>> @@ -568,15 +568,11 @@ panic(const char *fmt, ...)
>>> #ifdef SMP
>>> /*
>>> * We don't want multiple CPU's to panic at the same time, so we
>>> - * use panic_cpu as a simple spinlock. We have to keep checking
>>> - * panic_cpu if we are spinning in case the panic on the first
>>> - * CPU is canceled.
>>> + * use panic_cpu as a simple lock.
>>> */
>>> if (panic_cpu != PCPU_GET(cpuid))
>>> - while (atomic_cmpset_int(&panic_cpu, NOCPU,
>>> - PCPU_GET(cpuid)) == 0)
>>> - while (panic_cpu != NOCPU)
>>> - ; /* nothing */
>>> + if (atomic_cmpset_int(&panic_cpu, NOCPU, PCPU_GET(cpuid)) == 0)
>>> + cpustop_handler();
>>>
>>> if (stop_scheduler_on_panic) {
>>> if (panicstr == NULL && !kdb_active) {
>>>
>>>
>>> Infact it seems to me that the comment is outdated and no longer
>>> represent truth.
>>
>> Ah, I see. Thank you.
>>
>> My older plan was to get rid of stop_scheduler_on_panic, that is to make the
>> behavior unconditional. And then to use stop_cpus_hard instead of the hand-rolled
>> 'panic_cpu' spinlock. This way the whichever CPU wins stop_cpus_hard would be the
>> only CPU to enter panic.
>
> So, assuming we are not yet defaulting to stop cpus behaviour for
> panic, I think we have 2 things to take care:
> 1) Handling the simultaneous panics
> 2) Handling deadlocks/starvation among simultaneous cpu_stops
>
> In particular, case 2 is more tricky than it seems, it is not only
> related to NMI but also to the case where you send an IPI_STOP via a
> normal IPI and interrupts are disabled on one of the channels. Infact,
> I think the patch you propose makes such effects even worse, because
> it disables interrupts in generic_stop_cpus().
> What I suggest to do, is the following:
> - The CPU which wins the race for generic_stop_cpus also signals the
> CPUs it is willing to stop on a global mask
> - Another CPU entering generic_stop_cpus() and loosing the race,
> checks the mask of cpus which might be stopped and stops itself if
> necessary (ie. not yet done). We must be careful with races here, but
> I'm confindent this can be done easily enough.
BTW; I'm aware such case is not fatal because of the safety belt, but
I think it would be too easy to get wrong code if we get simultaneous
generic_stop_cpus() to race in the "wrong way" (race loser with
interrupts/NMI disabled).
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
More information about the freebsd-hackers
mailing list