please review, patch for lost camisr
Alfred Perlstein
bright at mu.org
Wed May 29 07:27:48 UTC 2013
On 5/29/13 12:16 AM, Konstantin Belousov wrote:
> On Tue, May 28, 2013 at 10:31:40PM -0700, Alfred Perlstein wrote:
>> On 5/28/13 10:08 PM, Konstantin Belousov wrote:
>>> On Tue, May 28, 2013 at 01:35:01PM -0700, Alfred Perlstein wrote:
>>>> [[ moved to hackers@ from private mail. ]]
>>>>
>>>> On 5/28/13 1:13 PM, John Baldwin wrote:
>>>>> On Tuesday, May 28, 2013 3:29:41 pm Alfred Perlstein wrote:
>>>>>> On 5/28/13 9:04 AM, John Baldwin wrote:
>>>>>>> On Tuesday, May 28, 2013 2:13:32 am Alfred Perlstein wrote:
>>>>>>>> Hey folks,
>>>>>>>>
>>>>>>>> I had a talk with Nathan Whitehorn about the camisr issue. The issue we
>>>>>>>> are seeing we mostly know, but to summarize, we are losing the camisr
>>>>>>>> signal and the camisr is not being run.
>>>>>>>>
>>>>>>>> I gave him a summary of what we have been seeing and pointed him to the
>>>>>>>> code I am concerned about here:
>>>>>>>> http://pastebin.com/tLKr7mCV (this is inside of kern_intr.c).
>>>>>>>>
>>>>>>>> What I think that is happening is that the setting of it_need to 0
>>>>>>>> inside of sys/kern/kern_intr.c:ithread_loop() is not being scheduled
>>>>>>>> correctly and it is being delayed until AFTER the call to
>>>>>>>> ithread_execute_handlers() right below the atomic_store_rel_int().
>>>>>>> This seems highly unlikely, to the extent that if this were true all our
>>>>>>> locking primitives would be broken. The store_rel is actually a release
>>>>>>> barrier which acts like more a read/write fence. No memory accesses (read or
>>>>>>> write) from before the release can be scheduled after the associated store,
>>>>>>> either by the compiler or CPU. That is what Konstantin is referring to in his
>>>>>>> commit when he says "release" semantics".
>>>>>> Yes, that makes sense, however does it specify that the writes *must*
>>>>>> occur at that *point*? If it only enforces ordering then we may have
>>>>>> some issue, specifically because the setting of it to '1' inside of
>>>>>> intr_event_schedule_thread has no barrier other than the acq semantics
>>>>>> of the thread lock. I am wondering what is forcing out the '1' there.
>>>>> Nothing ever forces writes. You would have to invalidate the cache to do that
>>>>> and that is horribly expensive. It is always only about ordering and knowing
>>>>> that if you can complete another operation on the same "cookie" variable with
>>>>> acquire semantics that earlier writes will be visible.
>>>> By cookie, you mean a specific memory address, basically a lock? This is
>>>> starting to reinforce my suspicions as the setting of it_need is done
>>>> with release semantics, however the acq on the other CPU is done on the
>>>> thread lock. Maybe that is irrelevant. We will find out shortly.
>>>>
>>>>>> See below as I think we have proof that this is somehow happening.
>>>>> Having ih_need of 1 and it_need of 0 is certainly busted. The simplest fix
>>>>> is probably to stop using atomics on it_need and just grab the thread lock
>>>>> in the main ithread loop and hold it while checking and clearing it_need.
>>>>>
>>>> OK, we have some code that will either prove this, or perturb the memory
>>>> ordering enough to make the bug go away, or prove this assertion wrong.
>>>>
>>>> We will update on our findings hopefully in the next few days.
>>> IMO the read of it_need in the 'while (ithd->it_need)' should
>>> have acquire semantic, otherwise the future reads in the
>>> ithread_execute_handlers(), in particular, of the ih_need, could pass
>>> the read of it_need and cause the situation you reported. I do not
>>> see any acquire barrier between a condition in the while() statement
>>> and the read of ih_need in the execute_handlers().
>>>
>>> It is probably true that the issue you see was caused by r236456, in the
>>> sense that implicitely locked xchgl instruction on x86 has a full barrier
>>> semantic. As result, the store_rel() was actually an acquire too, making
>>> this reordering impossible. I argue that this is not a bug in r236456,
>>> but the issue in the kern_intr.c.
>> If I remember the code correctly that would probably explain why we see
>> it only on 9.1 system.
>>> On the other hand, the John' suggestion to move the manipulations of
>>> it_need under the lock is probably the best anyway.
>>>
>> I was wondering if it would be lower latency to maintain it_need,
>> however to keep another variable it_needlocked under the thread lock.
>> This would result in potential superfluous interrupts, however under
>> load you would allow the ithread to loop without taking the thread lock
>> some number of times.
>>
>> I am not really sure if this is really worth the optimization
>> (especially since it can result in superfluous interrupts) however it
>> may reduce latency and that might be important.
>>
>> Is there some people that I can pass the patch onto for help with
>> performance once we confirm that this is the actual bug? We can do
>> internal testing, but I am worried about regressing performance of any
>> form of IO for the kernel.
>>
>> I'll show the patch soon.
>>
>> Thank you for the information. This is promising.
> Well, if you and I are right, the minimal patch should be
>
> diff --git a/sys/kern/kern_intr.c b/sys/kern/kern_intr.c
> index 8d63c9b..7c21015 100644
> --- a/sys/kern/kern_intr.c
> +++ b/sys/kern/kern_intr.c
> @@ -1349,7 +1349,7 @@ ithread_loop(void *arg)
> * we are running, it will set it_need to note that we
> * should make another pass.
> */
> - while (ithd->it_need) {
> + while (atomic_load_acq_int(&ithd->it_need)) {
> /*
> * This might need a full read and write barrier
> * to make sure that this write posts before any
>
OK we can try this.
I've been pretty good at locking when using mutexes, but when we get
into the atomic ops like this it gets a little tough for me to follow
without extensive research. I know that the signalling thread
(swi_sched caller) does not use any atomic ops... is this OK?
-Alfred
More information about the freebsd-hackers
mailing list