adaptive rwlock deadlock
Philippe Jalaber
pjalaber at gmail.com
Thu Jul 16 14:13:29 UTC 2015
2015-07-07 12:10 GMT+02:00 Philippe Jalaber <pjalaber at gmail.com>:
> Hi,
>
> I am facing a strange problem using the network stack and adaptive rwlocks
> running Freebsd 9.3.
> Basically I can reproduce the problem with 3 threads:
>
> 1) thread 1 has taken the rwlock of structure inpcb in exclusive mode in
> tcp_input.c. This thread also runs my own code and repeatedly takes a
> rwlock (called g_rwlock) in shared mode and releases it, until a shared
> object is marked not "busy" any more:
>
> rwlock(inp_lock);
> ....
> do { // thread is active waiting in the loop
> rlock(g_rwlock);
> o = find();
> if ( o == NULL )
> break;
> busy = o.busy;
> if (o != NULL && busy)
> runlock(g_rwlock);
> } while ( busy );
>
> if ( o != NULL )
> {
> // do something with o
> ....
> }
> runlock(g_rwlock);
> ....
>
> 2) thread 2 wants to set the shared object as "ready". So it tries to take
> g_rwlock in exclusive mode and is blocked in _rw_wlock_hard at kern_rwlock.c:815
> "turnstile_wait(ts, rw_owner(rw), TS_EXCLUSIVE_QUEUE)" because thread 1 has
> already taken it in shared mode:
>
> wlock(g_rwlock);
> o = find();
> if ( o != NULL )
> o.busy = 1;
> wunlock(g_rwlock);
>
> // o is busy so work on it without any lock
> ....
>
> wlock(g_rwlock); // thread is blocked here
> o.busy = 0;
> maybe_delete(o);
> wunlock(g_rwlock);
>
> 3) thread 3 spins on the same inpcb rwlock than thread 1 in
> _rw_wlock_hard at kern_rwlock.c:721 "while ((struct
> thread*)RW_OWNER(rw->rw_lock) == owner && TD_IS_RUNNING(owner)) "
>
>
> My target machine has two cpus.
> Thread 1 is pinned to cpu 0.
> Thread 2 and Thread 3 are pinned to cpu 1.
> Thread 1 and Thread 2 have a priority of 28.
> Thread 3 has a priority of 127
>
> Now what seems to happen is that when thread 1 calls runlock(g_rwlock), it
> calls turnstile_broadcast at kern_rwlock.c:650, but thread 2 never regains
> control because thread 3 is spinning on the inpcb rwlock. Also the
> condition TD_IS_RUNNING(owner) is always true because thread 1 is active
> waiting in a loop. So the 3 threads deadlock.
> Note that if I compile the kernel without adaptive rwlocks it works
> without any problem.
> A workaround is to add a call to "sched_relinquish(curthread)" in thread 1
> in the loop just after the call to runlock.
>
> I am also wondering about the code in _rw_runlock after
> "turnstile_broadcast(ts, queue)". Isn't the flag RW_LOCK_WRITE_WAITERS
> definitely lost if the other thread which is blocked in turnstile_wait
> never regains control ?
>
> Thank you for your time,
> Regards,
> Philippe
>
>
the sched_relinquish workaround does not seem to work every time.
one possible solution (which seems to work) is to rlock/runlock in thread
1, and if the busy flag is set, then take the lock in exclusive mode, like
this:
shared_count = 0;
rwlock(inp_lock);
....
do { // thread is active waiting in the loop
if ( shared_count == 0 )
rlock(g_rwlock);
else
wlock(g_rwlock);
o = find();
if ( o == NULL )
break;
busy = o.busy;
if (o != NULL && busy)
{
if ( shared_count == 0 )
runlock(g_rwlock);
else
wunlock(g_rwlock);
shared_count++;
}
} while ( busy );
if ( o != NULL )
{
// do something with o
....
}
if ( shared_count == 0 )
runlock(g_rwlock);
else
wunlock(g_rwlock);
with this code, deadlock does not happen anymore but I don't really see
why. Any idea ?
Thanks,
Philippe
More information about the freebsd-hackers
mailing list