duplicate read/write locks in net/pfil.c and netinet/ip_fw2.c
Luigi Rizzo
rizzo at icir.org
Thu Aug 18 00:02:50 GMT 2005
On Wed, Aug 17, 2005 at 04:35:19AM +0200, Max Laier wrote:
> On Wednesday 17 August 2005 02:05, Luigi Rizzo wrote:
> > [apologies for the cross post but it belongs both to arch and net.]
> >
> > I notice that net/pfil.c and netinet/ip_fw2.c have two copies of
> > aisimilar but slightly different implementation of
> > multiple-reader/single-writer locks, which brings up the question(s):
> >
> > 1. should we rather put this code in the generic kernel code so that other
> > subsystems could make use of it ? E.g. the routing table is certainly
> > a candidate,
>
> I have asked this several time on -arch and IRC, but never found anyone
> willing to pursue it. However, the problem is ...
>
> > and especially
> >
> > 2. should we implement it right ?
> >
> > Both implementations are subject to starvation for the writers
> > (which is indeed a problem here, because we might want to modify
> > a ruleset and be prevented from doing it because of incoming traffic
> > that keeps readers active).
> > Also the PFIL_TRY_WLOCK will in fact be blocking if a writer
> > is already in - i have no idea how problematic is this in the
> > way it is actually used.
>
>... really this. I didn't find a clean way out of the starvation issue. What
> I do for pfil is that I set a flag and simply stop serving[2] shared requests
> once a writer waits for the lock. If a writer can't sleep[1] then we return
> EBUSY and don't. However, for pfil it's almost ever safe to assume that a
> write may sleep (as it is for most instances of this kind of sx-lock where
> you have BIGNUMxreads:1xwrite).
could you guys look at the following code and see if it makes sense,
or tell me where i am wrong ?
It should solve the starvation and blocking trylock problems,
because the active reader does not hold the mutex in the critical
section anymore. The lock could well be a spinlock.
cheers
luigi
/*
* Implementation of multiple reader-single writer that prevents starvation.
* Luigi Rizzo 2005.08.19
*
* The mutex m only protects accesses to the struct rwlock.
* We can have the following states:
* IDLE: readers = 0, writers = 0, br = 0;
* any request will be granted immediately.
*
* READ: readers > 0, writers = 0, br = 0. Read in progress.
* Grant read requests immediately, queue write requests and
* move to READ1.
* When last reader terminates, move to IDLE.
*
* READ1: readers > 0, writers > 0, br >= 0.
* Read in progress, but writers are queued.
* Queue read and write requests to qr and wr, respectively.
* When the last reader terminates, wakeup the next queued writer
* and move to WRITE
*
* WRITE: readers = 0, writers > 0, br >= 0.
* Write in progress, possibly queued readers/writers.
* Queue read and write requests to qr and wr, respectively.
* When the writer terminates, wake up all readers if any,
* otherwise wake up the next writer if any.
* Move to READ, READ1, IDLE accordingly.
*/
struct rwlock {
mtx m; /* protects access to the rwlock */
int readers; /* active readers */
int br; /* blocked readers */
int writers; /* active + blocked writers */
cv qr; /* queued readers */
cv qw; /* queued writers */
}
int
RLOCK(struct rwlock *rwl, int try)
{
if (!try)
mtx_lock(&rwl->m);
else if (!mtx_trylock(&rwl->m))
return EBUSY;
if (rwl->writers == 0) /* no writer, pass */
rwl->readers++;
else {
rwl->br++;
cv_wait(&rwl->qr, &rwl->m);
}
mtx_unlock(&rwl->m);
return 0;
}
int
WLOCK(struct rwlock *rwl, int try)
{
if (!try)
mtx_lock(&rwl->m);
else if (!mtx_trylock(&rwl->m))
return EBUSY;
rwl->writers++;
if (rwl->readers > 0) /* have readers, must wait */
cv_wait(&rwl->qw, &rwl->m);
mtx_unlock(&rwl->m);
return 0;
}
void
RUNLOCK(struct rwlock *rwl)
{
mtx_lock(&rwl->m);
rwl->readers--;
if (rwl->readers == 0 && rwl->writers > 0)
cv_signal(&rwl->qw);
mtx_unlock(&rwl->m);
}
void
WUNLOCK(struct rwlock *rwl)
{
mtx_lock(&rwl->m);
rwl->writers--;
if (rwl->br > 0) { /* priority to readers */
rwl->readers = rwl->br;
rwl->br = 0;
cv_broadcast(&rwl->qr);
} else if (rwl->writers > 0)
cv_signal(&rwl->qw);
mtx_unlock(&rwl->m);
}
More information about the freebsd-net
mailing list