Hung kernel from sysv semaphore semu_list corruption
Divacky Roman
xdivac02 at stud.fit.vutbr.cz
Thu Mar 8 10:11:36 UTC 2007
On Wed, Mar 07, 2007 at 06:07:31PM -0500, Ed Maste wrote:
> Nightly tests on our 6.1-based installation using pgsql have resulted in
> a number of kernel hangs, due to a corrupt semu_list (the list ended up
> with a loop).
>
> It seems there are a few holes in the locking in the semaphore code. The
> issue we've encountered comes from semexit_myhook. It obtains a pointer
> to a list element after acquiring SEMUNDO_LOCK, and after dropping the
> lock manipulates the next pointer to remove the element from the list.
>
> The fix below solves our current problem. Any comments?
>
> --- RELENG_6/src/sys/kern/sysv_sem.c Tue Jun 7 01:03:27 2005
> +++ swbuild_plt_boson/src/sys/kern/sysv_sem.c Tue Mar 6 16:13:45 2007
> @@ -1259,16 +1259,17 @@
> struct proc *p;
> {
> struct sem_undo *suptr;
> - struct sem_undo **supptr;
>
> /*
> * Go through the chain of undo vectors looking for one
> * associated with this process.
> */
> SEMUNDO_LOCK();
> - SLIST_FOREACH_PREVPTR(suptr, supptr, &semu_list, un_next) {
> - if (suptr->un_proc == p)
> + SLIST_FOREACH(suptr, &semu_list, un_next) {
> + if (suptr->un_proc == p) {
> + SLIST_REMOVE(&semu_list, suptr, sem_undo, un_next);
this is wrong.. you cannot remove element from a *LIST when its iterated using *LIST_FOREACH.
Use *LIST_FOREACH_SAFE instead...
thnx for the patch!
roman
More information about the freebsd-hackers
mailing list