Unkillable KSE threaded proc
Julian Elischer
julian at elischer.org
Wed Sep 15 10:55:57 PDT 2004
Andrew Gallatin wrote:
>Julian Elischer writes:
> > either of :
> > http://www.freebsd.org/~julian/q.diff
> >
> > or
> >
> > http://www.freebsd.org/~julian/r.diff
> >
> > Might make some difference.
> >
> > today's q.diff has a fix that was missing yesterday.
>
>Both seem the same as unpatched head -- app starts, runs normally,
>then skill -9 -u gallatin leaves threads stuck on the cpu, seeminlgly
>deadlocking the system.
>
>But -- I think I now have a clue as to what's going on. I started a
>ktrace of the problematic process just before doing the skill -9, and
>afterwards it kept on tracing.
>
>I noticed it was stuck doing this:
>
> 569 mx_pingpong RET ioctl -1 errno 4 Interrupted system call
> 569 mx_pingpong Events dropped.
> 569 mx_pingpong RET ioctl -1 errno 4 Interrupted system call
> 569 mx_pingpong Events dropped.
> 569 mx_pingpong RET ioctl -1 errno 4 Interrupted system call
>
>It turns out that the userspace code is basically doing:
>
> do {
> MUTEX_LOCK(&lock);
> should_exit = work();
> MUTEX_UNLOCK(&lock);
> ioctl(fd, DRIVER_WAIT)
> } while (!should_exit);
> return NULL;
>
>Changing it to
>
><...>
> rv = ioctl(fd, DRIVER_WAIT)
> } while ((rv == 0 || rv == EWOULDBLOCK) && !should_exit);
> return NULL;
>
>Seems like it works around the problem with your r.diff patch applied
>to head. The ioctl in the driver boils down to a cv_timedwait_sig(),
>which is where the EINTR is coming from.
>
>Even if this is our bug, I think that a user-level bug like this should
>not be able to deadlock the system...
>
I agree.. the rule is that userland should not be able to crash the system..
so this is a bug either way..
>
>FWIW, even with the fix to the user-level code, we still have the
>original problem (one lingering thread using no CPU) in RELENG_5.
>
>Drew
>
>
>
>
More information about the freebsd-threads
mailing list