poll() POLLHUP behaviour inconsistency
Jérémie Galarneau
jeremie.galarneau at efficios.com
Wed Nov 4 22:01:58 UTC 2020
On Tue, 3 Nov 2020 at 19:12, Mateusz Guzik <mjguzik at gmail.com> wrote:
>
> On 11/4/20, Jérémie Galarneau <jeremie.galarneau at efficios.com> wrote:
> > Hi,
> >
> > Michael and myself are porting code from Linux to FreeBSD and we have
> > noticed a
> > peculiar difference in the way poll() events are handled.
> >
> > In short, we have a process that monitors the lifetime of other processes.
> > It
> > does so by sharing a pipe between the parent and the child on every fork:
> > read-end in the parent, write-end in the child. The pipe is not used to
> > communicate; it's only used to poll() on the death of the child process.
> >
> > On Linux, poll() is called with a POLLHUP event and nothing else. When
> > the child process dies, the poll() call returns with 'revents == POLLHUP'.
> >
> > After some head scratching, we figured that on FreeBSD (12.1 and 12.2) if
> > the
> > child process died while the parent was not waiting in poll(), we would get
> > 'revents == POLLHUP' on the next invocation of poll(), like we do on Linux.
> > However, if the parent is in poll() when the child dies, the call to poll()
> > never unblocks. This resulted in occasional hangs in the application.
> >
> > I am joining a reproducer [1].
> >
> >
> > As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both events
> > to
> > be delivered in both cases (child dies before/during calling poll()).
> >
> > The following excerpts of the FreeBSD, Linux, and Open Specification seem
> > in agreement that passing POLLHUP is unnecessary as it is checked
> > implicitly.
> >
> > FreeBSD (POLL(2))
> > This flag is always checked, even if not present in the events bitmask
> > [...]
> >
> > Open Group:
> > This flag is only valid in the revents bitmask; it shall be ignored in the
> > events member.
> >
> > Linux (poll(2)):
> > Hang up (only returned in revents; ignored in events).
> >
> >
> > I am surprised by the behaviour being different depending on the moment the
> > child process' death occurs.
> >
> > This is not a big deal for us to work-around, but I would like to know if I
> > should open a bug and try to fix it or if this is intentional (and perhaps
> > documented?) behaviour.
> >
> > Thanks!
> > Jérémie Galarneau
> >
> > [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c
> >
>
> Thanks for the detailed report with a reproducer.
>
> pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in
> order to decide whether to add itself to the list of waiters. Since
> you don't specify any of it and POLLHUP condition is not met, it
> neglects to do anything but at the same time does not return any
> events to poll itself. Then poll blocks waiting for wakeups which
> never come since pipe_poll did not add us anywhere.
>
> A trivial hack looks like this:
> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
> index 239cf3bbdfb..59bc03e032a 100644
> --- a/sys/kern/sys_pipe.c
> +++ b/sys/kern/sys_pipe.c
> @@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct
> ucred *active_cred,
> }
>
> if (revents == 0) {
> - if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)) {
> + if (fp->f_flag & FREAD) {
> selrecord(td, &rpipe->pipe_sel);
> if (SEL_WAITING(&rpipe->pipe_sel))
> rpipe->pipe_state |= PIPE_SEL;
> }
>
> - if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM)) {
> + if (fp->f_flag & FWRITE) {
> selrecord(td, &wpipe->pipe_sel);
> if (SEL_WAITING(&wpipe->pipe_sel))
> wpipe->pipe_state |= PIPE_SEL;
>
> With this in place the reproducer passes. I don't know yet if this is
> just a pipe or general poll problem.
>
> I don't know what the right fix is right now, may take few days. This
> may or may not be a candidate for errata for the 12.2 release,
> depending on how the extensive the real fix will turn out to be.
>
> That said you may need to implement a workaround regardless of the
> issue getting fixed.
Hi,
Thanks for the great (and quick!) reply.
Let me know if I can help with the fix in any way.
Jérémie
>
> Thanks,
> --
> Mateusz Guzik <mjguzik gmail.com>
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
More information about the freebsd-hackers
mailing list