poll() POLLHUP behaviour inconsistency

Mateusz Guzik mjguzik at gmail.com
Wed Nov 4 00:12:13 UTC 2020


On 11/4/20, Jérémie Galarneau <jeremie.galarneau at efficios.com> wrote:
> Hi,
>
> Michael and myself are porting code from Linux to FreeBSD and we have
> noticed a
> peculiar difference in the way poll() events are handled.
>
> In short, we have a process that monitors the lifetime of other processes.
> It
> does so by sharing a pipe between the parent and the child on every fork:
> read-end in the parent, write-end in the child. The pipe is not used to
> communicate; it's only used to poll() on the death of the child process.
>
> On Linux, poll() is called with a POLLHUP event and nothing else. When
> the child process dies, the poll() call returns with 'revents == POLLHUP'.
>
> After some head scratching, we figured that on FreeBSD (12.1 and 12.2) if
> the
> child process died while the parent was not waiting in poll(), we would get
> 'revents == POLLHUP' on the next invocation of poll(), like we do on Linux.
> However, if the parent is in poll() when the child dies, the call to poll()
> never unblocks. This resulted in occasional hangs in the application.
>
> I am joining a reproducer [1].
>
>
> As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both events
> to
> be delivered in both cases (child dies before/during calling poll()).
>
> The following excerpts of the FreeBSD, Linux, and Open Specification seem
> in agreement that passing POLLHUP is unnecessary as it is checked
> implicitly.
>
> FreeBSD (POLL(2))
>   This flag is always checked, even if not present in the events bitmask
> [...]
>
> Open Group:
>   This flag is only valid in the revents bitmask; it shall be ignored in the
>   events member.
>
> Linux (poll(2)):
>   Hang up (only  returned  in revents; ignored in events).
>
>
> I am surprised by the behaviour being different depending on the moment the
> child process' death occurs.
>
> This is not a big deal for us to work-around, but I would like to know if I
> should open a bug and try to fix it or if this is intentional (and perhaps
> documented?) behaviour.
>
> Thanks!
> Jérémie Galarneau
>
> [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c
>

Thanks for the detailed report with a reproducer.

pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in
order to decide whether to add itself to the list of waiters. Since
you don't specify any of it and POLLHUP condition is not met, it
neglects to do anything but at the same time does not return any
events to poll itself. Then poll blocks waiting for wakeups which
never come since pipe_poll did not add us anywhere.

A trivial hack looks like this:
diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
index 239cf3bbdfb..59bc03e032a 100644
--- a/sys/kern/sys_pipe.c
+++ b/sys/kern/sys_pipe.c
@@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct
ucred *active_cred,
        }

        if (revents == 0) {
-               if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM)) {
+               if (fp->f_flag & FREAD) {
                        selrecord(td, &rpipe->pipe_sel);
                        if (SEL_WAITING(&rpipe->pipe_sel))
                                rpipe->pipe_state |= PIPE_SEL;
                }

-               if (fp->f_flag & FWRITE && events & (POLLOUT | POLLWRNORM)) {
+               if (fp->f_flag & FWRITE) {
                        selrecord(td, &wpipe->pipe_sel);
                        if (SEL_WAITING(&wpipe->pipe_sel))
                                wpipe->pipe_state |= PIPE_SEL;

With this in place the reproducer passes. I don't know yet if this is
just a pipe or general poll problem.

I don't know what the right fix is right now, may take few days. This
may or may not be a candidate for errata for the 12.2 release,
depending on how the extensive the real fix will turn out to be.

That said you may need to implement a workaround regardless of the
issue getting fixed.

Thanks,
-- 
Mateusz Guzik <mjguzik gmail.com>


More information about the freebsd-hackers mailing list