Re: Automount + NFS hang issues (follow-up to FreeBSD 12.3/13.1 NFS client hang thread)
Date: Mon, 15 Apr 2024 19:48:33 UTC
On Sun, Apr 14, 2024 at 12:20:36PM -0700, Rick Macklem wrote: > On Sun, Apr 14, 2024 at 8:48 AM Andreas Kempe <kempe@lysator.liu.se> wrote: > > > > Hello, > > > > I'm doing a follow-up on the thread FreeBSD 12.3/13.1 NFS client hang, > > message ID YpEwxdGCouUUFHiE@shipon.lysator.liu.se. > > > > After having had recurring issues ever since that thread and not > > managing a good tcpdump of a hang, I decided to simply get rid of > > automount and instead mount the NFS shares via the fstab. With the > > mounts being done via the fstab instead of automount, the NFS server > > restarting causes processes using the mount to hang, but when the > > server comes back things recover. > > > > When using automount as the NFS server becomes unresponsive, the > > system log is filled with lines like > > > > 7 Apr 10 13:00:14 shipon kernel: WARNING: autofs_trigger_one: request for /home/ completed with error 60, pid 68836 (fish) > > 8 Apr 10 13:00:14 shipon kernel: WARNING: autofs_trigger_one: request for /home/ completed with error 60, pid 69248 (sshd) > > 9 Apr 10 13:00:14 shipon kernel: WARNING: autofs_trigger_one: request for /home/ completed with error 60, pid 2221 (weechat) > > > > and it seems like automount is repeatedly trying to perform mounts > > until the system eventually hangs. When the system has hung, all > > automount processes are stuck in the kernel in uninterruptable sleep > > in the NFS code. You can find some stack traces in the old thread. > > Sometimes a umount -N on all the mounts would solve the issue, but > > often a system reboot was the only way to recover. > I might take another look, but I have never been involved in the automount code > and never use the automounter, so I doubt I'll figure out how to fix it. > No pressure to have a look for our sake. Mounting from the fstab works for us for the forseeable future. I mostly thought I'd report it for posterity and anyone else encountering the same issue. // Andreas Kempe