Re: Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations
- Reply: Rick Macklem : "Re: Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations"
- In reply to: Rick Macklem : "Re: Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 21 Aug 2024 15:02:28 UTC
Hi Rick, Done - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280978 Thanks! -Matt On 8/21/24 10:45 AM, Rick Macklem wrote: > Please create a PR for this and include at least > one backtrace. I will try and figure out how > locallocks could cause it. > > I suspect few use locallocks=1. > > rick > > On Wed, Aug 21, 2024 at 7:29 AM Matthew L. Dailey > <Matthew.L.Dailey@dartmouth.edu <mailto:Matthew.L.Dailey@dartmouth.edu>> > wrote: > > Hi all, > > I posted messages to the this list back in February and March > (https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html <https://lists.freebsd.org/archives/freebsd-current/2024-February/005546.html>) > regarding kernel panics we were having with nfs clients doing hdf5 file > operations. After a hiatus in troubleshooting, I had more time this > summer and have found the cause - the vfs.nfsd.enable_locallocks sysctl. > > When this is set to 1, we can induce either a panic or hung nfs server > (more rarely) usually within a few hours, but sometimes within several > days to a week. We have replicated this on 13.0 through 15.0-CURRENT > (20240725-82283cad12a4-271360). With this set to 0 (default), we are > unable to replicate the issue, even after several weeks of 24/7 hdf5 > file operations. > > One other side-effect of these panics is that on a few occasions it has > corrupted the root zpool beyond repair. This makes sense since kernel > memory is getting corrupted, but obviously makes this issue more > impactful. > > I'm hoping this is enough information to start narrowing down this > issue. We are specifically using this sysctl because we are also > serving > files via samba and want to ensure consistent locking. > > I have provided some core dumps and backtraces previously, but am happy > to provide more as needed. I also have a writeup of exactly how to > reproduce this that I can send directly to anyone who is interested. > > Thanks so much for any and all help with this tricky problem. I'm happy > to do whatever I can to help get this squashed. > > Best, > Matt >