Re: 13-stable NFS server hang
- Reply: Rick Macklem : "Re: 13-stable NFS server hang"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 01 Mar 2024 08:00:20 UTC
Interesting read. Would it be possible to separate locking for admin actions like a client mounting an fs from traffic flowing for file operations? Like ongoing file operations could have a read only view/copy of the mount table. Only new operations will have to wait. But the mount never needs to wait for ongoing operations before locking the structure. Just a thought in the morning Regards, Ronald. Van: Rick Macklem <rick.macklem@gmail.com> Datum: 1 maart 2024 00:31 Aan: Garrett Wollman <wollman@bimajority.org> CC: stable@freebsd.org, rmacklem@freebsd.org Onderwerp: Re: 13-stable NFS server hang > > > On Wed, Feb 28, 2024 at 4:04PM Rick Macklem wrote: > > > > On Tue, Feb 27, 2024 at 9:30PM Garrett Wollman wrote: > > > > > > Hi, all, > > > > > > We've had some complaints of NFS hanging at unpredictable intervals. > > > Our NFS servers are running a 13-stable from last December, and > > > tonight I sat in front of the monitor watching `nfsstat -dW`. I was > > > able to clearly see that there were periods when NFS activity would > > > drop *instantly* from 30,000 ops/s to flat zero, which would last > > > for about 25 seconds before resuming exactly as it was before. > > > > > > I wrote a little awk script to watch for this happening and run > > > `procstat -k` on the nfsd process, and I saw that all but two of the > > > service threads were idle. The three nfsd threads that had non-idle > > > kstacks were: > > > > > > PID TID COMM TDNAME KSTACK > > > 997 108481 nfsd nfsd: master mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrvd_dorpc nfssvc_program svc_run_internal svc_run nfsrvd_nfsd nfssvc_nfsd sys_nfssvc amd64_syscall fast_syscall_common > > > 997 960918 nfsd nfsd: service mi_switch sleepq_timedwait _sleep nfsv4_lock nfsrv_setclient nfsrvd_exchangeid nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline > > > 997 962232 nfsd nfsd: service mi_switch _cv_wait txg_wait_synced_impl txg_wait_synced dmu_offset_next zfs_holey zfs_freebsd_ioctl vn_generic_copy_file_range vop_stdcopy_file_range VOP_COPY_FILE_RANGE vn_copy_file_range nfsrvd_copy_file_range nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline > > > > > > I'm suspicious of two things: first, the copy_file_range RPC; second, > > > the "master" nfsd thread is actually servicing an RPC which requires > > > obtaining a lock. The "master" getting stuck while performing client > > > RPCs is, I believe, the reason NFS service grinds to a halt when a > > > client tries to write into a near-full filesystem, so this problem > > > would be more evidence that the dispatching function should not be > > > mixed with actual operations. I don't know what the clients are > > > doing, but is it possible that nfsrvd_copy_file_range is holding a > > > lock that is needed by one or both of the other two threads? > > > > > > Near-term I could change nfsrvd_copy_file_range to just > > > unconditionally return NFSERR_NOTSUP and force the clients to fall > > > back, but I figured I would ask if anyone else has seen this. > > I have attached a little patch that should limit the server's Copy size > > to vfs.nfsd.maxcopyrange (default of 10Mbytes). > > Hopefully this makes sure that the Copy does not take too long. > > > > You could try this instead of disabling Copy. It would be nice to know if > > this is suffciient? (If not, I'll probably add a sysctl to disable Copy.) > I did a quick test without/with this patch,where I copied a 1Gbyte file. > > Without this patch, the Copy RPCs mostly replied in just under 1sec > (which is what the flag requests), but took over 4sec for one of the Copy > operations. This implies that one Read/Write of 1Mbyte on the server > took over 3 seconds. > I noticed the first Copy did over 600Mbytes, but the rest did about 100Mbytes > each and it was one of these 100Mbyte Copy operations that took over 4sec. > > With the patch, there were a lot more Copy RPCs (as expected) of 10Mbytes > each and they took a consistent 0.25-0.3sec to reply. (This is a test of a local > mount on an old laptop, so nowhere near a server hardware config.) > > So, the patch might be sufficient? > > It would be nice to avoid disabling Copy, since it avoids reading the data > into the client and then writing it back to the server. > > I will probably commit both patches (10Mbyte clip of Copy size and > disabling Copy) to main soon, since I cannot say if clipping the size > of the Copy will always be sufficient. > > Pleas let us know how trying these patches goes, rick > > > > > rick > > > > > > > > -GAWollman > > > > > > > > > > >