Re: NFS, intermittent 'RPC struct is bad' errors
- In reply to: Lexi Winter : "NFS, intermittent 'RPC struct is bad' errors"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 19 Jun 2024 14:21:25 UTC
On Tue, Jun 18, 2024 at 11:32 PM Lexi Winter <lexi@le-fay.org> wrote: > > hi, > > i have a few systems running NFSv4 on FreeBSD, using Kerberos (MIT > Kerberos KDC), with the server exporting ZFS filesystems. > > recently i've noticed intermittent errors of 'RPC struct is bad' when > writing to the NFS server, which usually resolves itself after retrying. > for example: > > % rsync -iavP /scratch/Star.Trek.Prodigy.S01E* . > sending incremental file list > >f++++++++++ Star.Trek.Prodigy.S01E01E02.1080p.WEBRip.x265-KONTRAST.mkv > 32,768 0% 0.00kB/s 0:00:00 rsync: [receiver] write failed on "/data/public/TV/Star Trek Prodigy/Season 01/Star.Trek.Prodigy.S01E01E02.1080p.WEBRip.x265-KONTRAST.mkv": RPC struct is bad (72) > rsync error: error in file IO (code 11) at receiver.c(380) [receiver=3.3.0] > > rsync: [sender] write error: Broken pipe (32) > % rsync -iavP /scratch/Star.Trek.Prodigy.S01E* . > sending incremental file list > >f.st....... Star.Trek.Prodigy.S01E01E02.1080p.WEBRip.x265-KONTRAST.mkv > 912,704,431 100% 96.51MB/s 0:00:09 (xfr#1, to-chk=18/19) > >f++++++++++ Star.Trek.Prodigy.S01E03.1080p.WEBRip.x265-KONTRAST.mkv > 477,408,567 100% 100.06MB/s 0:00:04 (xfr#2, to-chk=17/19) > [...] > > the client is running FreeBSD 15.0-CURRENT from around May 24, and the > server is running a slightly older 15.0-CURRENT from around May 23. > > /etc/exports on the server is pretty standard: > > /data/public -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > /data/public/Books -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > /data/public/CalibreLibrary -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > /data/public/Comics -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > /data/public/Films -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > /data/public/Miscellaneous -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > V4: /data -sec=sys:krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 > > client mount options: > > hemlock.eden.le-fay.org:/public /data/public nfs rw,nfsv4,minorversion=2,sec=krb5p,gssname=host,bgnow,proto=tcp6,nconnect=4,rsize=1048576,wsize=1048576,noncontigwr 0 0 > > is there anything more i can do investigate this? would a tcpdump > capture of the error be useful (considering all the RPC traffic is > Kerberos-encrypted)? If you could do a run that causes these failures safely without on the wire encryption, you could switch the mount to "krb5i". Then a tcpdump done via something like: # tcpdump -s 0 -w out.pcap host <other-system> followed by pulling out.pcap into wireshark, you could maybe see where the failure is occurring. (Unlike tcpdump, wireshark decodes NFS traffic quite nicely.) rick