nfsrvstats.srvrpc_errs rapidly increasing
Mohan Srinivasan
mohan_srinivasan at yahoo.com
Tue May 10 13:12:04 PDT 2005
Hi,
The srvrcp_errs are very likely unrelated to the hangs.
nfs_rephead() is called (via the contorted macros nfsm_reply() and
friends) from the NFS server routines in nfs_serv.c. The error
that was returned by the vnode op called is passed into
nfs_rephead(), whence it gets into the NFS reply. The fact that
you see these errors go up is not abnormal. In your case, over
90% of these errors are ENOENT.
Are you using NFS/TCP ? Can you force the mount to NFS/UDP ?
I have seen a bug in the FreeBSD 5.x NFS server, where in the
NFS/TCP case, the stream gets out of sync. This results in the
RPC record markers to be completely wrong, confusing clients.
Now, I don't know if this bug can cause the Linux client to hang
or not, but this is definitely worth eliminating as a factor.
The FreeBSD NFS client recovers from this by tearing down the
connection and reconnecting, other clients may behave strangely.
mohan
> In order to find the cause of the problems with our Linux NFS clients, i
> toook a look at 'nfsstat -s' on our FreeBSD server (RELENG_5_3).
> I noticed that "Server Ret-Failed" was rapidly increasing. After 1 day
> of uptime, it is already at 643936:
>
> #######################################################################
> root at antsrv1 [~] # nfsstat -s
>
> Server Info:
> Getattr Setattr Lookup Readlink Read Write Create
> Remove
> 2501670 234193 1051157 12421 365378 185952 61166
> 74050
> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus
> Access
> 60646 19767 246 1494 354 2265 50548
> 4465364
> Mknod Fsstat Fsinfo PathConf Commit
> 12 588 141 0 103946
> Server Ret-Failed
> 643936
> Server Faults
> 0
> Server Cache Stats:
> Inprog Idem Non-idem Misses
> 3 5 0 162819
> Server Write Gathering:
> WriteOps WriteRPC Opsaved
> 185952 185952 0
> root at antsrv1 [~] # uptime
> 4:24PM up 1 day, 17 mins, 4 users, load averages: 0.02, 0.03, 0.00
> ######################################################################
>
> Looking into nfsstat's source, i found that "nfsrvstats.srvrpc_errs" is
> the counter shown. Grep-ing the kernel sources showed that it is
> increased by /usr/src/sys/nfsserver/nfs_srvsock.c.
> It seems to be a catch-all for unexpected rpc errors.
> The procedure, nfs_rephead(), is called by nfs_srvcache.c, where
> rp->rc_status is supplied as value for the error.
> At this point i am unable to track things any further, i am not familiar
> with kernel sources.
>
> Question: is there a list of error codes somewhere?
>
> I hacked a log output into nfs_srvsock.c:
>
> --- nfs_srvsock.c Sat Jul 24 04:07:09 2004
> +++ nfs_srvsock.ANT.c Tue May 10 16:30:52 2005
> @@ -213,8 +213,10 @@
> }
> *mbp = mb;
> *bposp = bpos;
> - if (err != 0 && err != NFSERR_RETVOID)
> + if (err != 0 && err != NFSERR_RETVOID){
> nfsrvstats.srvrpc_errs++;
> + log(LOG_WARNING, "ANT: unknown RPC error %d\n", err);
> + }
> return mreq;
> }
>
> Most errors (>90%) are "2", but i also see 1, 13, 17, 66, 70
>
> Any thoughts on this? We do have annoying problems with Linux clients
> (2.6.8) occasionally hanging when mounting from the FBSD machine. I
> don't know if this is related, but at least it's a point to start.
>
> Thanks for any help,
>
> Heinrich Rebehn
> --
>
> Heinrich Rebehn
>
> University of Bremen
> Physics / Electrical and Electronics Engineering
> - Department of Telecommunications -
>
> Phone : +49/421/218-4664
> Fax : -3341
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
More information about the freebsd-net
mailing list