Re: HEADS UP: NFS changes coming into CURRENT early February
Date: Wed, 29 Jan 2025 05:13:57 UTC
On Mon, Jan 27, 2025 at 06:10:42PM -0800, Rick Macklem wrote: R> I think I've found a memory leak, but it shouldn't be a show stopper. R> R> What I did on the NFS client side is: R> # vmstat -m | fgrep -i rpc R> # mount -t nfs -o nfsv4,tls nfsv4-server:/ /mnt R> # ls --lR /mnt R> --> Then I network partitioned it from the server a few times, until R> the TCP connection closed. R> (My client is in bhyve and the server on the system the bhyve R> instance is running in. I just "ifconfig bridge0 down", waited for R> the TCP connection to close "netstat --a" then "ifconfig bridge0 up". R> Once done, I R> # umount /mnt R> # vmstat -m | fgrep -i rpc R> and say a somewhat larger allocation count R> R> The allocation count only goes up if I do the network partitioning R> and only on the NFS client side. R> R> Since the leak is slow and only happens when the TCP connection R> breaks, I do not think it is a show stopper and one of us can track it down R> someday. I reproduced the recipe and find two problems, but don't have a final solution for either. First, when we create backchannel in sys/fs/nfs/nfs_commonkrpc.c newnfs_connect(): if (nfs_numnfscbd > 0) { nfs_numnfscbd++; NFSD_UNLOCK(); xprt = svc_vc_create_backchannel( nfscbd_pool); CLNT_CONTROL(client, CLSET_BACKCHANNEL, xprt); NFSD_LOCK(); nfs_numnfscbd--; if (nfs_numnfscbd == 0) wakeup(&nfs_numnfscbd); } The svc_vc_create_backchannel() creates xprt with refcount=1. Then we link it to our client, CLNT_CONTROL() makes refcount=2. Then this functions forgets the pointer (but it owns refcount). Whenever the client is destroyed, it will do SVC_RELEASE() on the backchannel, refcount goes 2 to 1 and it is leaked. I made a patch against that and it does reduce amount of M_RPC leaked after your recipe, but once I got panic where the backchannel xprt is actually used after free. So looks like there is a case that my patch doesn't cover. What drives me crazy, can't reproduce it for the second time and can't get any clue from the single core. Patch attached. Second, with the patch the M_RPC leak count for me is 2. And I found that these two items are basically is a clnt_vc that belongs to a closed connection: fffff80029614a80 tcp4 0 0 10.6.6.9.772 10.6.6.9.2049 CLOSED There is no connection peer connection, as the server received a timeout trying to send. But rpc.tlsclntd doesn't try to send anything on the socket, it just keeps it select(2) fd set and doesn't garbage collect. So it is a bigger resource leak than just two pieces of M_RPC. I don't think this is related to my changes. -- Gleb Smirnoff