amd + NFS reconnect = ICMP storm + unkillable process.

Rick Macklem rmacklem at uoguelph.ca
Fri Aug 26 19:04:44 UTC 2011


Artem Belevich wrote:
> On Thu, Aug 25, 2011 at 6:24 PM, Rick Macklem <rmacklem at uoguelph.ca>
> wrote:
> > Btw, I fixed exactly the same issue for the TCP code (clnt_vc.c) in
> > r221127, so I wouldn't be surprised if the UDP code suffers the same
> 
> The code in clnt_vc.c was exactly what made me wonder about treatment
> of ERESTART.
> 
> > problem. I'll take a look at your patch tomorrow. You could also try
> > a TCP mount and see if the problem goes away. (For TCP on a
> > pre-r221127
> > system, the symptom would be a client thread looping in the kernel
> > in
> > "R" state.)
> 
> In my case the process was also stuck in unkillable running state
> because the process never returns from the syscall.
> 
> Unfortunately amd itself seems to handle NFS requests for its own
> top-level mountpoints only via UDP. At least I haven't found a way to
> do so without hacking rather convoluted amd code.
> 
> > I'll look tomorrow, but it sounds like you've figured it out. Looks
> > like
> > a good catch to me at this point, rick
> 
> Let me know if you're OK with the patch and I'll commit to head and
> MFC it to stable/8.
> 
The patch looks good to me. The only thing is that *maybe* it should
also do the same for the other msleep() higher up in clnt_dg_call()?
(It seems to me that if this msleep() were to return ERESTART, the same
 kernel loop would occur.)

Here's this variant of the patch (I'll let you decide which to commit).

Good work tracking this down, rick

--- rpc/clnt_dg.c.sav	2011-08-26 14:44:27.000000000 -0400
+++ rpc/clnt_dg.c	2011-08-26 14:48:07.000000000 -0400
@@ -467,7 +467,10 @@ send_again:
 		    cu->cu_waitflag, "rpccwnd", 0);
 		if (error) {
 			errp->re_errno = error;
-			errp->re_status = stat = RPC_CANTSEND;
+			if (error == EINTR || error == ERESTART)
+				errp->re_status = stat = RPC_INTR;
+			else
+				errp->re_status = stat = RPC_CANTSEND;
 			goto out;
 		}
 	}
@@ -636,7 +639,7 @@ get_reply:
 		 */
 		if (error != EWOULDBLOCK) {
 			errp->re_errno = error;
-			if (error == EINTR)
+			if (error == EINTR || error == ERESTART)
 				errp->re_status = stat = RPC_INTR;
 			else
 				errp->re_status = stat = RPC_CANTRECV;



More information about the freebsd-net mailing list