kern/130628: [nfs] NFS / rpc.lockd deadlock on 7.1-R
Burt Rosenberg
burt at cs.miami.edu
Wed Oct 14 14:40:05 UTC 2009
The following reply was made to PR kern/130628; it has been noted by GNATS.
From: Burt Rosenberg <burt at cs.miami.edu>
To: bug-followup at freebsd.org, Joe Marcus Clarke <marcus at marcuscom.com>
Cc:
Subject: Re: kern/130628: [nfs] NFS / rpc.lockd deadlock on 7.1-R
Date: Wed, 14 Oct 2009 10:31:45 -0400
--000e0cd6c8b6adc3e40475e605f0
Content-Type: text/plain; charset=ISO-8859-1
The patch which helped, but did not entirely fix the lock is not in 7.2-p4,
i386.
Furthermore, we now have a deadlock on an NFS mount between a free bsd
7.2-p3 and a Linux 2.6.18-164.el5 SMP i686 athlon i386,
in this situation there is a cisco ASA 5220 between linux and freebsd
boxes, and we run tcp nfs.
On Thu, Sep 3, 2009 at 2:40 PM, Burt Rosenberg <burt at cs.miami.edu> wrote:
> It seems that :
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130628
>
> appears in 7.2-R-p3; With this kernel, against Fedora 8 distros:
>
> Linux prism09.cs.miami.edu 2.6.26.8-57.fc8 #1 SMP Thu Dec 18 18:59:49 EST
> 2008 x86_64 x86_64 x86_64 GNU/Linux
>
> which are using NFS (tcp) to mount homedirs form the freebsd server to the
> fedora client,
> server will become unresponsive from the network during graphical login of
> a client.
>
> Applying the patch given in the article
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130628 seems at present to
> fix the problem. Under a 7.2-R-p3, we can manifest the problem in a few
> minutes, and under said kernel with patches as described in the article, and
> as provided by diffs against the current source, we have not yet seen the
> problem.
>
> When the problem appears, the sever cannot be pinged, an other network
> connections are halted.
>
> On the server, for instance, top shows:
>
> Proc, state, pri
> --------------------
> pc.lockd *tcpin -68
> nfsd - 4
> rpcbind select 44
> ntpd select 44
> nfsd select 44
> ... etc...
>
>
> Also,
>
> ./lockd restart
> Stopping lockd.
> Waiting for PIDS: 1114, 1114, 1114, 1114,....
>
> kill -9 1114 also ineffective.
>
> So it seems to be something spinning in lockd.
>
> I think this is a serious issue and would like to see it resolved. Our
> setup is available if you would like to send instrumented code. I attach
> diffs.
>
>
>
>
--000e0cd6c8b6adc3e40475e605f0
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
The patch which helped, but did not entirely fix the lock is not in 7.2-p4,=
i386.<br><br>Furthermore, we now have a deadlock on an NFS mount between a=
free bsd 7.2-p3 and a Linux 2.6.18-164.el5 SMP i686 athlon i386, <br><br>
in this situation there is a=A0 cisco ASA 5220 between linux and freebsd bo=
xes, and we run tcp nfs.<br><br><br><br><div class=3D"gmail_quote">On Thu, =
Sep 3, 2009 at 2:40 PM, Burt Rosenberg <span dir=3D"ltr"><<a href=3D"mai=
lto:burt at cs.miami.edu">burt at cs.miami.edu</a>></span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">It seems that :<b=
r>=A0<br> <a href=3D"http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/1306=
28" target=3D"_blank">http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/130=
628</a><br>
<br>appears in 7.2-R-p3; With this kernel, against Fedora 8 distros:<br>
<br>Linux <a href=3D"http://prism09.cs.miami.edu/" target=3D"_blank">prism0=
9.cs.miami.edu</a> 2.6.26.8-57.fc8 #1 SMP Thu Dec 18 18:59:49 EST 2008 x86_=
64 x86_64 x86_64 GNU/Linux<br><br>which are using NFS (tcp) to mount homedi=
rs form the freebsd server to the fedora client, <br>
server will become unresponsive from the network during graphical login of =
a client.<br><br>Applying the patch given in the article <a href=3D"http://=
www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/130628" target=3D"_blank">http:/=
/www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/130628</a> seems at present to =
fix the problem. Under a 7.2-R-p3, we can manifest the problem in a few min=
utes, and under said kernel with patches as described in the article, and a=
s provided by diffs against the current source, we have not yet seen the pr=
oblem.<br>
<br>When the problem appears, the sever cannot be pinged, an other network =
connections are halted. <br><br>On the server, for instance, top shows:<br>=
<br style=3D"font-family: courier new,monospace;"><span style=3D"font-famil=
y: courier new,monospace;">Proc, state, pri</span><br style=3D"font-family:=
courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">--------------------</s=
pan><br style=3D"font-family: courier new,monospace;"><span style=3D"font-f=
amily: courier new,monospace;">pc.lockd=A0=A0 *tcpin=A0=A0 -68 </span><br s=
tyle=3D"font-family: courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">nfsd=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 -=A0=A0=A0=A0=A0=A0 4</span><br style=3D"font-family: courier new=
,monospace;"><span style=3D"font-family: courier new,monospace;">rpcbind=A0=
=A0=A0=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monos=
pace;">
<span style=3D"font-family: courier new,monospace;">ntpd=A0=A0=A0=A0=A0=A0=
=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monospace;"=
><span style=3D"font-family: courier new,monospace;">nfsd=A0=A0=A0=A0=A0=A0=
=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monospace;"=
>
<span style=3D"font-family: courier new,monospace;">... etc...</span><br><b=
r><br>Also,<br><br><span style=3D"font-family: courier new,monospace;">./lo=
ckd restart</span><br style=3D"font-family: courier new,monospace;"><span s=
tyle=3D"font-family: courier new,monospace;">Stopping lockd.</span><br styl=
e=3D"font-family: courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">Waiting for PIDS: 1114,=
1114, 1114, 1114,....</span><br style=3D"font-family: courier new,monospac=
e;"><br>kill -9 1114 also ineffective.<br><br>So it seems to be something s=
pinning in lockd.<br>
<br>I think this is a serious issue and would like to see it resolved. Our =
setup is available if you would like to send instrumented code. I attach di=
ffs.<br><br><br><br>
</blockquote></div><br>
--000e0cd6c8b6adc3e40475e605f0--
More information about the freebsd-net
mailing list