kern/130628: [nfs] NFS / rpc.lockd deadlock on 7.1-R
Burt Rosenberg
burt at cs.miami.edu
Thu Sep 3 19:10:08 UTC 2009
The following reply was made to PR kern/130628; it has been noted by GNATS.
From: Burt Rosenberg <burt at cs.miami.edu>
To: bug-followup at FreeBSD.org, Joe Marcus Clarke <marcus at marcuscom.com>
Cc: bvowk at math.ualberta.ca
Subject: Re: kern/130628: [nfs] NFS / rpc.lockd deadlock on 7.1-R
Date: Thu, 3 Sep 2009 14:40:24 -0400
--000e0cd518fc7a0bcd0472b0b7ce
Content-Type: multipart/alternative; boundary=000e0cd518fc7a0bbf0472b0b7cc
--000e0cd518fc7a0bbf0472b0b7cc
Content-Type: text/plain; charset=ISO-8859-1
It seems that :
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130628
appears in 7.2-R-p3; With this kernel, against Fedora 8 distros:
Linux prism09.cs.miami.edu 2.6.26.8-57.fc8 #1 SMP Thu Dec 18 18:59:49 EST
2008 x86_64 x86_64 x86_64 GNU/Linux
which are using NFS (tcp) to mount homedirs form the freebsd server to the
fedora client,
server will become unresponsive from the network during graphical login of a
client.
Applying the patch given in the article
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130628 seems at present to
fix the problem. Under a 7.2-R-p3, we can manifest the problem in a few
minutes, and under said kernel with patches as described in the article, and
as provided by diffs against the current source, we have not yet seen the
problem.
When the problem appears, the sever cannot be pinged, an other network
connections are halted.
On the server, for instance, top shows:
Proc, state, pri
--------------------
pc.lockd *tcpin -68
nfsd - 4
rpcbind select 44
ntpd select 44
nfsd select 44
... etc...
Also,
./lockd restart
Stopping lockd.
Waiting for PIDS: 1114, 1114, 1114, 1114,....
kill -9 1114 also ineffective.
So it seems to be something spinning in lockd.
I think this is a serious issue and would like to see it resolved. Our setup
is available if you would like to send instrumented code. I attach diffs.
--000e0cd518fc7a0bbf0472b0b7cc
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
It seems that :<br>=A0<br> <a href=3D"http://www.freebsd.org/cgi/query-pr.c=
gi?pr=3Dkern/130628">http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/1306=
28</a><br><br>appears in 7.2-R-p3; With this kernel, against Fedora 8 distr=
os:<br>
<br>Linux <a href=3D"http://prism09.cs.miami.edu/" target=3D"_blank">prism0=
9.cs.miami.edu</a> 2.6.26.8-57.fc8 #1 SMP Thu Dec 18 18:59:49 EST 2008 x86_=
64 x86_64 x86_64 GNU/Linux<br><br>which are using NFS (tcp) to mount homedi=
rs form the freebsd server to the fedora client, <br>
server will become unresponsive from the network during graphical login of =
a client.<br><br>Applying the patch given in the article <a href=3D"http://=
www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/130628">http://www.freebsd.org/c=
gi/query-pr.cgi?pr=3Dkern/130628</a> seems at present to fix the problem. U=
nder a 7.2-R-p3, we can manifest the problem in a few minutes, and under sa=
id kernel with patches as described in the article, and as provided by diff=
s against the current source, we have not yet seen the problem.<br>
<br>When the problem appears, the sever cannot be pinged, an other network =
connections are halted. <br><br>On the server, for instance, top shows:<br>=
<br style=3D"font-family: courier new,monospace;"><span style=3D"font-famil=
y: courier new,monospace;">Proc, state, pri</span><br style=3D"font-family:=
courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">--------------------</s=
pan><br style=3D"font-family: courier new,monospace;"><span style=3D"font-f=
amily: courier new,monospace;">pc.lockd=A0=A0 *tcpin=A0=A0 -68 </span><br s=
tyle=3D"font-family: courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">nfsd=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 -=A0=A0=A0=A0=A0=A0 4</span><br style=3D"font-family: courier new=
,monospace;"><span style=3D"font-family: courier new,monospace;">rpcbind=A0=
=A0=A0=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monos=
pace;">
<span style=3D"font-family: courier new,monospace;">ntpd=A0=A0=A0=A0=A0=A0=
=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monospace;"=
><span style=3D"font-family: courier new,monospace;">nfsd=A0=A0=A0=A0=A0=A0=
=A0 select=A0=A0 44</span><br style=3D"font-family: courier new,monospace;"=
>
<span style=3D"font-family: courier new,monospace;">... etc...</span><br><b=
r><br>Also,<br><br><span style=3D"font-family: courier new,monospace;">./lo=
ckd restart</span><br style=3D"font-family: courier new,monospace;"><span s=
tyle=3D"font-family: courier new,monospace;">Stopping lockd.</span><br styl=
e=3D"font-family: courier new,monospace;">
<span style=3D"font-family: courier new,monospace;">Waiting for PIDS: 1114,=
1114, 1114, 1114,....</span><br style=3D"font-family: courier new,monospac=
e;"><br>kill -9 1114 also ineffective.<br><br>So it seems to be something s=
pinning in lockd.<br>
<br>I think this is a serious issue and would like to see it resolved. Our =
setup is available if you would like to send instrumented code. I attach di=
ffs.<br><br><br><br>
--000e0cd518fc7a0bbf0472b0b7cc--
--000e0cd518fc7a0bcd0472b0b7ce
Content-Type: application/octet-stream; name="svc.c.diff"
Content-Disposition: attachment; filename="svc.c.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fz5u5alz0
MTgxYzE4MQo8IHhwcnRfaW5hY3RpdmUoU1ZDWFBSVCAqeHBydCkKLS0tCj4geHBydF9pbmFjdGl2
ZV9sb2NrZWQoU1ZDWFBSVCAqeHBydCkKMTg1LDE4NmQxODQKPCAJbXR4X2xvY2soJnBvb2wtPnNw
X2xvY2spOwo8IAoxOTFjMTg5CjwgCXdha2V1cCgmcG9vbC0+c3BfYWN0aXZlKTsKLS0tCj4gfQox
OTJhMTkxLDE5Nwo+IHZvaWQKPiB4cHJ0X2luYWN0aXZlKFNWQ1hQUlQgKnhwcnQpCj4gewo+IAlT
VkNQT09MICpwb29sID0geHBydC0+eHBfcG9vbDsKPiAKPiAJbXR4X2xvY2soJnBvb2wtPnNwX2xv
Y2spOwo+IAl4cHJ0X2luYWN0aXZlX2xvY2tlZCh4cHJ0KTsK
--000e0cd518fc7a0bcd0472b0b7ce
Content-Type: application/octet-stream; name="svc.h.diff"
Content-Disposition: attachment; filename="svc.h.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fz5u5am51
NDlhNTAKPiAjaW5jbHVkZSA8c3lzL19zeC5oPgoxMzFjMTMyCjwgCXN0cnVjdCBtdHgJeHBfbG9j
azsKLS0tCj4gCXN0cnVjdCBzeCAgICAgICB4cF9sb2NrOwozMzRhMzM2Cj4gZXh0ZXJuIHZvaWQg
ICAgIHhwcnRfaW5hY3RpdmVfbG9ja2VkKFNWQ1hQUlQgKik7Cg==
--000e0cd518fc7a0bcd0472b0b7ce
Content-Type: application/octet-stream; name="svc_dg.c.diff"
Content-Disposition: attachment; filename="svc_dg.c.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fz5u5am92
NTVhNTYKPiAjaW5jbHVkZSA8c3lzL3N4Lmg+CjEyMWMxMjIKPCAJbXR4X2luaXQoJnhwcnQtPnhw
X2xvY2ssICJ4cHJ0LT54cF9sb2NrIiwgTlVMTCwgTVRYX0RFRik7Ci0tLQo+IAlzeF9pbml0KCZ4
cHJ0LT54cF9sb2NrLCAieHBydC0+eHBfbG9jayIpOwoxNjNhMTY1LDE2Nwo+IAlpZiAoc29yZWFk
YWJsZSh4cHJ0LT54cF9zb2NrZXQpKQo+IAkJZXR1cm4gKFhQUlRfTU9SRVJFUVMpOwo+IAoxNzRh
MTc5LDE4Mwo+IAkvKiAKPiAJICogU2VyaWFsaXNlIGFjY2VzcyB0byB0aGUgc29ja2V0Lgo+IAkg
Ki8KPiAJc3hfeGxvY2soJnhwcnQtPnhwX2xvY2spOwo+IAoxOTAsMTkxZDE5OAo8IAltdHhfbG9j
aygmeHBydC0+eHBfbG9jayk7CjwgCjE5OSwyMDBjMjA2LDIxMAo8IAkJeHBydF9pbmFjdGl2ZSh4
cHJ0KTsKPCAJCW10eF91bmxvY2soJnhwcnQtPnhwX2xvY2spOwotLS0KPiAJCW10eF9sb2NrKCZ4
cHJ0LT54cF9wb29sLT5zcF9sb2NrKTsKPiAJCWlmICghc29yZWFkYWJsZSh4cHJ0LT54cF9zb2Nr
ZXQpKQo+IAkJCXhwcnRfaW5hY3RpdmVfbG9ja2VkKHhwcnQpOwo+IAkJbXR4X3VubG9jaygmeHBy
dC0+eHBfcG9vbC0+c3BfbG9jayk7Cj4gCQlzeF94dW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKMjEx
YzIyMQo8IAkJbXR4X3VubG9jaygmeHBydC0+eHBfbG9jayk7Ci0tLQo+IAkJc3hfeHVubG9jaygm
eHBydC0+eHBfbG9jayk7CjIxNWMyMjUKPCAJbXR4X3VubG9jaygmeHBydC0+eHBfbG9jayk7Ci0t
LQo+IAlzeF94dW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKMzA0YzMxNAo8IAltdHhfZGVzdHJveSgm
eHBydC0+eHBfbG9jayk7Ci0tLQo+IAlzeF9kZXN0cm95KCZ4cHJ0LT54cF9sb2NrKTsKMzMxZDM0
MAo8IAltdHhfbG9jaygmeHBydC0+eHBfbG9jayk7CjMzM2QzNDEKPCAJbXR4X3VubG9jaygmeHBy
dC0+eHBfbG9jayk7Cg==
--000e0cd518fc7a0bcd0472b0b7ce
Content-Type: application/octet-stream; name="svc_vc.c.diff"
Content-Disposition: attachment; filename="svc_vc.c.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fz5u5amc3
NTZhNTcKPiAjaW5jbHVkZSA8c3lzL3N4Lmg+CjE0NWMxNDYKPCAJbXR4X2luaXQoJnhwcnQtPnhw
X2xvY2ssICJ4cHJ0LT54cF9sb2NrIiwgTlVMTCwgTVRYX0RFRik7Ci0tLQo+IAlzeF9pbml0KCZ4
cHJ0LT54cF9sb2NrLCAieHBydC0+eHBfbG9jayIpOwoyMjJjMjIzLDIyNAo8IAltdHhfaW5pdCgm
eHBydC0+eHBfbG9jaywgInhwcnQtPnhwX2xvY2siLCBOVUxMLCBNVFhfREVGKTsKLS0tCj4gCXN4
X2luaXQoJnhwcnQtPnhwX2xvY2ssICJ4cHJ0LT54cF9sb2NrIik7Cj4gCjI1OGMyNjAKPCAJbXR4
X2xvY2soJnhwcnQtPnhwX2xvY2spOwotLS0KPiAJc3hfeGxvY2soJnhwcnQtPnhwX2xvY2spOwoy
NjBjMjYyLDI2Mwo8IAltdHhfdW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKLS0tCj4gCXN4X3h1bmxv
Y2soJnhwcnQtPnhwX2xvY2spOwo+IAozNTljMzYyCjwgCW10eF9sb2NrKCZ4cHJ0LT54cF9sb2Nr
KTsKLS0tCj4gCXN4X3hsb2NrKCZ4cHJ0LT54cF9sb2NrKTsKMzY0LDM2NWMzNjcsMzczCjwgCQl4
cHJ0X2luYWN0aXZlKHhwcnQpOwo8IAkJbXR4X3VubG9jaygmeHBydC0+eHBfbG9jayk7Ci0tLQo+
IAkJQUNDRVBUX0xPQ0soKTsKPiAJCW10eF9sb2NrKCZ4cHJ0LT54cF9wb29sLT5zcF9sb2NrKTsK
PiAJCWlmIChUQUlMUV9FTVBUWSgmeHBydC0+eHBfc29ja2V0LT5zb19jb21wKSkKPiAJCQl4cHJ0
X2luYWN0aXZlX2xvY2tlZCh4cHJ0KTsKPiAJCW10eF91bmxvY2soJnhwcnQtPnhwX3Bvb2wtPnNw
X2xvY2spOwo+IAkJQUNDRVBUX1VOTE9DSygpOwo+IAkJc3hfeHVubG9jaygmeHBydC0+eHBfbG9j
ayk7CjM3NmMzODQKPCAJCW10eF91bmxvY2soJnhwcnQtPnhwX2xvY2spOwotLS0KPiAJCXN4X3h1
bmxvY2soJnhwcnQtPnhwX2xvY2spOwozODBjMzg4CjwgCW10eF91bmxvY2soJnhwcnQtPnhwX2xv
Y2spOwotLS0KPiAJc3hfeHVubG9jaygmeHBydC0+eHBfbG9jayk7CjQyNWM0MzMKPCAJbXR4X2Rl
c3Ryb3koJnhwcnQtPnhwX2xvY2spOwotLS0KPiAJc3hfZGVzdHJveSgmeHBydC0+eHBfbG9jayk7
CjQ5MWE1MDAKPiAJCXN4X3hsb2NrKCZ4cHJ0LT54cF9sb2NrKTsKNDk2YTUwNgo+IAkJc3hfeHVu
bG9jaygmeHBydC0+eHBfbG9jayk7CjUwMGE1MTEsNTEzCj4gCWlmIChzb3JlYWRhYmxlKHhwcnQt
PnhwX3NvY2tldCkpCj4gCQlyZXR1cm4gKFhQUlRfTU9SRVJFUVMpOwo+IAo1MTFhNTI1LDUyNgo+
IAlzeF94bG9jaygmeHBydC0+eHBfbG9jayk7Cj4gCjU4NmE2MDIKPiAJCQkJc3hfeHVubG9jaygm
eHBydC0+eHBfbG9jayk7CjYxNGQ2MjkKPCAJCW10eF9sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKNjI0
LDYyNWM2MzksNjQzCjwgCQkJeHBydF9pbmFjdGl2ZSh4cHJ0KTsKPCAJCQltdHhfdW5sb2NrKCZ4
cHJ0LT54cF9sb2NrKTsKLS0tCj4gCQkJbXR4X2xvY2soJnhwcnQtPnhwX3Bvb2wtPnNwX2xvY2sp
Owo+IAkJCWlmICghc29yZWFkYWJsZSh4cHJ0LT54cF9zb2NrZXQpKQo+IAkJCQl4cHJ0X2luYWN0
aXZlX2xvY2tlZCh4cHJ0KTsKPiAJCQltdHhfdW5sb2NrKCZ4cHJ0LT54cF9wb29sLT5zcF9sb2Nr
KTsKPiAJCQlzeF94dW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKNjM3YzY1NQo8IAkJCW10eF91bmxv
Y2soJnhwcnQtPnhwX2xvY2spOwotLS0KPiAJCQlzeF94dW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsK
NjQ0YTY2Mwo+IAkJCXhwcnRfaW5hY3RpdmUoeHBydCk7CjY0NmM2NjUKPCAJCQltdHhfdW5sb2Nr
KCZ4cHJ0LT54cF9sb2NrKTsKLS0tCj4gCQkJc3hfeHVubG9jaygmeHBydC0+eHBfbG9jayk7CjY1
NCw2NTVkNjcyCjwgCjwgCQltdHhfdW5sb2NrKCZ4cHJ0LT54cF9sb2NrKTsKNzQyZDc1OAo8IAlt
dHhfbG9jaygmeHBydC0+eHBfbG9jayk7Cjc0NGQ3NTkKPCAJbXR4X3VubG9jaygmeHBydC0+eHBf
bG9jayk7Cg==
--000e0cd518fc7a0bcd0472b0b7ce--
More information about the freebsd-net
mailing list