[PATCH] L2ARC deadlock
Xin LI
delphij at delphij.net
Sun Aug 7 08:52:55 UTC 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi,
I have observed this on a test system which have both L2ARC and ZIL
device installed, and several vdev groups. This can be triggered by
removing the l2arc and zil device (detaching l2arc would fail), blocking
at 'spa_namespace_lock'
After reboot, the pool can not be imported. 'zpool import' shows the
zpool command stuck with 'spa_namespace_lock'.
I have a theory about the issue but the patch was not tested, maybe
someone who is more familiar with the code can shed me some light?
Thread 1 (zpool) Thread 2 (l2arc_feed_thread)
Calls vdev_open()
vdev_open() can now assert RW_WRITER
on spa_config; [*1]
vdev_open() calls vdev_geom_open()
vdev_geom_open() called with
spa_namespace_lock held, sets
lock=1 and drops
spa_namespace_lock,
DROP_GIANT(),
g_topology_lock()
Races in, proceed to
l2arc_dev_get_next
Acquire &spa_namespace_lock;
[*2]
Found a device and then
spa_config_enter(RW_READER)
spa_config_enter blocks
waiting writer to finish
Tries to obtain spa_namespace_lock [*3]
This creates a deadlock situation.
My proposed solution is to use spa_config_tryenter() instead of
spa_config_enter() in arc.c (see attachment).
Comments?
Cheers,
- --
Xin LI <delphij at delphij.net> https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)
iQEcBAEBCAAGBQJOPlJcAAoJEATO+BI/yjfB4UoH/j3pK5rQ4iF52HmUwzMzKKol
UVXHmoUPbV1Ibhpajcz6LprSL7b/SI50a2yelIr+1ZjgC22Lrw6fFird90JfeXF7
YtQ1LyVVuDbMA5KfM6wD8linm3HYri88DQ2CDtUqZesZ1w7PH0XNYnRKRYcEGRem
1JG09BMl0EqGuvKSy+69UnE5dPTRnzrAQOe3xBB4LntZBeapwMy5F8gcBY5XP5Lh
W2lBJFcdFX3Zh390fUi1dxY5uoXQLfO5gu+YCXF7Zie2PLl596ZUjAEA3CsE/9yw
qWBSBGufMP4gK24j1EulxhmoIU993vGtqc3iZhs1TugfjGbHAlXiHQEcmJnaEk8=
=zkTh
-----END PGP SIGNATURE-----
-------------- next part --------------
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (revision 224652)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (working copy)
@@ -4224,8 +4224,10 @@ out:
* Grab the config lock to prevent the 'next' device from being
* removed while we are writing to it.
*/
- if (next != NULL)
- spa_config_enter(next->l2ad_spa, SCL_L2ARC, next, RW_READER);
+ if (next != NULL) {
+ if (!spa_config_tryenter(next->l2ad_spa, SCL_L2ARC, next, RW_READER))
+ next = NULL;
+ }
mutex_exit(&spa_namespace_lock);
return (next);
More information about the zfs-devel
mailing list