amd64/161968: renaming snapshot with -r including a zvol snapshot
causes total ZFS freeze/lockup
Peter Maloney
peter.maloney at brockmann-consult.de
Mon Oct 24 15:20:01 UTC 2011
>Number: 161968
>Category: amd64
>Synopsis: renaming snapshot with -r including a zvol snapshot causes total ZFS freeze/lockup
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-amd64
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Oct 24 15:20:00 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator: Peter Maloney
>Release: 8.2-STABLE FreeBSD 8.2-STABLE #0: Tue Sep 27 16:27:57 CEST 2011 root at bcnastest2.bc.local:/usr/obj/usr/src/sys/GENERIC amd64
>Organization:
Brockmann Consult
>Environment:
FreeBSD bcnas1.bc.local 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Sep 29 15:06:03 CEST 2011 root at bcnas1.bc.local:/usr/obj/usr/src/sys/GENERIC amd64
>Description:
renaming snapshot with -r including a zvol snapshot causes total ZFS freeze/lockup/deadlock.
After it is locked up, any command using "zfs" "zpool" "sysctl -a", or NFS exports will freeze. And "shutdown -r" will not restart the system, only shut it down until it says the disks are all synced.
CTRL+T done after zfs or zpool shows state "spa_namespace_lock". Done after "sysctl -a" shows state "g_waitfor_event".
Most of the time, a simple "zfs rename" does not cause a lockup, however with a specific snapshot on one system, renaming it always causes a lockup, and on every other 8-STABLE system I have, my script always causes a lockup after a few loops.
My FreeBSD 8-STABLE was installed as 8.2 release plus the mps driver, and then cvsup using this cvsupfile (removed comments):
*default host=cvsup.de.FreeBSD.org
*default base=/var/db
*default prefix=/usr
*default release=cvs tag=RELENG_8
*default delete use-rel-suffix
*default date=2011.09.27.00.00.00
*default compress
src-all
(and the same freeze result occurs with date changed to today, Oct. 24th)
# zpool get all big
NAME PROPERTY VALUE SOURCE
big size 39.8G -
big capacity 24% -
big altroot - default
big health ONLINE -
big guid 14576708073682355899 default
big version 28 default
big bootfs - default
big delegation on default
big autoreplace on local
big cachefile - default
big failmode continue local
big listsnapshots on local
big autoexpand off default
big dedupditto 0 default
big dedupratio 1.00x -
big free 30.1G -
big allocated 9.64G -
big readonly off -
# zfs get all big
NAME PROPERTY VALUE SOURCE
big type filesystem -
big creation Thu Jul 21 11:48 2011 -
big used 4.80G -
big available 14.7G -
big referenced 4.80G -
big compressratio 1.00x -
big mounted yes -
big quota none default
big reservation none default
big recordsize 128K default
big mountpoint /big default
big sharenfs off default
big checksum on default
big compression off default
big atime on default
big devices on default
big exec on default
big setuid on default
big readonly off default
big jailed off default
big snapdir visible local
big aclmode discard default
big aclinherit restricted default
big canmount on default
big xattr off temporary
big copies 1 default
big version 4 -
big utf8only off -
big normalization none -
big casesensitivity sensitive -
big vscan off default
big nbmand off default
big sharesmb off default
big refquota none default
big refreservation none default
big primarycache all default
big secondarycache all default
big usedbysnapshots 0 -
big usedbydataset 4.80G -
big usedbychildren 6.70M -
big usedbyrefreservation 0 -
big logbias latency default
big dedup off default
big mlslabel -
big sync standard default
big refcompressratio 1.00x -
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
big 4.80G 14.7G 4.80G /big
big at testcrashsnap4 0 - 4.80G -
zroot 5.64G 109G 894M legacy
zroot/tmp 2.14M 109G 2.14M /tmp
zroot/usr 4.72G 109G 2.45G /usr
zroot/usr/home 53.5K 109G 53.5K /usr/home
zroot/usr/obj 922M 109G 922M /usr/objtmp
zroot/usr/ports 1.07G 109G 941M /usr/ports
zroot/usr/ports/distfiles 150M 109G 150M /usr/ports/distfiles
zroot/usr/ports/packages 21K 109G 21K /usr/ports/packages
zroot/usr/src 314M 109G 314M /usr/src
zroot/var 17.6M 109G 904K /var
zroot/var/crash 22.5K 109G 22.5K /var/crash
zroot/var/db 16.2M 109G 15.1M /var/db
zroot/var/db/pkg 1.10M 109G 1.10M /var/db/pkg
zroot/var/empty 21K 109G 21K /var/empty
zroot/var/log 272K 109G 272K /var/log
zroot/var/mail 48K 109G 48K /var/mail
zroot/var/run 50K 109G 50K /var/run
zroot/var/tmp 23K 109G 23K /var/tmp
# cat /boot/loader.conf
zfs_load="YES"
vfs.root.mountfrom="zfs:zroot"
/etc/sysctl.conf is nothing but comments
On a virtual machine where I have 8.2 release (not stable), I don't know how to reproduce the problem.
I also tested it on the latest downloaded with cvsup today, which freezes the same way.
All my zfs systems are amd64.
I was hoping to use a zvol for iSCSI and use snapshots, so simply avoiding using snapshots on zvols is unacceptable.
>How-To-Repeat:
Prerequisite:
A system running 8.2-STABLE (more specifically using *default date=2011.09.27.00.00.00 in cvsup).
(1) Create a zpool.
[root at bcnastest2 ~]# zpool status big
pool: big
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
big ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
ad8 ONLINE 0 0 0
ad10 ONLINE 0 0 0
ad12 ONLINE 0 0 0
ad16 ONLINE 0 0 0
cache
gpt/cache0 ONLINE 0 0 0
errors: No known data errors
(2) create a zvol in the above zpool.
[root at bcnastest2 ~]# zfs create -V 100m big/testzvol
(3) run this script as root (written in bash, works in sh too except for the count printout; make sure to set dataset variable)
#-------begin script-------
dataset=big
count=0
while true; do
echo Snapshot
zfs destroy -r ${dataset}@testcrashsnap >/dev/null 2>&1
zfs snapshot -r ${dataset}@testcrashsnap || break
current=""
for next in 1 2 3 4 5; do
echo Renaming from ${current} to ${next}
zfs destroy -r ${dataset}@testcrashsnap${next} >/dev/null 2>&1
zfs rename -r ${dataset}@testcrashsnap${current} ${dataset}@testcrashsnap${next} || break
current=${next}
done
echo Destroy
zfs destroy -r ${dataset}@testcrashsnap${current} || break
let count++
echo $count
done
#-------end script-------
Result: After an arbitrary number of loops, the output stops. Here is the output including result from hitting CTRL+C, CTRL+Z and Ctrl+T. The script was run on a Friday. The last line of output from Ctrl+t was done on the following Monday.
============================================
Snapshot
Renaming from to 1
Renaming from 1 to 2
Renaming from 2 to 3
Renaming from 3 to 4
Renaming from 4 to 5
Destroy
1
Snapshot
Renaming from to 1
Renaming from 1 to 2
Renaming from 2 to 3
Renaming from 3 to 4
Renaming from 4 to 5
Destroy
2
Snapshot
Renaming from to 1
Renaming from 1 to 2
Renaming from 2 to 3
Renaming from 3 to 4
Renaming from 4 to 5
Destroy
3
Snapshot
Renaming from to 1
Renaming from 1 to 2
Renaming from 2 to 3
Renaming from 3 to 4
^C
load: 1.32 cmd: zfs 2363 [tx->tx_sync_done_cv)] 5.56r 0.00u 0.00s 0% 1696k
load: 1.32 cmd: zfs 2363 [tx->tx_sync_done_cv)] 6.07r 0.00u 0.00s 0% 1696k
load: 1.32 cmd: zfs 2363 [tx->tx_sync_done_cv)] 6.26r 0.00u 0.00s 0% 1696k
load: 1.46 cmd: zfs 2363 [tx->tx_sync_done_cv)] 13.42r 0.00u 0.00s 0% 1696k
^C^C^C
load: 1.89 cmd: zfs 2363 [tx->tx_sync_done_cv)] 36.59r 0.00u 0.00s 0% 1696k
^C^D
load: 0.01 cmd: zfs 2363 [tx->tx_sync_done_cv)] 230096.99r 0.00u 0.00s 0% 1696k
============================================
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-amd64
mailing list