ZFS performance
Nick Gustas
freebsd-fs at tychl.net
Mon May 14 14:54:58 UTC 2007
Sorry for the formatting,
> *Ståle Kristoffersen* staalebk at ifi.uio.no
> <mailto:freebsd-fs%40freebsd.org?Subject=ZFS%20performance&In-Reply-To=20070428003115.GA1003%40eschew.pusen.org>
> /Wed May 2 01:58:27 UTC 2007/
>
> * Previous message: infrequently used filesystem gets VERY slow
> <003153.html>
> * Next message: ZFS vs UFS2 overhead and may be a bug? <003155.html>
> * *Messages sorted by:* [ date ] <date.html#3154> [ thread ]
> <thread.html#3154> [ subject ] <subject.html#3154> [ author ]
> <author.html#3154>
>
> ------------------------------------------------------------------------
> On 2007-04-28 at 02:31, Ståle Kristoffersen wrote:
> >/ cvsup'ed, and buildt a new kernel and now the problem is gone, sorry about
> />/ the noise.
> /
> Not entirely true. I still experience very strange (to me) issues.
>
> I'm trying to stream something off the server at about 100 KB/s. It's a
> video-file located on zfs. I can see that the data pushed across the
> network is about the correct speed (fluctuating around 100 KB/s). Now if I
> start up 'zpool iostat -v 1' things start to look strange. it first reads
> a couple of hundres KB's, then about 1 MB the next second, then 3 MB, then
> 10 MB then 20 MB, before the cycle restart at around a 3-400 KB again. The
> numbers differ from run to run but the overall pattern is the same.
>
> This makes streaming several streams impossible as it is not able to
> deliver enough data. (It should easily push 5 streams of 100 KB/s, right?)
>
> I'm using samba, but the problem is there when transfering using proftpd,
> (limiting the transfer on the receiver side to 100 KB/s)
> alltho the "pattern" is not as clear, it still reads _way_ more data than
> is transferred.
>
> I'm not sure how to debug this, I'm willing to try anything!
> Below is a snip from 'zpool iostat -v 1', notice that the file beeing
> streamed is located on the last disc of the pool. (the pool was full, I
> added one disc and therefor all new data ends up on that disc). It
> completes two "cycles":
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 0 0 0 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 0 0 0 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 4 0 639K 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 4 0 639K 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 2 0 384K 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 2 0 384K 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 7 0 1023K 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 7 0 1023K 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 27 0 3.50M 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 27 0 3.50M 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 101 0 12.6M 0
> ad14 298G 87.4M 0 0 5.99K 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 100 0 12.6M 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 127 0 16.0M 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 127 0 16.0M 0
> ---------- ----- ----- ----- ----- ----- -----
>
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> stash 1.42T 76.3G 2 0 384K 0
> ad14 298G 87.4M 0 0 0 0
> ad15 298G 174M 0 0 0 0
> ad8 298G 497M 0 0 0 0
> ad10s1d 340G 441M 0 0 0 0
> ad16 223G 75.1G 2 0 384K 0
> ---------- ----- ----- ----- ----- ----- -----
>
>
snip!
I see the same behavior that Ståle is seeing, I can "fix" it by setting
vfs.zfs.prefetch_disable="1" in loader.conf. I'm assuming something in
the prefetch code isn't quite right? I believe I saw similar behavior in
solaris 10 when playing with ZFS a few weeks ago, but I need to revisit
that machine or install opensolaris on this one before I can be sure.
# uname -a
FreeBSD 7.0-CURRENT-200705 FreeBSD 7.0-CURRENT-200705 #0: Fri May 11
14:41:37 UTC 2007 root@:/usr/src/sys/amd64/compile/ZFS amd64
load_zfs="YES" in loader.conf
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2008.94-MHz
K8-class CPU)
usable memory = 1026355200 (978 MB)
I'm doing basic speed testing so I have WITNESS and friends turned off,
as well as malloc.conf symlinked to 'aj'
zpool is a gpt partition residing on a single sata seagate 400
# zpool status
pool: big
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
big ONLINE 0 0 0
ad6p1 ONLINE 0 0 0
Results with ftp to a 6.2 client over 100mbit lan:
vfs.zfs.prefetch_disable="0" (default)
netstat -w1
input (Total) output
packets errs bytes packets errs bytes colls
17 0 1856 3 0 382 0
8 0 689 3 0 382 0
1232 0 81719 2422 0 3657607 0
3994 0 263727 7924 0 11995892 0
4108 0 271701 8130 0 12304874 0
2966 0 196035 5856 0 8861816 0
4101 0 271084 8129 0 12304594 0
3988 0 263471 7894 0 11943838 0
2702 0 179088 5340 0 8083084 0
4105 0 271326 8129 0 12304594 0
4101 0 270825 8129 0 12304594 0
2752 0 182146 5428 0 8213760 0
4102 0 271243 8130 0 12305076 0
4099 0 270849 8129 0 12307622 0
2729 0 180453 5406 0 8181484 0
4096 0 270583 8129 0 12304594 0
4109 0 271980 8130 0 12304594 0
2742 0 181391 5417 0 8192630 0
4106 0 271583 8129 0 12304594 0
4098 0 270617 8129 0 12309136 0
3128 0 207274 6189 0 9354050 0
3718 0 245621 7368 0 11159762 0
4102 0 271383 8129 0 12304594 0
3879 0 256447 7683 0 11620508 0
2977 0 197005 5892 0 8929640 0
4103 0 271242 8129 0 12304808 0
4109 0 271337 8130 0 12303080 0
1596 0 105573 3133 0 4721738 0
15 0 1668 4 0 448 0
15 0 1253 3 0 382 0
zpool iostat 1
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 285 0 35.5M 0
big 10.5G 276G 106 0 13.3M 0
big 10.5G 276G 89 0 11.2M 0
big 10.5G 276G 225 0 28.1M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 257 0 32.1M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 257 0 32.1M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 257 0 32.1M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 235 0 29.3M 0
big 10.5G 276G 21 0 2.75M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 133 0 16.7M 0
big 10.5G 276G 123 0 15.4M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 35 0 4.50M 0
big 10.5G 276G 221 0 27.6M 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 0 0 0 0
big 10.5G 276G 257 0 32.1M 0
You can see how the network speeds fluctuate.
Now, I reboot with vfs.zfs.prefetch_disable="1" set in loader.conf
# netstat -w1
input (Total) output
packets errs bytes packets errs bytes colls
10 0 969 2 0 316 0
13 0 983 2 0 316 0
3890 0 257204 7672 0 11589854 0
4058 0 268449 8023 0 12125162 0
4063 0 268600 8029 0 12122918 0
4048 0 267395 8013 0 12095290 0
4062 0 268385 8020 0 12117482 0
4070 0 269165 8030 0 12121856 0
4062 0 268440 8021 0 12133816 0
4062 0 268387 8027 0 12125850 0
4059 0 268379 8028 0 12128878 0
4074 0 269529 8038 0 12151286 0
4064 0 268883 8032 0 12131174 0
4069 0 268763 8042 0 12155236 0
4061 0 268229 8033 0 12137728 0
4057 0 268353 8028 0 12133948 0
3953 0 261159 7814 0 11804028 0
4063 0 268331 8026 0 12122004 0
4072 0 269673 8032 0 12138958 0
4059 0 268214 8036 0 12137860 0
4065 0 268491 8043 0 12156648 0
4069 0 269000 8042 0 12146426 0
4056 0 267963 8033 0 12138546 0
4092 0 270247 8115 0 12282992 0
4101 0 271279 8129 0 12304594 0
758 0 50289 1465 0 2198524 0
8 0 657 3 0 382 0
14 0 1225 3 0 382 0
# zpool iostat 1
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
big 10.7G 275G 0 0 0 0
big 10.7G 275G 0 0 0 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 85 0 10.7M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 87 0 11.0M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 88 0 11.1M 0
big 10.7G 275G 7 0 1023K 0
big 10.7G 275G 0 0 0 0
May not be the same total data in each test, I was using 8GB test file
and stopping the transfer after a bit, behavior is consistent though.
ps: Pawel, thanks for all your work on ZFS for FreeBSD, I really
appreciate it!
More information about the freebsd-fs
mailing list