ZFS performance

Nick Gustas freebsd-fs at tychl.net
Mon May 14 14:54:58 UTC 2007


Sorry for the formatting,

> *Ståle Kristoffersen* staalebk at ifi.uio.no 
> <mailto:freebsd-fs%40freebsd.org?Subject=ZFS%20performance&In-Reply-To=20070428003115.GA1003%40eschew.pusen.org>
> /Wed May 2 01:58:27 UTC 2007/
>
>     * Previous message: infrequently used filesystem gets VERY slow
>       <003153.html>
>     * Next message: ZFS vs UFS2 overhead and may be a bug? <003155.html>
>     * *Messages sorted by:* [ date ] <date.html#3154> [ thread ]
>       <thread.html#3154> [ subject ] <subject.html#3154> [ author ]
>       <author.html#3154>
>
> ------------------------------------------------------------------------
> On 2007-04-28 at 02:31, Ståle Kristoffersen wrote:
> >/ cvsup'ed, and buildt a new kernel and now the problem is gone, sorry about
> />/ the noise.
> /
> Not entirely true. I still experience very strange (to me) issues.
>
> I'm trying to stream something off the server at about 100 KB/s. It's a
> video-file located on zfs. I can see that the data pushed across the
> network is about the correct speed (fluctuating around 100 KB/s). Now if I
> start up 'zpool iostat -v 1' things start to look strange. it first reads
> a couple of hundres KB's, then about 1 MB the next second, then 3 MB, then
> 10 MB then 20 MB, before the cycle restart at around a 3-400 KB again. The
> numbers differ from run to run but the overall pattern is the same.
>
> This makes streaming several streams impossible as it is not able to
> deliver enough data. (It should easily push 5 streams of 100 KB/s, right?)
>
> I'm using samba, but the problem is there when transfering using proftpd,
> (limiting the transfer on the receiver side to 100 KB/s)
> alltho the "pattern" is not as clear, it still reads _way_ more data than
> is transferred.
>
> I'm not sure how to debug this, I'm willing to try anything!
> Below is a snip from 'zpool iostat -v 1', notice that the file beeing
> streamed is located on the last disc of the pool. (the pool was full, I
> added one disc and therefor all new data ends up on that disc). It
> completes two "cycles":
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G      0      0      0      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G      0      0      0      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G      4      0   639K      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G      4      0   639K      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G      2      0   384K      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G      2      0   384K      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G      7      0  1023K      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G      7      0  1023K      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G     27      0  3.50M      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G     27      0  3.50M      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G    101      0  12.6M      0
>   ad14       298G  87.4M      0      0  5.99K      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G    100      0  12.6M      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G    127      0  16.0M      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G    127      0  16.0M      0
> ----------  -----  -----  -----  -----  -----  -----
>
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> stash       1.42T  76.3G      2      0   384K      0
>   ad14       298G  87.4M      0      0      0      0
>   ad15       298G   174M      0      0      0      0
>   ad8        298G   497M      0      0      0      0
>   ad10s1d    340G   441M      0      0      0      0
>   ad16       223G  75.1G      2      0   384K      0
> ----------  -----  -----  -----  -----  -----  -----
>
>   
snip!

I see the same behavior that Ståle is seeing, I can "fix" it by setting 
vfs.zfs.prefetch_disable="1" in loader.conf.  I'm assuming something in 
the prefetch code isn't quite right? I believe I saw similar behavior in 
solaris 10 when playing with ZFS a few weeks ago, but I need to revisit 
that machine or install opensolaris on this one before I can be sure.

# uname -a
FreeBSD  7.0-CURRENT-200705 FreeBSD 7.0-CURRENT-200705 #0: Fri May 11 
14:41:37 UTC 2007     root@:/usr/src/sys/amd64/compile/ZFS  amd64

load_zfs="YES" in loader.conf

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2008.94-MHz 
K8-class CPU)
usable memory = 1026355200 (978 MB)


I'm doing basic speed testing so I have WITNESS and friends turned off, 
as well as malloc.conf symlinked to 'aj'




zpool is a gpt partition residing on a single sata seagate 400

# zpool status
  pool: big
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        big         ONLINE       0     0     0
          ad6p1     ONLINE       0     0     0




Results with ftp to a 6.2 client over 100mbit lan:



vfs.zfs.prefetch_disable="0" (default)

netstat -w1
            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
        17     0       1856          3     0        382     0
         8     0        689          3     0        382     0
      1232     0      81719       2422     0    3657607     0
      3994     0     263727       7924     0   11995892     0
      4108     0     271701       8130     0   12304874     0
      2966     0     196035       5856     0    8861816     0
      4101     0     271084       8129     0   12304594     0
      3988     0     263471       7894     0   11943838     0
      2702     0     179088       5340     0    8083084     0
      4105     0     271326       8129     0   12304594     0
      4101     0     270825       8129     0   12304594     0
      2752     0     182146       5428     0    8213760     0
      4102     0     271243       8130     0   12305076     0
      4099     0     270849       8129     0   12307622     0
      2729     0     180453       5406     0    8181484     0
      4096     0     270583       8129     0   12304594     0
      4109     0     271980       8130     0   12304594     0
      2742     0     181391       5417     0    8192630     0
      4106     0     271583       8129     0   12304594     0
      4098     0     270617       8129     0   12309136     0
      3128     0     207274       6189     0    9354050     0
      3718     0     245621       7368     0   11159762     0
      4102     0     271383       8129     0   12304594     0
      3879     0     256447       7683     0   11620508     0
      2977     0     197005       5892     0    8929640     0
      4103     0     271242       8129     0   12304808     0
      4109     0     271337       8130     0   12303080     0
      1596     0     105573       3133     0    4721738     0
        15     0       1668          4     0        448     0
        15     0       1253          3     0        382     0


zpool iostat 1
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    285      0  35.5M      0
big         10.5G   276G    106      0  13.3M      0
big         10.5G   276G     89      0  11.2M      0
big         10.5G   276G    225      0  28.1M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    257      0  32.1M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    257      0  32.1M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    257      0  32.1M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    235      0  29.3M      0
big         10.5G   276G     21      0  2.75M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    133      0  16.7M      0
big         10.5G   276G    123      0  15.4M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G     35      0  4.50M      0
big         10.5G   276G    221      0  27.6M      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G      0      0      0      0
big         10.5G   276G    257      0  32.1M      0


You can see how the network speeds fluctuate.




Now, I reboot with vfs.zfs.prefetch_disable="1" set in loader.conf

# netstat -w1
           input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
        10     0        969          2     0        316     0
        13     0        983          2     0        316     0
      3890     0     257204       7672     0   11589854     0
      4058     0     268449       8023     0   12125162     0
      4063     0     268600       8029     0   12122918     0
      4048     0     267395       8013     0   12095290     0
      4062     0     268385       8020     0   12117482     0
      4070     0     269165       8030     0   12121856     0
      4062     0     268440       8021     0   12133816     0
      4062     0     268387       8027     0   12125850     0
      4059     0     268379       8028     0   12128878     0
      4074     0     269529       8038     0   12151286     0
      4064     0     268883       8032     0   12131174     0
      4069     0     268763       8042     0   12155236     0
      4061     0     268229       8033     0   12137728     0
      4057     0     268353       8028     0   12133948     0
      3953     0     261159       7814     0   11804028     0
      4063     0     268331       8026     0   12122004     0
      4072     0     269673       8032     0   12138958     0
      4059     0     268214       8036     0   12137860     0
      4065     0     268491       8043     0   12156648     0
      4069     0     269000       8042     0   12146426     0
      4056     0     267963       8033     0   12138546     0
      4092     0     270247       8115     0   12282992     0
      4101     0     271279       8129     0   12304594     0
       758     0      50289       1465     0    2198524     0
         8     0        657          3     0        382     0
        14     0       1225          3     0        382     0



# zpool iostat 1
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
big         10.7G   275G      0      0      0      0
big         10.7G   275G      0      0      0      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     85      0  10.7M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     87      0  11.0M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G     88      0  11.1M      0
big         10.7G   275G      7      0  1023K      0
big         10.7G   275G      0      0      0      0


May not be the same total data in each test, I was using 8GB test file 
and stopping the transfer after a bit, behavior is consistent though.







ps: Pawel, thanks for all your work on ZFS for FreeBSD, I really 
appreciate it!





More information about the freebsd-fs mailing list