Increase mps sequential read performance with ZFS/zvol

John jwd at FreeBSD.org
Tue Feb 5 02:36:09 UTC 2013


Hi Folks,

   I'm in the process of putting together another ZFS server and
after running some sequential read performance tests I'm thinking
things could be better. It's running 9.1-stable from late January:

FreeBSD vprzfs30p.unx.sas.com 9.1-STABLE FreeBSD 9.1-STABLE #1 r246079M

   I have two HP D2700 shelves populated with 600GB drives connected
to a pair of LSI 9207-8e HBA cards installed in a Del R620 with 128GB
of ram, the OS installed an internal raid volume. The shelves are dual
channel, each LSI card with a channel through both shelves.

   Gmultipath is used to bind the disks such that each disk can be
addressed by either controller and the I/O balanced.

   The zfs pool consists of 24 mirrors, each pair one from each shelf.
The multipaths are rotated such that I/O is balanced between shelves
and controllers.

   For testing, two 300GB zvols are created, each almost full:

NAME               USED  AVAIL  REFER  MOUNTPOINT
pool0             1.46T  11.4T    31K  /pool0
pool0/lun000004    301G  11.4T   261G  -
pool0/lun000005    301G  11.4T   300G  -

  Running a simple dd test:

# dd if=/dev/zvol/pool0/lun000005 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 278.554656 secs (1156406975 bytes/sec)

  The drives are spread and balanced across four 6Gb/s channels, 1.1GB/s
seems a bit slow. Note, changing the bs= options makes no real difference.

   Now, if I run 2 'dd' operations against different pools in parallel:

# dd if=/dev/zvol/pool0/lun000005 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 278.605380 secs (1156196435 bytes/sec)

# dd if=/dev/zvol/pool0/lun000004 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 282.065008 secs (1142015274 bytes/sec)

  This tells me the I/O subsystem has plenty of overhead room available
such that the first 'dd' operation could run faster.

  I've included some basic config information below. No kmem values in
/boot/loader.conf.  I did play around with block_cap but it made no
difference.  It seems like something is holding the system back.

  Thanks for any ideas.

-John

Output from top during a single dd run:

    5 root         11  -8    -     0K   208K zvol:i  1   5:11 41.65% zfskern
    0 root        350  -8    0     0K  5600K -       5   3:59 15.23% kernel
 1784 root          1  26    0  9944K  2072K CPU1    1   0:31 13.87% dd

The zvol:io state appears to be a simple loop wait loop waiting
for outstanding I/O requests to complete. How to get more I/O
requests going?

Sample of the highest number of I/O requests per controller:

dev.mps.0.io_cmds_highwater: 207
dev.mps.1.io_cmds_highwater: 126


   IOCFACTS (identical):

mps0: <LSI SAS2308> port 0xec00-0xecff mem 0xdaff0000-0xdaffffff,0xdaf80000-0xdafbffff irq 48 at device 0.0 on pci5
mps0: Doorbell= 0x22000000
mps0: mps_wait_db_ack: successfull count(2), timeout(5)
mps0: Doorbell= 0x12000000
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: IOCFacts  :
        MsgVersion: 0x200
        HeaderVersion: 0x1b00
        IOCNumber: 0
        IOCExceptions: 0x0
        MaxChainDepth: 128
        WhoInit: ROM BIOS
        NumberOfPorts: 1
        RequestCredit: 10240
        ProductID: 0x2214
        IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
        FWVersion= 15-0-0-0
        IOCRequestFrameSize: 32
        MaxInitiators: 32
        MaxTargets: 1024
        MaxSasExpanders: 64
        MaxEnclosures: 65
        ProtocolFlags: 3<ScsiTarg,ScsiInit>
        HighPriorityCredit: 128
        MaxReplyDescriptorPostQueueDepth: 65504
        ReplyFrameSize: 32
        MaxVolumes: 0
        MaxDevHandle: 1128
        MaxPersistentEntries: 128
mps0: Firmware: 15.00.00.00, Driver: 14.00.00.01-fbsd
mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>



And some output from 'gstat -f Z -I 300ms'

dT: 0.302s  w: 0.300s  filter: Z
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    202    202  25450    2.6      0      0    0.0   25.5| multipath/Z0
    1    202    202  25046    6.2      0      0    0.0   36.6| multipath/Z2
    7    185    185  23735    6.3      0      0    0.0   33.1| multipath/Z4
    0    212    212  27125    5.4      0      0    0.0   30.4| multipath/Z6
    0    169    169  21616    5.0      0      0    0.0   28.1| multipath/Z8
    0    162    162  20768    5.0      0      0    0.0   25.7| multipath/Z10
    0    175    175  22463    6.0      0      0    0.0   30.4| multipath/Z12
    0    192    192  24582    4.4      0      0    0.0   32.1| multipath/Z14
    2    169    169  21616    3.3      0      0    0.0   18.8| multipath/Z16
    4    169    169  20808    4.1      0      0    0.0   23.0| multipath/Z18
    2    195    195  24602    4.5      0      0    0.0   28.5| multipath/Z20
    5    172    172  22039    4.4      0      0    0.0   22.7| multipath/Z22
    0    166    166  21192    3.7      0      0    0.0   20.2| multipath/Z24
    7    179    179  22887    5.4      0      0    0.0   27.8| multipath/Z26
    7    172    172  22039    3.5      0      0    0.0   23.1| multipath/Z28
    0    192    192  24582    3.8      0      0    0.0   25.5| multipath/Z30
    1    175    175  22463    6.0      0      0    0.0   30.5| multipath/Z32
    1    182    182  22907    3.9      0      0    0.0   25.6| multipath/Z34
    0    212    212  27125    6.3      0      0    0.0   32.7| multipath/Z36
    0    179    179  22483    4.8      0      0    0.0   27.5| multipath/Z38
    2    185    185  23735    4.6      0      0    0.0   30.0| multipath/Z40
    0    179    179  22887    4.5      0      0    0.0   28.2| multipath/Z42
    3    195    195  25006    4.4      0      0    0.0   32.3| multipath/Z44
    3    192    192  24582    4.0      0      0    0.0   30.5| multipath/Z46
    0      0      0      0    0.0      0      0    0.0    0.0| multipath/Z48
    0    179    179  22887    4.7      0      0    0.0   31.0| multipath/Z1
    0    185    185  23331    4.1      0      0    0.0   24.8| multipath/Z3
    0    175    175  21639    5.3      0      0    0.0   28.2| multipath/Z5
    4    162    162  20768    5.1      0      0    0.0   26.6| multipath/Z7
    0    195    195  25006    3.5      0      0    0.0   23.4| multipath/Z9
    3    179    179  22887    5.0      0      0    0.0   25.7| multipath/Z11
    4    159    159  20344    4.9      0      0    0.0   23.7| multipath/Z13
    4    166    166  21192    4.3      0      0    0.0   25.1| multipath/Z15
    0    169    169  21616    3.9      0      0    0.0   24.7| multipath/Z17
    7    189    189  23334    4.2      0      0    0.0   25.7| multipath/Z19
    4    169    169  21212    4.3      0      0    0.0   28.1| multipath/Z21
    0    159    159  20344    5.3      0      0    0.0   25.8| multipath/Z23
    5    185    185  23316    4.1      0      0    0.0   26.0| multipath/Z25
    0    192    192  24582    4.9      0      0    0.0   30.6| multipath/Z27
    0    172    172  22039    5.5      0      0    0.0   27.4| multipath/Z29
    4    166    166  21192    4.2      0      0    0.0   23.7| multipath/Z31
    0    169    169  20778    3.5      0      0    0.0   22.2| multipath/Z33
    2    172    172  21232    5.1      0      0    0.0   29.4| multipath/Z35
    3    169    169  21616    2.9      0      0    0.0   20.1| multipath/Z37
    0    179    179  22887    5.2      0      0    0.0   32.0| multipath/Z39
    0    212    212  26721    5.4      0      0    0.0   31.7| multipath/Z41
    2    175    175  22463    4.4      0      0    0.0   28.0| multipath/Z43
    0    179    179  22887    3.6      0      0    0.0   18.2| multipath/Z45
    0    179    179  22887    4.3      0      0    0.0   28.3| multipath/Z47
    0      0      0      0    0.0      0      0    0.0    0.0| multipath/Z49

Each individual disk on the system shows the capability of 255 tags:

# camcontrol tags da0 -v
(pass2:mps0:0:10:0): dev_openings  255
(pass2:mps0:0:10:0): dev_active    0
(pass2:mps0:0:10:0): devq_openings 255
(pass2:mps0:0:10:0): devq_queued   0
(pass2:mps0:0:10:0): held          0
(pass2:mps0:0:10:0): mintags       2
(pass2:mps0:0:10:0): maxtags       255


zpool:

# zpool status
  pool: pool0
 state: ONLINE
  scan: none requested
config:

	NAME               STATE     READ WRITE CKSUM
	pool0              ONLINE       0     0     0
	  mirror-0         ONLINE       0     0     0
	    multipath/Z0   ONLINE       0     0     0
	    multipath/Z1   ONLINE       0     0     0
	  mirror-1         ONLINE       0     0     0
	    multipath/Z2   ONLINE       0     0     0
	    multipath/Z3   ONLINE       0     0     0
	  mirror-2         ONLINE       0     0     0
	    multipath/Z4   ONLINE       0     0     0
	    multipath/Z5   ONLINE       0     0     0
	  mirror-3         ONLINE       0     0     0
	    multipath/Z6   ONLINE       0     0     0
	    multipath/Z7   ONLINE       0     0     0
	  mirror-4         ONLINE       0     0     0
	    multipath/Z8   ONLINE       0     0     0
	    multipath/Z9   ONLINE       0     0     0
	  mirror-5         ONLINE       0     0     0
	    multipath/Z10  ONLINE       0     0     0
	    multipath/Z11  ONLINE       0     0     0
...
	  mirror-21        ONLINE       0     0     0
	    multipath/Z42  ONLINE       0     0     0
	    multipath/Z43  ONLINE       0     0     0
	  mirror-22        ONLINE       0     0     0
	    multipath/Z44  ONLINE       0     0     0
	    multipath/Z45  ONLINE       0     0     0
	  mirror-23        ONLINE       0     0     0
	    multipath/Z46  ONLINE       0     0     0
	    multipath/Z47  ONLINE       0     0     0
	spares
	  multipath/Z48    AVAIL   
	  multipath/Z49    AVAIL   

errors: No known data errors



More information about the freebsd-scsi mailing list