performance tuning of iSCSI and Dell MD3000i / gjournal problem

Miroslav Lachman 000.fbsd at quip.cz
Tue Jan 26 16:16:33 UTC 2010


I am CCing freebsd-performance at freebsd.org as somebody may be interested 
and not subscribed to freebsd-scsi.

Just a note, I am still having performance problems with iSCSI and Dell 
MD3000i.
I tried it with ZFS, but writing performance was even worse - about 
3-5MB/s! Copying 33GB from local UFS  partition do iSCSI ZFS partition 
takes almost 5 hours:

~/# cp -a /vol1/data /tank/vol2/data
Terminated

Usr: 0.224s  Krnl: 131.862s  Totl: 4:47:21.80s  CPU: 0.7%  swppd: 0  
I/O: 271251+0

~/# df -h /tank/vol2
Filesystem    Size    Used   Avail Capacity  Mounted on
tank/vol2      49G     33G     15G    68%    /tank/vol2


But ZFS read performance was impressive (thanks to prefetch enabled), 
about 88MB/s (up to 850Mbit/s on iSCSI interface bce1)

# ifstat -i bce1 -b
        bce1
  Kbps in  Kbps out
     0.00      0.00
726154.2   5020.94
721338.3   4000.65
674180.6   3658.37
693954.6   4227.52
679745.0   3619.76
693572.9   4170.39
544614.8   2812.59
579602.1   3021.56
752622.0   3921.31
854647.0   4654.90


da1 is ZFS partition over iSCSI
# iostat -x -w 20 da0 da1
                         extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.0   0.0     0.0     0.0    0   0.0   0
da1      1529.1   7.7 97860.8    56.7    9  16.4  99

Unfortunately there are some other problem with ZFS performance and 
Lighttpd.
http://lists.freebsd.org/pipermail/freebsd-stable/2010-January/054384.html
http://lists.freebsd.org/pipermail/freebsd-stable/2010-January/054385.html


Then I reverted back to UFS + gjournal, but as I post in to 
freebsd-geom@ list, there is problem with gjournal timeout at boot time. 
So UFS and gjournal seems to be usable only on first / manual setup, but 
not in usual way with rc.d scripts.
http://lists.freebsd.org/pipermail/freebsd-geom/2010-January/003872.html

Then I choose to drop gjournal from my plan (but I am scared of really 
long fsck after crash / power failure). I have some problems with 
combination of GPT + gjournal + glabel so I destroyed the filesystem and 
start recreating it again and restore data from backup.

*there were several system freezes caused by unkillable iscsi sessions 
in interaction with gjournal*
The machine had to be rebooted from remote management (cold or warm 
reboot in Dell DRAC). But no kernel panic message.

I was disappointed by terrible write performance again. UFS + 
SoftUpdates did only 4-7MB/s of write speed by rsync.

# iostat -x -w 10 da0
                         extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.4 107.4     6.4  6791.5   17 131.0 104
                         extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.0  67.0     0.0  4285.8   18 215.6 100


# ifstat -i bce0,bce1 -b 10
        bce0                bce1
  Kbps in  Kbps out   Kbps in  Kbps out
31542.66   1518.12    285.53  31415.25
34309.73   1652.56    306.90  34281.15
47235.61   2270.39    429.60  47887.96
30743.04   1483.37    267.57  30524.36
50871.20   2458.34    457.92  50449.77
63548.43   3023.20    581.71  64230.07

# iostat -w 5
       tty           mfid0              da0             cpu
  tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
    1  319 15.47   2  0.04  62.90  71  4.35   1  0  0  0 98
    0  100  0.00   0  0.00  60.22  62  3.66   1  0  0  0 99
    0   63 16.00   1  0.02  63.86  66  4.15   1  0  0  0 98
    0   39 22.65  35  0.78  57.97  73  4.13   2  0  1  0 96
    0   63 19.18  21  0.40  63.90  92  5.74   1  0  0  0 98

It means 2-3 days to copy all data back to the storage!


I stopped it, ad gjournal again and start rsync again.

Much better results (28MB/s of write speed) but not as high as was mont 
ago when I started to play with iSCSI (it was about 60MB/s)

This time it was:
                         extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.7 447.9    11.1 28486.3   32  68.8 100
                         extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.9 444.6    14.3 28367.0   23  69.4  99

# ifstat -i bce0,bce1 -b 10
        bce0                bce1
  Kbps in  Kbps out   Kbps in  Kbps out
213487.4   8139.54   2201.14  237459.0
268473.5  10233.86   2354.66  250007.1
222936.1   8499.59   2179.59  235984.8
259017.9   9835.16   2094.63  221011.9
215187.9   8188.47   2205.16  237813.7
175749.2   6637.50   1888.68  207940.3
291358.4  11056.45   2353.09  251959.6

journal is on the local RAID1 (mfid0) with writing speed about 150MB/s 
so the data is fetched over network to the journal and then committed to 
iSCSI storage (da0) with higher speed than in case of SoftUpdates.

But writing speed dropped after 5 hours of running rsync from 28MB/s to 
8MB/s (and is still running)

Now I am too tired and need some more hours of sleep...

Miroslav Lachman

PS: Thank you for your work Danny, I know it is not possible to you to 
help me if you do not have exactly the same HW setup.
I will post you (and to freebsd-rc@) highly modified (and I hope 
improved) rc.d script for iSCSI with better handling of loading modules, 
mounting, stopping etc.


More information about the freebsd-scsi mailing list