NVMe performance 4x slower than expected
Tobias Oberstein
tobias.oberstein at gmail.com
Thu Apr 16 20:31:48 UTC 2015
> It's already awesome that Intel has senior engineers working on
> FreeBSD driver code! And it would underline Intel's Open-source
> commitment and tech leadership if they donated a couple of these
> beefy NVMes.
>
> Intel has agreed to send DC P3700 samples to the FreeBSD Foundation to
> put in the cluster for this kind of work - we are working on getting
> these through the internal sample distribution process at the moment.
This is awesome! Intel rocks. I am spreading the word about this.
==
OT (slightly): I've done more measurements under Linux, now with all
eight cards (md RAID-0 over all 8) under fire from FIO.
The results are very good. Here is 1 number (block device level):
2647.4K random 4kB read IOPS @ <800us (95% quantile)
The Intel datasheet says 450k IOPS per card, which would be 3600K, so
the scaleup is quite good. The random _write_ IOPS I've measured is
almost too high to believe (2390.3K) where Intel datasheet suggests
1400k - I am running a longer test overnight now. Let's see if it can be
sustained. Anyway: these Intel NVMe's are great tech!
We'll now move on testing at filesystem level (XFS) and database level
(PostgreSQL).
Cheers,
/Tobias
FIO Ergebnisse 1 NVMe Karte
===========================
bvr-sql18:~ # fio --filename /dev/nvme7n1 control.fio
random-read-4k: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync,
iodepth=1
...
random-write-4k: (g=1): rw=randwrite, bs=4K-4K/4K-4K/4K-4K,
ioengine=sync, iodepth=1
...
sequential-read-128k: (g=2): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=sync, iodepth=1
...
sequential-write-128k: (g=3): rw=write,
bs=128K-128K/128K-128K/128K-128K, ioengine=sync, iodepth=1
...
fio-2.2.6
Starting 256 threads
Jobs: 64 (f=64): [_(192),W(64)] [2.4% done] [0KB/1902MB/0KB /s]
[0/15.3K/0 iops] [eta 09h:36m:00s] ]
random-read-4k: (groupid=0, jobs=64): err= 0: pid=21898: Thu Apr 16
18:56:01 2015
read : io=381205MB, bw=2117.9MB/s, iops=542157, runt=180000msec
clat (usec): min=6, max=5568, avg=116.92, stdev=103.46
lat (usec): min=6, max=5568, avg=117.00, stdev=103.47
clat percentiles (usec):
| 1.00th=[ 10], 5.00th=[ 14], 10.00th=[ 18], 20.00th=[ 26],
| 30.00th=[ 36], 40.00th=[ 54], 50.00th=[ 127], 60.00th=[ 151],
| 70.00th=[ 171], 80.00th=[ 193], 90.00th=[ 223], 95.00th=[ 249],
| 99.00th=[ 310], 99.50th=[ 346], 99.90th=[ 828], 99.95th=[ 1880],
| 99.99th=[ 2864]
bw (KB /s): min= 0, max=36016, per=1.56%, avg=33798.82,
stdev=1884.10
lat (usec) : 10=0.57%, 20=11.28%, 50=26.41%, 100=5.80%, 250=51.15%
lat (usec) : 500=4.64%, 750=0.04%, 1000=0.02%
lat (msec) : 2=0.04%, 4=0.05%, 10=0.01%
cpu : usr=1.57%, sys=3.76%, ctx=114070395, majf=0, minf=3150
IO depths : 1=116.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued : total=r=97588427/w=0/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
random-write-4k: (groupid=1, jobs=64): err= 0: pid=21994: Thu Apr 16
18:56:01 2015
write: io=301353MB, bw=1674.2MB/s, iops=428591, runt=180000msec
clat (usec): min=7, max=18212, avg=147.90, stdev=202.69
lat (usec): min=8, max=18212, avg=147.99, stdev=202.69
clat percentiles (usec):
| 1.00th=[ 14], 5.00th=[ 24], 10.00th=[ 33], 20.00th=[ 52],
| 30.00th=[ 76], 40.00th=[ 109], 50.00th=[ 145], 60.00th=[ 173],
| 70.00th=[ 195], 80.00th=[ 213], 90.00th=[ 247], 95.00th=[ 290],
| 99.00th=[ 454], 99.50th=[ 564], 99.90th=[ 1112], 99.95th=[ 1928],
| 99.99th=[13120]
bw (KB /s): min= 0, max=29584, per=1.56%, avg=26719.94,
stdev=1839.10
lat (usec) : 10=0.03%, 20=3.11%, 50=15.64%, 100=18.69%, 250=53.11%
lat (usec) : 500=8.69%, 750=0.52%, 1000=0.10%
lat (msec) : 2=0.07%, 4=0.03%, 10=0.01%, 20=0.01%
cpu : usr=1.52%, sys=3.20%, ctx=89437032, majf=0, minf=2657
IO depths : 1=115.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued : total=r=0/w=77146473/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
sequential-read-128k: (groupid=2, jobs=64): err= 0: pid=22088: Thu Apr
16 18:56:01 2015
read : io=424549MB, bw=2358.6MB/s, iops=18868, runt=180006msec
clat (usec): min=213, max=9712, avg=3390.92, stdev=1888.31
lat (usec): min=213, max=9713, avg=3391.02, stdev=1888.31
clat percentiles (usec):
| 1.00th=[ 548], 5.00th=[ 764], 10.00th=[ 924], 20.00th=[ 1272],
| 30.00th=[ 1832], 40.00th=[ 2608], 50.00th=[ 3376], 60.00th=[ 4128],
| 70.00th=[ 4896], 80.00th=[ 5408], 90.00th=[ 5856], 95.00th=[ 6176],
| 99.00th=[ 6752], 99.50th=[ 6944], 99.90th=[ 7584], 99.95th=[ 7968],
| 99.99th=[ 8640]
bw (KB /s): min= 4, max=40031, per=1.56%, avg=37670.97,
stdev=2609.28
lat (usec) : 250=0.01%, 500=0.59%, 750=4.10%, 1000=7.60%
lat (msec) : 2=20.12%, 4=25.78%, 10=41.81%
cpu : usr=0.07%, sys=0.29%, ctx=3980959, majf=0, minf=1920
IO depths : 1=117.2%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued : total=r=3396390/w=0/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
sequential-write-128k: (groupid=3, jobs=64): err= 0: pid=22153: Thu Apr
16 18:56:01 2015
write: io=337530MB, bw=1875.9MB/s, iops=15000, runt=180008msec
clat (usec): min=52, max=25625, avg=4258.89, stdev=2628.32
lat (usec): min=52, max=25625, avg=4259.05, stdev=2628.31
clat percentiles (usec):
| 1.00th=[ 76], 5.00th=[ 314], 10.00th=[ 780], 20.00th=[ 1608],
| 30.00th=[ 2576], 40.00th=[ 3440], 50.00th=[ 4256], 60.00th=[ 5024],
| 70.00th=[ 5792], 80.00th=[ 6624], 90.00th=[ 7584], 95.00th=[ 8256],
| 99.00th=[ 9792], 99.50th=[12352], 99.90th=[18304], 99.95th=[19584],
| 99.99th=[21888]
bw (KB /s): min= 4, max=32572, per=1.56%, avg=29945.45,
stdev=1770.29
lat (usec) : 100=2.65%, 250=1.83%, 500=2.21%, 750=2.89%, 1000=3.12%
lat (msec) : 2=11.60%, 4=22.73%, 10=52.03%, 20=0.89%, 50=0.04%
cpu : usr=0.25%, sys=0.22%, ctx=3151153, majf=0, minf=0
IO depths : 1=116.7%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued : total=r=0/w=2700240/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: io=381205MB, aggrb=2117.9MB/s, minb=2117.9MB/s,
maxb=2117.9MB/s, mint=180000msec, maxt=180000msec
Run status group 1 (all jobs):
WRITE: io=301353MB, aggrb=1674.2MB/s, minb=1674.2MB/s,
maxb=1674.2MB/s, mint=180000msec, maxt=180000msec
Run status group 2 (all jobs):
READ: io=424549MB, aggrb=2358.6MB/s, minb=2358.6MB/s,
maxb=2358.6MB/s, mint=180006msec, maxt=180006msec
Run status group 3 (all jobs):
WRITE: io=337530MB, aggrb=1875.9MB/s, minb=1875.9MB/s,
maxb=1875.9MB/s, mint=180008msec, maxt=180008msec
Disk stats (read/write):
nvme7n1: ios=118062872/92590109, merge=0/0, ticks=26066308/26109972,
in_queue=55971668, util=100.00%
FIO Ergebnisse für RAID-0 über all 8 NVMe Karten
================================================
bvr-sql18:/home/oberstet/scm/scratchbox/freebsd/cruncher # mdadm
--detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Apr 16 19:31:34 2015
Raid Level : raid0
Array Size : 15627067392 (14903.13 GiB 16002.12 GB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Update Time : Thu Apr 16 19:31:34 2015
State : clean
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Chunk Size : 256K
Name : bvr-sql18:0 (local to host bvr-sql18)
UUID : 7280eb72:929d263f:4604a091:3fe38c91
Events : 0
Number Major Minor RaidDevice State
0 259 0 0 active sync /dev/nvme0n1
1 259 1 1 active sync /dev/nvme1n1
2 259 2 2 active sync /dev/nvme2n1
3 259 3 3 active sync /dev/nvme3n1
4 259 4 4 active sync /dev/nvme4n1
5 259 5 5 active sync /dev/nvme5n1
6 259 6 6 active sync /dev/nvme6n1
7 259 7 7 active sync /dev/nvme7n1
bvr-sql18:~ # fio --filename /dev/md0 control2.fio
random-read-4k: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K,
ioengine=libaio, iodepth=32
...
random-write-4k: (g=1): rw=randwrite, bs=4K-4K/4K-4K/4K-4K,
ioengine=libaio, iodepth=32
...
sequential-read-128k: (g=2): rw=read, bs=128K-128K/128K-128K/128K-128K,
ioengine=libaio, iodepth=32
...
sequential-write-128k: (g=3): rw=write,
bs=128K-128K/128K-128K/128K-128K, ioengine=libaio, iodepth=32
...
fio-2.2.6
Starting 256 threads
random-read-4k: (groupid=0, jobs=64): err= 0: pid=31860: Thu Apr 16
20:22:41 2015
read : io=620514MB, bw=10341MB/s, iops=2647.4K, runt= 60005msec
slat (usec): min=1, max=29604K, avg=22.55, stdev=6631.04
clat (usec): min=0, max=37611K, avg=717.16, stdev=34318.22
lat (usec): min=8, max=37611K, avg=739.82, stdev=35080.01
clat percentiles (usec):
| 1.00th=[ 318], 5.00th=[ 374], 10.00th=[ 406], 20.00th=[ 442],
| 30.00th=[ 466], 40.00th=[ 486], 50.00th=[ 510], 60.00th=[ 532],
| 70.00th=[ 556], 80.00th=[ 588], 90.00th=[ 636], 95.00th=[ 692],
| 99.00th=[ 1048], 99.50th=[16512], 99.90th=[16768], 99.95th=[24448],
| 99.99th=[28544]
bw (KB /s): min= 0, max=341456, per=1.78%, avg=188940.35,
stdev=43167.55
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
lat (usec) : 100=0.01%, 250=0.05%, 500=45.93%, 750=51.16%, 1000=1.81%
lat (msec) : 2=0.18%, 4=0.02%, 10=0.04%, 20=0.72%, 50=0.09%
lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 2000=0.01%, >=2000=0.01%
cpu : usr=3.42%, sys=68.44%, ctx=63458, majf=0, minf=18647
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=158851602/w=0/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
random-write-4k: (groupid=1, jobs=64): err= 0: pid=31935: Thu Apr 16
20:22:41 2015
write: io=560381MB, bw=9336.8MB/s, iops=2390.3K, runt= 60019msec
slat (usec): min=1, max=867805, avg=21.12, stdev=515.87
clat (usec): min=0, max=892513, avg=833.67, stdev=5252.13
lat (usec): min=8, max=892527, avg=854.96, stdev=5280.37
clat percentiles (usec):
| 1.00th=[ 34], 5.00th=[ 251], 10.00th=[ 414], 20.00th=[ 462],
| 30.00th=[ 494], 40.00th=[ 516], 50.00th=[ 532], 60.00th=[ 556],
| 70.00th=[ 572], 80.00th=[ 596], 90.00th=[ 660], 95.00th=[ 756],
| 99.00th=[16512], 99.50th=[16512], 99.90th=[24448], 99.95th=[29568],
| 99.99th=[171008]
bw (KB /s): min= 5, max=275816, per=1.59%, avg=151755.38,
stdev=43952.51
lat (usec) : 2=0.01%, 4=0.01%, 10=0.03%, 20=0.43%, 50=0.99%
lat (usec) : 100=1.12%, 250=2.41%, 500=27.32%, 750=62.61%, 1000=2.59%
lat (msec) : 2=0.56%, 4=0.24%, 10=0.18%, 20=1.34%, 50=0.14%
lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
cpu : usr=5.60%, sys=61.80%, ctx=1940986, majf=0, minf=29148
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=0/w=143457578/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
sequential-read-128k: (groupid=2, jobs=64): err= 0: pid=32020: Thu Apr
16 20:22:41 2015
read : io=758042MB, bw=12611MB/s, iops=100885, runt= 60111msec
slat (usec): min=4, max=9475, avg=25.05, stdev=34.23
clat (usec): min=0, max=1609.2K, avg=20264.21, stdev=88933.36
lat (usec): min=264, max=1609.2K, avg=20289.38, stdev=88933.66
clat percentiles (usec):
| 1.00th=[ 442], 5.00th=[ 596], 10.00th=[ 716], 20.00th=[ 924],
| 30.00th=[ 1144], 40.00th=[ 1416], 50.00th=[ 1976], 60.00th=[ 4512],
| 70.00th=[11328], 80.00th=[21120], 90.00th=[34048], 95.00th=[60672],
| 99.00th=[288768], 99.50th=[602112], 99.90th=[1351680],
99.95th=[1449984],
| 99.99th=[1548288]
bw (KB /s): min=11636, max=1197056, per=1.57%, avg=202645.32,
stdev=164421.79
lat (usec) : 2=0.01%, 4=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
lat (usec) : 250=0.01%, 500=2.18%, 750=9.42%, 1000=12.10%
lat (msec) : 2=26.54%, 4=8.62%, 10=9.60%, 20=10.52%, 50=14.83%
lat (msec) : 100=3.39%, 250=1.66%, 500=0.50%, 750=0.16%, 1000=0.15%
lat (msec) : 2000=0.32%
cpu : usr=0.33%, sys=5.35%, ctx=5444833, majf=0, minf=699080
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=6064339/w=0/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
sequential-write-128k: (groupid=3, jobs=64): err= 0: pid=32084: Thu Apr
16 20:22:41 2015
write: io=841679MB, bw=13993MB/s, iops=111944, runt= 60150msec
slat (usec): min=4, max=6855, avg=23.85, stdev=25.17
clat (usec): min=0, max=901229, avg=18237.98, stdev=63454.35
lat (usec): min=53, max=901246, avg=18262.01, stdev=63454.86
clat percentiles (usec):
| 1.00th=[ 55], 5.00th=[ 75], 10.00th=[ 98], 20.00th=[ 155],
| 30.00th=[ 251], 40.00th=[ 446], 50.00th=[ 852], 60.00th=[ 1656],
| 70.00th=[ 4192], 80.00th=[13504], 90.00th=[34560], 95.00th=[86528],
| 99.00th=[350208], 99.50th=[432128], 99.90th=[749568],
99.95th=[790528],
| 99.99th=[856064]
bw (KB /s): min= 2981, max=971776, per=1.57%, avg=224887.01,
stdev=155980.09
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.09%
lat (usec) : 100=10.23%, 250=19.57%, 500=11.90%, 750=6.27%, 1000=4.41%
lat (msec) : 2=9.95%, 4=7.15%, 10=7.44%, 20=7.82%, 50=7.51%
lat (msec) : 100=3.25%, 250=2.55%, 500=1.52%, 750=0.25%, 1000=0.09%
cpu : usr=3.89%, sys=5.37%, ctx=5690696, majf=0, minf=336258
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
>=64=0.0%
issued : total=r=0/w=6733432/d=0, short=r=0/w=0/d=0,
drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: io=620514MB, aggrb=10341MB/s, minb=10341MB/s, maxb=10341MB/s,
mint=60005msec, maxt=60005msec
Run status group 1 (all jobs):
WRITE: io=560381MB, aggrb=9336.8MB/s, minb=9336.8MB/s,
maxb=9336.8MB/s, mint=60019msec, maxt=60019msec
Run status group 2 (all jobs):
READ: io=758042MB, aggrb=12611MB/s, minb=12611MB/s, maxb=12611MB/s,
mint=60111msec, maxt=60111msec
Run status group 3 (all jobs):
WRITE: io=841679MB, aggrb=13993MB/s, minb=13993MB/s, maxb=13993MB/s,
mint=60150msec, maxt=60150msec
Disk stats (read/write):
md0: ios=164916853/150191138, merge=0/0, ticks=0/0, in_queue=0,
util=0.00%, aggrios=20614611/18773876, aggrmerge=0/0,
aggrticks=16975295/18070966, aggrin_queue=35215176, aggrutil=99.55%
nvme0n1: ios=20613912/18773078, merge=0/0, ticks=85132920/4244536,
in_queue=89694924, util=96.14%
nvme1n1: ios=20610214/18771214, merge=0/0, ticks=11432236/7672636,
in_queue=19248200, util=98.60%
nvme2n1: ios=20618968/18778997, merge=0/0, ticks=2527348/3999968,
in_queue=6608776, util=96.34%
nvme3n1: ios=20616191/18777118, merge=0/0, ticks=8099032/85948268,
in_queue=94354244, util=99.55%
nvme4n1: ios=20610527/18772284, merge=0/0, ticks=11455980/5902592,
in_queue=17468636, util=98.45%
nvme5n1: ios=20620340/18777158, merge=0/0, ticks=2224840/3858944,
in_queue=6176544, util=98.01%
nvme6n1: ios=20615107/18768738, merge=0/0, ticks=5476296/7946824,
in_queue=13518332, util=97.72%
nvme7n1: ios=20611629/18772423, merge=0/0, ticks=9453712/24993964,
in_queue=34651756, util=98.89%
bvr-sql18:~ #
More information about the freebsd-hackers
mailing list