slow writes on nfs with bge devices
Max Laier
max at love2party.net
Sun Jan 21 07:09:45 UTC 2007
On Sunday 21 January 2007 07:25, Bruce Evans wrote:
> nfs writes much less well with bge NICs than with other NICs (sk, fxp,
Do you use hardware checksumming on the bge? There is an XXX in
bge_start_locked() that looks a bit suspicious to me.
> xl, even rl). Sometimes writing a 20K source file from vi seems to
> take about 2 seconds instead of seeming to be instantaneous (this gets
> faster as the system warms up). Iozone shows the problem more
> reproducibly. E.g.:
>
> 100Mbps fxp server -> 1Gbps bge 5701 client, udp:
> %%%
> IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92)
> By Bill Norcott
>
> Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
> MB reclen bytes/sec written bytes/sec read
> 1 512 1516885 291918639
> 1 1024 1158783 491354263
> 1 2048 1573651 715694105
> 1 4096 1223692 917431957
> 1 8192 729513 1097929467
> 2 512 1694809 281196631
> 2 1024 1379228 507917189
> 2 2048 1659521 789608264
> 2 4096 4606056 1064567574
> 2 8192 1142288 1318131028
> 4 512 1242214 298269971
> 4 1024 1853545 492110628
> 4 2048 2120136 742888430
> 4 4096 1896792 1121799065
> 4 8192 850210 1441812403
> 8 512 1563847 281422325
> 8 1024 1480844 492749552
> 8 2048 1658649 850165954
> 8 4096 2105283 1211348180
> 8 8192 2098425 1554875506
> 16 512 1508821 296842294
> 16 1024 1966239 527850530
> 16 2048 2036609 842656736
> 16 4096 1666138 1200594889
> 16 8192 2293378 1620824908
> Completed series of tests
> %%%
>
> Here bge barely reaches 10Mbps speeds (~1.2 MB/S) for writing. Reading
> is cached well and fast. 100Mbps xl on the same client with the same
> server goes at full 100Mbps speed (11.77 MB/S for all file sizes
> including larger ones since the disk is not the limit at 100Mbps).
> 1Gbps sk on a different client with the same server goes at full
> 100Nbps speed.
>
> Switching to tcp gives full 100 Mbps speed. However, when the bge link
> speed is reduced to 100Mbps, udp becomes about 10 times slower than the
> above and tcp becomes about as slow as the above (maybe a bit faster,
> but far below 11.77 MB/S).
>
> bge is also slow at nfs serving:
>
> 1Gbps bge 5701 server -> 1Gbps sk client:
> %%%
>
> IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92)
> By Bill Norcott
>
> Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
> MB reclen bytes/sec written bytes/sec read
> 1 512 36255350 242114472
> 1 1024 3051699 413319147
> 1 2048 22406458 632021710
> 1 4096 22447700 851162198
> 1 8192 3522493 1047562648
> 2 512 3270779 48125247
> 2 1024 28992179 46693718
> 2 2048 5956380 753318255
> 2 4096 27616650 1053311658
> 2 8192 5573338 48290208
> 4 512 9004770 47435659
> 4 1024 9576276 45601645
> 4 2048 30348874 85116667
> 4 4096 8635673 86150049
> 4 8192 9356773 47100031
> 8 512 9762446 46424146
> 8 1024 10054027 58344604
> 8 2048 9197430 60253061
> 8 4096 15934077 59476759
> 8 8192 8765470 47647937
> 16 512 5670225 46239891
> 16 1024 9425169 45950990
> 16 2048 9833515 46242945
> 16 4096 14812057 51313693
> 16 8192 9203742 47648722
> Completed series of tests
> %%%
>
> Now the available bandwidth is 10 times larger and about 9/10 of it is
> still not used, with a high variance. For larger files, the variance
> is lower and the average speed is about 10MB/S. The disk can only do
> about 40MB/S and the slowest of the 1Gbps NICS (sk) can only sustain
> 80MB/S through udp and about 50MB/S through tcp (it is limited by the
> 33 MHz 32-bit PCI bus and by being less smart than the bge interface).
> When the bge NIC was on the system which is now the server with the fxp
> NIC, bge and nfs worked unsurprisingly, just slower than I would have
> liked. The write speed was 20-30MB/S for large files and 30-40MB/S for
> medium-sized files, with low variance. This is the only configuration
> in which nfs/bge worked as expected.
>
> The problem is very old and not very hardware dependent. Similar
> behaviour happens when some of the following are changed:
>
> OS -> FreeBSD-~5.2 or FreeBSD-6
> hardware -> newer amd64 CPU (Turion X2) with 5705 (iozone output for
> this below) instead of old amd64 CPU with 5701. The newer amd64
> normally runs an i386-SMP current kernel while the old amd64 was
> running an amd64-UP current kernel in the above tests, but normally
> runs ~5.2 amd64-UP and behaves similarly with that. The combination
> that seemed to work right was an AthlonXP for the server with the same
> 5701 and any kernel. The only strangeness with that was that current
> kernels gave a 5-10% slower nfs server despite giving a 30-90% larger
> packet rate for small packets.
>
> IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92)
> By Bill Norcott
>
> Operating System: FreeBSD -- using fsync()
>
> 100Mbps fxp server -> 1Gbps bge 5705 client:
> %%%
> IOZONE: auto-test mode
>
> MB reclen bytes/sec written bytes/sec read
> 1 512 2994400 185462027
> 1 1024 3074084 337817536
> 1 2048 2991691 576792985
> 1 4096 3074759 884740798
> 1 8192 3078019 1176892296
> 2 512 4262096 186709962
> 2 1024 2994468 339893080
> 2 2048 5112176 584846610
> 2 4096 4754187 909815165
> 2 8192 5100574 1212919611
> 4 512 5298715 187129017
> 4 1024 5302620 344445041
> 4 2048 4985597 590579630
> 4 4096 3703618 927711124
> 4 8192 5236177 1240896243
> 8 512 5142274 186899396
> 8 1024 6207933 345564808
> 8 2048 6162773 593088329
> 8 4096 6031445 936751120
> 8 8192 6072523 1224102288
> 16 512 5427113 186797193
> 16 1024 5065901 345544445
> 16 2048 5462338 595487384
> 16 4096 5256552 937013065
> 16 8192 5097101 1226320870
> Completed series of tests
> %%%
>
> rl on a system with 1/20 as much CPU is faster than this.
>
> The problem doesn't seem to affect much besides writes on nfs. The
> bge 5701 works very well for most things. It has a much better bus
> interface than the 5705 and works even better after moving it to the
> old amd64 system (it can now saturate 1Gbps where on the AthlonXP it
> only got 3/4 of the way, while the 5705 only gets 1/4 of the way).
> I've been working on minimising network latency and maximising packet
> rate, and normally have very low network latency (60-80 uS for ping)
> and fairly high packet rates. The changes for this are not the caause
> of the bug :-), since the behaviour is not affected by running kernels
> without these changes or by sysctl''ing the changes to be null.
> However, the problem looks like ones caused by large latencies combined
> with non-streaming protocols. To write at just 11.77 MB/S, at least
> 8000 packets/second must be set from the client to the server. Working
> clients sustain this rate, but broken clients the rate is much lower
> and not sustained:
>
> Output from netstat -s 1 on server while writing a ~1GB file via
> 5701/udp: %%%
> input (Total) output
> packets errs bytes packets errs bytes colls
> 900 0 1513334 142 0 33532 0
> 1509 0 2564836 236 0 57368 0
> 1647 0 2295802 259 0 51106 0
> 1603 0 1502736 252 0 32926 0
> 1055 0 637014 163 0 13938 0
> 558 0 1542510 86 0 34340 0
> 984 0 989854 155 0 21816 0
> 864 0 1320786 135 0 38152 0
> 883 0 1558060 165 0 34340 0
> 1177 0 3780102 203 0 85850 0
> 2087 0 954212 331 0 21210 0
> 1187 0 1413568 190 0 31310 0
> 650 0 3320604 101 0 75346 0
> 1565 0 1706542 246 0 37976 0
> 2055 0 2360620 329 0 52318 0
> 1554 0 2416996 244 0 54226 0
> 1402 0 2579894 220 0 58176 0
> 1690 0 774488 267 0 16968 0
> 1323 0 3690650 209 0 83830 0
> 591 0 4519858 92 0 103110 0
> %%%
>
> There is no sign of any packet loss or switch problems. Forcing
> 1000baseTX full-duplex has no effect. Forcing 100baseTX full-duplex
> makes the problem more obvious. The mtu is 1500 throughout since
> only bge-5701 and sk support jumbo frames and I want to use udp for
> nfs.
>
> 5705/udp is better:
> %%%
> input (Total) output
> packets errs bytes packets errs bytes colls
> 5209 0 6607758 846 0 151702 0
> 4763 0 6684546 773 0 153520 0
> 4758 0 6618498 769 0 151298 0
> 3582 0 7057568 576 0 162498 0
> 4935 0 5115068 800 0 116756 0
> 4924 0 6622026 798 0 152802 0
> 4095 0 6018462 657 0 137450 0
> 4647 0 5270442 751 0 120594 0
> 4673 0 5451948 758 0 123624 0
> 2340 0 6001986 372 0 138168 0
> 3750 0 6150610 604 0 140996 0
> %%%
>
> sk/udp works right:
> %%%
> input (Total) output
> packets errs bytes packets errs bytes colls
> 8638 0 12384676 1440 0 293062 0
> 8636 0 12415646 1439 0 293708 0
> 8637 0 12415646 1441 0 293708 0
> 8637 0 12415646 1439 0 293708 0
> 8637 0 12417160 1440 0 293708 0
> 8636 0 12413162 1439 0 293506 0
> 8637 0 12414132 1439 0 293708 0
> 8636 0 12417160 1440 0 293708 0
> 8637 0 12415646 1439 0 293708 0
> 8636 0 12417160 1440 0 293708 0
> 8637 0 12414676 1439 0 293506 0
> %%%
>
> sk is under ~5.2 with latency/throughput/efficiency optimizations
> that don't have much effect here.
>
> Bruce
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
--
/"\ Best regards, | mlaier at freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier at EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20070121/88e37f9a/attachment.pgp
More information about the freebsd-net
mailing list