Qlogic fibre channel support questions
Bruce Evans
bde at zeta.org.au
Tue Feb 28 03:56:04 PST 2006
On Tue, 28 Feb 2006, Danny Braniss wrote:
>> On Mon, 27 Feb 2006, Matthew Jacob wrote:
>>
>>> Okay- let me ask why diskless booting doesn't work for you?
>>
>> Because NFS is slow. A locally disk (or a SAN attached disk, which is
>> essentially the same to FreeBSD) is going to be faster than NFS, no matter what.
>
> don't be too hasty with conclusions :-)
> 2- as to speed, it all depends, specially on how deep are your pockets.
> i've been running several 'benchmarks' latetly and disk speed is not
> everything.
>
> sample:
> host is a Sun Fire X4200 (dual dula core Opteron) with SAS disks
> OS is FreeBSD 6.1-PRERELEASE amd64.
>
> make buildworld:
> diskless: 40m16.71s real 54m18.55s user 17m54.69s sys
> (using only 1 server*)
> nondiskless: 20m51.58s real 51m13.19s user 12m59.84s sys
> " but /usr/obj is iSCSI:
> 28m23.29s real 52m17.27s user 14m23.06s sys
> " but /usr/src and /usr/obj is iSCSI:
> 20m38.20s real 52m10.19s user 14m48.74s sys
> diskless but /usr/src and /usr/obj is iSCSI:
> 20m22.66s real 50m56.14s user 13m8.20s sys
>
> *: server in this case is a Xeon running in 64 mode but not very fast
> ethernet - em0 at 1gb but at about 50% efficiency.
> this server will 'make buildworld' in about 40 min. using the onboard
> LSILogic v3 MegaRAID RAID0.
I recently tried to use 1Gbps ethernet more (instead of 100Mbps) and
hoped to get better makeworld performance, but actually got less. The
problem seems to be just that nfs3 does too many attribute cache
refreshes, so although all the data fits in the VMIO cache there is a
lot of network activity, and 100Gbps ends up slower because my 1Gbps
NICs have a slightly higher latency than my 100Mbps NICs. The 100Mbps
ones are fxp's and have a ping latency of about 100uS, and the 1GBps
ones are a bge and an sk and have a ping latency of 140uS. I think
these latencies are lower than average, but they are too large for
good makeworld-over-nfs performance. makeworld generates about 2000
(or is it 5000?) packets/second and waiting just 40uS longer for 2000
replies reduces performance by 8% or about 120 seconds of the total
buildworld time.
The faster NICs are better for bandwidth. I get a max of 40MB/S
for read/write using tcp and about 25MB/S using udp. tcp is apparently
faster because the latency is so bad that streaming in tcp reduces its
effects significantly. However, using tcp for makeworld is a pessimization.
All systems are 2-3GHz AthlonXPs with only 33MHz PCI buses running a 2
year old version of FreeBSD-current with local optimizations, with
/usr (including /usr/src) nfs3-mounted and local object and root trees
(initially empty). "world" is actually only about 95% of the world.
100Mbps:
--------
31532 maximum resident set size
2626 average shared memory size
1762 average unshared data size
128 average unshared stack size
15521898 page reclaims
14904 page faults
0 swaps
1932 block input operations <--- few of these since nfs bins and srcs
11822 block output operations <--- it's not disk-bound
1883576 messages sent
1883480 messages received
33448 signals received
2104163 voluntary context switches
472277 involuntary context switches
1GBps/tcp:
-----------
1930.89 real 1222.87 user 184.10 sys <--- way slower (real)
1GBps/udp:
-----------
1909.86 real 1225.25 user 181.22 sys
mostly local disks (except /usr, not including /usr/src):
---------------------------------------------------------
1476.58 real 1224.70 user 161.30 sys <---
This is almost a properly configured system, with disks fast enough for
real = user + sys + epsilon.
1GBps/udp + the best tuning/hacking I could find:
nfs access timeout 2 -> 60 (probably wrong for general use)
sk interrupt moderation 100 -> 10 (reduces latency)
delete zapping of attribute cache on open in nfs (probably a bug for general
use; a PR says that this should always be done for ro mounts)
----------------------------------------------------------------------------
1630.86 real 1227.86 user 175.09 sys
...
1342791 messages sent <--- tuning seems to work mainly by reducing
1343111 messages received <--- these; they are still large
1GBps/udp + the best tuning I could find:
nfs access timeout 2 -> 60
sk interrupt moderation 100 -> 10
no zapping of attribute cache on open in nfs
-j4
-----------------------------------------------------------
1599.74 real 1276.18 user 262.04 sys
...
1727832 messages sent
1726818 messages received
-j<ANY> is normally bad for UP systems, but here it helps by using cycles
that would otherwise be idle.
Bruce
More information about the freebsd-scsi
mailing list