Just joined the infiniband club
John Fleming
john at spikefishsolutions.com
Sun Sep 8 00:00:55 UTC 2019
Hi all, i've recently joined the club. I have two Dell R720s connected
directly to each other. The card is a connectx-4. I was having a lot
of problem with network drops. Where i'm at now is i'm running
FreeBSD12-Stable as of a week ago and cards have been cross flashed
with OEM firmware (these are lenovo i think) and i'm no longer getting
network drops. This box is basically my storage server. Its exporting
a raid 10 ZFS volume to a linux (compute 19.04 5.0.0-27-generic) box
which is running GNS3 for a lab.
So many questions.. sorry if this is a bit rambly!
>From what I understand this card is really 4 x 25 gig lanes. If i
understand that correctly then 1 data transfer should be able to do at
max 25 gig (best case) correct?
I'm not getting what the difference between connected mode and
datagram mode is. Does this have anything to do with the card
operating in infiniband mode vs ethernet mode? FreeBSD is using the
modules compiled in connected mode with shell script (which is really
a bash script not a sh script) from freebsd-infiniband page.
Linux box complains if mtu is over 2044 with expect mulitcast drops or
something like that so mtu on both boxes is set to 2044.
Everything i'm reading makes it sound like there is no RDMA support in
FreeBSD or maybe that was no NFS RDMA support. Is that correct?
So far it seems like these cards struggle to full 10 gig pipe. Using
iperf (2) the best i'm getting is around 6gb(bit) sec. Interfaces
aren't showing drops on either end. Doesn't seem to matter if i do 1,
2 or 4 threads on iperf.
Here is the card
mlx5_core0 at pci0:66:0:0: class=0x020700 card=0x001415b3 chip=0x101315b3 rev=0x00
hdr=0x00
vendor = 'Mellanox Technologies'
device = 'MT27700 Family [ConnectX-4]'
class = network
This is a MCA456A (dual port connectX-4 infiniband/ethernet).
Should be in a 16x slot.. but .. hmm is it? Looking at pciconf i can't tell.
Dell R720 -
CPU E5-2670
ECC DDR-1600 128GB (16GB sticks in white slots)
Compute is - for sure is in pcie 16x slot here.
Dell R720
CPU E5-2697
ECC DDR-1600 128GB (16GB sticks in white slots)
root at R720-Storage:/var/log # ibstat
CA 'mlx5_0'
CA type: MT4115
Number of ports: 1
Firmware version: 12.25.1020
Hardware version: 0
Node GUID: 0x248a07030049f308
System image GUID: 0x248a07030049f308
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x2651e84a
Port GUID: 0x248a07030049f308
Link layer: InfiniBand
root at R720-Storage:/var/log # netstat -inb | egrep 'ib0|Name'
Name Mtu Network Address Ipkts Ierrs Idrop
Ibytes Opkts Oerrs Obytes Coll
ib0 2044 <Link#6> 00:00:00:85:fe:80 287483828 0 0
531774120120 330632289 1 401889930592 0
ib0 - 10.255.255.0/ 10.255.255.22 287483710 - -
519124822036 330632186 - 393954749268 -
root at R720-Storage:/var/log #
This is with nothing going on right now.
root at R720-Storage:/var/log # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 10.255.255.22 port 5001 connected with 10.255.255.55 port 56238
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 6.21 GBytes 5.33 Gbits/sec
root at compute720:~# iperf -c 10.255.255.22
------------------------------------------------------------
Client connecting to 10.255.255.22, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.255.255.55 port 56238 connected with 10.255.255.22 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 6.21 GBytes 5.33 Gbits/sec
root at compute720:~#
Swapped
root at R720-Storage:/var/log # iperf -c 10.255.255.55
------------------------------------------------------------
Client connecting to 10.255.255.55, TCP port 5001
TCP window size: 209 KByte (default)
------------------------------------------------------------
[ 3] local 10.255.255.22 port 46814 connected with 10.255.255.55 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.1 sec 3.77 GBytes 3.22 Gbits/sec
root at R720-Storage:/var/log #
root at compute720:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (default)
------------------------------------------------------------
[ 4] local 10.255.255.55 port 5001 connected with 10.255.255.22 port 46814
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.1 sec 3.77 GBytes 3.22 Gbits/sec
More information about the freebsd-infiniband
mailing list