From nobody Sun May 22 22:26:07 2022 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id A68CF1B45333 for ; Sun, 22 May 2022 22:26:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-YT3-obe.outbound.protection.outlook.com (mail-yt3can01on2062.outbound.protection.outlook.com [40.107.115.62]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4L5w6G33wjz4hVR; Sun, 22 May 2022 22:26:14 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=neq6jAyN1Tz/64IIWQmzVH3vkQIsV/k8p7KdAN9Sl3Cez8+UzqYV9WV9/lgKxAIM9vH6+jFog2n8VJjnZL4XKM4YFfR6YWjL2MP6dKzsyGCcmHdkpjQ3rbP7SolQ3tMphNYXctN9UXAmkEgwJqXmrm2ZgI1CAGYaiMigIWPcGb7UqKQdMyWIhXyu3Ymg8dKDk6z57RswY3rV/xCgeFqrpKVEGinLJVV+UpMlM2z91Vy32efPoy1z2jQaKqJXDWF72V0aR3R1KseixaTEWIRHWnS1MsmIJn0a+w5OwUPbzHwVvRMV2+A0w5xNPulSyOECZe6Og9CBIOhWaKEpPG/jHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aqD9DG56VRoOMECY0V/XsEAyoWrnMEb8f6OBHNu5pHw=; b=h4R3I+ziIpjWcYRVWEgQQN2spZV7eRoozaI05cJHnAQjQE5Hjm40pORt++6h9UmRxFLWdS1YUmrLB/QmhnzjI2/QLzLy+tBL7l0OJr9E4xFgOogXadwma69rXPb1RasGHvVXxTGAM3EUTBPTwLlA2bmfEw/7uKQBaniaajmqFmmW9mXEX5zgjDdsKy8aeCk+n9rrn+NME5JnskmdtMrkSi1FR6zKFhC6PH1cDVqfBH36bajV2hahzf8x0EvU/wVN8W3RCP2Ko8qG2sRPJlsWO05L5m5RsQxoEFvsZs1QfArj8z7VEIo3EUSBqH+KcPJJ4qi4GISov+5C6k55GKSu8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aqD9DG56VRoOMECY0V/XsEAyoWrnMEb8f6OBHNu5pHw=; b=lBgrxARSVxKLNZUdFXNy6AguAOD/hqQEMUBLDhNRc5vq9ZtZWpM1j0qRx3lftsDc+wBAkusV5iMos65kJO9kdMRIpDGxv9zyuAtred9z0YVNSxNbGxEyolOHrJEqRAS7g5s9WWJx6S5cohaM1GMEbe++H1OOC6N+FZpLEeXNbktJf5T561WlrbzMJ3RnJc46bcBiVskftFsRfiXA7UJ39lHIsmyxKTY/GD7ooPmhPzgZ8vcXhTSd5EgW5i2YTBUG3GOToSHOBzsNSKOw14Tmonw3aBGMq61og/OAi3hjYrPvCIwJNkAayjmIXEyv/XxMZ09g9HoiLiaihaq07lf5Wg== Received: from YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:81::14) by YQXPR01MB2727.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:4b::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5273.17; Sun, 22 May 2022 22:26:07 +0000 Received: from YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM ([fe80::b921:251e:4a0b:54fc]) by YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM ([fe80::b921:251e:4a0b:54fc%6]) with mapi id 15.20.5273.022; Sun, 22 May 2022 22:26:07 +0000 From: Rick Macklem To: Adam Stylinski CC: John , "freebsd-fs@freebsd.org" Subject: Re: zfs/nfsd performance limiter Thread-Topic: zfs/nfsd performance limiter Thread-Index: AQHYaxQfJkENgRkWo0q5KjKiS/ePTK0q68g+gAADxyeAAAh/gIAAgbCR Date: Sun, 22 May 2022 22:26:07 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: suggested_attachment_session_id: 3ebaf0ba-14f4-9a95-903f-664af768760c x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e5ab4289-6e98-424c-2834-08da3c4210b3 x-ms-traffictypediagnostic: YQXPR01MB2727:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: tE7DBDCPwAFqjKIZodEtDp/Ngv5o3dQPIEz0yZqkbSP/c9BVRdLSZnx7bKk6ap66+lk1Y31OckyDInF+Q5lAZxeYYZ8IE6X2zRHkKSUB/oMCvN+3hyRhySxvbFzkHwNqmhSFlErAbjDgXsYKRvEldguOMqcxjXTI5qeNqbhfok6Zla0HTmoK9UoWospPUsRT514Is3QrXoEFNmuie+7Z8ewzbqwFr6kP+DU7Hh08PN34VJukUWh7NZBTaV5999zmM1v62MCknD3PT5bFHOOCWnkOP+yAs/AEiHsDQA80iBTUGS/umSlseqCgXVKCw718zP5645OiqGaD1FR6wrMTrX0CbR3woKI5RF50X/PndORoZYfpuwubeNzoj/WaXD1UnnRu9bJs9ZqPipIbT1LcliZnuKEqXmknkTI2bnAY/VgdUOdb5aSbG7/9y0+b1MKikR1uPp8g5GTiMMdRXuA7RE7xDeDn5yY0nl2fmyLcD2f7D7xlEgb9F5hSP6n4O5zD5TYl9KbRySOUCc89h2y6UcqwSk0f+QAuS4G0b3FN98CpGnUiC7nbD5kqVHZImb/MImX+V6Jt7SMlsFs3JgNuQMCd/6WCQ3YqSag+4oRyQZjqVk9xJ4bWEYZXBXc/DLS9YhBLHcEpKyX6f+Kr4DUG3z+8y12JjNN8/Zm0fAeX0oEvEOC7XG8t2L4UVEWy7nvcutTzDvaJToDwh0x/1gtu7JLyux049FH3uNmP9ai/78Q4Mo4N2XOxMJ9dyFSsNhtf6k9iH+5Wd1X+b0xyNXtPk+dU8WoRyZxdivZYAvOP/sY= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(38070700005)(38100700002)(5660300002)(2906002)(122000001)(52536014)(8936002)(3480700007)(4326008)(91956017)(54906003)(7696005)(6506007)(83380400001)(66946007)(186003)(33656002)(66556008)(66476007)(66446008)(64756008)(8676002)(76116006)(9686003)(86362001)(786003)(316002)(6916009)(55016003)(71200400001)(966005)(508600001)(53546011);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?PZ/4JqpNNjmBcXFHwJY11qgN2yCwl/pBwNgyf3DZLKPRTZYmMDzdAfHebd?= =?iso-8859-1?Q?Y8PNAcDKXDjkCFBGCKLH/nIZlpV0oE0wFLTWHlyzYYrb3zUM1a3Dmrt3Mz?= =?iso-8859-1?Q?EHhQ0ivrItjJTZld7qp2szBvEa/OoxZ80KJOQE+ygMtHz3uhAF0ZFbqgOI?= =?iso-8859-1?Q?FpevVL3OttV4cA5aDOTqbzOjA7qkAd4q0q6dPWaOLgIwjvihxJXTDMLvE7?= =?iso-8859-1?Q?9x7OoB5DHOsBfOSb04XEvirwnd3ATsIFLERqQaAHfFtUTNlay53UjoJlro?= =?iso-8859-1?Q?gVhVyy5CfS/MlwTRQxeg1KKwY7rAAakasBOLpSSPzn3loHHPyKWcjzsFMc?= =?iso-8859-1?Q?rgC/cN9t1HLMBPYkiulKsE4bEeYqim3XnhzD3PgFtttvRSz4sIj+Wl2nwM?= =?iso-8859-1?Q?F91vWeXAunR94Vx4SeG3QSqRPFHv4bhKbJ6HOcAgbKSy5dkkEYLJXlQlAn?= =?iso-8859-1?Q?4dmuiE94Z+RGJJ9mwCixjnQ+hmAKzWKkTjg3UbE5qrZ3oqcpmiagyNkEL7?= =?iso-8859-1?Q?ktFLEsd5X7oVs1+PKwwdIolC8KFHuI2GhmmLcA/7DrNkOIQL1YfNrQfS6a?= =?iso-8859-1?Q?zJaEIofq8KEZwdPjyOG/371mVYOqH7ht8eltcgd0u/RvfxDZaIZpxkXWur?= =?iso-8859-1?Q?kCgeBVN3jrRsulqizLzb5+Mmr8HvfEeZ56o25f3pYBhp59rZ8uNS8Zqhku?= =?iso-8859-1?Q?ZoUEAEo406gpJ5S3lNIiTvunXlDJl59Zo7jAqaEIf+QH3c/5EwynfB62RL?= =?iso-8859-1?Q?d+8W21wQ8tUvNWVZKCpXg5P8wz7QBzwJfHfA+TtlVQ6c6+HJ3Rbg18g6nt?= =?iso-8859-1?Q?8lflLBqboW2oABxMdYjewbCMiA/8xG17p5BSGB0XuJrLBezihdvoqW+Mr7?= =?iso-8859-1?Q?gTOVwGqH0N3fIYtnVA0TZT+IXVlvTPpAR3UgGDGDgGW67i5UXnema0YWs6?= =?iso-8859-1?Q?PJTNsGJMyy+R2D6PDkj33Hh2iS2R04gQLcEZCqFRO+w4OwVmxPAH2PU3El?= =?iso-8859-1?Q?DmgqoinNFKnTHSAlOPdjZHosxfQnxd2YCwa7I0pG9b1HTWkUOUs23Elyav?= =?iso-8859-1?Q?iwMnfTXOlfoW38xOE8bWSgp9CGNnsb17qQGCjLqum48b7EE/h3qXiiGhwn?= =?iso-8859-1?Q?5n0ANYUbM4V66k/Ab3ibk5ztarz0k2JOQ+TQ4rQ1yuUkzNTgR9+9SAclgq?= =?iso-8859-1?Q?HJ+jz1X+t/EEbdc/yqINITlWEz0/lUp7zH6JiuPtQc1+B3OPLWTK0sUuxZ?= =?iso-8859-1?Q?H1gph11Jdt9+zhU68j+6knTKMEPtlGBKGzdqFiCyDejd7z7z/wvJXObPJ2?= =?iso-8859-1?Q?2QEZ6gvZnXi/4lOgNiaGhkK9GVLqTlojTklNxilH0Zwmbeu02ha7TV1fy1?= =?iso-8859-1?Q?jP+vczQjirWm6gQeKJgqK8lXOZ1+F/G+dWv02ErF6ir1sTLV6+GVSATsZW?= =?iso-8859-1?Q?af/zVvzrcGIcTEB67VQorj6iyGwtxYzPuh6vopR1Cz3mF73BYd2yc3bhBI?= =?iso-8859-1?Q?dQRjQd6dkKGPShQ00+77ysUlMapnrtDnIrgh9FNfODbcuH/BpmQ0pmlv0k?= =?iso-8859-1?Q?PnGrmWV3ot2q/XUsaR98JJj7brmKgo5fcpZYn4/dZy4kRnA9HAqFL8ODdZ?= =?iso-8859-1?Q?mkm+sKUuVttu5/s8EAVLHFMhgKtwxgAxn1PnK29xh5dptieYx/23klsPAQ?= =?iso-8859-1?Q?lIQFMjPbl9GjNUWSflskmdcGrbU2EGj1CT8jY5iWPesSZUT01Jpf5qrYMS?= =?iso-8859-1?Q?o/o5OT5Il8tAzWVOQe7ugZG0YA1T4MxFLoSU6HQJ6mqJ1B+Ng0LxvALn4v?= =?iso-8859-1?Q?nh02zRB6LMFNI15JInj1oLCKKJ9Roj8mHSzbiVGWai4KR5/PGdYN?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: e5ab4289-6e98-424c-2834-08da3c4210b3 X-MS-Exchange-CrossTenant-originalarrivaltime: 22 May 2022 22:26:07.1406 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: pmAzaSa5RAnMD0mJpsUHLDjPbnhEMec3zjUGSANd/qOQbg9GTB6/fTcwlT6Haz07WrWB43bZw3GRbKDB3yPcfg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB2727 X-Rspamd-Queue-Id: 4L5w6G33wjz4hVR X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector2 header.b=lBgrxARS; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.115.62 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-5.46 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-0.46)[-0.459]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector2]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[40.107.115.62:from]; MLMMJ_DEST(0.00)[freebsd-fs]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.115.62:from] X-ThisMailContainsUnwantedMimeParts: N Adam Stylinski wrote:=0A= [stuff snipped]=0A= >=0A= > However, in general, RPC RTT will define how well NFS performs and not=0A= > the I/O rate for a bulk file read/write.=0A= Lets take this RPC RTT thing a step further...=0A= - If I got the math right, at 40Gbps, 1Mbyte takes about 200usec on the wir= e.=0A= Without readahead, the protocol looks like this:=0A= Client Server (time going down the scre= en)=0A= small Read request --->=0A= <-- 1Mbyte reply=0A= small Read request -->=0A= <-- 1Mbyte reply=0A= The 1Mbyte replies take 200usec on the wire.=0A= =0A= Then suppose your ping time is 400usec (I see about 350usec on my little la= n).=0A= - The wire is only transferring data about half of the time, because the sm= all=0A= request message takes almost as long as the 1Mbyte reply.=0A= =0A= As you can see, readahead (where multiple reads are done concurrently)=0A= is critical for this case. I have no idea how Linux decides to do readahead= .=0A= (FreeBSD defaults to 1 readahead, with a mount option that can increase=0A= that.)=0A= =0A= Now, net interfaces normally do interrupt moderation. This is done to=0A= avoid an interrupt storm during bulk data transfer. However, interrupt=0A= moderation results in interrupt delay for handling the small Read request= =0A= message.=0A= --> Interrupt moderation can increase RPC RTT. Turning it off, if possible,= =0A= might help.=0A= =0A= So, ping the server from the client to see what your RTT roughly is.=0A= Also, you could look at some traffic in wireshark, to see what readahead=0A= is happening and what the RPC RTT is.=0A= (You can capture with "tcpdump", but wireshark knows how to decode=0A= NFS properly.)=0A= =0A= As you can see, RPC traffic is very different from bulk data transfer.=0A= =0A= rick=0A= =0A= > Btw, writing is a very different story than reading, largely due to the n= eed=0A= > to commit data/metadata to stable storage while writing.=0A= >=0A= > I can't help w.r.t. ZFS nor high performance nets (my fastest is 1Gbps), = rick=0A= >=0A= > > You mention iperf. Please post the options you used when invoking iper= f and it's output.=0A= >=0A= > Setting up the NFS client as a "server", since it seems that the=0A= > terminology is a little bit flipped with iperf, here's the output:=0A= >=0A= > -----------------------------------------------------------=0A= > Server listening on 5201 (test #1)=0A= > -----------------------------------------------------------=0A= > Accepted connection from 10.5.5.1, port 11534=0A= > [ 5] local 10.5.5.4 port 5201 connected to 10.5.5.1 port 43931=0A= > [ ID] Interval Transfer Bitrate=0A= > [ 5] 0.00-1.00 sec 3.81 GBytes 32.7 Gbits/sec=0A= > [ 5] 1.00-2.00 sec 4.20 GBytes 36.1 Gbits/sec=0A= > [ 5] 2.00-3.00 sec 4.18 GBytes 35.9 Gbits/sec=0A= > [ 5] 3.00-4.00 sec 4.21 GBytes 36.1 Gbits/sec=0A= > [ 5] 4.00-5.00 sec 4.20 GBytes 36.1 Gbits/sec=0A= > [ 5] 5.00-6.00 sec 4.21 GBytes 36.2 Gbits/sec=0A= > [ 5] 6.00-7.00 sec 4.10 GBytes 35.2 Gbits/sec=0A= > [ 5] 7.00-8.00 sec 4.20 GBytes 36.1 Gbits/sec=0A= > [ 5] 8.00-9.00 sec 4.21 GBytes 36.1 Gbits/sec=0A= > [ 5] 9.00-10.00 sec 4.20 GBytes 36.1 Gbits/sec=0A= > [ 5] 10.00-10.00 sec 7.76 MBytes 35.3 Gbits/sec=0A= > - - - - - - - - - - - - - - - - - - - - - - - - -=0A= > [ ID] Interval Transfer Bitrate=0A= > [ 5] 0.00-10.00 sec 41.5 GBytes 35.7 Gbits/sec rec= eiver=0A= > -----------------------------------------------------------=0A= > Server listening on 5201 (test #2)=0A= > -----------------------------------------------------------=0A= >=0A= > On Sun, May 22, 2022 at 3:45 AM John wrote:=0A= > >=0A= > > ----- Adam Stylinski's Original Message -----=0A= > > > Hello,=0A= > > >=0A= > > > I have two systems connected via ConnectX-3 mellanox cards in etherne= t=0A= > > > mode. They have their MTU's maxed at 9000, their ring buffers maxed= =0A= > > > at 8192, and I can hit around 36 gbps with iperf.=0A= > > >=0A= > > > When using an NFS client (client =3D linux, server =3D freebsd), I se= e a=0A= > > > maximum rate of around 20gbps. The test file is fully in ARC. The= =0A= > > > test is performed with an NFS mount nconnect=3D4 and an rsize/wsize o= f=0A= > > > 1MB.=0A= > > >=0A= > > > Here's the flame graph of the kernel of the system in question, with= =0A= > > > idle stacks removed:=0A= > > >=0A= > > > https://gist.github.com/KungFuJesus/918c6dcf40ae07767d5382deafab3a52#= file-nfs_fg-svg=0A= > > >=0A= > > > The longest functions seems like maybe it's the ERMS aware memcpy=0A= > > > happening from the ARC? Is there maybe a missing fast path that coul= d=0A= > > > take fewer copies into the socket buffer?=0A= > >=0A= > > Hi Adam -=0A= > >=0A= > > Some items to look at and possibly include for more responses....=0A= > >=0A= > > - What is your server system? Make/model/ram/etc. What is your=0A= > > overall 'top' cpu utilization 'top -aH' ...=0A= > >=0A= > > - It looks like you're using a 40gb/s card. Posting the output of=0A= > > 'ifconfig -vm' would provide additional information.=0A= > >=0A= > > - Are the interfaces running cleanly? 'netstat -i' is helpful.=0A= > >=0A= > > - Inspect 'netstat -s'. Duplicate pkts? Resends? Out-of-order?=0A= > >=0A= > > - Inspect 'netstat -m'. Denied? Delayed?=0A= > >=0A= > >=0A= > > - You mention iperf. Please post the options you used when=0A= > > invoking iperf and it's output.=0A= > >=0A= > > - You appear to be looking for through-put vs low-latency. Have=0A= > > you looked at window-size vs the amount of memory allocated to the=0A= > > streams. These values vary based on the bit-rate of the connection.= =0A= > > Tcp connections require outstanding un-ack'd data to be held.=0A= > > Effects values below.=0A= > >=0A= > >=0A= > > - What are your values for:=0A= > >=0A= > > -- kern.ipc.maxsockbuf=0A= > > -- net.inet.tcp.sendbuf_max=0A= > > -- net.inet.tcp.recvbuf_max=0A= > >=0A= > > -- net.inet.tcp.sendspace=0A= > > -- net.inet.tcp.recvspace=0A= > >=0A= > > -- net.inet.tcp.delayed_ack=0A= > >=0A= > > - What threads/irq are allocated to your NIC? 'vmstat -i'=0A= > >=0A= > > - Are the above threads floating or mapped? 'cpuset -g ...'=0A= > >=0A= > > - Determine best settings for LRO/TSO for your card.=0A= > >=0A= > > - Disable nfs tcp drc=0A= > >=0A= > > - What is your atime setting?=0A= > >=0A= > >=0A= > > If you really think you have a ZFS/Kernel issue, and you're=0A= > > data fits in cache, dump ZFS, create a memory backed file system=0A= > > and repeat your tests. This will purge a large portion of your=0A= > > graph. LRO/TSO changes may do so also.=0A= > >=0A= > > You also state you are using a Linux client. Are you using=0A= > > the MLX affinity scripts, buffer sizing suggestions, etc, etc.=0A= > > Have you swapped the Linux system for a fbsd system?=0A= > >=0A= > > And as a final note, I regularly use Chelsio T62100 cards=0A= > > in dual home and/or LACP environments in Supermicro boxes with 100's=0A= > > of nfs boot (Bhyve, QEMU, and physical system) clients per server=0A= > > with no network starvation or cpu bottlenecks. Clients boot, perform= =0A= > > their work, and then remotely request image rollback.=0A= > >=0A= > >=0A= > > Hopefully the above will help and provide pointers.=0A= > >=0A= > > Cheers=0A= > >=0A= >=0A=