From nobody Sun Jan 14 03:30:30 2024 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TCLPy6TySz56tHl for ; Sun, 14 Jan 2024 03:30:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TCLPy5SDgz453t for ; Sun, 14 Jan 2024 03:30:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1705203030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GQmXvkNzDlY9H1q96GC1yAgfqvmjgIm2fCP657ggM4Y=; b=P+dkKQLU6GP6WDt8OaihGYR3tJNbcab0H+0za4TawENoGqbBa9/pOekLWEwFN31uK+JKbX S1azj0ps1TPaJ1zr81XAMVKlUcqnEVRMpjgbNhv11GXd26UNQdAkK52sN3SY+w3dqRqMAg kDDEOZgnHsyF69kpgONixUkXgx6eOA8fRdWG4GiwFLzvA0HiG4iiXzYaLQPy2NBpWf/lIi qfSMsEEPHDhhm91JsmLpWEd/xl8fiGOI2kE9wXizdnXTPfYPfbc1HNd65hTEUmKLv29xsI BUuqpMAYWUrIS7XBKMlWda2ByUuP5kxg4UVv+WqEFso/WJuIB1TqFzbm1iEAYg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1705203030; a=rsa-sha256; cv=none; b=o8ya8Qi7QLEgJG0x19oBFKBsg7f/QAklVDekG5wGegm680N+0aNZZhyl4kR6Um5iYl7zKV 806LEdJhKFmjTGi1OSx67VU92qzOSZ6WWNrcEEYY2NEAPHqgcoChgtAE1CvAwdew9TGtqw mb94tBrrl9mTMpQtRUqwwErc+60KpMdVG0L/CBIYoHiOdIgs5PgO7f6rVVe9J/6Y9rpNJK MVNP6se1av67VA95trGaYSZY5DTHJdW2N4pzs0aGf9AH8q3sP5LYFHE/uLVU5+QEu1fYjB M01AJlk0zfRGXmQ6ExRlzDCD2kiVl87wxuh1qx4b5OR/3NCon1tfYxmiBuxUdA== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TCLPy4XZZzxbY for ; Sun, 14 Jan 2024 03:30:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 40E3UU0p024209 for ; Sun, 14 Jan 2024 03:30:30 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 40E3UUMV024208 for fs@FreeBSD.org; Sun, 14 Jan 2024 03:30:30 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 276299] Write performance to NFS share is ~4x slower than on 13.2 Date: Sun, 14 Jan 2024 03:30:30 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-RELEASE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D276299 --- Comment #10 from Rick Macklem --- By network fabric I mean everything from the TCP stack down, at both ends. A problem can easily manifest itself as only a problem during writing. Writing to an NFS server is very different traffic as reading from a NFS server. I am not saying that it is a network fabric problem, just that good read performance does not imply it is not a network fabric problem. I once saw a case where everything worked fine over NFS (where I worked as a sysadmin) until one specific NFS RPC was done. That NFS RPC (and only that NFS RPC would fail). It turned out to be a hardware bug in a network switch. Move the machine to a port on another switch and the problem went away. Move it onto the problem switch and the issue showed up again. There were no detectable other problems with this switch and the manufacturer returned it after a maintenance cycle claiming it was fixed. It still had the problem, so it went in the trash. (It probably had a memory problem that flipped a bit for this specific case or some such.) Two examples of how a network problem might affect NFS write performance, but not read performance. Write requests are the only large RPC messages sent from client->server. With a !Mbyte write size, each write results in about 700 1500byte TCP segments (for an ordinary ethernet packet size). -> If the burst of 700 packets causes one to be dropped on the server (receive) end sometimes... (Found by seeing an improvement with a smaller wsize.) -> If the client/sender has a TSO bug (the most common problem is mishandling a TSO segment that is slightly less than 64Kbyytes. (Found by disabling TSO in the client. Disabling TSO also changes the timing of the TCP segments and this can sometimes avoid bugs.) Have you yet tried a smaller rsize/wsize as I suggested. NFS traffic is also very different than typical TCP traffic. For example, both 13.0 and 13.1 shipped with bugs in the TCP stack that affected the NFS server (intermittent hangs in these cases). If it isn't a network fabric problem it is probably something related to ZFS. I know nothing about ZFS, so I can't even suggest anything beyond "sync=3Ddisabled". Since an NFS server uses both storage (hardware + ZFS) and networking, any breakage anywhere in these can cause a big performance hit. NFS itself just translates between the NFS RPC message and VFS/VOP calls. It is conceivable that some change in the NFS server is causing this, but these changes are few and others have not reported similar write performance problems for 14.0, so it seems unlikely. --=20 You are receiving this mail because: You are the assignee for the bug.=