From nobody Mon Jan 06 13:53:38 2025 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YRbJ154kYz5kZYm for ; Mon, 06 Jan 2025 13:53:53 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YRbJ12y58z4gfT for ; Mon, 6 Jan 2025 13:53:53 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-5d3f65844deso24640959a12.0 for ; Mon, 06 Jan 2025 05:53:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736171632; x=1736776432; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=X4fWRx31BWKHQ4S1TTrH2arfs/wxC30fRyFs9wjf+gs=; b=bUWtPxmGi+DPgsZ9pDJsutD27mUCFHSglkPQ9KOxZsNl9azFdNpnp5SQIfLTmKbzNg JwMApyVwUEJenyNEFFwMS3oMAPSq8YZDH6JImDcRgRkZ2rB5FBIpVsCMyWejlgpcxQXX h68Yo/zgwf/GcjuV2xnkxaqg45Ksgqg91KkKZmZqbXVGv+S42xyukDNtMnZjPD5LPWMu Ak8YZNvPwC/5GFcYrHXa3GFML2PJaqD3O7Vh/wdEbKs51oYZqr33vHHIyg3NxQikItcY mTtF075Fr1tTLouwY/CsGjIWu5ZeNsfH0evebJmfhpn2egqXnGf82hekAf4dgLPqGuiG GBYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736171632; x=1736776432; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X4fWRx31BWKHQ4S1TTrH2arfs/wxC30fRyFs9wjf+gs=; b=thdeUbOTioFWI+3YpDFWuO31AI6QkIVKxNb78cBEtldUtR+rkFxZhlTZNJGbwWDW45 ASPcjKdzJCk4U8bes81j/MbWjqFbwvLP78b6IFpT1/2lBPPs4s+iNbHa9VkQGRcRczeo p52kdc3Y0mE9OeU9lnyOXUmumIfCoy7nTI8ObKYXDpFZ4tZq+rda76fWE1eZRB7fyMkV 8i8hCWLO9i5msF3y3dYyOo0CJe01QUmuyq31pdO5JgTrRVv3Qw03kscEp9K9fn7kWCoa 9AQoJg2D48sZk0RSTCKcT8rwHZrVR+5PoyvCWDGkvjWxBb6UotyZUD7Jn6wRk9Ryj1+F dMpA== X-Gm-Message-State: AOJu0YyDFzhO+EiVul+jXYRSoDUxTDzO4blnFl7DnfExWg+EwXLnPLCF UMGUGnR4/csG9/4eo7hvTTgEzpxZdRh7NZJWZtH0UzwLRheEtpX8rKDfjxLVhcO451KY2bqE2Q/ Xc97C8GJ4VDkYvhH/dqt58Q91xL5T X-Gm-Gg: ASbGncuMIS7vysKN59HVQIaFE8kncXvyObSXvC+4cNP5F4x4pAkGttYMftJiffNVnyn QpA2AYxQs4YunZpB6Vg/S69VoXmmgZn7KQLGA X-Google-Smtp-Source: AGHT+IFLNZhl/vYtUxsVaG7z0Vale9g1YuG3Mkfva67RMYkuLdfVEvIpl/XZFliqKrVvmvm/GCU0IFW+/aGBB8YVz5M= X-Received: by 2002:a05:6402:26d1:b0:5d3:e45d:ba91 with SMTP id 4fb4d7f45d1cf-5d81de39850mr54893414a12.32.1736171631351; Mon, 06 Jan 2025 05:53:51 -0800 (PST) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Mon, 6 Jan 2025 05:53:38 -0800 X-Gm-Features: AbW1kvYbh3QUNnsxNw51M0fkbjq2gFtad0yVFmNjaBtcqgXxq8TZjKEkngg-lzk Message-ID: Subject: Re: system stalled, no I/O but 100% CPU from nfs To: "Peter 'PMc' Much" Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4YRbJ12y58z4gfT X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] On Sun, Jan 5, 2025 at 8:45=E2=80=AFPM Peter 'PMc' Much wrote: > > Cheers, > > This doesn't look good. It goes on for hours. What can be done about it? > (13.4 client & server) > > > 44 processes: 4 running, 39 sleeping, 1 waiting > CPU: 0.4% user, 0.0% nice, 99.6% system, 0.0% interrupt, 0.0% idle > Mem: 21M Active, 198M Inact, 1190M Wired, 278M Buf, 3356M Free > ARC: 418M Total, 39M MFU, 327M MRU, 128K Anon, 7462K Header, 43M Other > 332M Compressed, 804M Uncompressed, 2.42:1 Ratio > Swap: 15G Total, 15G Free > > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAN= D > 417 root 4 52 0 12M 2148K RUN 20:55 99.12% nfscbd Do you have delegations enabled on your server (vfs.nfsd.issue_delegations not 0)? (If you do not, I have no idea why the server would be doing callbacks, which is what nfscbd handles.) Also, "nfsstat -m" on the client shows you/us what your mount options are. > 0 root 65 -16 - 0B 1040K swapin 0:17 0.64% kernel > 11054 root 1 52 0 18M 7664K RUN 0:04 0.10% bsdtar > 11 root 15 -56 - 0B 240K WAIT 0:15 0.05% intr > 16 root 1 -16 - 0B 16K - 0:01 0.03% racctd > 11062 root 1 20 0 14M 3804K RUN 0:00 0.03% top > 7 root 3 -16 - 0B 48K psleep 0:00 0.01% pageda= emon > 11056 root 1 20 0 21M 10M select 0:00 0.01% sshd > 6 root 1 -16 - 0B 16K - 0:00 0.01% rand_h= arvest > > > Interface Traffic Peak Total > vtnet0 in 5.380 KB/s 9.113 KB/s 781.439 = MB > out 4.012 KB/s 8.002 KB/s 674.294 = MB > > > # nfsstat -zc > /dev/null ; sleep 1 ; nfsstat -c Adding -E makes it show all RPC counts. (Without -E you just get the "old Sun compatible" output. > Rpc Counts: > Getattr Setattr Lookup Readlink Read = Write Create Remove > 1 2 5 0 0 = 0 0 0 > Rename Link Symlink Mkdir Rmdir Re= addir RdirPlus Access > 0 0 0 0 0 = 1 0 1 > Mknod Fsstat Fsinfo PathConf Commit > 0 0 0 0 0 > Rpc Info: > TimedOut Invalid X Replies Retries Requests > 0 0 0 0 11 > Cache Info: > Attr Hits Attr Misses Lkup Hits Lkup Misses BioR Hits BioR M= isses BioW Hits BioW Misses > 11 1 2 5 0 = 0 0 0 > BioRL Hits BioRL Misses BioD Hits BioD Misses DirE Hits DirE M= isses Accs Hits Accs Misses > 0 0 1 1 1 = 0 8 1 > > The above suggests that there is still some activity on the client, but the info. is limited. If the client is still in this state, you can collect more info via: # tcpdump -s 0 -w out.pcap host run for a little while. The out.pcap file needs to be looked at in wireshark (tcpdump is useless at decoding NFS). If there is nothing secret in it, you can email it to me as an attachment, so I can take a look. # ps axHl done repeatedly gets a lot more info about the NFS related thread= s. (I'll admit I doubt the info is useful for this case?) # nfsstat -E -c -z repeatedly as above. If you just want to get rid of the mount # umount -N should work, although it can take a couple of minutes. Either not running "nfscbd" on the client or disabling delegations by setting vfs.nfsd.issue_delegations=3D0 on the server (assuming you have them enabled) ,might/should avoid the problem. rick