From nobody Sun Mar 10 00:27:03 2024 X-Original-To: freebsd-virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tsghj3lfVz5CnY5 for ; Sun, 10 Mar 2024 00:27:17 +0000 (UTC) (envelope-from brad@facefault.org) Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tsghh4mGqz4qLt for ; Sun, 10 Mar 2024 00:27:16 +0000 (UTC) (envelope-from brad@facefault.org) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=facefault.org header.s=fm3 header.b=DfbaKEPV; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=rBfAGG++; dmarc=none; spf=pass (mx1.freebsd.org: domain of brad@facefault.org designates 64.147.123.155 as permitted sender) smtp.mailfrom=brad@facefault.org Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.west.internal (Postfix) with ESMTP id 774CB180009A for ; Sat, 9 Mar 2024 19:27:15 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Sat, 09 Mar 2024 19:27:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=facefault.org; h=cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:message-id:mime-version:reply-to :subject:subject:to:to; s=fm3; t=1710030434; x=1710116834; bh=CX 2uqAB6RYr7WtbSyP+k7o53JJMfFMvbLY0qDgUG6vY=; b=DfbaKEPVUN/b7XmHPc LH/ioBSOmnD7OjJbD9gYtmveu3zt/veOuu0OtqmzFwrjnFyz34hjSDtpfhbwraCu XvNAE9dQKn3i6DXVUip0QgA9R6OO20jYw6yxmWUBxELyUjh6eT3Q/Ah51TKfiM6r 8VFY2tz0fO0lMvhaGZIMlPjqaS8NwSUuvTBkDqd4yRpa5d9C/pJzRrD04/yv98bi yOgCO3tOCytIuyEUF+epGIPslCOE9MW9YAMISVcAdvueWzZDE2LFWtUtVgAtPhjo v9xKFXMRPq4sKPDrFwy0BQilF2Jk0dU9Fq1MiPScWsqJlr6QI4xaDb+QfzoSogox PnrQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1710030434; x=1710116834; bh=CX2uqAB6RYr7WtbSyP+k7o53JJMf FMvbLY0qDgUG6vY=; b=rBfAGG++399L7wkn9AZInp1cGdcAiLx+Ojs90IU0l9XV xI2iUqOrE2v+a/wSQ5GV9eVEDVDfntUMRB2ynCdkZaZ+61i7D6drVXE/XyJrcZqB B8Yv9nyg2CA+j6Otu+LOcdhnD00OBuuIaCn9dcQxQJ2oNufFaGN5lbqb/vfeFPqp h+vTsiFJE6auM5R392Jx8yN95+XwbEzo+149nFdZPP+LeQYEzs6+diy9lR8WdR8c yGSfhd9Ly0vwWrExJ24G2KVpRJ4NYK2w+o1ikUjZpdbEKLYnAuWnuKTe/HMNzVIl vEzmj0Y7vHuXsifKDOzc4gfaHQULdYHgR4DqO+LHHQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrieekgddvvdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephfgtgfgguffkfffvofesthhqmhdthh dtjeenucfhrhhomhepuehrrgguucettghkvghrmhgrnhcuoegsrhgrugesfhgrtggvfhgr uhhlthdrohhrgheqnecuggftrfgrthhtvghrnhepudfffefhjeeludekvdejffegjeekle euieejffetjedvueejvedufffhtdelhfevnecuvehluhhsthgvrhfuihiivgeptdenucfr rghrrghmpehmrghilhhfrhhomhepsghrrggusehfrggtvghfrghulhhtrdhorhhg X-ME-Proxy: Feedback-ID: ib09641b5:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Sat, 9 Mar 2024 19:27:14 -0500 (EST) From: Brad Ackerman Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\)) Subject: 14.0-RELEASE and 15-CURRENT have high interrupt count/latency in Xen domu (xcp-ng host) Message-Id: <92898E90-B3BA-4B89-8929-34F397BDE9BF@facefault.org> Date: Sat, 9 Mar 2024 16:27:03 -0800 To: freebsd-virtualization@freebsd.org X-Mailer: Apple Mail (2.3774.500.171.1.1) X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.10 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:64.147.123.128/27]; R_DKIM_ALLOW(-0.20)[facefault.org:s=fm3,messagingengine.com:s=fm1]; RCVD_IN_DNSWL_LOW(-0.10)[64.147.123.155:from]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:29838, ipnet:64.147.123.0/24, country:US]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[messagingengine.com:dkim]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DMARC_NA(0.00)[facefault.org]; PREVIOUSLY_DELIVERED(0.00)[freebsd-virtualization@freebsd.org]; MLMMJ_DEST(0.00)[freebsd-virtualization@freebsd.org]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[facefault.org:+,messagingengine.com:+] X-Rspamd-Queue-Id: 4Tsghh4mGqz4qLt On an xcp-ng VM running 14.0-RELEASE (also tested on p5 and 15-CURRENT), = top shows a lot of CPU time in interrupts (always a double-digit = percentage, usually around 70% but it varies all over the place). Based = on dtrace, the problem appears to be in ata_interrupt and I=E2=80=99d = guess it=E2=80=99s mutex-related. What else do I want to do to investigate this issue? All of below is on an instance with 1GB RAM and 2 vCPUs; the host is = xcp-ng 8.2 with all available patches. dtrace -n '::*interrupt*:entry { self->ts =3D timestamp; } = ::*interrupt*:return /self->ts/ { @[probefunc] =3D quantize(timestamp - = self->ts); self->ts =3D 0; }' run for one minute on an otherwise idle VM produces: zio_delay_interrupt value ------------- Distribution ------------- count 512 | 0 1024 |@@@@@@@ 2 2048 |@@@ 1 4096 | 0 8192 |@@@@@@@@@@@@@@@@@@@@ 6 16384 |@@@@@@@@@@ 3 32768 | 0 ata_interrupt value ------------- Distribution ------------- count 65536 | 0 131072 |@@@ 3 262144 |@@@@@@@@@@@@@@@@@ 19 524288 |@@@@@@@@ 9 1048576 |@@@@@@@@@@@@ 13 2097152 | 0 On CURRENT, that dtrace invocation returns nothing at all; dtrace -l = shows an entry but not return for ata_interrupt so that appears to be an = fbt issue.=20 The disk does appear to be using the PV driver: xbd0: 32768MB at device/vbd/768 on xenbusb_front0 xbd0: attaching as ada0 xbd0: features: write_barrier xbd0: synchronize cache commands enabled. There are two mutex warnings that show up on CURRENT, but they seem = unrelated: Invoking IPv6 network device address event may sleep with the following = non-sleepable locks held: exclusive sleep mutex xnrx_1 (netfront receive lock) r =3D 0 = (0xfffffe00496d2d48) locked @ = /usr/src/sys/dev/xen/netfront/netfront.c:679 stack backtrace: #0 0xffffffff80bc9a35 at witness_debugger+0x65 #1 0xffffffff80bcab99 at witness_warn+0x3e9 #2 0xffffffff80d793c4 at in6_update_ifa+0xcb4 #3 0xffffffff80da4f80 at in6_ifadd+0x1e0 #4 0xffffffff80da16d9 at nd6_ra_input+0x1029 #5 0xffffffff80d73de0 at icmp6_input+0x870 #6 0xffffffff80d8c4ae at ip6_input+0xc8e #7 0xffffffff80cadbbd at netisr_dispatch_src+0xad #8 0xffffffff80c9065a at ether_demux+0x16a #9 0xffffffff80c91d73 at ether_nh_input+0x403 #10 0xffffffff80cadbbd at netisr_dispatch_src+0xad #11 0xffffffff80c90ab9 at ether_input+0xd9 #12 0xffffffff809c6ebc at xn_rxeof+0x5dc #13 0xffffffff809c8238 at xn_intr+0x48 #14 0xffffffff80b0d556 at ithread_loop+0x266 #15 0xffffffff80b09972 at fork_exit+0x82 #16 0xffffffff8102e06e at fork_trampoline+0xe lock order reversal: (sleepable after non-sleepable) 1st 0xfffffe00496d2d48 xnrx_1 (netfront receive lock, sleep mutex) @ = /usr/src/sys/dev/xen/netfront/netfront.c:679 2nd 0xffffffff81abe830 in6_multi_sx (in6_multi_sx, sx) @ = /usr/src/sys/netinet6/in6_mcast.c:1217 lock order netfront receive lock -> in6_multi_sx attempted at: #0 0xffffffff80bc95f3 at witness_checkorder+0xb63 #1 0xffffffff80b5f0fd at _sx_xlock+0x5d #2 0xffffffff80d81b11 at in6_joingroup+0x31 #3 0xffffffff80d7977d at in6_update_ifa+0x106d #4 0xffffffff80da4f80 at in6_ifadd+0x1e0 #5 0xffffffff80da16d9 at nd6_ra_input+0x1029 #6 0xffffffff80d73de0 at icmp6_input+0x870 #7 0xffffffff80d8c4ae at ip6_input+0xc8e #8 0xffffffff80cadbbd at netisr_dispatch_src+0xad #9 0xffffffff80c9065a at ether_demux+0x16a #10 0xffffffff80c91d73 at ether_nh_input+0x403 #11 0xffffffff80cadbbd at netisr_dispatch_src+0xad #12 0xffffffff80c90ab9 at ether_input+0xd9 #13 0xffffffff809c6ebc at xn_rxeof+0x5dc #14 0xffffffff809c8238 at xn_intr+0x48 #15 0xffffffff80b0d556 at ithread_loop+0x266 #16 0xffffffff80b09972 at fork_exit+0x82 #17 0xffffffff8102e06e at fork_trampoline+0xe (The "non-sleepable locks held=E2=80=9D backtrace shows up a second time = but without a lock reversal.) systat -vmstat on RELEASE: 3 users Load 0.13 0.19 0.22 Mar 10 00:21:55 Mem usage: 89%Phy 78%Kmem VN PAGER SWAP = PAGER Mem: REAL VIRTUAL in out in = out Tot Share Tot Share Free count Act 47944K 20684K 406M 24480K 107M pages All 50736K 23368K 446M 57372K ioflt = Interrupts Proc: cow 69 = total r p d s w Csw Trp Sys Int Sof Flt zfod = atkbd0 1 31 91 2 67 36 1 ozfod = ata1 15 %ozfod = uhci0 23 0.0%Sys 78.9%Intr 0.0%User 0.0%Nice 21.1%Idle daefr 17 = cpu0:xen | | | | | | | | | | | prcfr 17 = cpu1:xen +++++++++++++++++++++++++++++++++++++++ totfr = cpu0:r dtbuf react = cpu0:itlb Namei Name-cache Dir-cache 62482 maxvn pdwak 4 = cpu0:b Calls hits % hits % 17168 numvn 5 pdpgs = cpu1:r 7 7 100 16002 frevn intrn = cpu1:itlb 772M wire 6 = cpu1:b Disks ada0 cd0 pass0 13M act 13 = xen_et0:c0 KB/t 0.00 0.00 0.00 60M inact 11 = xen_et0:c1 tps 0 0 0 0 laund = xenstore0 MB/s 0.00 0.00 0.00 107M free = xbd0 2113 %busy 0 0 0 0 buf 1 = xn0 2114 = xn0 2115 = xn0 2116 = xn0 2117 and on CURRENT: 4 users Load 0.15 0.12 0.16 Mar 10 00:23:11 Mem usage: 35%Phy 8%Kmem VN PAGER SWAP = PAGER Mem: REAL VIRTUAL in out in = out Tot Share Tot Share Free count Act 103M 14608K 454M 16848K 626M pages All 105M 16908K 486M 49620K ioflt = Interrupts Proc: cow 274 = total r p d s w Csw Trp Sys Int Sof Flt zfod = atkbd0 1 29 272 1 49 137 1 ozfod = uhci0 23 %ozfod 114 = cpu0:xen 0.0%Sys 56.0%Intr 0.0%User 0.0%Nice 44.0%Idle daefr 23 = cpu1:xen | | | | | | | | | | | prcfr 2 = cpu0:r ++++++++++++++++++++++++++++ totfr = cpu0:itlb 6 dtbuf react 12 = cpu0:b Namei Name-cache Dir-cache 62833 maxvn pdwak 1 = cpu1:r Calls hits % hits % 920 numvn 10 pdpgs = cpu1:itlb 7 7 100 126 frevn intrn 17 = cpu1:b 217M wire 100 = xen_et0:c0 Disks ada0 31M act 4 = xen_et0:c1 KB/t 0.00 84M inact = xenstore0 tps 0 0 laund = xbd0 2113 MB/s 0.00 626M free 1 = xn0 2114 %busy 0 80M buf = xn0 2115 = xn0 2116 = xn0 2117 =E2=80=94=20 Brad Ackerman brad@facefault.org