14.0-RELEASE and 15-CURRENT have high interrupt count/latency in Xen domu (xcp-ng host)

From: Brad Ackerman <brad_at_facefault.org>
Date: Sun, 10 Mar 2024 00:27:03 UTC
On an xcp-ng VM running 14.0-RELEASE (also tested on p5 and 15-CURRENT), top shows a lot of CPU time in interrupts (always a double-digit percentage, usually around 70% but it varies all over the place). Based on dtrace, the problem appears to be in ata_interrupt and I’d guess it’s mutex-related.

What else do I want to do to investigate this issue?

All of below is on an instance with 1GB RAM and 2 vCPUs; the host is xcp-ng 8.2 with all available patches.

dtrace -n '::*interrupt*:entry { self->ts = timestamp; } ::*interrupt*:return /self->ts/ { @[probefunc] = quantize(timestamp - self->ts); self->ts = 0; }'

run for one minute on an otherwise idle VM produces:

  zio_delay_interrupt
           value  ------------- Distribution ------------- count
             512 |                                         0
            1024 |@@@@@@@                                  2
            2048 |@@@                                      1
            4096 |                                         0
            8192 |@@@@@@@@@@@@@@@@@@@@                     6
           16384 |@@@@@@@@@@                               3
           32768 |                                         0

  ata_interrupt
           value  ------------- Distribution ------------- count
           65536 |                                         0
          131072 |@@@                                      3
          262144 |@@@@@@@@@@@@@@@@@                        19
          524288 |@@@@@@@@                                 9
         1048576 |@@@@@@@@@@@@                             13
         2097152 |                                         0

On CURRENT, that dtrace invocation returns nothing at all; dtrace -l shows an entry but not return for ata_interrupt so that appears to be an fbt issue. 

The disk does appear to be using the PV driver:

xbd0: 32768MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0
xbd0: attaching as ada0
xbd0: features: write_barrier
xbd0: synchronize cache commands enabled.

There are two mutex warnings that show up on CURRENT, but they seem unrelated:

Invoking IPv6 network device address event may sleep with the following non-sleepable locks held:
exclusive sleep mutex xnrx_1 (netfront receive lock) r = 0 (0xfffffe00496d2d48) locked @ /usr/src/sys/dev/xen/netfront/netfront.c:679
stack backtrace:
#0 0xffffffff80bc9a35 at witness_debugger+0x65
#1 0xffffffff80bcab99 at witness_warn+0x3e9
#2 0xffffffff80d793c4 at in6_update_ifa+0xcb4
#3 0xffffffff80da4f80 at in6_ifadd+0x1e0
#4 0xffffffff80da16d9 at nd6_ra_input+0x1029
#5 0xffffffff80d73de0 at icmp6_input+0x870
#6 0xffffffff80d8c4ae at ip6_input+0xc8e
#7 0xffffffff80cadbbd at netisr_dispatch_src+0xad
#8 0xffffffff80c9065a at ether_demux+0x16a
#9 0xffffffff80c91d73 at ether_nh_input+0x403
#10 0xffffffff80cadbbd at netisr_dispatch_src+0xad
#11 0xffffffff80c90ab9 at ether_input+0xd9
#12 0xffffffff809c6ebc at xn_rxeof+0x5dc
#13 0xffffffff809c8238 at xn_intr+0x48
#14 0xffffffff80b0d556 at ithread_loop+0x266
#15 0xffffffff80b09972 at fork_exit+0x82
#16 0xffffffff8102e06e at fork_trampoline+0xe
lock order reversal: (sleepable after non-sleepable)
 1st 0xfffffe00496d2d48 xnrx_1 (netfront receive lock, sleep mutex) @ /usr/src/sys/dev/xen/netfront/netfront.c:679
 2nd 0xffffffff81abe830 in6_multi_sx (in6_multi_sx, sx) @ /usr/src/sys/netinet6/in6_mcast.c:1217
lock order netfront receive lock -> in6_multi_sx attempted at:
#0 0xffffffff80bc95f3 at witness_checkorder+0xb63
#1 0xffffffff80b5f0fd at _sx_xlock+0x5d
#2 0xffffffff80d81b11 at in6_joingroup+0x31
#3 0xffffffff80d7977d at in6_update_ifa+0x106d
#4 0xffffffff80da4f80 at in6_ifadd+0x1e0
#5 0xffffffff80da16d9 at nd6_ra_input+0x1029
#6 0xffffffff80d73de0 at icmp6_input+0x870
#7 0xffffffff80d8c4ae at ip6_input+0xc8e
#8 0xffffffff80cadbbd at netisr_dispatch_src+0xad
#9 0xffffffff80c9065a at ether_demux+0x16a
#10 0xffffffff80c91d73 at ether_nh_input+0x403
#11 0xffffffff80cadbbd at netisr_dispatch_src+0xad
#12 0xffffffff80c90ab9 at ether_input+0xd9
#13 0xffffffff809c6ebc at xn_rxeof+0x5dc
#14 0xffffffff809c8238 at xn_intr+0x48
#15 0xffffffff80b0d556 at ithread_loop+0x266
#16 0xffffffff80b09972 at fork_exit+0x82
#17 0xffffffff8102e06e at fork_trampoline+0xe

(The "non-sleepable locks held” backtrace shows up a second time but without a lock reversal.)

systat -vmstat on RELEASE:

    3 users    Load  0.13  0.19  0.22                  Mar 10 00:21:55
   Mem usage:  89%Phy 78%Kmem                           VN PAGER   SWAP PAGER
Mem:      REAL           VIRTUAL                        in   out     in   out
       Tot   Share     Tot    Share     Free   count
Act 47944K  20684K    406M   24480K     107M   pages
All 50736K  23368K    446M   57372K                       ioflt  Interrupts
Proc:                                                     cow      69 total
  r   p   d    s   w   Csw  Trp  Sys  Int  Sof  Flt       zfod        atkbd0 1
              31        91    2   67   36         1       ozfod       ata1 15
                                                         %ozfod       uhci0 23
 0.0%Sys  78.9%Intr  0.0%User  0.0%Nice 21.1%Idle         daefr    17 cpu0:xen
|    |    |    |    |    |    |    |    |    |    |       prcfr    17 cpu1:xen
+++++++++++++++++++++++++++++++++++++++                   totfr       cpu0:r
                                           dtbuf          react       cpu0:itlb
Namei     Name-cache   Dir-cache     62482 maxvn          pdwak     4 cpu0:b
   Calls    hits   %    hits   %     17168 numvn        5 pdpgs       cpu1:r
       7       7 100                 16002 frevn          intrn       cpu1:itlb
                                                     772M wire      6 cpu1:b
Disks  ada0   cd0 pass0                               13M act      13 xen_et0:c0
KB/t   0.00  0.00  0.00                               60M inact    11 xen_et0:c1
tps       0     0     0                                 0 laund       xenstore0
MB/s   0.00  0.00  0.00                              107M free        xbd0 2113
%busy     0     0     0                                 0 buf       1 xn0 2114
                                                                      xn0 2115
                                                                      xn0 2116
                                                                      xn0 2117

and on CURRENT:

    4 users    Load  0.15  0.12  0.16                  Mar 10 00:23:11
   Mem usage:  35%Phy  8%Kmem                           VN PAGER   SWAP PAGER
Mem:      REAL           VIRTUAL                        in   out     in   out
       Tot   Share     Tot    Share     Free   count
Act   103M  14608K    454M   16848K     626M   pages
All   105M  16908K    486M   49620K                       ioflt  Interrupts
Proc:                                                     cow     274 total
  r   p   d    s   w   Csw  Trp  Sys  Int  Sof  Flt       zfod        atkbd0 1
              29       272    1   49  137         1       ozfod       uhci0 23
                                                         %ozfod   114 cpu0:xen
 0.0%Sys  56.0%Intr  0.0%User  0.0%Nice 44.0%Idle         daefr    23 cpu1:xen
|    |    |    |    |    |    |    |    |    |    |       prcfr     2 cpu0:r
++++++++++++++++++++++++++++                              totfr       cpu0:itlb
                                         6 dtbuf          react    12 cpu0:b
Namei     Name-cache   Dir-cache     62833 maxvn          pdwak     1 cpu1:r
   Calls    hits   %    hits   %       920 numvn       10 pdpgs       cpu1:itlb
       7       7 100                   126 frevn          intrn    17 cpu1:b
                                                     217M wire    100 xen_et0:c0
Disks  ada0                                           31M act       4 xen_et0:c1
KB/t   0.00                                           84M inact       xenstore0
tps       0                                             0 laund       xbd0 2113
MB/s   0.00                                          626M free      1 xn0 2114
%busy     0                                           80M buf         xn0 2115
                                                                      xn0 2116
                                                                      xn0 2117

— 
Brad Ackerman
brad@facefault.org