[Bug 281560] gve (4) uma deadlock during high tcp throughput
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 02 Oct 2024 23:17:37 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560 --- Comment #18 from shailend@google.com --- (In reply to Konstantin Belousov from comment #14) Although I do not have access to the VMs to do `show pcpu`, I checked my notes to find this `ps` entry: 100438 Run CPU 11 [gve0 txq 4 xmit] The packet transmitting thread is hogging the cpu and preventing iperf from ever running to release the uma lock. The "gve0 txq 4 xmit" is running forever because it is waiting on the tx cleanup thread to make room on the ring, and that thread is not doing anything because it is waiting on the uma zone lock. I did another repro, and the situation is similar: ``` db> show lockchain 100416 thread 100416 (pid 0, gve0 rxq 0) is blocked on lock 0xfffffe00df57a3d0 (sleep mutex) "mbuf" thread 100708 (pid 860, iperf) is on a run queue db> show lockchain 100423 thread 100423 (pid 0, gve0 rxq 7) is blocked on lock 0xfffff8010447daa0 (rw) "tcpinp" thread 100736 (pid 860, iperf) is blocked on lock 0xfffffe00df57a3d0 (sleep mutex) "mbuf" thread 100708 (pid 860, iperf) is on a run queue db> show lockchain 100452 thread 100452 (pid 0, gve0 txq 10) is blocked on lock 0xfffffe00df57a3d0 (sleep mutex) "mbuf" thread 100708 (pid 860, iperf) is on a run queue ``` Here 100708 is the offending iperf thread. Lets see its state: ``` db> show thread 100708 Thread 100708 at 0xfffff800a86bd000: proc (pid 860): 0xfffffe01a439bac0 name: iperf pcb: 0xfffff800a86bd520 stack: 0xfffffe01a4dc1000-0xfffffe01a4dc4fff flags: 0x5 pflags: 0x100 state: RUNQ priority: 4 container lock: sched lock 31 (0xfffffe001bee8440) last voluntary switch: 11510.470 s ago last involuntary switch: 11510.470 s ago ``` And now lets see whats happening on cpu 31: ``` db> show pcpu 31 cpuid = 31 dynamic pcpu = 0xfffffe009a579d80 curthread = 0xfffff800a8501740: pid 0 tid 100453 critnest 0 "gve0 txq 10 xmit" curpcb = 0xfffff800a8501c60 fpcurthread = none idlethread = 0xfffff80003b04000: tid 100034 "idle: cpu31" self = 0xffffffff8242f000 curpmap = 0xffffffff81b79c50 tssp = 0xffffffff8242f384 rsp0 = 0xfffffe01a4ca8000 kcr3 = 0xffffffffffffffff ucr3 = 0xffffffffffffffff scr3 = 0x0 gs32p = 0xffffffff8242f404 ldt = 0xffffffff8242f444 tss = 0xffffffff8242f434 curvnet = 0 spin locks held: ``` Sure enough a driver transmit thread is hogging the cpu. And to seal the loop, lets see what this queue's cleanup thread is doing: ``` db> show lockchain 100452 thread 100452 (pid 0, gve0 txq 10) is blocked on lock 0xfffffe00df57a3d0 (sleep mutex) "mbuf" thread 100708 (pid 860, iperf) is on a run queue ``` In summary this is the usual loop: iperf thread (with uma zone lock) ---sched---> gve tx xmit thread ---for room---> gve tx cleanup thread -----uma zone lock----> iperf thread There is clearly a problematic behavior in the driver transmit thread (gve_xmit_br): this taskqueue should not enqueue itself, and should rather let the cleanup taskqueue wake it up when room is made in the ring, so I'll work on that. But I also want to confirm that it is not problematic for an iperf thread to be knocked off the cpu with the zone lock held: is it not a critical enough lock to disallow that? (I am not very familiar with schedulers to know if this is a naive question). -- You are receiving this mail because: You are the assignee for the bug.