From nobody Wed Jan 19 16:15:23 2022 X-Original-To: threads@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id CCE58196699D for ; Wed, 19 Jan 2022 16:15:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Jf9j73lJ8z4s9V for ; Wed, 19 Jan 2022 16:15:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5C97317975 for ; Wed, 19 Jan 2022 16:15:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 20JGFNOw085150 for ; Wed, 19 Jan 2022 16:15:23 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 20JGFNxf085149 for threads@FreeBSD.org; Wed, 19 Jan 2022 16:15:23 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: threads@FreeBSD.org Subject: [Bug 261338] [PATCH] kernel panic "bad pte" on heavy CPU load on 12.2 and 12.3 (i386) Date: Wed, 19 Jan 2022 16:15:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 12.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: thedix@yandex.ru X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Threading List-Archive: https://lists.freebsd.org/archives/freebsd-threads List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-threads@freebsd.org X-BeenThere: freebsd-threads@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1642608923; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=kAxKq46B5tDkIbg2VJgufW3KzCw7F+yEe1srkf3okxo=; b=m5JKU9F1T8IESzRPbZxjiKokMSEst2KCL0CsIiyaBJB1oU1OwTkb7RqmPVpfXQFK6AhPKp lB9zJQXpet0bvSm5fjZT0DMuq11YiDxEIRcZ5wE0jKmSmNFScFnLyeMnHxo5PkyXYYgXBD 11TqAtHCWwszQ9xbSUHdg+ue8JEkUTV5ubgaZl078BWxDAAjC/ECO9h5vY8DLiYZmODOmF WrlFyNax31Pb4B9cKjGrYYrwXqQggdyt+jXxcaWL46cG8wOfDqtd2aeafHa4WitAzjM+Gs w8uVyFKYgi1Mnt+ecU423D1R9Ss1/yQz915FQXbSrT9E1eAZFlUNNAybCYPu6A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1642608923; a=rsa-sha256; cv=none; b=XUQFSH6ZXNC0JTFdlx4QwDsWboK10wtmyImPsaixh8igM7NAaRfAE7Vtis+V+6AfTd+NOE P9KYxSQzGDGwmF2zqkpGBgz3Gwj8XQsaykEo/U0W83jcNSQ168j+ayR582l8hemVGlnFQM OZDUuuDnzp+gcQ7pwBNTMOKWvfioqAlYJ8XGrX6LY2cFwh2w0WQgEUQM9ihisiO1y4Doyv YeyQ9xRLgffURzt5v5FLJruXewj3seT8lKjfI6Z1tHzx39R9hcAONqd6LIBBLy0lpLxgZw j5IkoSJVlbN71hguEtZlIk3ec5AxsM8zOdRXsre+aqDt+yaZkyLpaeuFIHF6Gg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261338 Bug ID: 261338 Summary: [PATCH] kernel panic "bad pte" on heavy CPU load on 12.2 and 12.3 (i386) Product: Base System Version: 12.3-RELEASE Hardware: i386 OS: Any Status: New Severity: Affects Some People Priority: --- Component: threads Assignee: threads@FreeBSD.org Reporter: thedix@yandex.ru Created attachment 231160 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D231160&action= =3Dedit Panic screenshot After updating to 12.2p12 and 12.3p1 I noticed kernel panic under heavy multi-core CPU load. As an example of heavy load is building kernel in multi-threaded mode. Affected systems: - 12.2p12 i386 - 12.3p1 i386 12.X amd64 is not affected, 13.0 is not affected at all. Tested hardware: - Virtual machine 8 vCPU 4 GB vRAM under VMWare ESXi 6.7 - HP MicroServer Gen8 Intel Xeon E3-1265Lv2 16 GB RAM - PC Intel Core i5-4690 16 GB RAM Steps to reproduce: # cd /usr/src # make -s -j`sysctl -n hw.ncpu` KERNCONF=3DGENERIC buildkernel And after some time the system hangs with panic like: TPTE at 0x2857f14 IS ZERO @ VA 247c5000 panic: bad pte cpuid =3D 7 time =3D 1642334372 KDB: stack backtrace: #0 0x10438ee at kdb_backtrace+0x4e #1 0xffdb68 at vpanic+0x118 #2 0xffda44 at panic+0x14 #3 0x155b6d5 at pmap_remove_pages+0x5a5 #4 0x12fceb4 at vmspace_exit+0x94 #5 0xfbe0f3 at exit1+0x593 #6 0xfbdb52 at sys_sys_exit+0x12 #7 0x1561b79 at syscall+0x3e9 #8 0xffc033e7 at PTDpde+0x43ef Additional stack info: #0 0x00ffd9f6 in doadump () at /usr/src/sys/kern/kern_shutdown.c:370 370 savectx(&dumppcb); (kgdb) #0 0x00ffd9f6 in doadump () at /usr/src/sys/kern/kern_shutdown.c:370 #1 0x00ffd831 in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x00ffdbbf in vpanic (fmt=3D0x15d448a "bad pte", ap=3D0x1ff80a10 "") at /usr/src/sys/kern/kern_shutdown.c:881 #3 0x00ffda44 in panic (fmt=3D0x15d448a "bad pte") at /usr/src/sys/kern/kern_shutdown.c:808 #4 0x0155b6d5 in pmap_remove_pages (pmap=3D0x22a0354c) at /usr/src/sys/i386/i386/pmap.c:4845 #5 0x012fceb4 in vmspace_exit (td=3D0x1bb57380) at /usr/src/sys/vm/vm_map.= c:411 #6 0x00fbe0f3 in exit1 (td=3D0x1bb57380, rval=3D0, signo=3D0) at /usr/src/sys/kern/kern_exit.c:399 #7 0x00fbdb52 in sys_sys_exit (td=3D0x1bb57380, uap=3D0x1bb57604) at /usr/src/sys/kern/kern_exit.c:176 #8 0x01561b79 in syscall (frame=3D0x1ff80ba8) at src/sys/i386/i386/../../kern/subr_syscall.c:144 #9 0xffc033e7 in ?? () #10 0x00000033 in ?? () I made some research on the kernel code and found the problem appeared in t= he recent changes of SMP processing in mp_x86.c: https://github.com/freebsd/freebsd-src/commit/1820ca2154611d6f27ce5a5fdd561= a16ac54fdd8#diff-b34ee41e14f87fb2b18fdf77337237f336830ae88aac2a02e1c32aa45e= 43b4de https://reviews.freebsd.org/D33413 The problem is in the function smp_targeted_tlb_shootdown(): - sched_pin(); + KASSERT(curthread->td_pinned > 0, ("curthread not pinned")); Under some circumstances the function is not pinned, which later causes PTE panic. I recompiled GENERIC kernel with INVARIANTS options and added the function = name to the assertion text for additional info and got an immediate panic during= the boot (see attached image panic_not_pinned.png). So the fix is to revert this line back: - KASSERT(curthread->td_pinned > 0, ("curthread not pinned")); + sched_pin(); I attached the patch mp_x86.c.patch to fix the problem. After recompiling the kernel with this patch, I no longer see panics on both 12.2 and 12.3 when recompiling the kernel further. --=20 You are receiving this mail because: You are the assignee for the bug.=