From nobody Wed Aug 21 15:00:46 2024 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WpqJt4G99z5V1TY for ; Wed, 21 Aug 2024 15:00:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WpqJt0w03z4LSl for ; Wed, 21 Aug 2024 15:00:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1724252446; a=rsa-sha256; cv=none; b=dfCnfjPYAtI1KZko7feTBq0sWEJ4/qOc7qaeblkgRl1y+fEfLOry8anwxKueEREHQ09TtB AbyCSFXSxFOuop3+W6DUHEIZvKpDnvWtxocr4mmtlyM4+d01pLur9a2y6gAgFIn5goian9 HRmsbnQWNnHQFlH5bUm6znQdSylVCSI5z7RUsVjm2FPAelCyYaCG+v9Rllnzi6R/tS7tJH Jawicm286qT6WzdsOP2R4Vnbx1Ed/W/I2UN4aTePgVfQGvxrKmNhs0AHu9Zk8WWLzFi/z3 F5nrpZPAgGJgjtzR03gml0KGxIA7RV+qh87/JOS5Mmn2/QDCx3bl0Vpib3HsWg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1724252446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=E3p5525agpSdtwFozGW4ygqYqFPIIaRXmfXt4D5/h9c=; b=ttB44DnECYd/Hj3c0ef03X4LKYPIkPQVlNFQ0g2buNOOxkj4klKlXPT//8kcTisajIFLMl Lmec/Gyuqd/XbhIcc++Gfvwx/UYPSbj0OhbJi6IJKJUqUCt0M9x7rpcofcDQb+wOOubDY1 jZjhMq984LN8se43fzjcYSouvE/hP70oTFGjJLQkiUagTs+vsFy9qCdaQYAb2IorE/IY5S QD8wIhIY/H24gk/jIl95ZE4ZCRDz1EMxzmuFnK8aqjxjJiFENn9JuF7/Srb+vFjE26wk9n LoD/QdP+S0BX7RQ86MQ2kFg1/ueuxjjkDDDBwiwxBMcU78Bs6IftWdtj0se0GQ== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4WpqJt0Wm6znwj for ; Wed, 21 Aug 2024 15:00:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 47LF0jx1053786 for ; Wed, 21 Aug 2024 15:00:45 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 47LF0jfA053785 for bugs@FreeBSD.org; Wed, 21 Aug 2024 15:00:45 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 280978] Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations Date: Wed, 21 Aug 2024 15:00:46 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 15.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: matthew.l.dailey@dartmouth.edu X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@FreeBSD.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D280978 Bug ID: 280978 Summary: Kernel panics with vfs.nfsd.enable_locallocks=3D1 and nfs clients doing hdf5 file operations Product: Base System Version: 15.0-CURRENT Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: matthew.l.dailey@dartmouth.edu With vfs.nfsd.enable_locallocks=3D1, kernel panics or hung nfs server (more rarely) can be induced from Linux nfs clients doing hdf5 file operations. In testing, this has also sometimes resulted in irrecoverable zpool corruption= due to (I assume) memory corruption. In testing on hardware and VMs, we have induced these failures usually with= in a few hours, but sometimes within several days to a week. We have replicated = this on 13.0 through 15.0-CURRENT (20240725-82283cad12a4-271360). With this sysc= tl set to 0 (default), we are unable to replicate the issue, even after several weeks of 24/7 hdf5 file operations. Below is a backtrace from a recent panic on a test VM. Based on a suggestion from a colleague, we are currently running a test with a VM on 15.0-CURRENT (20240725-82283cad12a4-271360) with a single CPU, just to see if this has a= ny effect. Thanks and please let me know what other information I can provide. #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 #1 doadump (textdump=3Dtextdump@entry=3D0) at /usr/src/sys/kern/kern_shutdown.c:404 #2 0xffffffff8049e0fa in db_dump (dummy=3D, dummy2=3D, dummy3=3D, dummy4=3D) at /usr/src/sys/ddb/db_command.c:596 #3 0xffffffff8049deed in db_command (last_cmdp=3D, cmd_table=3D, dopager=3Dtrue) at /usr/src/sys/ddb/db_command.c:508 #4 0xffffffff8049dbad in db_command_loop () at /usr/src/sys/ddb/db_command.c:555 #5 0xffffffff804a15f6 in db_trap (type=3D, code=3D) at /usr/src/sys/ddb/db_main.c:267 #6 0xffffffff80b9c49f in kdb_trap (type=3Dtype@entry=3D3, code=3Dcode@entr= y=3D0, tf=3Dtf@entry=3D0xfffffe00da2dfce0) at /usr/src/sys/kern/subr_kdb.c:790 #7 0xffffffff81068479 in trap (frame=3D0xfffffe00da2dfce0) at /usr/src/sys/amd64/amd64/trap.c:606 #8 #9 kdb_enter (why=3D, msg=3D) at /usr/src/sys/kern/subr_kdb.c:556 #10 0xffffffff80b4cd40 in vpanic (fmt=3D0xffffffff811db2b0 "%s", ap=3Dap@entry=3D0xfffffe00da2dff10) at /usr/src/sys/kern/kern_shutdown.= c:967 #11 0xffffffff80b4cbc3 in panic ( fmt=3D0xffffffff81b98380 "N\235\024\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:892 #12 0xffffffff81068edb in trap_fatal (frame=3D0xfffffe00da2e0010, eva=3D7) at /usr/src/sys/amd64/amd64/trap.c:950 #13 0xffffffff81068f80 in trap_pfault (frame=3D, usermode=3D= false, signo=3D0x0, ucode=3D) at /usr/src/sys/amd64/amd64/trap.= c:758 #14 #15 0xffffffff80a42cc7 in nfsrv_freelockowner ( stp=3Dstp@entry=3D0xfffff8000459a600, vp=3Dvp@entry=3D0x0, cansleep=3Dcansleep@entry=3D0, p=3Dp@entry=3D0xfffff8002716d740) at /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:1637 #16 0xffffffff80a45865 in nfsrv_freestateid (nd=3Dnd@entry=3D0xfffffe00da2e= 0428, stateidp=3Dstateidp@entry=3D0xfffffe00da2e0180, p=3Dp@entry=3D0xfffff80= 02716d740) at /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:6651 #17 0xffffffff80a558b1 in nfsrvd_freestateid (nd=3D0xfffffe00da2e0428, isdgram=3D, vp=3D, exp=3D) at /usr/src/sys/fs/nfsserver/nfs_nfsdserv.c:4775 #18 0xffffffff80a35ecf in nfsrvd_compound (nd=3D0xfffffe00da2e0428, isdgram= =3D0, tag=3D, taglen=3D0, minorvers=3D) at /usr/src/sys/fs/nfsserver/nfs_nfsdsocket.c:1338 #19 nfsrvd_dorpc (nd=3Dnd@entry=3D0xfffffe00da2e0428, isdgram=3Disdgram@ent= ry=3D0, tag=3D, taglen=3D0, minorvers=3D) at /usr/src/sys/fs/nfsserver/nfs_nfsdsocket.c:633 #20 0xffffffff80a4b608 in nfs_proc (nd=3D0xfffffe00da2e0428, xid=3D, xprt=3D0xfffff8000a8a6400, rpp=3D) at /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:474 #21 nfssvc_program (rqst=3D0xfffff80027e51800, xprt=3D0xfffff8000a8a6400) at /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:358 #22 0xffffffff80e62dd0 in svc_executereq (rqstp=3D0xfffff80027e51800) at /usr/src/sys/rpc/svc.c:1053 #23 svc_run_internal (grp=3Dgrp@entry=3D0xfffff80003201100, ismaster=3Dismaster@entry=3D1) at /usr/src/sys/rpc/svc.c:1329 #24 0xffffffff80e6220f in svc_run (pool=3D0xfffff80003201000) at /usr/src/sys/rpc/svc.c:1408 #25 0xffffffff80a4bd73 in nfsrvd_nfsd (td=3Dtd@entry=3D0xfffff8002716d740, args=3Dargs@entry=3D0xfffffe00da2e09a0) at /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:641 #26 0xffffffff80a66ada in nfssvc_nfsd (td=3D0xfffff8002716d740, uap=3D) at /usr/src/sys/fs/nfsserver/nfs_nfsdport.c:3877 #27 0xffffffff80dcc11c in sys_nfssvc (td=3D, uap=3D) at /usr/src/sys/nfs/nfs_nfssvc.c:107 #28 0xffffffff81069888 in syscallenter (td=3D0xfffff8002716d740) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #29 amd64_syscall (td=3D0xfffff8002716d740, traced=3D0) at /usr/src/sys/amd64/amd64/trap.c:1192 #30 #31 0x00002687a11d1eda in ?? () Backtrace stopped: Cannot access memory at address 0x26879f30f8f8 --=20 You are receiving this mail because: You are the assignee for the bug.=