From nobody Tue Apr 05 18:35:13 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B1DA81A81AE1; Tue, 5 Apr 2022 18:35:13 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KXxCP3Q5Cz3pY2; Tue, 5 Apr 2022 18:35:13 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1649183713; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=qc616w5LQ5OuK//6wLV+A1nsE6RQzcgTCnMpF5zu1Us=; b=ZyTPpURxYyCik6MdaxGjcxSKAh+LFR2o7ts6QZkKPRwLY+bZqBpWo/Sc/dfSiKTIDFHFpT QI8f+/Vowb904TOtFI7TwUUMTi6xDdNv+1Zi1UoLcwrCQGye2aTLdYnGPH4F7nd+YJiIOP isJSeHONpz7DCSB3Tq7jkVHaO9ulroV851gSR7JjC3GGK0Zayrvr8FW2LHc4OetPCr0R0o hQLuzVV+uMYwruY4LUu/ZD1S/5s2VvOEJIQFpJDI44PHKCJR/WVUKazzvvWHT4W4wa4JKo FpTvOOY6VAuN5efd9G4KFdEyzcc2GBmHFioyV2vDBJNfAcxEuZi5+xteipGNxg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 342D83D2F; Tue, 5 Apr 2022 18:35:13 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 235IZDD7044170; Tue, 5 Apr 2022 18:35:13 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 235IZDqG044169; Tue, 5 Apr 2022 18:35:13 GMT (envelope-from git) Date: Tue, 5 Apr 2022 18:35:13 GMT Message-Id: <202204051835.235IZDqG044169@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mitchell Horne Subject: git: c9114f9f86f9 - main - Add new vnode dumper to support live minidumps List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mhorne X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: c9114f9f86f92742eacd1d802c34009a57e81055 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1649183713; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=qc616w5LQ5OuK//6wLV+A1nsE6RQzcgTCnMpF5zu1Us=; b=Zlzc42P99a0R9PbDq7yrxP2ZvH/bBjf8dttK6QEnqWjAcxAm9ajgoin061sc3DT1Keb8Hs zmB+84Gk5L+cGUWD5D8hAF6iY1FbKD0MnsmSRYhZR09/kmV/Si/tYtxD4vNMA+NdCxw8Pi 5ecu8d15Tc19n2gFHcfo3dR9fNQst6U4I5LOswBOYSPO1X76JxI2SdD2Nwbz+kzzdKBUFw m9c7U4WRYzy2L31O3z1ZBK8sfKZvHslf0V3Ww7eZXi0IOYHtgmfzM1Bws6WzaaIcUwei2b IIDxtS1/aiwgeC5HpyJOzNTVdjlblU3akExIKkf8MxRHQgn8QrVZa3kowR0m7A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1649183713; a=rsa-sha256; cv=none; b=Zpgd9M2awyBehvrWyvXRlrbf1SIDYR6NIX6AMk2qrkIYLrr61r3ai+1sA6B9ocQwMc88pm bSuyV20KbyP4KTyc3XRmcaw/+SmzmfHMaMEuvJ+xVViX4yxOmGbFUPR86/fQ3eE7dF2858 vGVQhTyXd6w1QJZZ8j85GoS5uSt6l3ifkVHZOg8F9n1L3heQrxwxzh2lr+N1ZxqYjT/KvL ZNb6ITU2eFedPFenA89BKC3iKxeCJp0qjCVoCZ5ih1nG18Os+TyBNYIHV60JuBiki31iRk Hfi6TqZGNM4hyQFhgBD5V72ofI7rNZjs9uLYY2LcfQ4RZbMZzpZW5NMTeGTzMQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by mhorne: URL: https://cgit.FreeBSD.org/src/commit/?id=c9114f9f86f92742eacd1d802c34009a57e81055 commit c9114f9f86f92742eacd1d802c34009a57e81055 Author: Mitchell Horne AuthorDate: 2021-03-23 20:47:14 +0000 Commit: Mitchell Horne CommitDate: 2022-04-05 18:35:05 +0000 Add new vnode dumper to support live minidumps This dumper can instantiate and write the dump's contents to a file-backed vnode. Unlike existing disk or network dumpers, the vnode dumper should not be invoked during a system panic, and therefore is not added to the global dumper_configs list. Instead, the vnode dumper is constructed ad-hoc when a live dump is requested using the new ioctl on /dev/mem. This is similar in spirit to a kgdb session against the live system via /dev/mem. As described briefly in the mem(4) man page, live dumps are not guaranteed to result in a usuable output file, but offer some debugging value where forcefully panicing a system to dump its memory is not desirable/feasible. A future change to savecore(8) will add an option to save a live dump. Reviewed by: markj, Pau Amma (manpages) Discussed with: kib MFC after: 3 weeks Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D33813 --- share/man/man4/mem.4 | 62 ++++++++++++++ sys/conf/files | 1 + sys/dev/mem/memdev.c | 6 ++ sys/kern/kern_shutdown.c | 14 ++- sys/kern/kern_vnodedumper.c | 202 ++++++++++++++++++++++++++++++++++++++++++++ sys/sys/conf.h | 1 + sys/sys/kerneldump.h | 2 + sys/sys/memrange.h | 10 +++ 8 files changed, 296 insertions(+), 2 deletions(-) diff --git a/share/man/man4/mem.4 b/share/man/man4/mem.4 index f860df036428..6370d2a95525 100644 --- a/share/man/man4/mem.4 +++ b/share/man/man4/mem.4 @@ -202,6 +202,50 @@ to update an existing or establish a new range, or to .Dv MEMRANGE_SET_REMOVE to remove a range. .El +.Ss Live Kernel Dumps +.Pp +The +.Dv MEM_KERNELDUMP +ioctl will initiate a kernel dump against the running system, the contents of +which will be written to a process-owned file descriptor. +The resulting dump output will be in minidump format. +The request is described by +.Bd -literal +struct mem_livedump_arg { + int fd; /* input */ + int flags /* input */ + uint8_t compression /* input */ +}; +.Ed +.Pp +The +.Va fd +field is used to pass the file descriptor. +.Pp +The +.Va flags +field is currently unused and must be set to zero. +.Pp +The +.Va compression +field can be used to specify the desired compression to +be applied to the dump output. +The supported values are defined in +.In sys/kerneldump.h ; +that is, +.Dv KERNELDUMP_COMP_NONE , +.Dv KERNELDUMP_COMP_GZIP , +or +.Dv KERNELDUMP_COMP_ZSTD . +.Pp +Kernel dumps taken against the running system may have inconsistent kernel data +structures due to allocation, deallocation, or modification of memory +concurrent to the dump procedure. +Thus, the resulting core dump is not guaranteed to be usable. +A system under load is more likely to produce an inconsistent result. +Despite this, live kernel dumps can be useful for offline debugging of certain +types of kernel bugs, such as deadlocks, or in inspecting a particular part of +the system's state. .Sh RETURN VALUES .Ss MEM_EXTRACT_PADDR The @@ -229,6 +273,24 @@ base/length supplied. An attempt to remove a range failed because the range is permanently enabled. .El +.Ss MEM_KERNELDUMP +.Bl -tag -width Er +.It Bq Er EOPNOTSUPP +Kernel minidumps are not supported on this architecture. +.It Bq Er EPERM +An attempt to begin the kernel dump failed because the calling thread lacks the +.It Bq Er EBADF +The supplied file descriptor was invalid, or does not have write permission. +.It Bq Er EBUSY +An attempt to begin the kernel dump failed because one is already in progress. +.It Bq Er EINVAL +An invalid or unsupported value was specified in +.Va flags . +.It Bq Er EINVAL +An invalid or unsupported compression type was specified. +.Dv PRIV_KMEM_READ +privilege. +.El .Sh FILES .Bl -tag -width /dev/kmem -compact .It Pa /dev/mem diff --git a/sys/conf/files b/sys/conf/files index 57bd2693f532..9b907da0dd4b 100644 --- a/sys/conf/files +++ b/sys/conf/files @@ -3839,6 +3839,7 @@ kern/kern_tslog.c optional tslog kern/kern_ubsan.c optional kubsan kern/kern_umtx.c standard kern/kern_uuid.c standard +kern/kern_vnodedumper.c standard kern/kern_xxx.c standard kern/link_elf.c standard kern/linker_if.m standard diff --git a/sys/dev/mem/memdev.c b/sys/dev/mem/memdev.c index f03550aaa495..7d33066f5678 100644 --- a/sys/dev/mem/memdev.c +++ b/sys/dev/mem/memdev.c @@ -35,6 +35,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -96,6 +97,7 @@ memioctl(struct cdev *dev, u_long cmd, caddr_t data, int flags, { vm_map_t map; vm_map_entry_t entry; + const struct mem_livedump_arg *marg; struct mem_extract *me; int error; @@ -120,6 +122,10 @@ memioctl(struct cdev *dev, u_long cmd, caddr_t data, int flags, } vm_map_unlock_read(map); break; + case MEM_KERNELDUMP: + marg = (const struct mem_livedump_arg *)data; + error = livedump_start(marg->fd, marg->flags, marg->compression); + break; default: error = memioctl_md(dev, cmd, data, flags, td); break; diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index 7d0f913961cb..f7e72d53a566 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -390,6 +390,17 @@ print_uptime(void) printf("%lds\n", (long)ts.tv_sec); } +/* + * Set up a context that can be extracted from the dump. + */ +void +dump_savectx(void) +{ + + savectx(&dumppcb); + dumptid = curthread->td_tid; +} + int doadump(boolean_t textdump) { @@ -402,8 +413,7 @@ doadump(boolean_t textdump) if (TAILQ_EMPTY(&dumper_configs)) return (ENXIO); - savectx(&dumppcb); - dumptid = curthread->td_tid; + dump_savectx(); dumping++; coredump = TRUE; diff --git a/sys/kern/kern_vnodedumper.c b/sys/kern/kern_vnodedumper.c new file mode 100644 index 000000000000..c8fdce5e550a --- /dev/null +++ b/sys/kern/kern_vnodedumper.c @@ -0,0 +1,202 @@ +/*- + * Copyright (c) 2021-2022 Juniper Networks + * + * This software was developed by Mitchell Horne + * under sponsorship from Juniper Networks and Klara Systems. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static dumper_start_t vnode_dumper_start; +static dumper_t vnode_dump; +static dumper_hdr_t vnode_write_headers; + +static struct sx livedump_sx; +SX_SYSINIT(livedump, &livedump_sx, "Livedump sx"); + +/* + * Invoke a live minidump on the system. + */ +int +livedump_start(int fd, int flags, uint8_t compression) +{ +#if MINIDUMP_PAGE_TRACKING == 1 + struct dumperinfo di, *livedi; + struct diocskerneldump_arg kda; + struct vnode *vp; + struct file *fp; + void *rl_cookie; + int error; + + error = priv_check(curthread, PRIV_KMEM_READ); + if (error != 0) + return (error); + + if (flags != 0) + return (EINVAL); + + error = getvnode(curthread, fd, &cap_write_rights, &fp); + if (error != 0) + return (error); + vp = fp->f_vnode; + + if ((fp->f_flag & FWRITE) == 0) { + error = EBADF; + goto drop; + } + + /* Set up a new dumper. */ + bzero(&di, sizeof(di)); + di.dumper_start = vnode_dumper_start; + di.dumper = vnode_dump; + di.dumper_hdr = vnode_write_headers; + di.blocksize = PAGE_SIZE; /* Arbitrary. */ + di.maxiosize = MAXDUMPPGS * PAGE_SIZE; + + bzero(&kda, sizeof(kda)); + kda.kda_compression = compression; + error = dumper_create(&di, "livedump", &kda, &livedi); + if (error != 0) + goto drop; + + /* Only allow one livedump to proceed at a time. */ + if (sx_try_xlock(&livedump_sx) == 0) { + dumper_destroy(livedi); + error = EBUSY; + goto drop; + } + + /* To be used by the callback functions. */ + livedi->priv = vp; + + /* Lock the entire file range and vnode. */ + rl_cookie = vn_rangelock_wlock(vp, 0, OFF_MAX); + vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); + + dump_savectx(); + error = minidumpsys(livedi, true); + + VOP_UNLOCK(vp); + vn_rangelock_unlock(vp, rl_cookie); + sx_xunlock(&livedump_sx); + dumper_destroy(livedi); +drop: + fdrop(fp, curthread); + return (error); +#else + return (EOPNOTSUPP); +#endif /* MINIDUMP_PAGE_TRACKING == 1 */ +} + +int +vnode_dumper_start(struct dumperinfo *di, void *key, uint32_t keysize) +{ + + /* Always begin with an offset of zero. */ + di->dumpoff = 0; + + KASSERT(keysize == 0, ("encryption not supported for livedumps")); + return (0); +} + +/* + * Callback from dumpsys() to dump a chunk of memory. + * + * Parameters: + * arg Opaque private pointer to vnode + * virtual Virtual address (where to read the data from) + * physical Physical memory address (unused) + * offset Offset from start of core file + * length Data length + * + * Return value: + * 0 on success + * errno on error + */ +int +vnode_dump(void *arg, void *virtual, vm_offset_t physical __unused, + off_t offset, size_t length) +{ + struct vnode *vp; + int error = 0; + + vp = arg; + MPASS(vp != NULL); + ASSERT_VOP_LOCKED(vp, __func__); + + /* Done? */ + if (virtual == NULL) + return (0); + + error = vn_rdwr(UIO_WRITE, vp, virtual, length, offset, UIO_SYSSPACE, + IO_NODELOCKED, curthread->td_ucred, NOCRED, NULL, curthread); + if (error != 0) + uprintf("%s: error writing livedump block at offset %jx: %d\n", + __func__, (uintmax_t)offset, error); + return (error); +} + +/* + * Callback from dumpsys() to write out the dump header, placed at the end. + */ +int +vnode_write_headers(struct dumperinfo *di, struct kerneldumpheader *kdh) +{ + struct vnode *vp; + int error; + off_t offset; + + vp = di->priv; + MPASS(vp != NULL); + ASSERT_VOP_LOCKED(vp, __func__); + + /* Compensate for compression/encryption adjustment of dumpoff. */ + offset = roundup2(di->dumpoff, di->blocksize); + + /* Write the kernel dump header to the end of the file. */ + error = vn_rdwr(UIO_WRITE, vp, kdh, sizeof(*kdh), offset, + UIO_SYSSPACE, IO_NODELOCKED, curthread->td_ucred, NOCRED, NULL, + curthread); + if (error != 0) + uprintf("%s: error writing livedump header: %d\n", __func__, + error); + return (error); +} diff --git a/sys/sys/conf.h b/sys/sys/conf.h index 6f84a3f03dbc..4808de511d6b 100644 --- a/sys/sys/conf.h +++ b/sys/sys/conf.h @@ -362,6 +362,7 @@ struct dumperinfo { extern int dumping; /* system is dumping */ +void dump_savectx(void); int doadump(boolean_t); struct diocskerneldump_arg; int dumper_create(const struct dumperinfo *di_template, const char *devname, diff --git a/sys/sys/kerneldump.h b/sys/sys/kerneldump.h index c293491eadc9..2c73790bc81d 100644 --- a/sys/sys/kerneldump.h +++ b/sys/sys/kerneldump.h @@ -162,6 +162,8 @@ void dumpsys_pb_progress(size_t); extern int do_minidump; +int livedump_start(int, int, uint8_t); + #endif #endif /* _SYS_KERNELDUMP_H */ diff --git a/sys/sys/memrange.h b/sys/sys/memrange.h index 454b033775f4..d3eeeb79b664 100644 --- a/sys/sys/memrange.h +++ b/sys/sys/memrange.h @@ -59,6 +59,16 @@ struct mem_extract { #define MEM_EXTRACT_PADDR _IOWR('m', 52, struct mem_extract) +struct mem_livedump_arg { + int fd; + int flags; + uint8_t compression; + uint8_t pad1[7]; + uint64_t pad2[2]; +}; + +#define MEM_KERNELDUMP _IOW('m', 53, struct mem_livedump_arg) + #ifdef _KERNEL MALLOC_DECLARE(M_MEMDESC);