A new boot-time trace framework
- Reply: Bjoern A. Zeeb: "Re: A new boot-time trace framework"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 10 Nov 2021 16:26:50 UTC
Hello, For some time I have been working towards upstreaming the 'boottrace' feature, which captures boot-time and shutdown-time information as trace entries. This feature was developed by NetApp, and they have been using it internally to great success for several years. Its purpose is to aid in logging and identifying slow portions of boot or shutdown, spanning from kernel initialization through execution of rc scripts. It is driven by a simple sysctl interface, described in the main review (D30184). Boottrace has some functional overlap with the existing TSLOG framework, in that it captures timestamped entries of notable boot events in a buffer, for later inspection. This overlap has grown somewhat recently, as cperciva@'s work on reducing the overall system boot time extended TSLOG to capture trace data from userspace (see commit 46dd801acb23). Boottrace differs from TSLOG in the following ways: - Separate trace buffers for collecting run-time (post multiuser) and shutdown-time events - Does not cover the bootloader or early kernel initialization (tracing begins at SI_SUB_CPU) - Output log is human-readable, but not as suitable for generating flamegraphs (see the attached sample log) - Trace entries also record some resource usage of the invoking process (total CPU time, # of blocks in/out from disk) Given these differences, I believe the two can peacefully coexist. I would roughly characterize them as follows: boottrace may be most useful to a sysadmin/QA team, whereas TSLOG is a tool more suited to the needs of a developer and will provide more fine-grained and machine-readable information. Unlike TSLOG, I intend for this work to be compiled in to the kernel by default, but disabled behind a tunable (kern.boottrace.enabled). The cost of doing so should be minimal, only a couple of syscalls added to init(8) at most. The changes span several areas, including the following: - Core boottrace module - Kernel initialization and shutdown paths - init(8), shutdown(8), and reboot(8) utilities - addition of a new boottrace(1) wrapper utility - RC framework (rc.subr) If you have an interest in any of the above areas, please add yourself to the review. If you have other feedback, I would be interested in hearing it here. The reviews: - https://reviews.freebsd.org/D30184 - https://reviews.freebsd.org/D30187 - https://reviews.freebsd.org/D31928 - https://reviews.freebsd.org/D31929 - https://reviews.freebsd.org/D31930 If you are interested in trying this feature, I have made it available on the following git branch: https://github.com/mhorne/freebsd/tree/netapp_boottrace Finally, I've attached a sample output from `sysctl kern.boottrace.log`, for a small bhyve VM. Cheers, Mitchell