[Bug 257641] hwpmc/libpmc needs to gain a notion of big.LITTLE
Date: Thu, 05 Aug 2021 17:35:35 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257641 Bug ID: 257641 Summary: hwpmc/libpmc needs to gain a notion of big.LITTLE Product: Base System Version: Unspecified Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: mhorne@freebsd.org Some systems that FreeBSD supports contain a heterogeneous collection of CPUs. This is present in ARM's big.LITTLE chips, such as the rockpro64, and will be a feature of some next-generation x86 chips as well [1][2]. The PMC stack was written in a time before these heterogeneous systems, and thus the assumption of homogeneous support for performance monitoring capabilities among all cores in the system is ingrained. This is stated explicitly in the hwpmc(4) man page under IMPLEMENTATION NOTES. In the case of the rockpro64/RK3399, it contains four Cortex-a53 cores and two larger Cortex-a72 cores. There is some overlap of supported performance events between the two types, but some events that are unique to each. This poses problems that hwpmc is not currently equipped to deal with. The first problem to solve is CPU reporting. There are two ways this is communicated from the kernel to libpmc, via the kern.hwpmc.cpuid sysctl and the PMC_OP_GETCPUINFO operation on the hwpmc syscall. Neither of these methods make a distinction between different CPUs in the system, so the value received by userspace basically depends on which CPU does the initialization of the hwpmc module. This somehow needs to become a per-CPU value, in order to properly detect which events are supported on a given core. Assuming this is solved, the basic high-level behaviour will depend on the type of PMC being allocated: System-scope PMCs: Allocating a system-scope counter with e.g. pmcstat -s <event> will attempt to allocate the event on every CPU in the system. If the allocation fails for any CPU, the command will not proceed with any measurement. This has reasonable behaviour on a heterogeneous system, where the user needs to either pick an event that is compatible with all CPUs, or use the -c flag to qualify the selected CPUs. Process-scope PMCs: Allocating a process-scope counter is slightly more problematic. Suppose a PMC counter is allocated on CPU A, where the target process is running and the requested event is supported. If the process is migrated to CPU B, which differs from A, then attempting to resume the hardware counter could start measuring an entirely different event, if the programmed value is valid at all. I see two possible ways to solve this: don't allow PMC-enabled processes (curproc->p_flag & P_HWPMC) to migrate outside of their PMC-compatible cluster, OR, have libpmc call cpuset(3) for the process, and bind it to compatible CPUs for the duration of the measurement. I have not thought through either of these approaches in detail, but both require building some list of "PMC-compatible" CPU groups/clusters in the kernel. [1] https://www.cnx-software.com/2021/07/10/intel-alder-lake-hybrid-mobile-processor-family-to-range-from-5w-to-55w-tdp/ [2] https://www.tomshardware.com/news/amd-patent-hybrid-cpu-rival-intel-raptor-lake-cpu -- You are receiving this mail because: You are the assignee for the bug.