Re: cpufreq & hwpstate_amd & Zen 2
- In reply to: Johannes Totz : "cpufreq & hwpstate_amd & Zen 2"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 19 May 2023 00:29:29 UTC
On 15/05/2023 22:16, Johannes Totz wrote: > Hi all, > > I'm poking cpufreq's hwpstate_amd to see what I can tune re performance > vs power vs heat trade-off. Here are some patches, if anyone is interested: https://reviews.freebsd.org/D40139 Adds a tunable for cpufreq/hwpstate to get the P-state info from the CPU's MSR instead of acpi_perf. https://reviews.freebsd.org/D40158 Adds another tunable that allows overriding the default (or BIOS-configured?) P-state configuration. Stuff like over- or underclocking and -volting. https://reviews.freebsd.org/D40140 Adds power calculation if P-state info comes from MSR. This was missing until now but is really just cosmetic. These do not solve the mystery below though :( And fwiw, C-state power saving is really effective. Messing with the P-states does not do much while idle, it's measurable only when the CPU is busy. > I'm struggling with the P-state behaviour though. > The code looks really straight-forward: > https://github.com/freebsd/freebsd-src/blob/main/sys/x86/cpufreq/hwpstate_amd.c#L172 > > But enabling hwpstate_verify, it looks like P-state transitions never go > as requested. > For this, I'm not running powerd. > In addition to the existing verify code, I've sprinkled in a few more > printfs. > > PStateCurLim (aka MSR_AMD_10H_11H_LIMIT = 0x20) and PStateDef (aka > MSR_AMD_10H_11H_CONFIG = eg 0x8000000049120890) look all reasonable. > > > $ sysctl dev.cpu.0 > dev.cpu.0.freq_levels: 3600/3960 2800/2800 2200/1980 > dev.cpu.0.freq: 2800 > > $ sysctl dev.cpu.0.freq=3600 > dev.cpu.0.freq: 2800 -> 3600 > > $ cat /var/log/messages > [...extra printf debugging...] > kernel: hwpstate0: setting P0-state on cpu0 > kernel: hwpstate0: setting P1(2) -> P0 on cpu1 > [...same for all the other cpus...] > kernel: hwpstate0: setting P1(2) -> P0 on cpu15 > > > This shows that cpufreq thought we were at P1 and wanted to transition > to P0. But actually, the CPU was in P2 (the 2 in brackets). > > We want to go from P0 to P2... > > > $ sysctl dev.cpu.0.freq=2200 > dev.cpu.0.freq: 3600 -> 2200 > > $ cat /var/log/messages > kernel: hwpstate0: setting P2-state on cpu0 > kernel: hwpstate0: setting P0(1) -> P2 on cpu1 > > > ...but CPU was in P1 at that time. > > Wanting to go from P2 back to P1... > > > $ sysctl dev.cpu.0.freq=2800 > dev.cpu.0.freq: 2200 -> 2800 > > $ cat /var/log/messages > kernel: hwpstate0: setting P1-state on cpu0 > kernel: hwpstate0: setting P2(2) -> P1 on cpu1 > > > ...shows that this time the CPU really was in P2 (yeay). But it did not > transition to P1, it stayed in P2 (not shown in the log). > > > So question is: what else could be interfering with P-state? > > > thanks, > > Johannes