Further testing of power management

Sun Apr 10 14:04:39 PDT 2005

Kevin Oberman wrote:
> Nate,
> 
> I finally had time to do some careful testing of power management on
> -current. All testing was done on my IBM T30 with a 1.8 GHz P4-M
> Processor. CPU load was generated by the use of md5 on a long gatch of
> zeros. (As you suggested.)
> 
> First, on power dissipation, while the use of TCC and adjusting actual
> CPU frequency causes very predictable compute performance. They do not
> produce the expected matching power dissipation.
> 
> Here is a chart of the CPU temperature against the value of
> dev.cpu.0.freq. The third column list the actual clock frequency that
> the CPU is using. The T30 supports only 2 frequencies, 1.8 GHz and 1.2
> GHz.
> 
> dev.cpu.0.freq	Temperature	CPU Clock
> 1800		>_PSV		1800
> 1575		>_PSV		1800
> 1350		85		1800
> 1200		73		1200
> 1125		82		1800
> 1050		69		1200
> 900		77		1800
> 750		64		1200
> 675		72		1800
> 600		62		1200
> 450		66		1800
> 300		56		1200
> 225		61		1800
> 150		54		1200
> 
> As you can see, lowering the CPU cock speed is much more effective in
> reducing CPU heat (and battery drain) than doing it with TCC. I can get
> much better performance with lower battery consumption at 1200 MHz than
> at 900 MHz. Clearly, if both clock and TCC can provide identical
> performance, you want the slower clock. This is backwards from how it is
> now running as both 900 MHZ and 450 MHz can be achieved at either 1800
> MHZ or 1200MHz clocking, but are clocked at 1800 MHz.

Thanks for your testing.  I agree that settings like the 900 mhz value 
don't make sense to use when the 1050 value has lower heat.  Do you have 
known values for power consumption (sysctl dev.cpu.0.freq_levels, look 
for the second number after the /)?  Unknown values are marked -1.  Is 
the power consumption for 900 higher than 1050?  If so, we could add a 
test that compares power consumption and discards levels that have lower 
frequencies but higher power consumption than their neighbors.

If you check today's CVS email, you'll see that I committed support to 
powerd to add up all power usage when in verbose mode.  This lets you 
test a fixed workload with different algorithms like this:

sysctl dev.cpu.0.freq=[your highest]
powerd -v > powerd_output &
time -o load_output ./load_generator
killall powerd

I've found with my testing that the current algorithm saves much more 
power and has no worse performance loss than any other algorithm I 
tried.  I tried various forms of linear and exponential growth/decay as 
well as adding the "stickiness" patch.   Two workloads I tried were a 
random IO with pauses and a random CPU burner with pauses.

On a P4-M laptop I was able to borrow, I ran the test at 1700 and 600 
Mhz (fixed) to get a baseline.  The 600 Mhz test ran 37% slower (wall 
time) than the full speed test.  The current adaptive algorithm ran only
2.3% slower than full speed and saved a whopping 75% on power (energy 
consumed, really).  The best performing algorithm (lazy linear 
step-up/step-down similar to your patch) gave performance only 0.7% 
slower than full speed but used 2.6X more power than the current 
adaptive algorithm.

I did find that interrupt latency was not significantly affected by the 
slower clock frequencies so I've dropped the default running level to 
65%.  This should keep us from jumping to full speed prematurely.

I welcome further testing with different workloads.  Be sure to 
benchmark your performance with time(1) as well as power consumed (via 
powerd -v) and compare to full speed and slow speed.  You can get the 
power consumed for constant speeds by using powerd's fixed mode (powerd 
-v -a max -b max)

-- 
Nate