Re: nvme(4): some non-operational power states are broken

From: Warner Losh <imp_at_bsdimp.com>
Date: Mon, 23 Sep 2024 10:10:15 UTC
On Mon, Sep 23, 2024, 10:47 AM Warner Losh <imp@bsdimp.com> wrote:

>
>
> On Mon, Sep 23, 2024, 10:41 AM Alexey Sukhoguzov <mail@eseipi.net> wrote:
>
>> Hi,
>>
>> My NVMe controller is Toshiba XG5, and it has 6 power states: the
>> first three (0-2) are normal and the last three (3-5) are NOPS.
>> Here is 'nvmecontrol power -l nvme0' output:
>>
>>  #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
>> --  --------  --------- --------- -- -- -- -- -------- -------- --
>>  0:  8.0000W    0.000ms   0.000ms  0  0  0  0  0.0000W  0.0000W 0
>>  1:  3.9000W    0.000ms   0.000ms  1  1  1  1  0.0000W  0.0000W 0
>>  2:  2.0000W    0.000ms   0.000ms  2  2  2  2  0.0000W  0.0000W 0
>>  3:  0.0500W*   1.500ms   1.500ms  3  3  3  3  0.0000W  0.0000W 0
>>  4:  0.0050W*   6.000ms  14.000ms  4  4  4  4  0.0000W  0.0000W 0
>>  5:  0.0030W*  50.000ms  80.000ms  5  5  5  5  0.0000W  0.0000W 0
>>
>> The problem is that only one of the NOPS is working as expected
>> (state 3). Another two (states 4-5) skyrocket the controller's power
>> consumption far beyond normal (0-2) power states do, and far beyond
>> reasonable. For example, when the controller is in state 3, my
>> system consumes about 3-3.5 W at idle (according to acpiconf with
>> laptop power cable unplugged), in states 0-2 - about 4 W, and in
>> states 4-5 consumption is approaching 6 W. Thus, the NVMe becomes
>> the hottest part of the system (>50C, still idle), and it eats up
>> almost half of the battery alone.
>>
>> Linux doesn't have this issue, so it seems to be nvme(4) related.
>> All the above data is collected on 14.1-RELEASE Live USB with no
>> filesystem mounted. 15-CURRENT has the same problem.
>>
>> Any ideas what it might be?
>>
>
> Does Linux have active power state management?
>

And what's the workload? What performance are you seeing? And what's the
reported model number? What form factor? I have an m.2 XG5 hanging around.
We used these at work years ago, but in only one hw spin. We measured no
power diffs btn the states in the system with our streaming workload. They
also had a higher latency more quickly than other vendors, but not enough
to matter so far (the machines they were in are nearing EOL) and it was
only during high write loads iirc. I never looked at the temperature since
they never got above our limits.

Warner


> Warner
>
>
>> Regards,
>> Alexey
>>
>>