RFC: setting performance_cx_lowest=C2 in -HEAD to avoid lock contention on many-CPU boxes
Adrian Chadd
adrian at freebsd.org
Sat Apr 25 18:45:11 UTC 2015
On 25 April 2015 at 11:18, Davide Italiano <davide at freebsd.org> wrote:
> On Sat, Apr 25, 2015 at 9:31 AM, Adrian Chadd <adrian at freebsd.org> wrote:
>> Hi!
>>
>> I've been doing some NUMA testing on large boxes and I've found that
>> there's lock contention in the ACPI path. It's due to my change a
>> while ago to start using sleep states above ACPI C1 by default. The
>> ACPI C3 state involves a bunch of register fiddling in the ACPI sleep
>> path that grabs a serialiser lock, and on an 80 thread box this is
>> costly.
>>
>> I'd like to drop performance_cx_lowest to C2 in -HEAD. ACPI C2 state
>> doesn't require the same register fiddling (to disable bus mastering,
>> if I'm reading it right) and so it doesn't enter that particular
>> serialised path. I've verified on Westmere-EX, Sandybridge, Ivybridge
>> and Haswell boxes that ACPI C2 does let one drop down into a deeper
>> CPU sleep state (C6 on each of these). I think is still a good default
>> for both servers and desktops.
>>
>> If no-one has a problem with this then I'll do it after the weekend.
>>
>
> This sounds to me just a way to hide a problem.
> Very few people nowaday run on NUMA and they can tune the machine as
> they like when they do testing.
> If there's a lock contention problem, it needs to be fixed and not
> hidden under another default.
The lock contention problem is inside ACPI and how it's designed/implemented.
We're not going to easily be able to make ACPI lock "better" as we're
constrained by how ACPI implements things in the shared ACPICA code.
> Also, as already noted this is a problem on 80-core machines but
> probably not on a 2-core Atom. I think you need to understand factors
> better and come up with a more sensible relation. In other words, your
> bet needs to be proven before changing a default useful for frew that
> can impact many.
I've just described the differences in behaviour. I've checked the C
states on all the intel servers too - with power plugged in, ACPI C2
and ACPI C3 still result in entering CPU C6 state, not CPU C7 state -
so it's not going to result in worse behaviour.
For reference, "all" being the following list:
* westmere-EX
* nehalem
* sandybridge
* sandybridge mobile
* sandybridge xeon
* ivybridge mobile
* ivybridge xeon
* haswell mobile
* haswell
* haswell xeon
* haswell xeon v3
-adrian
More information about the freebsd-arch
mailing list