Permit init(8) use its own cpuset group.
Alexander V. Chernikov
melifaro at FreeBSD.org
Thu Jun 12 09:25:33 UTC 2014
On 12.06.2014 07:57, Julian Elischer wrote:
> On 6/12/14, 12:52 AM, Alexander V. Chernikov wrote:
>> On 09.06.2014 21:43, John Baldwin wrote:
>>> On Friday, June 06, 2014 6:09:13 pm Alexander V. Chernikov wrote:
>>>> On 06.06.2014 01:04, Alexander V. Chernikov wrote:
>>>>> On 05.06.2014 23:59, John Baldwin wrote:
>>>>>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote:
>>>>>>> On 05.06.2014 18:09, John Baldwin wrote:
>>>>>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov
>>>>>>>> wrote:
>>>>>>>>> On 04.06.2014 19:06, John Baldwin wrote:
>>>>>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote:
>>>>>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V.
>>>>>>>>>>> Chernikov wrote:
>>>>>>>>>>>> Hello list!
>>>>>>>>>>>>
>>>>>>>>>>>> Currently init(8) uses group 1 which is root group.
>>>>>>>>>>>> Modifications of this group affects both kernel and
>>>>>>>>>>>> userland threads.
>>>>>>>>>>>> Additionally, such modifications are impossible, for
>>>>>>>>>>>> example, in
>>>>>>>> presence
>>>>>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds
>>>>>>>>>>>> their
>>>>>> threads
>>>>>>>>>> to
>>>>>>>>>>>> particular cpus.
>>>>>>>>>>>>
>>>>>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits
>>>>>>>>>>>> changing cpu
>>>>>>>>>>>> masks for
>>>>>>>>>>>> userland more easily. Restricting user processes to migrate
>>>>>>>>>>>> to/from
>>>>>> CPU
>>>>>>>>>>>> cores
>>>>>>>>>>>> used for network traffic processing is one of the cases.
>>>>>>>>>>>>
>>>>>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same
>>>>>>>>>>>> version
>>>>>> attached
>>>>>>>>>>>> inline)
>>>>>>>>>>>>
>>>>>>>>>>>> If there are no objections, I'll commit this next week.
>>>>>>>>>>> Why is the tunable needed ?
>>>>>>>>>> Because some people already depend on doing 'cpuset -l 0 -s
>>>>>>>>>> 1'. It is
>>>>>>>> also
>>>>>>>>>> documented in our manpages that processes start in cpuset 1
>>>>>>>>>> by default
>>>>>> so
>>>>>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc.
>>>>>>>>>>
>>>>>>>>>> For the stated problem (bound ithreads in drivers), I would
>>>>>>>>>> actually
>>>>>> like
>>>>>>>> to
>>>>>>>>>> fix ithreads that are bound to a specific CPU to create a
>>>>>>>>>> different
>>>>>> cpuset
>>>>>>>>>> instead so they don't conflict with set 1.
>>>>>>>>> Yes, this seem to be much better approach.
>>>>>>>>> Please take a look on the new patch (here or in the phabricator).
>>>>>>>>> Comments:
>>>>>>>>>
>>>>>>>>> Use different approach for modifyable root set:
>>>>>>>>>
>>>>>>>>> * Make sets in 0..15 internal
>>>>>>>>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range
>>>>>>>>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id
>>>>>>>>> * Create additional root set for ithreads
>>>>>>>>> * Use this set in ithread_create()
>>>>>>>>> * Change intr_setaffinity() to use cpuset_iroot (do we
>>>>>>>>> really need
>>>>>> this)?
>>>>>>>>> We can probably do the same for kprocs, but I'm unsure if we
>>>>>>>>> really need
>>>>>> it.
>>>>>>>> I imagined something a bit simpler. Just create a new set in
>>>>>> intr_event_bind
>>>>>>>> and bind the ithread to the new set. No need to have more
>>>>>>>> magic set ids,
>>>>>> etc.
>>>>>>> Well, we also have userland which can modify given changes via
>>>>>>> `cpuset
>>>>>>> -x', so we need to be able to add some more logic on set
>>>>>>> allocation/keeping. Additionally, we can try to do the same via
>>>>>>> `cpuset
>>>>>>> -t', so introducing something like cpuset_setIthread() and
>>>>>>> hooking into
>>>>>>> intr_event_bind() won't probably be enough. At least I can't
>>>>>>> think out a
>>>>>>> quick and easy way to do this.
>>>>>> cpuset -x calls intr_event_bind(). If you just do it there you
>>>>>> fix both
>>>>>> places.
>>>> Yup. I was wrong. This version seems to be simpler while doing the
>>>> right
>>>> thing.
>>> Hmm, I do think this patch is doing the right thing, but I'm worried
>>> you
>>> might have weird effects, e.g. I worry that because the 'intr' proc
>>> is in
>>> set 1 still, the thread masks will be (incorrectly) checked when
>>> changing
>> Well, as far as I understand we don't have any cpu sets bound to
>> program ("pid" search returns set of first thread).
>> Anyway, each original ithread mask is released via cpuset_rel() so
>> I'm a bit unsure about remaining thread masks.
>> Playing with cpuset and ixgbe driver, does not reveal given problem:
>>
>> 12:59 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
>> 16, 17, 18, 19, 20, 21, 22, 23
>> 13:00 [0] test15# kldload ixgbe
>> 13:00 [0] test15# vmstat -i | grep ix
>> irq263: ix0:que 0 116 0
>> irq264: ix0:que 1 108 0
>> irq265: ix0:que 2 108 0
>> irq266: ix0:que 3 108 0
>> irq267: ix0:que 4 108 0
>> irq268: ix0:que 5 108 0
>> irq269: ix0:que 6 108 0
>> irq270: ix0:que 7 108 0
>> ...
>> 13:00 [0] test15# cpuset -g -x 263
>> irq 263 mask: 0
>> 13:00 [0] test15# cpuset -g -x 264
>> irq 264 mask: 1
>> 13:00 [0] test15# cpuset -l 21,22,23 -s 1
>> 13:01 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 21, 22, 23
>> 13:01 [0] test15# cpuset -g -x 263
>> irq 263 mask: 0
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 1
>> 13:01 [0] test15# cpuset -l 12 -x 264
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 12
>> 13:01 [0] test15# cpuset -l 22,23 -s 1
>> 13:01 [0] test15# echo $?
>> 0
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 12
>> 13:01 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 22, 23
>>
>>
>>> set 1 and you still won't be able to 'cpuset -l 0 -s 1'.
>>>
>>> Also, strictly speaking, it would probably be best to return threads to
>>> set 1 when NOCPU is passed to intr_event_bind() to unbind an interrupt.
>> Yup.
>
> in a quick look I didn't see anywhere that checked if the old set and
> new set are the same set..
Well, checking if old cpu _mask_ is the same as old one hasn't been
implemented for any other object type inside cpuset.
Additionally, intr_event_bind() is usually called once per irq, so the
benefit for that particular object type is questionable.
>>
>>
>> _______________________________________________
>> freebsd-hackers at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to"freebsd-hackers-unsubscribe at freebsd.org"
>
>
>
More information about the freebsd-hackers
mailing list