Accessing struct pthread from kernel

Davide Italiano davide at freebsd.org
Sun Jul 7 22:22:19 UTC 2013


On Sun, Jul 7, 2013 at 2:34 PM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> On Sat, Jul 06, 2013 at 01:22:05AM +0200, Davide Italiano wrote:
>> Hi,
>> as a preliminary step in the implementation of adaptive spinning for
>> umtx, I'm switching the pthread/umtx code so that a thread that
>> acquires a pthread_mutex writes the address of struct pthread in the
>> owner field of the lock instead of the thread id (tid). This is
>> because having struct pthread pointer allows easily to access
>> informations of the thread, and makes easy to get the state of the
>> thread exported from the kernel (once this will be implemented).
>>
>> For what concerns the libthr side, the internal function
>> _get_curthread() goes into the TLS to obtain the struct field of
>> curthread, so I'm done.
>> OTOH, I'm quite unsure instead about how to get address of struct
>> pthread for curthread from the kernel side (for example, in
>> do_lock_umutex() sys/kern/kern_umtx.c).
> You should not, see below.
>
>>
>> I guess I need to write some MD code because the TLS is different on
>> the various architecture supported in FreeBSD, and as a first step I
>> focused on amd64.
>> It looks like from the SDM that %%fs register points to the base of
>> the TLS, so I think that accessing using curthread->td_pcb->pcb_fsbase
>> (and then adding the proper offset to acces the right field) is a
>> viable solution to do this. Am I correct?
>> In particular what worries me is if the read of 'struct pthread' for
>> curthread from the TLS register is atomic with respect to preemptions.
>>
>> Alternatively, is there an (easier way) to accomplish this task?
>
> Coupling the libthr thread structure and kernel makes the ABI cast in
> stone and avoids most possibilities of changing the libthr internals.
> The same is true for kernel accessing the userspace TLS area of the thread.
>
> If you want kernel<->usermode communication of the thread run state,
> and possibly also a way to advisory prevent a preemption of the
> spinning usermode thread, you should create a dedicated parameter block
> communicated from usermode to kernel on thread creation. For the main
> thread, the block could be allocated by kernel by image activator,
> placed on the stack and its address passed to the usermode by auxv.
>
> Note that you cannot access the usermode from the context switch
> code. Wiring the corresponding page is very wasteful (think about a
> process with 10,000 threads) and still does not provide any guarantees
> since usermode can unmap or remap the range. You might see my 'fast
> sigprocmask' patches where somewhat similar idea was implemented.
>

I think the tecnique you used for sigprocmask is neat and could be
reused for sharing thread state between kernel and userland, with some
modifications, thanks.

That said, the problem I faced out before was slightly different.
In order to implement adaptive spinning 'efficiently'[1], threads
waiting for a lock held by some other thread should be able to access
easily the state owner. If I understand the kernel locking code
properly, to accomplish this, once a thread acquire a lock it writes
the address of its struct thread in the owner field of the lock, so
that other threads can easily access to his state.

Now, it looks like applying such a tecnique for userspace is just
impossible for the reasons you mentioned in your previous mail.
The two alternatives that came up to the top of my mind are:
1) maintain an hashtable that keep track of the mapping
tid->curthread, so that other thread perform a lookup on the hash
table to get curthread access.
2) having an array indexed by tid where the state is stuck so that
other threads can check what is the state of the lock owner.

I do see problem in both of my ideas: 1) is clearly time inefficient,
having to perform an hash table lookup might make useless the speedup
obtained from the adaptive spinning implementation. 2) is clearly
space inefficient, because in most of the situation the array would be
'sparse'.

What do you prefer? If none, is there an alternative I'm missing?

Thanks,

> I also recommend you to look at the Solaris schedctl(2).

I'll take a look at it in the next hours.

[1] for some definition of 'efficient'

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare


More information about the freebsd-amd64 mailing list