ULE locking mechanism
John Baldwin
jhb at freebsd.org
Tue Feb 11 19:49:47 UTC 2014
On Tuesday, January 28, 2014 8:07:08 am Jens Krieg wrote:
> Hello,
>
> we are currently working on project for our university. Our goal is to
implement a simple round robin scheduler for FreeBSD 9.2 on a single core
machine.
> So far we removed most of the functionality of the ULE scheduler except the
functions that are called from outside. The system successfully boots to user
land with our RR scheduler managing thread in a list based run queue. Further,
it is possible to interact with the system using the shell.
>
> The next step is to replace the locking mechanism of the ULE scheduler.
Therefore, we replaced the scheduling dependent thread_lock/thread_unlock
functions by simply disabling/enabling the interrupts. With this modification
the kernel works fine until we hit the user land then the system crashes.
> The error occurs in the init user process (init_main.c:start_init:685). We
found out that the page fault is triggered while executing the subyte function
for the first time. See the error description below (unfortunately not shown
in backtrace).
> We compared the ULE scheduler with our RR implementation and it appears,
that the parameters passed to subyte as well as the register values are
identical. We assume, that whatever caused the error is related to the thread
locking replacement.
>
> Every time the kernel want to modify thread data the corresponding thread is
locked to prevent any interference by other threads. Since we are using a
single core machine why isn’t it sufficient to simply disable interrupt while
modifying thread data. Could you provide us with detailed information about
the locking mechanism in FreeBSD and also answer the following questions,
please.
>
> What is the purpose of thread_lock/thread_unlock besides protecting thread
data?
> How does the TDQ LOCK works and how is it related to a thread LOCK?
> - all thread LOCKs of the thread located in the run queue pointing to the
TDQ LOCK, and
> - the TDQ LOCK points to the currently running thread
> - on context switching the current thread passes the TDQ LOCK to the new
chosen thread
> - Could you explain the idea behind that locking concept, please?
> Any suggestions we shall care about in our own lock implementation?
thread_lock is quite intertwined with other locks. E.g. when a thread is
blocked on a turnstile, thread_lock() for that thread locks the 'ts_lock'
spin mutex for that turnstile. If you want to replace thread lock, you need
to change all the locks that td_lock can be to use your new primitive. You'd
probably have an easier time just changing how mtx_lock_spin() works. (In
fact, if you just disable 'options SMP', the stock kernel turns
mtx_lock_spin() into a function that just disables interrupts.)
For your core dump, the first step would be to use gdb to map that address to
a file line. For example, you can just do 'l *fork_exit+0x9d', or you can do
'l *<instruction pointer>' where you use the value from the trap message.
Looking at that can probably tell you why you panic'd.
--
John Baldwin
More information about the freebsd-hackers
mailing list