CURRENT as gateway on not-so-fast hardware: where is a
bottlneck?
Alexander Motin
mav at FreeBSD.org
Wed Aug 15 10:18:11 UTC 2012
On 15.08.2012 03:09, Doug Barton wrote:
> On 08/14/2012 12:20 PM, Adrian Chadd wrote:
>> Would you be willing to compile a kernel with KTR so you can capture
>> some KTR scheduler dumps?
>>
>> That way the scheduler peeps can feed this into schedgraph.py (and you
>> can too!) to figure out what's going on.
>>
>> Maybe things aren't being scheduled correctly and the added latency is
>> killing performance?
>
> You might also try switching to SCHED_ULE to see if it helps.
>
> Although, in the last few months as mav has been converging the 2 I've
> started to see the same problems I saw on my desktop systems previously
> re-appear even using ULE. For example, if I'm watching an AVI with VLC
> and start doing anything that generates a lot of interrupts (like moving
> large quantities of data from one disk to another) the video and sound
> start to skip. Also, various other desktop features (like menus, window
> switching, etc.) start to take measurable time to happen, sometimes
> seconds.
>
> ... and lest you think this is just a desktop problem, I've seen the
> same scenario on 8.x systems used as web servers. With ULE they were
> frequently getting into peak load situations that created what I called
> "mini thundering herd" problems where they could never quite get caught
> up. Whereas switching to 4BSD the same servers got into high-load
> situations less often, and they recovered on their own in minutes.
It is quite pointless to speculate without real info like mentioned
above KTR_SCHED traces. Main thing I've learned about schedulers, things
there never work as you expect. There are two many factors are relations
to predict behavior in every case.
About Soekris and idle CPU measurement, let's start from what kind of
eventtimer is used there. As soon as it is UP machine, I guess it uses
i8254 timer in periodic mode. It means that it by definition can't
properly measure load from treads running from hardclock, such as
dummynet, polling netisr threads, etc.
What's about playing AVIs and using other GUIs, key word here and for
ULE in general is interactivity. ULE gives huge boost to threads it
counts interactive. Disk I/O is a good candidate for it, as it does many
voluntary sleeps by definition, while waiting for data. If it will not
be counted interactive, it will heavily suffer from latencies while
waiting for other threads. Modern heavy GUIs and video CODECs same time
may consume CPU time sequentially for long periods. On busy machines
they may never sleep at all, trying to catchup incoming data rate. It
can make ULE count them as batch and so less preferred then I/O. As I've
said above, let's try to collect some real data first.
If somebody still wish area for experiments, there is always some:
- if you want video player to not lag, set negative nice for it (ULE
is not a magician to guess user wishes);
- same I guess counts for Xorg process;
- there are number of sysctls ULE provides:
- kern.sched.interact -- value in percents specifying how much run
time may have thread to still be counted as interactive;
- kern.sched.slice or new kern.sched.quantum -- specifying interval
of context switches for non-interactive threads, historically set to
100ms. It may be too long now. Reducing it may make system run more
smooth, while price of those switches is probably not so significant now.
--
Alexander Motin
More information about the freebsd-current
mailing list