powerd broken
Dominic Fandrey
kamikaze at bsdforen.de
Sun Mar 29 06:25:03 PDT 2009
Dominic Fandrey wrote:
> Alexander Motin wrote:
>> Dominic Fandrey wrote:
>>> Alexander Motin wrote:
>>>> It means that one of your CPUs spent most of it's time in interrupt
>>>> processing and so far from idle. What does `top -P` shows you? Where
>>>> have you seen that ~6% CPU load?
>>> That is the load shown by the e17 CPU module. It's display has always
>>> been in sync with top in the past, no longer though, it appears.
>>>
>>> # top -PIS
>>> last pid: 68235; load averages: 0.09, 0.16,
>>> 0.17
>>> up 0+05:17:29 13:05:10
>>> 137 processes: 4 running, 117 sleeping, 16 waiting
>>> CPU 0: 1.1% user, 0.0% nice, 1.1% system, 61.4% interrupt, 36.3% idle
>>> CPU 1: 9.0% user, 0.0% nice, 3.4% system, 0.0% interrupt, 87.6% idle
>>> Mem: 419M Active, 415M Inact, 416M Wired, 3752K Cache, 183M Buf, 716M
>>> Free
>>> Swap: 4096M Total, 4096M Free
>>>
>>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
>>> COMMAND
>>> 11 root 1 171 ki31 0K 16K RUN 1 286:42 93.46%
>>> idle: cpu1
>>> 23 root 1 -80 - 0K 16K RUN 0 126:54 59.96%
>>> irq16: hdac0 uhci+
>>> 12 root 1 171 ki31 0K 16K RUN 0 179:27 40.09%
>>> idle: cpu0
>>> 1318 root 1 46 0 439M 312M select 1 12:55 1.76% Xorg
>>> 4361 musicpd 4 44 0 91164K 14412K ucond 1 1:57 0.78% mpd
>>>
>>> Some things strike me as odd. The difference between the load reported
>>> by powerd and top is still very significant and of course the high
>>> interrupt load.
>> powerd now reports/uses summary load of all CPUs (it can be bigger then
>> 100%), while top shows average.
>>
>>> I've got a mouse with a 1khz report rate (the only connected USB
>>> device), but unplugging it doesn't change the load. Neither does
>>> stopping moused (I'm running the system without HAL). There also
>>> is a fingerprint reader, but it is only detected by ugen.
>> I would start from identifying all devices sharing that IRQ and trying
>> to disable them (or unload their drivers) one by one. `systat -vm 1`
>> will show you how much interrupts actually happens there per second.
>>
>> On most of modern systems you can make hdac0 to not share that IRQ by
>> enabling MSI there with hint.hdac.0.msi=1 in loader.conf.
>>
>
> I already did unplug all devices to no avail. Afterwards I tried
> to 'kldunload -f' all usb devices, but most refused. However during
> this I recognized that /boot/modules holds an old u3g.ko, back from
> the time when it was not in stable. Apparently modules in
> /boot/modules have preference over those in /boot/kernels. I deleted
> the old u3g.ko and rebooted (because I couldn't get the system to
> unload it). Now everything is fine, the interrupt load is gone.
>
> I'll monitor this and come back here if it turns out this has just
> been coincidence.
>
> Thank you for all that help.
>
> Regards
This has turned out to be an early call. I just happened to start
skype_devel and the whole thing started over, so I think it's
pretty certain, now, that this is a hdac issue.
However after a reboot I tried to reproduce this and now skype
and mpd are both running without causing an irq race.
If the irq race reappears I will use the device hint you suggested
to rule out that this is an interrupt sharing issue.
More information about the freebsd-stable
mailing list