Constant load of 1 on a recent 12-STABLE
Gordon Bergling
gbergling at googlemail.com
Thu Jun 4 12:37:27 UTC 2020
Hi Allan,
On Wed, Jun 03, 2020 at 05:33:37PM -0400, Allan Jude wrote:
> On 2020-06-03 16:29, Gordon Bergling wrote:
> > Hi Allan,
> >
> > On Wed, Jun 03, 2020 at 03:13:47PM -0400, Allan Jude wrote:
> >> On 2020-06-03 06:16, Gordon Bergling via freebsd-hackers wrote:
> >>> since a while I am seeing a constant load of 1.00 on 12-STABLE,
> >>> but all CPUs are shown as 100% idle in top.
> >>>
> >>> Has anyone an idea what could caused this?
> >>>
> >>> The load seems to be somewhat real, since the buildtimes on this
> >>> machine for -CURRENT increased from about 2 hours to 3 hours.
> >>>
> >>> This a virtualized system running on Hyper-V, if that matters.
> >>>
> >>> Any hints are more then appreciated.
> >>>
> >> Try running 'top -SP' and see if that shows a specific CPU being busy,
> >> or a specific process using CPU time
> >
> > Below is the output of 'top -SP'. The only relevant process / thread that is
> > relatively constant consumes CPU time seams to be 'zfskern'.
> >
> > -----------------------------------------------------------------------------
> > last pid: 68549; load averages: 1.10, 1.19, 1.16 up 0+14:59:45 22:17:24
> > 67 processes: 2 running, 64 sleeping, 1 waiting
> > CPU 0: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> > CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> > CPU 2: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
> > CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> > Mem: 108M Active, 4160M Inact, 33M Laundry, 3196M Wired, 444M Free
> > ARC: 1858M Total, 855M MFU, 138M MRU, 96K Anon, 24M Header, 840M Other
> > 461M Compressed, 1039M Uncompressed, 2.25:1 Ratio
> > Swap: 2048M Total, 2048M Free
> >
> > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> > 11 root 4 155 ki31 0B 64K RUN 0 47.3H 386.10% idle
> > 8 root 65 -8 - 0B 1040K t->zth 0 115:39 12.61% zfskern
> > -------------------------------------------------------------------------------
> >
> > The only key performance indicator that is relatively high IMHO, for a
> > non-busy system, are the context switches, that vmstat has reported.
> >
> > -------------------------------------------------------------------------------
> > procs memory page disks faults cpu
> > r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
> > 0 0 0 514G 444M 7877 2 7 0 9595 171 0 0 0 4347 43322 17 2 81
> > 0 0 0 514G 444M 1 0 0 0 0 44 0 0 0 121 40876 0 0 100
> > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 133 42520 0 0 100
> > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 120 43830 0 0 100
> > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 132 42917 0 0 100
> > --------------------------------------------------------------------------------
> >
> > Any other ideas what could generate that load?
>
> I agree that load average looks out of place here when you look at the %
> cpu idle, but I wonder if it is caused by a lot of short lived processes
> or threads.
>
> How quickly is the 'last pid' number going up?
>
> You might also look at `zpool iostat 1` or `gstat -p` to see how busy
> your disks are
In the IDLE state the last pid isn't changing within at least 60 seconds.
During 'buildworld' times it is off course much shorter, but a "-j 4" is
resulting in a load average from about 5.0, so that the underlying problem
still persists. 'zpool iostat 1' and 'gstat -p' doesn't show anything
suspicious.
I had a private mail that made me aware of PR173541, where this problem is
documentated. I'll add my hardware information and performance measurements
to it when I find some time.
I am currently thinking about how to measure the spawned threads/s. Did you
have an idea how to do it?
Best regards,
Gordon
--
Gordon Bergling
Mobile: +49 170 23 10 948
Web: https://www.gordons-perspective.com/
Mail: gbergling at gmail.com
Think before you print!
More information about the freebsd-hackers
mailing list