Heavy I/O blocks FreeBSD box for several seconds

Mon Jul 11 16:16:54 UTC 2011

On Mon, Jul 11, 2011 at 06:07:04PM +0300, Andriy Gapon wrote:
> on 11/07/2011 17:41 Ivan Voras said the following:
> > On 07/07/2011 22:08, Steve Kargl wrote:
> > 
> >> 4BSD kernel gives for N = Ncpu + 1.
> >>
> >> 34 processes:  6 running, 28 sleeping
> >>
> >>    PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME    CPU COMMAND
> >>   1417 kargl       1  71    0   370M   294M RUN     0   1:30 79.39% sasmp
> >>   1416 kargl       1  71    0   370M   294M RUN     0   1:30 79.20% sasmp
> >>   1418 kargl       1  71    0   370M   294M CPU2    0   1:29 78.81% sasmp
> >>   1420 kargl       1  71    0   370M   294M CPU1    2   1:30 78.27% sasmp
> >>   1419 kargl       1  70    0   370M   294M CPU3    0   1:30 77.59% sasmp
> > 
> >> ULE kernel gives for N = Ncpu + 1.
> >>
> >> 34 processes:  6 running, 28 sleeping
> >>
> >>    PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME    CPU COMMAND
> >>   1318 kargl       1 103    0   370M   294M CPU0    0   1:31 100.00% sasmp
> >>   1319 kargl       1 103    0   370M   294M RUN     1   1:29 100.00% sasmp
> >>   1322 kargl       1  99    0   370M   294M CPU2    2   1:03 87.26% sasmp
> >>   1320 kargl       1  91    0   370M   294M RUN     3   1:07 60.79% sasmp
> >>   1321 kargl       1  89    0   370M   294M CPU3    3   1:06 55.18% sasmp
> > 
> > I can confirm this. Look at the priorities column for the two cases. For some
> > reason (CPU affinity?) the loads get asymmetrical on ULE.
> 
> Yeah, but what problem is demonstrated here?

That ULE cannot balance numerically intensive work, leading
to poor performance.

> Are we confident that non-even workload is inherently bad?
> E.g.:
> 79.39 + .. + 77.59 < 5 * 80 = 400
> 100.00 + ... + 55.18 ~~ 402 which is more than theoretically possible :-)
> So it would _appear_ that with ULE we get more work out of available CPUs.
> 
> But it's not clear which of the processes are slaves and which is master.
> It's also not clear why the master takes so much CPU (on par with the
> slaves) -
> from my reading of its description (by Steve) it should be doing only light
> periodic work.

These are all slave processes.  The master process was on a different
node in the cluster.  Each process is doing the exact same computation
with only a small change in a coordinate from (x,y,z) to (x,y+n*dy,z)
with n = 1, 2, 3, 4.  The small change does not causes a different 
code path, so all should complete in nearly identical times.

> If it does have to do CPU-heavy work, then I'd imagine that it should
> spawn only Ncpus - 1 slaves.

And if you have M users on the system?  Also note, you can get the
exact same loading problem by launching Ncpu+1 completely independent
cpu-bound processes.  Ncpu-1 processes will be bound to specific cpus
and 2 processes will ping-pong on one cpu.  This ping-ponging will
simply kill performance.

> Also, if with ULE we get less jumping around between CPUs than with
> 4BSD, that would mean less cache misses and more useful work done.

Well, yes, less cache misses for the pinned processes; and, no, for more
useful work done.

> Still not convinced that there is a problem with ULE here.

It's ULE.  See the last 3 years of my posts on the topic.

> I'd start with the app.

I'd switch to 4BSD ;-).  

-- 
Steve