Re: Periodic rant about SCHED_ULE
- Reply: Peter : "Re: Periodic rant about SCHED_ULE"
- In reply to: Peter : "Re: Periodic rant about SCHED_ULE"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 25 Mar 2023 22:35:36 UTC
> On Mar 25, 2023, at 14:51, Peter <pmc@citylink.dinoex.sub.org> wrote: > > On Sat, Mar 25, 2023 at 01:41:16PM -0700, Mark Millard wrote: > ! On Mar 25, 2023, at 11:58, Peter <pmc@citylink.dinoex.sub.org> wrote: > > ! > ! > ! > ! At which point I get the likes of: > ! > ! > ! > ! 17129 root 1 68 0 14192Ki 3628Ki RUN 13 0:20 3.95% gzip -9 > ! > ! 17128 root 1 20 0 58300Ki 13880Ki pipdwt 18 0:00 0.27% tar cvf - / (bsdtar) > ! > ! 17097 root 1 133 0 13364Ki 3060Ki CPU13 13 8:05 95.93% sh -c while true; do :; done > ! > ! > ! > ! up front. > ! > > ! > Ah. So? To me this doesn't look good. If both jobs are runnable, they > ! > should each get ~50%. > ! > > ! > ! For reference, I also see the likes of the following from > ! > ! "gstat -spod" (it is a root on ZFS context with PCIe Optane media): > ! > > ! > So we might assume that indeed both jobs are runable, and the only > ! > significant difference is that one does system calls while the other > ! > doesn't. > ! > > ! > The point of this all is: identify the malfunction with the most > ! > simple usecase. (And for me here is a malfunction.) > ! > And then, obviousely, fix it. > ! > ! I tried the following that still involves pipe-io but avoids > ! file system I/O (so: simplifying even more): > ! > ! cat /dev/random | cpuset -l 13 gzip -9 >/dev/null 2>&1 > ! > ! mixed with: > ! > ! cpuset -l 13 sh -c "while true; do :; done" & > ! > ! So far what I've observed is just the likes of: > ! > ! 17736 root 1 112 0 13364Ki 3048Ki RUN 13 2:03 53.15% sh -c while true; do :; done > ! 17735 root 1 111 0 14192Ki 3676Ki CPU13 13 2:20 46.84% gzip -9 > ! 17734 root 1 23 0 12704Ki 2364Ki pipewr 24 0:14 4.81% cat /dev/random > ! > ! Simplifying this much seems to get a different result. > > Okay, then you have simplified too much and the malfunction is not > visible anymore. > > ! Pipe I/O of itself does not appear to lead to the > ! behavior you are worried about. > > How many bytes does /dev/random deliver in a single read() ? > > ! Trying cat /dev/zero instead ends up similar: > ! > ! 17778 root 1 111 0 14192Ki 3672Ki CPU13 13 0:20 51.11% gzip -9 > ! 17777 root 1 24 0 12704Ki 2364Ki pipewr 30 0:02 5.77% cat /dev/zero > ! 17736 root 1 112 0 13364Ki 3048Ki RUN 13 6:36 48.89% sh -c while true; do :; done > ! > ! It seems that, compared to using tar and a file system, there > ! is some significant difference in context that leads to the > ! behavioral difference. It would probably be of interest to know > ! what the distinction(s) are in order to have a clue how to > ! interpret the results. > > I can tell you: > With tar, tar can likely not output data from more than one input > file in a single output write(). So, when reading big files, we > get probably 16k or more per system call over the pipe. But if the > files are significantly smaller than that (e.g. in /usr/include), > then we get gzip doing more system calls per time unit. And that > makes a difference, because a system call goes into the scheduler > and reschedules the thread. > > This 95% vs. 5% imbalance is the actual problem that has to be > addressed, because this is not suitable for me, I cannot wait for my > tasks starving along at a tenth of the expected compute only because > some number crunching does also run on the core. > > Now, reading from /dev/random cannot reproduce it. Reading from > tar can reproduce it under certain conditions - and that is all that > is needed. The suggestion that the size of the transfers into the first pipe matters is back up by experiments with the likes of: dd if=/dev/zero bs=128 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=132 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=133 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=192 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=1k | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=4k | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/zero bs=16k | cpuset -l 13 gzip -9 >/dev/null 2>&1 & (just examples) as what is paired up with: cpuset -l 13 sh -c "while true; do :; done" & Such avoids the uncontrolled variability in use of tar against a file system. But an interesting comparison/contrast results from, for example: dd if=/dev/zero bs=128 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & vs. dd if=/dev/random bs=128 | cpuset -l 13 gzip -9 >/dev/null 2>&1 & as what is paired with the: cpuset -l 13 sh -c "while true; do :; done" & At least in my context, the /dev/zero one ends up with: 18251 root 1 68 0 14192Ki 3676Ki RUN 13 0:02 1.07% gzip -9 18250 root 1 20 0 12820Ki 2484Ki pipewr 29 0:02 1.00% dd if=/dev/zero bs=128 18177 root 1 135 0 13364Ki 3048Ki CPU13 13 14:47 98.93% sh -c while true; do :; done but the /dev/random one ends up with: 18253 root 1 108 0 14192Ki 3676Ki CPU13 13 0:09 50.74% gzip -9 18252 root 1 36 0 12820Ki 2488Ki pipewr 30 0:03 16.96% dd if=/dev/random bs=128 18177 root 1 115 0 13364Ki 3048Ki RUN 13 15:45 49.26% sh -c while true; do :; done It appears that the CPU time (or more) for the dd feeding the first pipe matters for the overall result, not just the bs= value used. === Mark Millard marklmi at yahoo.com