Re: top vs. array sorted_state's out of bounds indexing (and related issues)

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 04 Jul 2023 04:34:35 UTC
On Sun, Jul 02, 2023 at 09:52:46AM -1000, Clifton Royston wrote:
> On 6/19/2023 6:05 AM, Konstantin Belousov wrote:
> > On Sun, Jun 18, 2023 at 09:23:06PM -0700, Mark Millard wrote:
> > > usr.bin/top/machine.c (from main) has:
> > > 
> > > /*
> > > . . . The process states are ordered as follows (from least
> > >   *      to most important):  WAIT, zombie, sleep, stop, start, run.  The
> > >   *      array declaration below maps a process state index into a number
> > >   *      that reflects this ordering.
> > >   */
> > > 
> > > static int sorted_state[] = {
> > >          0,      /* not used             */
> > >          3,      /* sleep                */
> > >          1,      /* ABANDONED (WAIT)     */
> > >          6,      /* run                  */
> > >          5,      /* start                */
> > >          2,      /* zombie               */
> > >          4       /* stop                 */
> > > };
> > > 
> > > Note the elements are 0..6, so 7 elements.
> > > 
> > > It is accessed via code like:
> > > 
> > > sorted_state[(unsigned char)(b)->ki_stat]
> [...]
> > > /*
> > >   * These were process status values (p_stat), now they are only used in
> > >   * legacy conversion code.
> > >   */
> > > #define SIDL    1               /* Process being created by fork. */
> > > #define SRUN    2               /* Currently runnable. */
> > > #define SSLEEP  3               /* Sleeping on an address. */
> > > #define SSTOP   4               /* Process debugging or suspension. */
> > > #define SZOMB   5               /* Awaiting collection by parent. */
> > > #define SWAIT   6               /* Waiting for interrupt. */
> > > #define SLOCK   7               /* Blocked on a lock. */
> > > 
> [...]
> > > There is also the issue that:
> > > 
> > > SIDL   is misidentified as: sleep
> > > SRUN   is misidentified as: ABANDONED (WAIT)
> > > SSLEEP is misidentified as: run
> > > SZOMB  is misidentified as: start
> > > SWAIT  is misidentified as: zombie
> > > SLOCK  is out of bounds (as indicated above).
> > > 
> > > So the sort order in top is not as documented.
> > > 
> > > 
> > > For reference, from sys/kern/kern_proc.c :
> > > 
> > >          if (p->p_state == PRS_NORMAL) { /* approximate. */
> > >                  if (TD_ON_RUNQ(td) ||
> > >                      TD_CAN_RUN(td) ||
> > >                      TD_IS_RUNNING(td)) {
> > >                          kp->ki_stat = SRUN;
> > >                  } else if (P_SHOULDSTOP(p)) {
> > >                          kp->ki_stat = SSTOP;
> > >                  } else if (TD_IS_SLEEPING(td)) {
> > >                          kp->ki_stat = SSLEEP;
> > >                  } else if (TD_ON_LOCK(td)) {
> > >                          kp->ki_stat = SLOCK;
> > >                  } else {
> > >                          kp->ki_stat = SWAIT;
> > >                  }
> > >          } else if (p->p_state == PRS_ZOMBIE) {
> > >                  kp->ki_stat = SZOMB;
> > >          } else {
> > >                  kp->ki_stat = SIDL;
> > >          }
> > https://reviews.freebsd.org/D40607
> 
>   I rarely comment here and hesitate to now, but out of curiosity I looked
> at the original report and at the chain of commits.
> 
>   It appears to me that in making the code more readable the final result
> may have accidentally inverted the sort order from the intended values
> (mostly.)  A casual test wouldn't show this, as unless the sort order in top
> were changed, normally it would only come into play for two processes with
> equal CPU % and ticks which seems unlikely.
> 
>   Perhaps I'm simply misreading the original which was itself confused on
> the index values, but it appears from the original *comments*, as opposed to
> the effects, that higher values are intended to sort as higher list
> priority. For example, run was originally supposed to be mapped to 6
> (largest value), start to 5, and so on down to zombie mapped to 2 and WAIT
> to 1.
IMO no, the lower values put the process closer to the start of the sorted
list.  Could you provide an argument that this is not true?

The chain is
	ORDERKEY_STATE
	compare_XXX
	compares
	get_process_info
	qsort(compares[idx])
The result is that lowest weight process appears at the beginning of the
sorted process list.  This is exactly what we want, IMHO: running first,
sleeping last, and other states ordered somewhat arbitrary in between.

> 
>   The rewrite with designated initializers in  rGbc2ac2585aa8 now maps - in
> descending value order -  [SSLEEP] = 6, [SSTOP] = 5, [SWAIT] = 4 ... [SRUN]
> = 1.
> 
>   As long as this is being fixed, shouldn't it read more like this:
> 
> /*
> 
>  *  proc_compare - comparison function for "qsort"
> 
>  *    Compares the resource consumption of two processes using five
> 
>  *    distinct keys.  The keys (in descending order of importance) are:
> 
>  *    percent cpu, cpu ticks, state, resident set size, total virtual
> 
>  *    memory usage.  The process states are ordered as follows (from most
> 
>  *    to least important):  run, zombie, idle, interrupt wait, stop, sleep.
> 
>  *    The array declaration below maps a process state index into a
> 
>  *    number that reflects this ordering.
> 
>  */
> 
> static const int sorted_state[] = {
> 
>     [SRUN] =    6,    /* running/runnable    */
> 
>     [SZOMB] =    5,    /* zombie        */
> 
>     [SIDL] =    4,    /* being created    */
> 
>     [SWAIT] =    3,    /* intr            */
> 
>     [SSTOP] =    2,    /* stopped/suspended    */
> 
>     [SSLEEP] =    1,    /* sleeping        */
> 
> 
>   I haven't read the entire context of how the values get used in the
> comparison, so it's possible I have this wrong.
> 
>   I'm assuming where SIDL and SWAIT should fall based on the revised
> comments in rGd636fc5bd1e2 and the assumption that the original comment
> "(from least to most important)" was misread as "(from most to least
> important)" as I think one would normally want to see the "run" processes
> ahead of the "sleep" processes etc.
> 
>   Best regards,
> 
>   -- Clifton
> 
> (My apologies if this comes out garbled, as seemingly I have to fight
> Thunderbird to edit for plain text output.)
> 
> -- 
> 
> Clifton Royston <cliftonr@volcano.org>
> 
>