Call for testers: New PID allocator patch for -CURRENT

Jun Su junsu at delphij.net
Sun Feb 1 07:59:06 PST 2004


----- Original Message ----- 
From: "Peter Jeremy" <PeterJeremy at optushome.com.au>
To: "Jun Su" <junsu at delphij.net>
Cc: "David Schultz" <das at freebsd.org>; "Xin LI" <delphij at frontfree.net>;
<hackers at freebsd.org>; "Olivier Houchard" <cognet at ci0.org>;
<current at freebsd.org>; "John Baldwin" <jhb at freebsd.org>
Sent: Sunday, February 01, 2004 5:50 AM
Subject: Re: Call for testers: New PID allocator patch for -CURRENT


> On Fri, Jan 30, 2004 at 11:02:08PM +0800, Jun Su wrote:
> >I think this algorithms for proc_alloc, pfind and zpfind are all O(1).
The
> >worst situation is that
> >it reaches PID_MAX. In this situation, we need expand our pidtbl. This
may
> >bring some delay. However, this situation will only occurs few times. If
a
> >machine need 1000 concurrent process, it will only need a 2 ^ 10 slots
> >process table. From the original 2 ^ 5 table, we need only 5 times to
expend
> >the table.
>
> What was the reason for picking an initial value of 32?  A typical
> -CURRENT system will have more than this many processes running by the
> time rc.d is finished.  A more reasonable initial value would seem to
> be 2^7.  Compaq/HP Tru64 uses a similar pid allocation strategy and it
> appears to start with a table of around 2^10.
>
The value is from netbsd. I didn't give much thinking here. Thank you.

> The memory requirements for embedded systems could be reduced by making
> the initial value tunable - either at boot time or config time.
>
Good point. I will try to add this as a config value.
> >> [2] Many systems have a high enough fork rate that pids recycle
> >>     every few minutes or hours with the present algorithm.  These
> >>     systems don't necessarily have lots of processes running at any
> >>     given time, so the table (and thus the cycle length) in your
> >>     patch could remain relatively small if I'm interpreting the
> >>     code correctly.  I think the code would have to be changed to
> >>     prevent reuse from happening too quickly in wall time.
> >Reusing the proc slot doesn't mean reusing the pid. Everytime we
> >reuse a proc slot, we will add pidtbl_mask to the pid. We reuse
> >the pid when the pid reach the max_pid. Therefore if a user wants, he can
> >increase
> >the max_pid to a larger number to increase the period that the pid is not
be
> >reused. I will add a sysctl to allow user to adjust max_pid.
>
> I don't believe it's reasonable to just create a max_pid sysctl and
> expect users to tune this to avoid obscure system misbehaviour.
>
> If the maximum number of simultaneous processes is just below a power
> of two then there are very few free slots.  These slots will therefore
> be reused very rapidly.  Even taking into account the pid_tbl_mask,
> the pid's could be reused quite rapidly - especially since pid_max
> may only be twice pid_tbl_mask.
Yes. You are right. In this scenerio, the pid reuse will rapidly.

>
> The code does include a comment "ensure pids cycle through 2000+
> values" but it's very unclear how this actually works - pid_alloc_cnt
That comment is from netbsd... I also am confused about that.

> is just a count of the used slots in pid_table and pid_alloc_lim
> starts off as the size of pid_table and either doubles or triples
> whenever the pid_table size doubles.  I can't see any mechanism to
> ensure any minimum pid cycle length longer than 2.
>
> Note that many system utilities "know" that pids can be represented
> in 5 digits and having max_pid exceed 99999 will disrupt output from
> top, ps, lsof, pstat etc.  This places an upper limit on how high
> max_pid can be realistically tuned.
I did this check in my new patch. I will post my new patch later.

>
> Rather than doubling the size of pid_table only if the existing table
> is full, you need a mechanism to also double the pid_table size and/or
> increase max_pid if pids are reused too quickly.  A simple check would
> be ((pid_tbl_mask - pid_alloc_cnt) * pid_max / pid_table_mask < N)
> [for some suitable N].  It would be nice if N depended on the fork()
> rate but this may make to code sensitive to fork bombs.
Good suggestion. I will try to implement this.

Appreciate your feedback.

Jun Su

>
> Peter
>
>



More information about the freebsd-hackers mailing list