freebsd-hackers Digest, Vol 1, Issue 1

Wed Mar 26 17:08:54 PST 2003

Здравствуйте, freebsd-hackers-request.
Ты задолбал!!!!!!!!!!!!!!!!!!
Проверь-ка правильность отсылки своих сообщений

Вы писали 26 марта 2003 г., 12:35:32:

fhrfo> Send freebsd-hackers mailing list submissions to
fhrfo>         freebsd-hackers at freebsd.org

fhrfo> To subscribe or unsubscribe via the World Wide Web, visit
fhrfo>         http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
fhrfo> or, via email, send a message with subject or body 'help' to
fhrfo>         freebsd-hackers-request at freebsd.org

fhrfo> You can reach the person managing the list at
fhrfo>         freebsd-hackers-owner at freebsd.org

fhrfo> When replying, please edit your Subject line so it is more specific
fhrfo> than "Re: Contents of freebsd-hackers digest..."

fhrfo> Today's Topics:

fhrfo>    1. Re: [PATCH2] PPP in -direct mode does not execute any chat
fhrfo>       scripts (Maksim Yevmenkin)
fhrfo>    2. Re: shared mem and panics when out of PV Entries (Andrew Kinney)
fhrfo>    3. Re: shared mem and panics when out of PV Entries (Terry Lambert)
fhrfo>    4. Some specific questions about 5.x (Alex)
fhrfo>    5. Re: Some specific questions about 5.x (Miguel Mendez)
fhrfo>    6. Re: Some specific questions about 5.x (Terry Lambert)
fhrfo>    7. Re: Some specific questions about 5.x (Lev Walkin)

fhrfo> ----------------------------------------------------------------------

fhrfo> Message: 1
fhrfo> Date: Tue, 25 Mar 2003 17:25:35 -0800
fhrfo> From: Maksim Yevmenkin <myevmenk at exodus.net>
fhrfo> Subject: Re: [PATCH2] PPP in -direct mode does not execute any chat
fhrfo>         scripts
fhrfo> To: Brian Somers <brian at Awfulhak.org>
fhrfo> Cc: hackers at FreeBSD.ORG
fhrfo> Message-ID: <3E81018F.70805 at exodus.net>
fhrfo> Content-Type: text/plain; charset=us-ascii; format=flowed

fhrfo> Hello Brian,

>> Yes, this looks fine, although I think this shows that the -direct
>> description is wrong.  Perhaps this is more appropriate:
>> 
>> -direct
>>    This is used for communicating over an already established connection,
>>    usually when receiving incoming connections accepted by getty(8).  ppp
>>    ignores the ``set device'' line and uses descriptor 0 as the link.  ppp
>>    will ignore any configured chat scripts unless the ``force-scripts''
>>    option has been enabled.
>> 
>>    If callback....
>> 
>> Do you agree with this description ?  If so, I'll go ahead and commit the

fhrfo> yes, this is more accurate description. i missed it.

>> changes.  Just to be picky, I'll re-sort the OPT_ variables too :*P

fhrfo> no problem :)

>> And thanks for the patches.

fhrfo> thank you for reviewing them :)
fhrfo> max

>> On Mon, 03 Feb 2003 14:45:37 -0800, Maksim Yevmenkin wrote:
>> 
>>>Dear Brian and Hackers,
>>>
>>>Please find updated proposed version of the patch. As suggested by
>>>Warner option has been renamed to 'force-sripts' and now works for
>>>both 'direct' and 'dedicated' modes. Also as suggested by Terry the
>>>man page has been updated to document side effect of 'direct'.
>>>
>>>-direct
>>>   This is used for receiving incoming connections.  ppp ignores the
>>>   ``set device'' line and uses descriptor 0 as the link.  ppp will
>>>   never use any configured chat scripts unless ``force-scripts''
>>>   option has been enabled.
>>>
>>>   If callback is configured, ppp will use the ``set device'' infor-
>>>   mation when dialing back.
>>>
>>>-dedicated
>>>   This option is designed for machines connected with a dedicated
>>>   wire.  ppp will always keep the device open and will never use
>>>   any configured chat scripts unless ``force-scripts'' option has
>>>   been enabled.
>>>
>>>force-scripts
>>>   Default: Disabled. Forces execution of the configured chat
>>>   scripts in direct and dedicated modes.
>>>
>>>
>>>>>Please find attached patch that adds new option to the PPP.
>>>>>
>>>>>run-scripts-in-direct-mode
>>>>>       Default: Disabled. This allows to run chat scripts in
>>>>>       direct mode.
>>>>>
>>>>>did i miss anything? objections? comments? reviews?
>>>>
>>>>
>>>>First comment: run it past Brian Somers <brian at Awfulhak.org>; it's
>>>>his baby, and he's the active maintainer.
>>>
>>>I have sent him e-mail.
>>>
>>>
>>>>Rest of comments:
>>>>
>>>>Actually, why doesn't "-direct" allow a chat script by default?
>>>>The man page doesn't document that as a side-effect of "-direct",
>>>>only of "-dedicated", but it's been there since the import.
>>>>
>>>>Should this really be a "negotiate" section command, rather than
>>>>just a command or a "set" command?
>>>>
>>>>Also, there are only two other commands even have a "-" in them,
>>>>and both of them only have one (just seems a little long, compared
>>>>to, say, "rsid" or "direct-with-script", or even "force-script").
>>>>
>>>>Personal preference: don't make it conditional on "-direct", let
>>>>it also work with "-dedicated", and call it "force-script" or
>>>>something, instead.
>>>
>>>done
>>>
>>>
>>>>The man page should be updated -- including the undocumented
>>>>side-effect of "-direct" disabling scripts).
>>>
>>>done
>>>
>>>thanks
>>>max
>>>
>> 
>> 
>> 

fhrfo> ------------------------------

fhrfo> Message: 2
fhrfo> Date: Tue, 25 Mar 2003 17:54:28 -0800
fhrfo> From: "Andrew Kinney" <andykinney at advantagecom.net>
fhrfo> Subject: Re: shared mem and panics when out of PV Entries
fhrfo> To: Igor Sysoev <is at rambler-co.ru>
fhrfo> Cc: freebsd-hackers at FreeBSD.ORG
fhrfo> Message-ID: <3E8097D4.22759.A3FC35 at localhost>
fhrfo> Content-Type: text/plain; charset=US-ASCII

fhrfo> On 25 Mar 2003, at 17:56, Igor Sysoev wrote:

>> > So, what's the best approach to limiting memory shared via fork() or
>> > reducing PV Entry usage by that memory?  Is there something I can do
>> > with the kernel config or sysctl to accomplish this?
>> 
>> No, as far as I know there's no way to do it.
>> The irony is that you do not need the most of these PV entries because
>> you are not swaping.
>> 

fhrfo> My thoughts exactly.  I suppose not all that many people run well  
fhrfo> used web servers with 4GB of RAM, so there wouldn't be any 
fhrfo> reason for this issue to come up on a regular basis.  

fhrfo> I'm going to expose my newbness here with respect to BSD 
fhrfo> memory management, but could the number of files served and 
fhrfo> filesystem caching have something to do with the PV Entry usage 
fhrfo> by Apache?  We've got around 1.2 million files served by this 
fhrfo> Apache.  Could it be that the extensive PV Entry usage has 
fhrfo> something to do with that?  Obviously, not all are accessed all the 
fhrfo> time, but it wouldn't take a very large percentage of them being 
fhrfo> accessed to cause issues if filesystem caching is in any way 
fhrfo> related to PV Entry usage by Apache.

fhrfo> I remember reading somewhere (sorry, didn't keep track of the link) 
fhrfo> that someone running a heavily used Squid proxy had a very 
fhrfo> similar issue with running out of PV Entries as they got more and 
fhrfo> more files in the cache.  Squid is basically a modified Apache with 
fhrfo> proxy caching turned on.

>> I think you should try to decrease memory that shared between Apache
>> processes. If you can not change scripts then the single method is to
>> decrease number of Apache processes while keeping to handle the
>> current workload: 1) disable keepalive if it enabled; 2) set Apache
>> behind reverse-proxy server that frees Apache processes
>>    as soon as proxy get the whole responses.

fhrfo> We had keepalive set to the default of "on" (at least default for this 
fhrfo> install) with the default keepalive timeout of 15 seconds.

fhrfo> Dropping the keepalive timeout down to 3 seconds has 
fhrfo> dramatically reduced the number of Apache processes required to 
fhrfo> serve the load.  With the new settings, we're averaging 30 to 80 
fhrfo> Apache processes, which is much more manageable in terms of 
fhrfo> memory usage, though we weren't anywhere near running out of 
fhrfo> physical RAM prior to this.  We're servicing a little over 1000 
fhrfo> requests per minute, which by some standards isn't a huge amount.

fhrfo> We're still seeing quite heavy PV Entry usage, though.  The 
fhrfo> reduced number of Apache processes (by more than half) doesn't 
fhrfo> seem to have appreciably reduced PV Entry usage versus the 
fhrfo> previous settings, so I suspect I may have been wrong about 
fhrfo> memory sharing as the culprit for the PV Entry usage.  This 
fhrfo> observation may just be coincidence, but the average PV Entry 
fhrfo> usage seems to have gone up by a couple million entries since the 
fhrfo> changes to the Apache config.

fhrfo> Time will tell if the PV Entries are still getting hit hard enough to 
fhrfo> cause panics due to running out of them.  They're supposed to get 
fhrfo> forcibly recycled at 90% utilization from what I see in the kernel 
fhrfo> code, so if we never get above 90% utilization I guess I could 
fhrfo> consider the issue resolved.

fhrfo> What other things in Apache (besides memory sharing via PHP 
fhrfo> and/or mod_perl) could generate PV Entry usage on a massive 
fhrfo> scale?

fhrfo> Sincerely,
fhrfo> Andrew Kinney
fhrfo> President and
fhrfo> Chief Technology Officer
fhrfo> Advantagecom Networks, Inc.
fhrfo> http://www.advantagecom.net

fhrfo> ------------------------------

fhrfo> Message: 3
fhrfo> Date: Tue, 25 Mar 2003 19:28:18 -0800
fhrfo> From: Terry Lambert <tlambert2 at mindspring.com>
fhrfo> Subject: Re: shared mem and panics when out of PV Entries
fhrfo> To: andykinney at advantagecom.net
fhrfo> Cc: freebsd-hackers at FreeBSD.ORG
fhrfo> Message-ID: <3E811E52.198972EB at mindspring.com>
fhrfo> Content-Type: text/plain; charset=us-ascii

fhrfo> Andrew Kinney wrote:
>> On 25 Mar 2003, at 17:56, Igor Sysoev wrote:
>> > > So, what's the best approach to limiting memory shared via fork() or
>> > > reducing PV Entry usage by that memory?  Is there something I can do
>> > > with the kernel config or sysctl to accomplish this?
>> >
>> > No, as far as I know there's no way to do it.
>> > The irony is that you do not need the most of these PV entries because
>> > you are not swaping.
>> 
>> My thoughts exactly.  I suppose not all that many people run well
>> used web servers with 4GB of RAM, so there wouldn't be any
>> reason for this issue to come up on a regular basis.

fhrfo> You need the pv_entry_t's because there is one on each vm_page_t
fhrfo> for each virtual mapping for the page.

fhrfo> This is necessary to correctly mark things clean or dirty,
fhrfo> and to deal with copy-on-write.

fhrfo> What is *actually* ironic is that, for the most part, these
fhrfo> things *may* be able to be shared, if they were made slightly
fhrfo> more complex and reference counted, and you were willing to
fhrfo> split some of the copy-on-write code a little bit further
fhrfo> between machine-dependent and machine-independent.

fhrfo> Matt Dillon would be the person to talk to about this; I
fhrfo> could do it, but he'd do it faster.

>> I'm going to expose my newbness here with respect to BSD
>> memory management, but could the number of files served and
>> filesystem caching have something to do with the PV Entry usage
>> by Apache?  We've got around 1.2 million files served by this
>> Apache.  Could it be that the extensive PV Entry usage has
>> something to do with that?  Obviously, not all are accessed all the
>> time, but it wouldn't take a very large percentage of them being
>> accessed to cause issues if filesystem caching is in any way
>> related to PV Entry usage by Apache.

fhrfo> When you fork, you copy the address space, which means you copy
fhrfo> the pv_entry_t's, so the answer is a tentative "yes".  But files
fhrfo> which are not open are not mapped, so unless you have a lot of
fhrfo> mmap's hanging around, this shouldn't be an issue with System V
fhrfo> shared memory.

>> We had keepalive set to the default of "on" (at least default for this
>> install) with the default keepalive timeout of 15 seconds.
>> 
>> Dropping the keepalive timeout down to 3 seconds has
>> dramatically reduced the number of Apache processes required to
>> serve the load.  With the new settings, we're averaging 30 to 80
>> Apache processes, which is much more manageable in terms of
>> memory usage, though we weren't anywhere near running out of
>> physical RAM prior to this.  We're servicing a little over 1000
>> requests per minute, which by some standards isn't a huge amount.
>> 
>> We're still seeing quite heavy PV Entry usage, though.  The
>> reduced number of Apache processes (by more than half) doesn't
>> seem to have appreciably reduced PV Entry usage versus the
>> previous settings, so I suspect I may have been wrong about
>> memory sharing as the culprit for the PV Entry usage.  This
>> observation may just be coincidence, but the average PV Entry
>> usage seems to have gone up by a couple million entries since the
>> changes to the Apache config.
>> 
>> Time will tell if the PV Entries are still getting hit hard enough to
>> cause panics due to running out of them.  They're supposed to get
>> forcibly recycled at 90% utilization from what I see in the kernel
>> code, so if we never get above 90% utilization I guess I could
>> consider the issue resolved.
>> 
>> What other things in Apache (besides memory sharing via PHP
>> and/or mod_perl) could generate PV Entry usage on a massive
>> scale?

fhrfo> Basically, you don't really care about pv_entry_t's, you care
fhrfo> about KVA space, and running out of it.

fhrfo> In a previous posting, you suggested increasing KVA_PAGES fixed
fhrfo> the problem, but caused a pthreads problem.

fhrfo> What you meant to say is that it caused a Linux threads kernel
fhrfo> module mailbox location problem for the user space Linux threads
fhrfo> library.  In other words, it's because you are using the Linux
fhrfo> threads implementation, that you have this problem, not FreeBSD's
fhrfo> pthreads.

fhrfo> Probably, the Linux threads kernel module should be modified to
fhrfo> provide the mailbox location, and then the user space Linux
fhrfo> threads library should be modified to utilize sysctl to talk
fhrfo> to the kernel module, and establish the locations, so that they
fhrfo> don't have to be agreed upon at compile time for programs using
fhrfo> the code.

fhrfo> In any case, the problem you are having is because the uma_zalloc()
fhrfo> (UMA) allocator is feeling KVA space pressure.

fhrfo> One way to move this pressure somewhere else, rather than dealing
fhrfo> with it in an area which results in a panic on you because the code
fhrfo> was not properly retrofit for the limitations of UMA, is to decide
fhrfo> to preallocate the UMA region used for the "PV ENTRY" zone.

fhrfo> The way to do this is to modify /usr/src/sys/i386/i386/pmap.c
fhrfo> at about line 122, where it says:

fhrfo>         #define MINPV 2048

fhrfo> to say instead:

fhrfo>         #ifndef MINPV
fhrfo>         #define MINPV 2048      /* default, if not specified in config */
fhrfo>         #endif

fhrfo> And to realize that there is an "opt_pmap.h".  To activate this,
fhrfo> you will need to add this line to /usr/src/sys/conf/options.i386:

fhrfo>         MINPV   opt_pmap.h

fhrfo> With this in place, you will be able to adjust the initial minimum
fhrfo> allocations upward by saying:

fhrfo>         options MINPV=4096

fhrfo> (or whatever) in your kernel config file.

fhrfo> Note: you may want to "#if 0" out the #define in pmap.c altogether,
fhrfo> to reassure yourself that this is working; it's easy to make a mistake
fhrfo> in this part of the kernel.

fhrfo> -- Terry

fhrfo> ------------------------------

fhrfo> Message: 4
fhrfo> Date: Wed, 26 Mar 2003 10:57:07 +0300
fhrfo> From: Alex <alex at dynaweb.ru>
fhrfo> Subject: Some specific questions about 5.x
fhrfo> To: FreeBSD hackers list <freebsd-hackers at freebsd.org>
fhrfo> Message-ID: <3E815D53.6010404 at dynaweb.ru>
fhrfo> Content-Type: text/plain; charset=windows-1251; format=flowed

fhrfo> Hi everybody!

fhrfo> I was so much enthusiastic about kernel threads implemented in 5.x but 
fhrfo> some ugly rumors spoiled my dreams :0)
fhrfo> So I want to get if these rumors are myths or not.

fhrfo> 1.    Is it true that kernel threads are more "heavy" than userspace 
fhrfo> ones (pthread) and hence application with hundreds of threads will work 
fhrfo> evidently slower than that using pthreads due to more switching penalties?

fhrfo> 2.    Is it true that even 5.x has no implementation for inter-process 
fhrfo> semaphores that are blocking calling thread only not the whole process 
fhrfo> as usually in FreeBSD?

fhrfo> Alex

fhrfo> ------------------------------

fhrfo> Message: 5
fhrfo> Date: Wed, 26 Mar 2003 09:18:45 +0100
fhrfo> From: Miguel Mendez <flynn at energyhq.homeip.net>
fhrfo> Subject: Re: Some specific questions about 5.x
fhrfo> To: alex at dynaweb.ru
fhrfo> Cc: FreeBSD hackers list <freebsd-hackers at freebsd.org>
fhrfo> Message-ID: <20030326091845.36425fad.flynn at energyhq.homeip.net>
fhrfo> Content-Type: text/plain; charset="us-ascii"

fhrfo> On Wed, 26 Mar 2003 10:57:07 +0300
fhrfo> Alex <alex at dynaweb.ru> wrote:

fhrfo> Howdy.

>> 1.    Is it true that kernel threads are more "heavy" than userspace 
>> ones (pthread) and hence application with hundreds of threads will
>> work evidently slower than that using pthreads due to more switching
>> penalties?

fhrfo> AFAIK, not in a hybrid model. Systems that do 1:1 thread mapping (Like
fhrfo> Gah! Nu/Linux) will suffer from this kind of situation, also will use
fhrfo> more kernel memory. In hybrid implementations based on Scheduler
fhrfo> Activations, like FreeBSD's KSE, and NetBSD's SA, there's a balance
fhrfo> between the number of kernel virtual processors available and the number
fhrfo> of userland threads, it's an N:M model. Nathan Williams' paper on the
fhrfo> subject suggests that context switch is not much slower than a pure
fhrfo> userland implementation. Also, keep in mind that pure userland has other
fhrfo> problems, like when one thread blocks on I/O. In pure userland threading
fhrfo> systems this means the whole process is blocked, whereas in KSE and SA
fhrfo> only that thread is stopped.

>> 2.    Is it true that even 5.x has no implementation for inter-process
>> semaphores that are blocking calling thread only not the whole process
>> as usually in FreeBSD?

fhrfo> That I don't know, perhaps the local KSE guru, Julian might have an
fhrfo> answer for this.

fhrfo> Cheers,
fhrfo> -- 
fhrfo>         Miguel Mendez - flynn at energyhq.homeip.net
fhrfo>         GPG Public Key :: http://energyhq.homeip.net/files/pubkey.txt
fhrfo>         EnergyHQ :: http://www.energyhq.tk
fhrfo>         Tired of Spam? -> http://www.trustic.com
fhrfo> -------------- next part --------------
fhrfo> A non-text attachment was scrubbed...
fhrfo> Name: not available
fhrfo> Type: application/pgp-signature
fhrfo> Size: 186 bytes
fhrfo> Desc: not available
fhrfo> Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20030326/6b59980f/attachment-0001.bin

fhrfo> ------------------------------

fhrfo> Message: 6
fhrfo> Date: Wed, 26 Mar 2003 00:53:59 -0800
fhrfo> From: Terry Lambert <tlambert2 at mindspring.com>
fhrfo> Subject: Re: Some specific questions about 5.x
fhrfo> To: alex at dynaweb.ru
fhrfo> Cc: FreeBSD hackers list <freebsd-hackers at freebsd.org>
fhrfo> Message-ID: <3E816AA7.1B6A02C at mindspring.com>
fhrfo> Content-Type: text/plain; charset=us-ascii

fhrfo> Alex wrote:
>> I was so much enthusiastic about kernel threads implemented in 5.x but
>> some ugly rumors spoiled my dreams :0)
>> So I want to get if these rumors are myths or not.

fhrfo> 5.x does not implement traditional "kernel threads" like you appear
fhrfo> to be thinking about them.  Instead, it implements a variation of
fhrfo> scheduler activations.  Traditional "kernel threads" have a lot of
fhrfo> unnecessary overhead problems, including CPU affinity and thread
fhrfo> group negaffinity, necessary for increased single application
fhrfo> concurrency.

fhrfo> See the KSE documentation for more information.

>> 1.    Is it true that kernel threads are more "heavy" than userspace
>> ones (pthread) and hence application with hundreds of threads will work
>> evidently slower than that using pthreads due to more switching penalties?

fhrfo> Yes and No.

fhrfo> See the KSE documentation for more information.

>> 2.    Is it true that even 5.x has no implementation for inter-process
>> semaphores that are blocking calling thread only not the whole process
>> as usually in FreeBSD?

fhrfo> No, for values of x > 0.

fhrfo> See the KSE documentation for more information.

fhrfo> -- Terry

fhrfo> ------------------------------

fhrfo> Message: 7
fhrfo> Date: Wed, 26 Mar 2003 01:35:37 -0800
fhrfo> From: Lev Walkin <vlm at netli.com>
fhrfo> Subject: Re: Some specific questions about 5.x
fhrfo> To: Miguel Mendez <flynn at energyhq.homeip.net>
fhrfo> Cc: FreeBSD hackers list <freebsd-hackers at freebsd.org>
fhrfo> Message-ID: <3E817469.4030403 at netli.com>
fhrfo> Content-Type: text/plain; charset=us-ascii; format=flowed

fhrfo> Miguel Mendez wrote:
>> On Wed, 26 Mar 2003 10:57:07 +0300
>> Alex <alex at dynaweb.ru> wrote:
>> 
>> Howdy.
>> 
>> 
>>>1.    Is it true that kernel threads are more "heavy" than userspace 
>>>ones (pthread) and hence application with hundreds of threads will
>>>work evidently slower than that using pthreads due to more switching
>>>penalties?
>> 
>> 
>> AFAIK, not in a hybrid model. Systems that do 1:1 thread mapping (Like
>> Gah! Nu/Linux) will suffer from this kind of situation, also will use
>> more kernel memory. In hybrid implementations based on Scheduler
>> Activations, like FreeBSD's KSE, and NetBSD's SA, there's a balance
>> between the number of kernel virtual processors available and the number
>> of userland threads, it's an N:M model. Nathan Williams' paper on the
>> subject suggests that context switch is not much slower than a pure
>> userland implementation. Also, keep in mind that pure userland has other
>> problems, like when one thread blocks on I/O. In pure userland threading
>> systems this means the whole process is blocked, whereas in KSE and SA
>> only that thread is stopped.

fhrfo> What about Solaris' migration towards 1:1 model from the N:M one they
fhrfo> had supported for years already? Who are insane, Solaris folks (moving
fhrfo> towards Linux) or Free/NetBSD ones (migrating to the old Solaris'
fhrfo> behavior)?

-- 
С уважением,
 SkiEr                          mailto:dm_list at mail.ru