Some days, it doesn't pay to upgrade ...
Marc G. Fournier
scrappy at freebsd.org
Sat Mar 3 03:13:29 UTC 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Based on the suggestion by someone on this list, I setup a screen session with
top running, to watch things ... again, after 3 days, the server goes 'out of
process' ... this time, of course, I could get in to look around and kill off
processes ...
from what I can tell, a process that all it does is:
ping -c 1 <host> with a 300 sec timeout that runs once a minute started to 'run
over top of' each other out of cron ... the host that it is pinging is on the
same switch and has been running fine for 20 days now, and it wasn't until I
did the last upgrade on teh server causing the problems that these problems
started ...
Coincidence? :)
I'm going to fix the script so that it doesn't try to run over itself ...
anyone konw of a problem with the fxp driver in 6-STABLE that might cause the
ping to hang?
- --On Thursday, March 01, 2007 09:51:13 +1100 Antony Mawer
<fbsd-stable at mawer.org> wrote:
> On 27/02/2007 11:59 PM, Marc G. Fournier wrote:
>> After 155 days of problem free uptime, I upgraded my 6-STABLE system the
>> other day to the latest cvsup ... 3 days later, the whole thing hung solid
>> with:
>>
>>
>> Feb 27 04:32:49 mars uptimec: The server requested that we do a new login
>> Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see
>> tuning(7) and login.conf(5).
>> Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see
>> tuning(7) and login.conf(5).
>>
>> Stupid question: why isn't there some mechanism that prevents new processes
>> from starting up, instead of locking up the whole server? I'm not asking
>> for the evilness of Linux, where it arbitrarily kills off existing
>> processes, but if maxproc is hit, why continue to try and start up new ones?
>
> What do you define as 'hung solid'? You are unable to get in via SSH? Or at a
> console via iLO/etc?
>
> I've seen this on some of our 6.0-RELEASE machines (along with maxpipekva
> exhausted errors), and you can't SSH in from that point... because sshd forks
> to handle the connection, and all available process slots are used up.
>
> I've thought about writing a background daemon to monitor the logs for signs
> of this (or even to just try and create a short-lived child process by
> fork()ing every 5 minutes or so), and dump information to disk then reboot
> the system when this occurs... it's a work-around for something that
> "shouldn't happen", but it does anyway... once I'm able to identify _what_ is
> causing the build-up of processes, then I might be able to do something about
> killing them...!!!
>
>
> It's quite deceptive from an end-user point of view, because things like
> Apache that are already keep running, so all they see are strange bits and
> pieces that don't work... and as always, its one of those things that only
> happens on some clients machines, but never on any of our test machines...
>
> --Antony
>
>
> PS. I haven't disappeared off the face of the earth.. though close.. my
> fiance and I have been busy planning the wedding, and wound up buying a house
> at the same time..!! Will catch up shortly once I get a chance to come up for
> air!!
- ----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy at hub.org MSN . scrappy at hub.org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)
iD8DBQFF6Ofd4QvfyHIvDvMRAmoqAJ9ka8ZQxq0Ciidyy4R60bTmYfxeggCeLz7i
/De9C0Hmdqb22nErxhyUaZA=
=Seo0
-----END PGP SIGNATURE-----
More information about the freebsd-stable
mailing list