Intermittent system hangs on 7.2-RELEASE-p1
Linda Messerschmidt
linda.messerschmidt at gmail.com
Sat Sep 12 07:52:24 UTC 2009
OK, first, I figured out the seven second thing. I actually had
already found that particular issue earlier in the troubleshooting
process, but forgot all about it when I pulled in a second machine to
test with. It was simply a case of setting Apache's
MaxRequestsPerChild to a very low value (128) in combination with only
allowing 1 access at a time. 128 requests * (50ms sleep + 2ms request
+ overhead) ~= 7s. So that was just noise masking the real problem,
which is less frequent and less predictable. Sorry for the red
herring. :(
On Sat, Sep 12, 2009 at 2:52 AM, Linda Messerschmidt
<linda.messerschmidt at gmail.com> wrote:
> If you're asking could the check script be modified to time out after,
> say, 1 second, and if so, would it return during the hang or after it?
> I don't know. My guess based on the earlier ktrace output is that it
> would time out, but not return until the hang ended. I'll see if I
> the curl lib exposes a configurable timeout and try it.
This proved to be quite easy to do. I ran the script twice, once with
the timeout and once without.
Without timeout:
1252741492: request 910 101ms
1252741567: request 2133 1429ms
1252741603: request 2722 146ms
With 1s timeout:
1252741492: request 1078 106ms
1252741567: request 2302 1010ms (<--- Timeout)
1252741567: request 2303 273ms (<--- after 50ms sleep, goes back to
end of stall)
1252741603: request 2892 136ms
As you can see, the two scripts experience stalls in pretty much
lockstep, but the script itself does not appear affected, so it's just
on the Apache side.
More information about the freebsd-hackers
mailing list