How to debug rc hangs?
John Marshall
John.Marshall at riverwillow.com.au
Tue Oct 23 09:57:23 PDT 2007
Brooks Davis wrote:
> On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote:
>> Mike Telahun Makonnen wrote:
>>> On 10/23/07, John Marshall <John.Marshall at riverwillow.com.au> wrote:
>>>> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me
>>>> any more than I already know (e.g. last line before mountd hang is:
>>>> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l"
>>> It seems to me that if it's getting this far, that the problem probably is
>>> not in rc.d. The next thing it does after that debug message is eval the
>>> $doit
>>> line you saw, so either the eval command is missbehaving or the problem
>>> is with the daemon and not rc.d. What does CTR-t say when it hangs? Also,
>>> I noticed all three programs you listed are network daemons. My guess is
>>> they are not actually hung, they only *appear* to hang because they're
>>> wating
>>> on some sort of network resource (DNS maybe?).
>> Thanks Mike,
>>
>> The ctrl-T tip is the kind of information I'm looking for. My primary reason
>> for posting is to find out what tools/switches/hooks are available to help
>> troubleshoot this kind of problem, rather than asking somebody else to solve
>> it.
>>
>> Having said that, ctrl-T shows:
>> load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k
>> load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>> load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>>
>> ...which lends weight to my suspicion that a pre-requisite resource is not
>> yet available - and, perhaps, hasn't yet started due to a circular
>> dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE
>> labyrinth and work by trial and error (with a reboot in between each error).
>> I'm happy to do that but I'm hoping that I might be able to use this
>> situation to learn of more elegant ways to diagnose the problem.
>>
>> ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all
>> this was working without any intervention on 6.2-RELEASE.
>
> When I see processes stalled on nanslp at boot it's usually when my network is
> messed up in some way. I think it's stuck in the resolver trying to look things
> up.
[blush] I actually fixed this 12 months ago on 6.n and forgot all about
it. I let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't
think I had touched it.
Here is the fix. All happy now - but not much the wiser as to rc
troubleshooting techniques.
-----------------------------------------------
--- /usr/src/etc/rc.d/ypset 2007-10-12 12:38:42.000000000 +1000
+++ /etc/rc.d/ypset 2007-10-24 02:31:32.000000000 +1000
@@ -5,6 +5,7 @@
# PROVIDE: ypset
# REQUIRE: ypbind
+# BEFORE: mountd
. /etc/rc.subr
-----------------------------------------------
Thank you for your help.
--
John Marshall
More information about the freebsd-rc
mailing list