How to debug rc hangs?

John Marshall John.Marshall at riverwillow.com.au
Tue Oct 23 09:57:23 PDT 2007


Brooks Davis wrote:
> On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote:
>>  Mike Telahun Makonnen wrote:
>>> On 10/23/07, John Marshall <John.Marshall at riverwillow.com.au> wrote:
>>>> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me
>>>> any more than I already know (e.g. last line before mountd hang is:
>>>> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l"
>>> It seems to me that if it's getting this far, that the problem probably is
>>> not in rc.d. The next thing it does after that debug message is eval the 
>>> $doit
>>> line you saw, so either the eval command is missbehaving or the problem
>>> is with the daemon and not rc.d. What does CTR-t say when it hangs? Also,
>>> I noticed all three programs you listed are network daemons. My guess is
>>> they are not actually hung, they only *appear* to hang because they're 
>>> wating
>>> on some sort of network resource (DNS maybe?).
>>  Thanks Mike,
>>
>>  The ctrl-T tip is the kind of information I'm looking for. My primary reason 
>>  for posting is to find out what tools/switches/hooks are available to help 
>>  troubleshoot this kind of problem, rather than asking somebody else to solve 
>>  it.
>>
>>  Having said that, ctrl-T shows:
>>   load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k
>>   load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>>   load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>>
>>  ...which lends weight to my suspicion that a pre-requisite resource is not 
>>  yet available - and, perhaps, hasn't yet started due to a circular 
>>  dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE 
>>  labyrinth and work by trial and error (with a reboot in between each error). 
>>  I'm happy to do that but I'm hoping that I might be able to use this 
>>  situation to learn of more elegant ways to diagnose the problem.
>>
>>  ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all 
>>  this was working without any intervention on 6.2-RELEASE.
> 
> When I see processes stalled on nanslp at boot it's usually when my network is
> messed up in some way.  I think it's stuck in the resolver trying to look things
> up.

[blush] I actually fixed this 12 months ago on 6.n and forgot all about 
it. I let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't 
think I had touched it.

Here is the fix. All happy now - but not much the wiser as to rc 
troubleshooting techniques.

-----------------------------------------------
--- /usr/src/etc/rc.d/ypset     2007-10-12 12:38:42.000000000 +1000
+++ /etc/rc.d/ypset  2007-10-24 02:31:32.000000000 +1000
@@ -5,6 +5,7 @@

  # PROVIDE: ypset
  # REQUIRE: ypbind
+# BEFORE:  mountd

  . /etc/rc.subr

-----------------------------------------------

Thank you for your help.


-- 
John Marshall


More information about the freebsd-rc mailing list