FreeBSD 10.x + LiquidSoap + NFS == Server Hang
Marc Fournier
scrappy at hub.org
Fri Jul 4 04:30:48 UTC 2014
Hi all …
I have a jail running on FreeBSD 10-STABLE (svn update as of July 2nd @ ~05:30 UTC:
==
Working Copy Root Path: /usr/src
URL: https://svn0.us-east.freebsd.org/base/stable/10
Relative URL: ^/stable/10
Repository Root: https://svn0.us-east.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 268135
Node Kind: directory
Schedule: normal
Last Changed Author: pfg
Last Changed Rev: 268132
Last Changed Date: 2014-07-02 01:28:38 +0000 (Wed, 02 Jul 2014)
==
Currently it has 3 jail’d environments running off it, with the files for them NFS mounted from a NetApp filer … and right now, the NFS mount that these jails are running from is “locked” … a ‘df’ hangs … trying to do a ‘jexec # /bin/tcsh’ into one of the jail’s hangs … etc.
The same NFS file system is mounted and running on a half dozen other servers, and they are all operating just fine, so the NetApp is operating properly.
If I move the jail with liquidsoap running around to a different server, the hang will follow to the new server, and the old server will once more become rock solid …
I’m not 100% certain it is liquidsoap, but the hang appears to always coincide with reloading a new playlist … and although it happens frequently (more with recent upgrades), it doesn’t happen *every* night …
This is on a remote server … so doing things at the console isn’t possible, and although I’ve got a remote console on this, I’ve never figured out how to break to the debugger through it, although I’m going to work on it to see if I can’t get it to work …
Baring breaking to the debugger (is there a way, from the command line, to force it to break to the debugger?), is there anything else I can use to provide some sort of useful information?
ps aux for the proces shows:
# ps aux | grep liq
1002 2957 0.0 0.7 226888 112792 - TLJ 4:45AM 370:27.23 /usr/local/bin/liquidsoap -q -d /usr/local/etc/liquidsoap/liquidsoap.liq
and:
# ps auxxwl | grep 2957
1002 2957 0.0 0.7 226888 112792 - TLJ 4:45AM 370:27.23 /usr/local/bin/l 1002 1 0 20 0 -
1002 96280 0.0 0.0 12316 0 - IWJ - 0:00.00 pwait 2957 1002 96274 0 52 0 kqread
root 96508 0.0 0.0 18788 1828 4 S+ 4:19AM 0:00.00 grep 2957 0 96505 0 20 0 piperd
Other commands I can / should run next time it happens … ? Which won’t take long ...
Thanks …
More information about the freebsd-stable
mailing list