Unresponsive jails issues

Grzegorz Junka list1 at gjunka.com
Mon May 16 12:55:05 UTC 2016


I have a server running 13 jails for various system services. Recently I 
added two jails to run simple go applications for testing. They open a 
network socket and nginx, which is in another jail, and which round 
robin balances requests to them. I mention that because it may be 
related, however not necessarily because it was happening earlier.

The problem is that every 2-3 days jails in my servers stop responding. 
"jexec jailname tcsh" hangs forever, "service jail stop jailname" hangs 
forever as well. "top" doesn't show anything suspicious. I can login 
through SSH to the main server fine. I don't login to jails through SSH 
so I can't check but it seems that when that happens they stop 
responding because the services that are running in them stop too (e.g. 
web server, imap, ...). I tried to "kill -9" the "jexec" process that 
hangs but that doesn't work.

My first question is what evidence should I gather when that happens so 
that I can investigate the issue later on after the server is restarted?

And the second question, any idea why that might be happening in the 
first place?

I am running FreeBSD 10.3 AMD64 updated from 10.2 a couple of weeks ago.

Grzegorz



More information about the freebsd-jail mailing list