Unable to stop a jail

Robert Watson rwatson at FreeBSD.org
Fri Dec 1 03:15:47 PST 2006


On Fri, 1 Dec 2006, Bjoern A. Zeeb wrote:

> On Fri, 1 Dec 2006, Steven Hartland wrote:
>
>> We've got a jail here which we cant stop with either killall jexec or jkill 
>> all return success but jls still reports the jail as running.
>> 
>> The machines running several other jails which I cant restart at this time 
>> so I ended up starting the jail again jls now reports: jls
>>  JID  IP Address      Hostname   Path
>>    9  10.10.0.5     jail6        /usr/local/jails/jail6
>>    7  10.10.0.5     jail6        /usr/local/jails/jail6
>>    6  10.10.0.4     jail5        /usr/local/jails/jail5
>>    5  10.10.0.39    jail4        /usr/local/jails/jail4
>>    3  10.10.0.6     jail3        /usr/local/jails/jail3
>>    2  10.10.0.8     jail2        /usr/local/jails/jail2
>>    1  10.10.0.7     jail1        /usr/local/jails/jail1
>> 
>> Host machine is running FreeBSD-6.1-P10
>> 
>> Any ideas some sort of kernel data corruption?
>
> no the jails should really be gone (you should not find any sockets or 
> processes for them after some seconds) - at least it should be that way...
>
> See http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/89528

Not all cases of straggling jails are leaks -- does netstat -n show that all 
the TIME_WAIT TCP connections in the jail have been GC'd?  Because security 
state may be used in the network stack for TCP packet transmission/reception, 
the ucred remains referenced until the last socket/pcb associated with it are 
free'd.  I've been wondering if we should add a jail process counter, and hide 
jails in jls if the counter is zero (with a -a argument or such to show them). 
One idea I've been kicking around is adding a zombie state for jails, in which 
some straggling references exist, but (a) there are no processes in the jail, 
and (b) no new processes are allowed to enter the jail.  The significance of 
(b) is that we could vrele() the vnode reference hung off the jail; there's 
been at least one report that this vnode reference causes issues, as the file 
system it's from can't be unmounted until the last jail reference evaporates.

In essence, this would move to having two reference counts on the prison: a 
"strong" reference that has to do with having process members, and a "weak" 
reference that has to do with ucreds pointing at the prison.

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-hackers mailing list