Strange ZFS problem, filesystem claims to be full when clearly
not full
Danny Carroll
fbsd at dannysplace.net
Thu Sep 30 09:11:48 UTC 2010
On 30/09/2010 6:36 PM, Alexander Leidinger wrote:
>
> Quoting Jeremy Chadwick <freebsd at jdc.parodius.com> (from Wed, 29 Sep
> 2010 15:15:49 -0700):
>
>> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote:
>>> I'm at a complete loss here. I shut down the jail completely, and I am
>>> watching the jail's ZFS filesystem grow as we speak. No process is
>>> using
>>> it. It only grows in "df" and "zfs list", I can't find any files
>>> that are
>>> growing. I have to re-set the quota to be higher and higher to
>>> accommodate
>>> the space.
>>>
>>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen <
>>> torbjoern at gmail.com> wrote:
>>>
>>> > Hi Jeremy.
>>> >
>>> > 1) I checked now, and found nothing extraordinary. Just processes
>>> that have
>>> > been running for a long while, such as screen, cron, sshd, bash,
>>> irssi,
>>> > syslogd, etc.
>>> >
>>> > 2) No compression used on this zfs filesystem (or any of the others).
>>> >
>>> > I completedly stopped the jail now, and removed some of the
>>> directories
>>> > with the most data in them, but to no avail.
>>> >
>>> >
>>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick
>>> <freebsd at jdc.parodius.com
>>> > > wrote:
>>> >
>>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen
>>> wrote:
>>> >> > I have a ZFS "tank" called tpool, the server runs a couple of
>>> jails
>>> >> (each
>>> >> > with a zfs filesystem). There is a problem with one of these
>>> >> filesystems.
>>> >> > First, its disk usage as shown in ``df -h'':
>>> >> > ...
>>> >> > tpool/rb.org 100G 95G 4.6G 95% /jails/rb.org
>>> >> > ...
>>> >> >
>>> >> > The command ``zfs list'' shows the same:
>>> >> > ..
>>> >> > tpool/rb.org 95.4G 4.56G 95.4G /jails/rb.org
>>> >> > ..
>>> >> >
>>> >> > However, there is a very mysterious problem somewhere.
>>> >> > Something inside this jail is eating diskspace, but we can't
>>> find any
>>> >> > directories that is actually taking the diskspace. We first
>>> suspected
>>> >> either
>>> >> > fetchmail or spamassassin of causing a lot of space to be used,
>>> since
>>> >> some
>>> >> > of their directories were huge. (These were later deleted, and
>>> which is
>>> >> why
>>> >> > you see that 4.6GB is now available, before that 0GB was
>>> available).
>>> >> >
>>> >> > However, we can't find *any trace* of an actual directory or
>>> file that
>>> >> is
>>> >> > taking all the spac.e
>>> >> >
>>> >> > Take this for instance:
>>> >> >
>>> >> > outsidejail# du -sh rb.org
>>> >> > 43G rb.org
>>> >> >
>>> >> > How can this be? df and zfs are showing that the entire drive
>>> is nearly
>>> >> > full, yet I can't find any directory that is actually taking
>>> all this
>>> >> space.
>>> >> > I've carefully looked through every single directory within
>>> the jail
>>> >> trying
>>> >> > to find something that's taking all that space, but to no avail.
>>> >> >
>>> >> > ----
>>> >> > My system stats:
>>> >> > # uname -a
>>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19
>>> 02:36:49 UTC
>>> >> > 2010
>>> root at mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
>>> >> > # zpool get version tpool
>>> >> > NAME PROPERTY VALUE SOURCE
>>> >> > tpool version 14 default
>>> >> > # zpool status
>>> >> > pool: tpool
>>> >> > state: ONLINE
>>> >> > scrub: none requested
>>> >> > config:
>>> >> >
>>> >> > NAME STATE READ WRITE CKSUM
>>> >> > tpool ONLINE 0 0 0
>>> >> > mirror ONLINE 0 0 0
>>> >> > ad4s1d ONLINE 0 0 0
>>> >> > ad6s1d ONLINE 0 0 0
>>> >> >
>>> >> > errors: No known data errors
>>> >> >
>>> >> > [ Note that I've also done a scrub recently ]
>>> >>
>>> >> 1) Have you checked using fstat to ensure that no file descriptors
>>> >> remain open on any of your ZFS filesystems (not pools)?
>>> >>
>>> >> 2) Are you using compression on any of your ZFS filesystems?
>>
>> Andriy and Pawel,
>>
>> Do either of you have ideas as to what could cause the issue Torbjorn's
>> experiencing? I swear I remember some bug or quirk that got fixed with
>> regards to free space on ZFS, but as has been proven time and time again
>> my memory is horrible. His kernel's 8.1-RELEASE dated July 19th.
>
> IIRC the commit you talk about was by Martin (CCed). I do not know if
> it is (already) MFCed.
>
> I'm not sure the bug you talk about is related to what Torbjorn is
> talking about. The fact that the free space is going down while the
> jail is shutdown (and I assume jls does not show his JID anymore, so
> all of its processes are really gone) points more to some other
> process (outside of the jail) which is filling some (maybe already
> deleted, so not visible anymore with du) file.
>
It certainly smells like a process still writing to a file that is unlinked.
I wonder if it would show up with lsof.
If dtrace is enabled on that machine then I think it should be easy to
see which process is performing write operations.
-D
More information about the freebsd-fs
mailing list