skipping fsck with soft-updates enabled
Scott Oertel
freebsd at scottevil.com
Thu Jan 11 18:17:49 UTC 2007
Eric Anderson wrote:
> On 01/10/07 10:18, Scott Oertel wrote:
>> Eric Anderson wrote:
>>> On 01/10/07 00:20, Scott Oertel wrote:
>>>> Victor Loureiro Lima wrote:
>>>>> From rc.conf man page:
>>>>> ---
>>>>> background_fsck_delay
>>>>> (int) The amount of time in seconds to sleep
>>>>> before starting
>>>>> a background fsck(8). It defaults to sixty
>>>>> seconds to allow
>>>>> large applications such as the X server to start
>>>>> before disk
>>>>> I/O bandwidth is monopolized by fsck(8).
>>>>> ---
>>>>>
>>>>> You can set the delay as long as you want, so it wont have to start
>>>>> right away, in fact it can start as late as a year (if thats really
>>>>> what you want ;))
>>>>>
>>>>> att,
>>>>> victor loureiro lima
>>>>>
>>>>> 2007/1/10, Oliver Fromme <olli at lurza.secnetix.de>:
>>>>>> Scott Oertel wrote:
>>>>>> > I am wondering what kind of problems would occur, besides lost
>>>>>> space, if
>>>>>> > after a system crash a fsck is skipped. According to the
>>>>>> documentation,
>>>>>> > with soft-updates enabled, the file system would be
>>>>>> consistant, there
>>>>>> > would just be lost resources to be recovered which I am
>>>>>> assuming can be
>>>>>> > safely done at a later time to avoid long periods of downtime
>>>>>> during
>>>>>> > peek hours.
>>>>>>
>>>>>> I think that's exactly what the background fsck feature
>>>>>> does. If you enable it (which is even the default), the
>>>>>> fsck process doesn' start right away, so the system comes
>>>>>> up in multi-user mode immediately. Then a snapshot is
>>>>>> created on the file system, and fsck runs on the snap-
>>>>>> shot, freeing the lost space in the file system.
>>>>>>
>>>>>> Of course, it only works reliably with soft-updates enabled,
>>>>>> _and_ there must not be any unexpected inconsistencies.
>>>>>> However, with some common setups (e.g. cheap disks lying
>>>>>> about completed write operation) it is difficult to
>>>>>> guarantee the consistency. Soft-updates is rather fragile
>>>>>> when the hardware doesn't work exactly as it's supposed to.
>>>>>> I've witnessed breakage in the past, and for that reason
>>>>>> I always disable the background fsck feature. And it's the
>>>>>> reason I'm looking forward to gjournal to become stable,
>>>>>> because it seems to be less fragile in the presence of
>>>>>> imperfect hardware.
>>>>>>
>>>>>> Best regards
>>>>>> Oliver
>>>>>>
>>>>>> --
>>>>>> Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
>>>>>> Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
>>>>>> Any opinions expressed in this message may be personal to the author
>>>>>> and may not necessarily reflect the opinions of secnetix in any way.
>>>>>>
>>>>>> "C++ is to C as Lung Cancer is to Lung."
>>>>>> -- Thomas Funke
>>>>>> _______________________________________________
>>>> The problem with background fsck is that on my machines, it doesn't
>>>> work well. These machines have 8x750gb SATA drives and they are
>>>> under extreme stress all the time. When you run fsck in the
>>>> background each drive takes 10+ minutes to create the snapshot
>>>> file, during which time the machine is completely unresponsive, and
>>>> unstable.
>>> What version of FreeBSD are you running? You might try gjournal,
>>> which I've had great luck with, and Pawel (pjd@) is incredibly
>>> responsive to bug reports, etc.
>>>
>>>> That is why I am wondering, if it is ok to skip the background
>>>> fsck's, foreground fsck's and reschedule them for a later time,
>>>> during non peak hours.
>>> I think most people would be nervous to tell you 'sure, skip it
>>> until later', but I can tell you from experience that I myself have
>>> delayed fscking for weeks on end, to do exactly what you want.
>>>
>>> Eric
>>>
>>>
>>>
>> I'm running on 6.2-RC2. For fun I tried to create a snapshot on one
>> of our newest machines, same drive config as the previous ones, it's
>> just less active then the others. It's running 6.2RC2 and it just
>> completely locked up. Anyway, thanks for the suggestion about running
>> gjournal, i'm not sure running non-offical patches on the file system
>> code with production machines is such a great idea. Have you had any
>> problems with gjournal, if so, of what nature were they?
>>
>
>
> Honestly, I haven't had many issues with snapshots since 6.1-ish and
> before. There were lots of deadlocks, livelocks, etc. I think Kris@
> has done a bang up job at finding bugs and getting them fixed. If you
> still see snapshot issues like this, it would be great if you could
> start sending some info like a ps -auxl, and if it's a deadlock, drop
> to the debugger and get a crash dump.
What size are the hard drives you're creating snapshots of? is it >
750gb? If it is then I would be happy to find a resolution for the
snapshot issue by providing debug info and such.
>
> As far as gjournal, I now have it running on several systems, all very
> high usage NFS servers (~1000 high end machines pounding them very
> hard, 24x7). I've only seen a few little issues on one of my systems
> that is running an older 6-STABLE (it's a little difficult for me to
> update it right now), but all my other systems have been very solid.
> PJD has done a great job getting it stable and ready for production
> use. As far as I have experienced, I have had no data loss, and no
> file system corruption using it. The worst that's happened is a
> livelock, followed by a reboot. Since it is indeed journaled, the
> reboot takes a few minutes, and the fsck takes a few *seconds* (on a
> 10TB volume). I would say, that using gjournal is more reliable over
> time, than relying on background fsck's. Gjournal is, however, still
> in a beta test mode, however you should do your own testing to
> evaluate it. You can always disable it very easily, without losing
> your data.
>
> Eric
>
>
>
I'll go ahead and give gjournal a test run on a test machine, and see
how I like it. Thank you for the information based on your experiences
with it.
-Scott
More information about the freebsd-fs
mailing list