GSoC proposition: multiplatform UFS2 driver

Fri Mar 14 19:18:55 UTC 2014

Wiadomość napisana przez Richard Yao w dniu 14 mar 2014, o godz. 19:53:
> On 03/14/2014 02:36 PM, Edward Tomasz Napierała wrote:
>> Wiadomość napisana przez Ian Lepore w dniu 14 mar 2014, o godz. 16:39:
>>> On Fri, 2014-03-14 at 15:27 +0000, RW wrote:
>>>> On Thu, 13 Mar 2014 18:22:10 -0800
>>>> Dieter BSD wrote:
>>>> 
>>>>> Julio writes,
>>>>>> That being said, I do not like the idea of using NetBSD's UFS2
>>>>>> code. It lacks Soft-Updates, which I consider to make FreeBSD UFS2
>>>>>> second only to ZFS in desirability.
>>>>> 
>>>>> FFS has been in production use for decades.  ZFS is still wet behind
>>>>> the ears. Older versions of NetBSD have soft updates, and they work
>>>>> fine for me. I believe that NetBSD 6.0 is the first release without
>>>>> soft updates.  They claimed that soft updates was "too difficult" to
>>>>> maintain.  I find that soft updates are *essential* for data
>>>>> integrity (I don't know *why*, I'm not a FFS guru). 
>>>> 
>>>> NetBSD didn't simply drop soft-updates, they replaced it with
>>>> journalling, which is the approach used by practically all modern
>>>> filesystems. 
>>>> 
>>>> A number of people on the questions list have said that they find
>>>> UFS+SU to be considerably less robust than the journalled filesystems
>>>> of other OS's.  
>> 
>> Let me remind you that some other OS-es had problems such as truncation
>> of files which were _not_ written (XFS), silently corrupting metadata when
>> there were too many files in a single directory (ext3), and panicing instead
>> of returning ENOSPC (btrfs).  ;->
> 
> Lets be clear that such problems live between the VFS and block layer
> and therefore are isolated to specific filesystems. Such problems
> disappear when using ZFS.

Such problems disappear after fixing bugs that caused them.  Just like
with ZFS - some people _have_ lost zpools in the past.

>>> What I've seen claimed is that UFS+SUJ is less robust.  That's a very
>>> different thing than UFS+SU.  Journaling was nailed onto the side of UFS
>>> +SU as an afterthought, and it shows.
>> 
>> Not really - it was developed rather recently, and with filesystems it usually
>> shows, but it's not "nailed onto the side": it complements SU operation
>> by journalling the few things which SU doesn't really handle and which
>> used to require background fsck.
>> 
>> One problem with SU is that it depends on hardware not lying about
>> write completion.  Journalling filesystems usually just issue flushes
>> instead.
> 
> This point about write completion being done on unflushed data and no
> flushes being done could explain the disconnect between RW's statements
> and what Soft Updates should accomplish. However, it does not change my
> assertion that placing UFS SU on a ZFS zvol will avoid such failure
> modes.

Assuming everything between UFS and ZFS below behaves correctly.

> In ZFS, we have a two stage transaction commit that issues a
> flush at each stage to ensure that data goes to disk, no matter what the
> drive reported. Unless the hardware disobeys flushes, the second stage
> cannot happen if the first stage does not complete and if the second
> stage does not complete, all changes are ignored.
> 
> What keeps soft updates from issuing a flush following write completion?
> If there are no pending writes, it is a noop. If the hardware lies, then
> this will force the write. The internal dependency tracking mechanisms
> in Soft Updates should make figuring out when a flush needs to be issued
> should hardware have lied about completion rather simple. At a high
> level, what needs to be done is to batch the things that can be done
> simultaneously and separate those that cannot by flushes. If such
> behavior is implemented, it should have a mount option for toggling it.
> It simply is not needed on well behaved devices, such as ZFS zvols.

As you say, it's not needed on well-behaved devices.  While it could
help with crappy hardware, I think it would be either very complicated
(batching, as described), or would perform very poorly.

To be honest, I wonder how many problems could be avoided by
disabling write cache by default.  With NCQ it shouldn't cause
performance problems, right?