September OpenZFS Leadership Meeting
Matthew Ahrens
mahrens at delphix.com
Mon Sep 23 16:22:25 UTC 2019
At this month's meeting we discussed:
- ZoL EOL of RHEL 6
- Xattr cross-platform compatibility
- Relaxed quota semantics for improved performance
- zpool replace of log vdev
- temporal dedup
Video is now up on youtube: https://youtu.be/kjBWhEE8tZ8
Full notes below (thanks Serapheim):
-
EOL ZoL on RHEL 6 (Brian Behlendorf)
-
RHEL 6 could be old enough that we could drop support for it on
master (still supported for 0.8)
-
Technically will be EOL'd by Red Hat in November 2020.
-
Feedback from the community: Given enough Notifications beforehand,
people should be fine
-
Actual change needed in ZoL:
-
Go through build system code and remove any references of v3.10
kernel and older (new oldest supported kernel would be 3.11).
The process
should be similar on what ZoL did for deprecating RHEL5.
-
Action Items:
-
Brian/Matt will give a heads up in the mailing list, in the
release notes of each versions until then, open PR for this
-
We need volunteers for the build system changes
-
Xattr cross-platform compatibility (Andrew Walker)
-
Problem:
-
ixSystems works with services that receive alternate data streams
written as xattrs in FreeBSD in the user namespace, which is
implemented
slightly different in Linux (there is "user." prefix - FreeBSD uses
"freebsd." prefix(?) - Solaris uses "smb." prefix). Their application
(Samba) is doing the same thing in Linux and FreeBSD, but ZFS
represents
them different on-disk between each platform. As a result,
xattrs that are
written in FreeBSD are visible in other OSes except from ZoL where the
metadata disappears.
-
Potential Solutions:
-
Brian: ZoL has around 4 prefixes, so one solution would be to have
user as a fallback choice (e.g. if it is not part of any
namespace, it is
part of the user namespace).
-
Andrew Walker: Have a zfs dataset property to be able to tell
which format is used
-
Andriy Gapon: Add some OS info on the actual attribute and have
ZFS interpret them differently
-
Sef: Some form a feature flag that would fix the prefixes.
-
Matt Ahrens: First make it possible to read xattrs from all
platforms, even if the names show up differently. A
potential long-term
solution: New stuff is written in some new format that is
portable across
platforms (e.g. in the zfs.* namespace) and each platform
translates the
ZFS prefixes to the local platform’s prefixes.
-
Question: Is it an incompatibility between different OSes? or an
incompatibility between different implementations of ZFS? Shall we have a
translation layer outside of ZFS?
-
A bit of both but mostly VFS layer (outside of ZFS code). Assuming
it is only on the VFS layer, it would be reasonable to still
have some way
of accessing these attributes. A point for this, is that in
ZoL there is
little flexibility in changing the VFS code.
-
Action Items:
-
Proposal & Next steps - Andrew can start a writeup and coordinate
with Alexander from iXSystems
-
Relax quota semantics for improved performance (Allan Jude)
-
Problem: As you approach quotas, ZFS performance degrades.
-
Proposal: Can we have a property like quota-policy=strict or loose,
where we can optionally allow ZFS to run over the quota as long as
performance is not decreased.
-
People's Feedback/Questions:
-
Richard Elling - Isn't it the same problem when the pool is almost
full (SLOP space)? Answer: This is slightly different, but
the mechanism is
the same, and we don't want to break that (e.g. run beyond
SLOP space just
like that).
-
Tangent: Should we scale the SLOP space appropriately? The SLOP space
can bite a big chunk of space in big pools.
-
Feedback: That seems reasonable, though the use cases may not be
that many (fragmentation issues in such big pools will probably arise
before encountering the SLOP space issue). See discussion
<https://github.com/zfsonlinux/zfs/pull/8106#issuecomment-437499997>
on previous PR.
-
zpool replace of a log (and maybe a cache) vdev – does this work well?
Can it be improved? (Andriy Gapon)
-
Problem: a user had to replace a log device using the replace command
and it took a long time (dozens of gigabytes were scanned). Can we do
better? It seems like there is not special logic for devices
like that, do
we want to do something different for log vdevs? Even maybe
prohibit using
replace for these devices and advice the remove & add workflow.
-
Feedback: the above sound reasonable except for one thing. Log
devices can have actual data on them. If you crash and you have blocks in
the log device and you've removed the device, and you don't mount the
specific filesystems, these blocks will stay there. Encryption
should also
make this more common. We need to retain the ability for the scrub-based
replace/attach. We could improve the performance by looking at all the
blocks of all the logs instead of looking at all the blocks in the pool.
-
Action Item: Andriy will look into this and create a doc
-
Renaming bookmarks – are there any pitfalls? Seems like a useful feature
that’s not been implemented in a long time (Andriy Gapon)
-
Feedback: It should just work - one more thing to plumb through the
CLI, libzfs, etc… internally, removing the ZAP entry and
re-adding it with
the new name should do the trick
-
Panzura to open source their temporal dedup implementation (Josh P)
-
Panzura will be open-sourcing some parts of their self-contained ZFS
implementation of temporal dedup on Github. There is hope from
Panzura that
this will be integrated within OpenZFS but at least for now there are no
concrete plans of getting this code upstreamed without volunteers.
-
Question: What is temporal dedup?
-
A dedup scheme that groups blocks by the time that they are
created/modified etc... Grouping blocks in such way should
allow for faster
access to the data due to caching based on temporal locality
On Tue, Sep 17, 2019 at 8:47 AM Matthew Ahrens <mahrens at delphix.com> wrote:
> The next OpenZFS Leadership meeting will be held today, September 17,
> 1pm-2pm Pacific time.
>
> Everyone is welcome to attend and participate, and we will try to keep the
> meeting on agenda and on time. The meetings will be held online via Zoom,
> and recorded and posted to the website and YouTube after the meeting.
>
> The agenda for the meeting will be a discussion of the projects listed in
> the agenda doc.
>
> For more information and details on how to attend, as well as notes and
> video from the previous meeting, please see the agenda document:
>
>
> https://docs.google.com/document/d/1w2jv2XVYFmBVvG1EGf-9A5HBVsjAYoLIFZAnWHhV-BM/edit
>
> --matt
>
More information about the zfs-devel
mailing list