CEPH + FreeBSD

Rick Macklem rmacklem at uoguelph.ca
Wed Sep 9 12:20:33 UTC 2015


Outback Dingo wrote:
> On Wed, Sep 9, 2015 at 11:31 AM, Mark Saad <nonesuch at longcount.org> wrote:
> 
> > All
> >  What about leofs. It's in ports has  and s3 obj store and NFS support out
> > of the box
> >
> >
> > http://www.freshports.org/databases/leofs/
> > http://leo-project.net
> 
> 
> LeoFS supperts NFSv3 and does not have a lock manager....
> 
I doubt lack of a lock manager is an issue for what I want to do, since the NFSv4.1
metadata server (just a regular NFSv4.1 server that can give out layouts for reading/writing
the data directly on the data servers) handles the locking. It is actually much easier to keep
track of the locking in the NFSv4.1 server and not have to worry about locking on the
underlying cluster FS. All I intend to do with a NFSv3 server on the data server(s) is do
Read/Write RPCs. Everything else is handled via the NFSv4.1 metadata server.
(The original RFC required use of NFSv4.1 read/write ops on the data servers,
 but a new layout type called flex files supports NFSv3 Read/Write for the data servers.)

The key issue for me is whether or not it has a VFS interface to a POSIX like
file system (via FUSE or ???). At a quick glance at the web page, I don't see
any mention of this?
Why? Well, simply the fact that I am looking at extending the current kernel based
     NFSv4.1 server to support pNFS. Obviously, there are other ways a NFSv4.1/pNFS
     server can be built (userland NFS-Ganehsa that is on Linux, for example), but
     that isn't what I'm interested in doing.

Btw, I took a quick look at MooseFS and it does seem to have this and could be an
     alternative to glusterFS. It isn't an object store and only appears to have a
     single metadata server, which might be a limitation for the long term?
     It sounds like MooseFS uses custom prototcol for the chunk/data
     servers and I don't feel like trying to define yet another layout type, so I
     think I would need to add a partial NFSv3 server to the chunk/data servers.

     I will be looking more closely at both glusterFS and MooseFS soon.

If there are yet more of these cluster object stores that you think might be worth
considering, feel free to mention them. (I thought I had looked at most of them, but
hadn't noticed MooseFS, so...)

Thanks for all the comments, rick

> 
> 
> >
> >
> >
> > ---
> > Mark Saad | nonesuch at longcount.org
> >
> > > On Sep 6, 2015, at 6:18 PM, Rick Macklem <rmacklem at uoguelph.ca> wrote:
> > >
> > > Jordan Hubbard wrote:
> > >>
> > >>> On Sep 3, 2015, at 4:17 AM, Rick Macklem <rmacklem at uoguelph.ca> wrote:
> > >>>
> > >>> Slightly off topic but, btw, there is a port of GLusterFS and those
> > folks
> > >>> do seem
> > >>> interested in seeing it brought "up to speed". I am not sure how
> > mature it
> > >>> is at
> > >>> this point, but it has been known to build on amd64. (I don't have an
> > amd64
> > >>> machine,
> > >>> so I haven't gotten around to building/testing it, but I do plan to
> > try and
> > >>> use
> > >>> it as a basis for a pNFS server, if I can figure out how to get the FH
> > info
> > >>> out of it.
> > >>> I'm working on that;-)
> > >>
> > >> There are at least two distributed (multi-node) object stores for
> > FreeBSD
> > >> that I know of.
> > >>
> > >> One is glusterfs, for which I’m not even really clear on the status of
> > the
> > >> ports for.  I don’t see any glusterfs port in the master branch of
> > >> https://github.com/freebsd/freebsd-ports (or
> > >> https://github.com/freebsd/freebsd-ports/tree/branches/2015Q3 for that
> > >> matter).
> > >>
> > >> Our FreeNAS ports tree (https://github.com/freenas/ports), in which we
> > have a
> > >> bit more latitude to add and curate our own ports, has both a
> > net/glusterfs
> > >> and sysutils/glusterfs, from separate sources (looks like we need to
> > clean
> > >> things up) - net/glusterfs lists craig001 at lerwick.hopto.org as the
> > >> MAINTAINER and is at version 3.6.2.  The sysutils/glusterfs port lists
> > >> bapt at FreeBSD.org as the MAINTAINER and is at version 20140811.
> > >>
> > >> I’m not really sure about the provenance since we were simply evaluating
> > >> glusterfs for awhile and may have pulled in interim versions from those
> > >> sources, but obviously it would be best to have an official maintainer
> > and
> > >> someone in the FreeBSD project actually curating a glusterfs port so
> > that
> > >> all users of FreeBSD can use it.  It would also be fairly key to your
> > own
> > >> efforts, assuming you decide to pursue glusterfs as a foundation
> > technology
> > >> for pNFS.
> > >>
> > >> The other object store, which is pretty mature and is currently leading
> > the
> > >> pack (of two :) ) for inclusion into FreeNAS is RiakCS from Basho.
> > There is
> > >> a port currently in databases/riak but it’s pretty out of date at
> > version
> > >> 1.4.12 (the current version is 2.0.1, with 2.0 being a major upgrade of
> > >> RiakCS).
> > >>
> > >> We are very interested in investigating various ways of shimming RiakCS
> > to
> > >> NFS, using RiakCS a back-end store.   Is that something you’d be
> > amenable to
> > >> discussing?   I’d be happy to send you an amd64 architecture machine to
> > >> develop on. :)
> > > Hmm. From a quick look at their web page (I looked once before as well),
> > I don't
> > > think RiakCS has what I need to do pNFS in a reasonable (for me) amount
> > of effort.
> > > Two things that glusterFS has that I am hoping to use (and I don't think
> > RiakCS has
> > > either of these) are:
> > > - A Fuse file system interface which allows the kernel nfsd to access
> > the store as
> > >  a file system, so that it can provide the metadata services (NFS
> > without the reads/writes).
> > > - A userland NFSv3 server in each node which will allow the node to act
> > as a data server.
> > >
> > > If I am wrong and RiakCS does support a VFS file system interface (via
> > Fuse or ???), then
> > > please correct me. With that, it might be a reasonable alternative.
> > > I'll admit I've spent a little time looking at the glusterFS sources and
> > haven't yet
> > > solved the problem of how to generate the file handles I need, but that
> > sounds trivial
> > > compared with an entire Fuse and/or VFS file system interface, I think?
> > >
> > > In general, using a cloud object store to implement a pNFS server is a
> > *mis*use of
> > > the technology, imho. I think it may be possible with glusterFS, since
> > that technology
> > > seems to be based on a cluster file system, which is what a pNFS server
> > can also use.
> > >
> > > I think there would be a lot of work involved in mapping a POSIX file
> > system onto the
> > > Riak database and then exporting that via NFS, etc. It might also be
> > more practical to
> > > do this via a userland NFS service than the kernel based one currently
> > in FreeBSD.
> > > (glusterFS is starting to use the NFS-ganesha server, but I believe it
> > is pretty Linux specific,
> > > so I doubt it would be useful for Riak running on FreeBSD?)
> > >
> > > rick
> > >
> > >> - Jordan
> > > _______________________________________________
> > > freebsd-fs at freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> >
> 


More information about the freebsd-fs mailing list