RFC: GEOM MULTIPATH rewrite

Gary Palmer gpalmer at freebsd.org
Mon Nov 14 21:09:59 UTC 2011


On Tue, Nov 01, 2011 at 10:24:06PM +0200, Alexander Motin wrote:
> On 01.11.2011 19:50, Dennis K?gel wrote:
> > Not sure if replying on-list or off-list makes more sense...
> 
> Replying on-list could share experience to other users.
> 
> > Anyway, some first impressions, on stable/9:
> > 
> > The lab environment here is a EMC VNX / Clariion SAN, which has two Storage Processors, connected to different switches, connected to two isp(4)s on the test machine. So at any time, the machine sees four paths, but only two are available (depending on which SP owns the LUN).
> > 
> > 580# camcontrol devlist
> > <DGC VRAID 0531>                   at scbus0 target 0 lun 0 (da0,pass0)
> > <DGC VRAID 0531>                   at scbus0 target 1 lun 0 (da1,pass1)
> > <DGC VRAID 0531>                   at scbus1 target 0 lun 0 (da2,pass2)
> > <DGC VRAID 0531>                   at scbus1 target 1 lun 0 (da3,pass3)
> > <COMPAQ RAID 1(1VOLUME OK>         at scbus2 target 0 lun 0 (da4,pass4)
> > <COMPAQ RAID 0  VOLUME OK>         at scbus2 target 1 lun 0 (da5,pass5)
> > <hp DVD D  DS8D3SH HHE7>           at scbus4 target 0 lun 0 (cd0,pass6)
> > 
> > I miss the ability to "add" disks to automatic mode multipaths, but I (just now) realized this only makes sense when gmultipath has some kind of path checking facility (like periodically trying to read sector 0 of each configured device, this is was Linux' devicemapper-multipathd does).
> 
> In automatic mode other paths supposed to be detected via metadata
> reading. If in your case some paths are not readable, automatic mode
> can't work as expected. By the way, could you describe how your
> configuration supposed to work, like when other paths will start
> working? 

Without knowledge of the particular Clariion SAN Dennis is working with,
I've seen some so-called active/active RAID controllers force a LUN 
fail over from one controller to another (taking it offline for 3 seconds
in the process) because the LUN received an I/O down a path to the controller
that was formerly taking the standby role for that LUN (and it was per-LUN,
so some would be owned by one controller and some by the other).  During
the controller switch, all I/O to the LUN would fail.  Thankfully that
particular RAID model where I observed this behaviour hasn't been sold in
several years, but I would tend to expect such behaviour at the lower
end of the storage market with the higher end units doing true active/active
configurations. (and no, I won't name the manufacturer on a public list)

This is exactly why Linux ships with a multipath configuration file, so
it can describe exactly what form of brain damage the controller in
question implements so it can work around it, and maybe even 
document some vendor-specific extensions so that the host can detect
which controller is taking which role for a particular path.

Even some controllers that don't have pathological behaviour when
they receive I/O down the wrong path have sub-optimal behaviour unless
you choose the right path.  NetApp SANs in particular typically have two
independant controllers with a high-speed internal interconnect, however
there is a measurable and not-insignificant penalty for sending the I/O
to the "partner" controller for a LUN, across the internal interconnect
(called a "VTIC" I believe) to the "owner" controller.  I've been told,
although I have not measured this myself, that it can add several ms to
a transaction, which when talking about SAN storage is potentially several
times what it takes to do the same I/O directly to the controller that
owns it.  There's probably a way to make the "partner" controller not
advertise the LUN until it takes over in a failover scenario, but every
NetApp I've worked with is set (by default I believe) to advertise the
LUN out both controllers.

Gary


More information about the freebsd-geom mailing list