gmultipath HA over iscsi/iser
Alexander Motin
mav at FreeBSD.org
Wed Jun 24 05:32:07 UTC 2015
Hi, Max.
On 24.06.2015 03:14, John-Mark Gurney wrote:
> Max Gurtovoy wrote this message on Tue, Jun 23, 2015 at 18:42 +0300:
>> On 6/19/2015 7:20 PM, John-Mark Gurney wrote:
>>> Max Gurtovoy wrote this message on Sun, Jun 14, 2015 at 19:16 +0300:
>>>> lately I was testing HA using gmultipath utility over iSCSI/iSER devices.
>>>> I'm working on 11-current code base.
>>>> I created 1 LUN on the target side and connected via 2 different
>>>> physical ports from the initiator side.
>>>> On the initiator side I see see /dev/da0 and /dev/da1.
>>>> I created multipath device using:
>>>> gmultipath label dm0 /dev/da0 /dev/da1.
>>>> Now I have new device /dev/multipath/dm0.
>>>> I set kern.iscsi.fail_on_disconnection=1 (to fail IO fast).
>>>>
>>>> Issue 1:
>>>> -------------
>>>> I can't run simple fio/dd traffice over /dev/da0 nor /dev/da1.
>>>> The only traffic that possible is using the multipath device dm0.
>>>> Is this by design ?
Yes, that is by design. Otherwise geom tasting process could be confused
in some situations. For example, it may be not obvious whether partition
tables should be handled over multipath or over raw devices. And if user
mounts file system via partition labels or GPT IDs, it is not
predictable which of three (or more) instances he will mount.
There is sysctl kern.geom.multipath.exclusive, documented in man page,
to control that behavior, but I would not recommend you to disable it.
>>>> In the linux implementation we can run traffic on both block devices and
>>>> multipath devices.
There is an Easter egg in GEOM that allows to do the same:
kern.geom.debugflags=16. It allows access to raw GEOM devices in any
situation. But that is a hack and should be considered as such with all
possible cautions.
>>>> Issue 2:
>>>> --------------
>>>> I run some fio traffic utility over multipath device dm0 on initiator
>>>> side with port toggling in a loop
>>>>
>>>> Port 1 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile
>>>> with no success) --> port 1 up --> sleep 5 mins (iSCSI/ISER device
>>>> reconnecting successecfully)
>>>> Port 2 down --> sleep 2 mins (iSCSI/ISER device reconnecting meanwhile
>>>> with no success) --> port 2 up --> sleep 5 mins (iSCSI/ISER device
>>>> reconnecting successecfully)
>>>>
>>>> The expected result is that when the port N is down than the traffic
>>>> moves to the available port and continue succesfully.
>>>> I run this test for many hours and traffic FAILED (even though there was
>>>> at least 1 suitable path between initiator and target).
I am not aware about such bug. We may need some deeper debugging to
diagnose that.
> Though I realize it's difficult, it's easiest to look at the source to
> see who's been touching it last:
> https://svnweb.freebsd.org/base/head/sys/geom/multipath/g_multipath.c?view=log
>
> Looks like mav has been somewhat active recently... I've cc'd him...
>
>> Maybe we can discuss about testing the gmultipath driver over iscsi/iser
>> devices and fix some bugs together ?
>
> I can help out some, but engaging mav, or some of the others that
> are more active in storage would be better...
>
>> We are planning to add it to our test plan and HA is in high priority
>> for us.
>
> You should definately talk/coordinate w/ the people at iXsystems (mav
> works there), as they work in storage w/ iSCSI, etc.
I was not a original gmultipath author, but I've rewritten significant
part of it to make it work, and now it works fine for us. I am open to
bug reports and propositions about improving it.
--
Alexander Motin
More information about the freebsd-scsi
mailing list