ZFS stalled after some mirror disks were lost
Ben RUBSON
ben.rubson at gmail.com
Fri Oct 6 10:09:00 UTC 2017
> On 02 Oct 2017, at 20:12, Ben RUBSON <ben.rubson at gmail.com> wrote:
>
> Hi,
>
> On a FreeBSD 11 server, the following online/healthy zpool :
>
> home
> mirror-0
> label/local1
> label/local2
> label/iscsi1
> label/iscsi2
> mirror-1
> label/local3
> label/local4
> label/iscsi3
> label/iscsi4
> cache
> label/local5
> label/local6
>
> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk
> according to "zpool iostat", nothing on local disks.
> No write IOs.
>
> Let's disconnect all iSCSI disks :
> iscsictl -Ra
>
> Expected behavior :
> IO activity flawlessly continue on local disks.
>
> What happened :
> All IOs stalled, server only answers to IOs made to its zroot pool.
> All commands related to the iSCSI disks (iscsictl), or to ZFS (zfs/zpool),
> don't return.
>
> Questions :
> Why this behavior ?
> How to know what happens ? (/var/log/messages says almost nothing)
>
> I already disconnected the iSCSI disks without any issue in the past,
> several times, but there were almost no IOs running.
>
> Thank you for your help !
>
> Ben
Hello,
So first, many thanks again to Andriy, we spent almost 3 hours debugging the
stalled server to find the root cause of the issue.
Sounds like I would need help from iSCSI dev team (Edward perhaps ?), as issue
seems to be on this side.
Here is Andriy conclusion after the debug session, I quote him :
> So, it seems that the root cause of all evil is this outstanding zio (it might
> be not the only one).
> In other words, it looks like iscsi stack bailed out without completing all
> outstanding i/o requests that it had.
> It should either return success or error for every request, it can not simply
> drop a request.
> And that appears to be what happened here.
> It looks like ZFS is fragile in the face of this type of errors.
> Essentially, each logical i/o request obtains a configuration lock of type 'zio'
> in shared mode to prevent certain configuration changes from happening while
> there are any outsanding zio-s.
> If a zio is lost, then this lock is leaked.
> Then, the code that deals with vdev failures tries to take this lock in
> exclusive mode while holding a few other configuration locks also in exclsuive
> mode so, any other thread needing those locks would block.
> And there are code paths where a configuration lock is taken while
> spa_namespace_lock is held.
> And when spa_namespace_lock is never dropped then the system is close to toast,
> because all pool lookups would get stuck.
> I don't see how this can be fixed in ZFS.
> It seems that when the initiator is being removed it doesn't properly terminate
> in-glight requests.
> It would be interesting to see what happens if you test other scenarios.
So I tested the following other scenarios :
1 - drop all iSCSI traffic using ipfw on the target
2 - ifdown the iSCSI NIC on the target
3 - ifdown the iSCSI NIC on the initiator
4 - stop ctld (on the target of course)
I tested all of them several times, 5 or 6 times each ?
I managed to kernel panic (!) 2 times.
First time in case 2.
Second time in case 4.
Not sure I would not have been able to panic in other test cases though.
Stack traces :
https://s1.postimg.org/2hfdpsvban/panic_case2.png
https://s1.postimg.org/2ac5ud9t0f/panic_case4.png
(kgdb) list *g_io_request+0x4a7
0xffffffff80a14dc7 is in g_io_request (/usr/src/sys/geom/geom_io.c:638).
633 g_bioq_unlock(&g_bio_run_down);
634 /* Pass it on down. */
635 if (first)
636 wakeup(&g_wait_down);
637 }
638 }
639
640 void
641 g_io_deliver(struct bio *bp, int error)
642 {
I had some kernel panics on the same servers a few months ago,
loosing iSCSI targets which were used in a gmirror with local disks.
gmirror should have continued to work flawlessly (as ZFS)
using local disks but the server crashed.
Stack traces :
https://s1.postimg.org/14v4sabhv3/panic_g_destroy1.png
https://s1.postimg.org/437evsk6rz/panic_g_destroy2.png
https://s1.postimg.org/8pt1whiy5b/panic_g_destroy3.png
(kgdb) list *g_destroy_consumer+0x53
0xffffffff80a18563 is in g_destroy_consumer (geom.h:369).
364 KASSERT(g_valid_obj(ptr) == 0,
365 ("g_free(%p) of live object, type %d", ptr,
366 g_valid_obj(ptr)));
367 }
368 #endif
369 free(ptr, M_GEOM);
370 }
371
372 #define g_topology_lock() \
373 do { \
> I think that all problems that you have seen are different sides of the same
> underlying issue. It looks like iscsi does not properly depart from geom and
> leaves behind some dangling pointers...
>
> The panics you got today most likely occurred here:
> bp->bio_to->geom->start(bp);
>
> And the most likely reason is that bio_to points to a destroyed geom provider.
>
> I wonder if you'd be able to get into direct contact with a developer
> responsible for iscsi in FreeBSD. I think that it is a relatively recent
> addition and it was under a FreeBSD Foundation project. So, I'd expect that the
> developer should be responsive.
Feel free then to contact me if you need, so that we can go further on this !
Thank you very much for your help,
Ben
More information about the freebsd-scsi
mailing list