[Bug 219760] ZFS iSCSI w/ Win10 Initiator Causes pool corruption

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri Aug 11 08:09:29 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219760

emz at norma.perm.ru changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |emz at norma.perm.ru

--- Comment #6 from emz at norma.perm.ru ---
I observed similar behaviour on one of my SAN systems.

In my opinion, iSCSI + zfs is broken somewhere between r310734 and r320056.

Symptoms:

- random fatal trap 12 panics.
- random general protection faults panics
- random "Solaris(panic): zfs: allocating allocated segment" panics
- zfs pool corruption that happens ONLY on pools that serve the zvols as iSCSI
target devices
- zfs pool corruption happening _on the fly_, without system panicking.
- no zfs corruption is happening of the zfs pools not serving the devices for
the iSCSI targets.

I have 7 SAN systems running this setup. No system more recent than r310734 is
showing that behaviour. The only system more recent than r310734 (at least
r320056, and until 11.1-RELEASE) was affected by this, and became healthy when
downgraded to r310734 (r310734 was chosen simply because it's the most recent
revision on all of the 7).

First I had the strong impression that we had a hardware problem. Memtest86+
found no problems. We found multiple SNART ATA errors that were caused by the
bad cabling, and that seemed to be the rooy cause for the moment, but after
switching to a new cable (and also to a new HBA, new server and new enclosure)
the problem hasn't vanished. It was solved only after the downgrade to the
r310734. The SAN system is up and running for 48 hours already without pool
corruption (which usually happened withing first 12 hours of running) and
without panics (which usually happened within first 24 hours).

Unfortunately, I have no crashdumps, because the mpr(4) blocks crashdump
collecting (see the discussion in the freebsd-scsi@). I have only the
backtraces  from serial-over-ethernet IPMI, which I will attach here.

Problem initial description:

https://lists.freebsd.org/pipermail/freebsd-fs/2017-August/025099.html

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-fs mailing list