A strange SCSI-disk-detached issue for FreeBSD VM on Hyper-V/Azure

Dexuan-BSD Cui dexuan.bsd at gmail.com
Tue Mar 13 22:46:05 UTC 2018


I reported a bug for this issue:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583

I'm working on a fix.

-- Dexuan

On Mon, Jan 29, 2018 at 7:44 PM, Dexuan-BSD Cui <dexuan.bsd at gmail.com> wrote:
> With the the recent Azure host upgrade, some FreeBSD VMs can't
> properly detect HDD-backed disks.
> The issue happens to the latest 12-CURRENT too.
>
> da0 (0:0:0:0)is also backed by HDD, but somehow is not affected.
> da1 (1:0:1:0) it not affected either, only the extra disks, e,g. da2
> (3:0:0:0), are affected.
>
> These are the logs of my VM, which has da0, da1 and da2. These 3 disks
> are on 3 different disk controllers.
>
>
> [root at decui-d4-repro ~]# dmesg -c > /dev/null
> [root at decui-d4-repro ~]# camcontrol rescan all
> Re-scan of bus 0 was successful
> Re-scan of bus 1 was successful
> Re-scan of bus 2 was successful
> Re-scan of bus 3 was successful
> Re-scan of bus 4 was successful
> Re-scan of bus 5 was successful
>
> [root at decui-d4-repro ~]# dmesg
> (aprobe0:ata0:0:0:0): Disable ATA disks on simulated ATA controller (0x71118086)
> (aprobe0:ata0:0:1:0): Disable ATA disks on simulated ATA controller (0x71118086)
> (probe1:(probe0:blkvsc0:0:1:0): Down reving Protocol Version from 4 to 2?
> blkvsc0:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ]
> (probe1:blkvsc0:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
> (probe0:blkvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6
> (probe0:(probe1:blkvsc0:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ]
> (probe0:blkvsc0:0:1:1): Down reving Protocol Version from 4 to 2?
> blkvsc0:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ]
> (probe1:blkvsc0:0:1:1): invalid LUN 1 for op: INQUIRY
> (probe0:blkvsc0:0:0:1): invalid LUN 1 for op: INQUIRY
> (probe0:blkvsc1:0:0:0): Down reving Protocol Version from 4 to 2?
> (probe0:blkvsc1:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
> (probe1:(probe0:blkvsc1:0:1:0): storvsc inquiry (36) [0 0 5 2 1f ... ]
> (probe1:blkvsc1:0:0:1): Down reving Protocol Version from 4 to 2?
> blkvsc1:0:1:0): storvsc scsi_status = 2, srb_status = 6
> (probe1:blkvsc1:0:1:0): storvsc inquiry (5) [0 0 0 8 0 ... ]
> (probe0:blkvsc1:0:0:1): invalid LUN 1 for op: INQUIRY
> (probe1:blkvsc1:0:1:0): storvsc inquiry (6) [0 83 0 30 1 ... ]
> (probe0:storvsc2:0:0:0): Down reving Protocol Version from 4 to 2?
> (probe1:(probe0:storvsc2:0:1:0): Down reving Protocol Version from 4 to 2?
> storvsc2:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
> (probe1:(probe0:storvsc2:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
> (probe1:storvsc2:0:0:1): Down reving Protocol Version from 4 to 2?
> (probe0:storvsc2:0:1:1): Down reving Protocol Version from 4 to 2?
> (probe1:storvsc2:0:0:1): invalid LUN 1 for op: INQUIRY
> storvsc2:0:1:1): invalid LUN 1 for op: INQUIRY
> (probe0:storvsc3:0:0:0): Down reving Protocol Version from 4 to 2?
> (probe1:(probe0:storvsc3:0:1:0): Down reving Protocol Version from 4 to 2?
> storvsc3:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ]
> (probe1:(probe0:storvsc3:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
> storvsc3:0:0:0): storvsc scsi_status = 2, srb_status = 6
> (probe0:(probe1:storvsc3:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ]
> (probe0:storvsc3:0:1:1): Down reving Protocol Version from 4 to 2?
> storvsc3:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ]
> (probe1:storvsc3:0:1:1): invalid LUN 1 for op: INQUIRY
> GEOM: new disk da2
> (probe0:pass3 at storvsc3 bus 0 scbus5 target 0 lun 0
> storvsc3:0:0:1): invalid LUN 1 for op: INQUIRY
> pass3: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-2 SCSI device
> pass3: 300.000MB/s transfers
> pass3: Command Queueing enabled
> pass3 at storvsc3 bus 0 scbus5 target 0 lun 0
> pass3: <Msft Virtual Disk 1.0> detached
> (da2:storvsc3:0:0:0): fatal error, could not acquire reference count
> (pass3:(da2:storvsc3:0:storvsc3:0:0:0): storvsc inquiry (5) [0 b1 0 3c 0 ... ]
> 0:0): Periph destroyed
> da2: Delete methods: <NONE(*),ZERO>
> Opened disk da2 -> 6
> g_access(918): provider da2 has error
> g_access(918): provider da2 has error
> g_access(918): provider da2 has error
> g_access(918): provider da2 has error
>
>
> According to the log, first it looks da2 is indeed detected, but then
> it's detached immediately:
>
> pass3: <Msft Virtual Disk 1.0> detached
>
> The above line comes from xpt_denounce_periph_sbuf(), which is called by:
>
> cam_periph_invalidate() at cam_periph_invalidate+0xe7/frame 0xfffffe00004eb7f0
> xpt_async_process_dev() at xpt_async_process_dev+0x162/frame 0xfffffe00004eb840
> xptdevicetraverse() at xptdevicetraverse+0xa3/frame 0xfffffe00004eb890
> xpttargettraverse() at xpttargettraverse+0x6a/frame 0xfffffe00004eb8d0
> xpt_async_process() at xpt_async_process+0x114/frame 0xfffffe00004eb9e0
> xpt_done_process() at xpt_done_process+0x38a/frame 0xfffffe00004eba20
> xpt_done_td() at xpt_done_td+0x136/frame 0xfffffe00004eba70
> fork_exit() at fork_exit+0x84/frame 0xfffffe00004ebab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004ebab
>
>
> I'm trying to figure out why CAM detaches da2 -- it looks something is wrong?
>
> Currently, with the below hack, da2 can be recognized and
> reading/writing da2 works fine:
>
> --- a/sys/cam/cam_periph.c
> +++ b/sys/cam/cam_periph.c
> @@ -1942,8 +1942,10 @@ cam_periph_error(union ccb *ccb, cam_flags camflags,
>                          * Let peripheral drivers know that this
>                          * device has gone away.
>                          */
> +#if 0
>                         xpt_async(AC_LOST_DEVICE, newpath, NULL);
>                         xpt_free_path(newpath);
> +#endif
>                 }
>         }
>
> And I'm further debugging. I appreciate any suggestion.
>
> Thanks!
>
> -- Dexuan


More information about the freebsd-scsi mailing list