Re: USB Disk Stalls on -current

From: Sean Bruno <sbruno_at_freebsd.org>
Date: Sun, 06 Feb 2022 19:02:19 UTC

> 
> 
> So there's some tools you can use. For usb, there's usbdump that can
> get you the USB transactions. I've not used it enough to give more details
> here. This will let you know what's going on, and when, on the USB endpoint.
> 
> You can also enable the CAM_IOSCHED stuff. This will allow you to get 
> latency
> measurements for 'requests in the sim' which basically will tell you 
> what your
> latency spread is for the drives. This will tell you if things are 
> getting caught
> up in the USB layer, or after CAM's da driver completes the I/O request
> (granted, that's almost certainly not happening, but it will help you 
> figure out
> what's going on and put numbers to the oddities you are seeing).
> 
> Also, make sure you have good cables. I've had lots of hicupsĀ over the
> years from dodgy USB cables. Also make sure you have good, high quality
> enclosures. Many from the USB2 time-period are sketchy at best and I
> went through several at one point trying to find a good one. I'd be 
> tempted to
> get USB 3 enclosures. I've had better luck with USB3 gear than USB2 gear
> here, but you need a USB-3 controller to get USB-3 speeds which might not
> be compatible with the NUC's built-in stuff (though my NUC has one USB3
> port, there's lots of different models).
> 
> Usually, though, I see weirdness associated with dmesg messages from
> usb, cam, etc when the hardware is on the sketch end.
> 
> Warner

I'm assuming that I have a fairly dodgy USB device, as the pauses seem 
to correspond to this from CAM being emitted:

Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): READ(10). CDB: 28 
00 36 69 02 6e 00 00 80 00
Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): CAM status: CCB 
request completed with an error
Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): Retrying command, 
2 more tries remain


Things resume after this is emitted, but there is a substantial 
(multiple minutes) pause here.  I would assume that timeouts would fire 
much quicker.

sean