Re: USB Disk Stalls on -current
- Reply: Mehmet Erol Sanliturk : "Re: USB Disk Stalls on -current"
- In reply to: Sean Bruno : "Re: USB Disk Stalls on -current"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 06 Feb 2022 19:11:01 UTC
On Sun, Feb 6, 2022 at 12:02 PM Sean Bruno <sbruno@freebsd.org> wrote: > > > > > > > > So there's some tools you can use. For usb, there's usbdump that can > > get you the USB transactions. I've not used it enough to give more > details > > here. This will let you know what's going on, and when, on the USB > endpoint. > > > > You can also enable the CAM_IOSCHED stuff. This will allow you to get > > latency > > measurements for 'requests in the sim' which basically will tell you > > what your > > latency spread is for the drives. This will tell you if things are > > getting caught > > up in the USB layer, or after CAM's da driver completes the I/O request > > (granted, that's almost certainly not happening, but it will help you > > figure out > > what's going on and put numbers to the oddities you are seeing). > > > > Also, make sure you have good cables. I've had lots of hicups over the > > years from dodgy USB cables. Also make sure you have good, high quality > > enclosures. Many from the USB2 time-period are sketchy at best and I > > went through several at one point trying to find a good one. I'd be > > tempted to > > get USB 3 enclosures. I've had better luck with USB3 gear than USB2 gear > > here, but you need a USB-3 controller to get USB-3 speeds which might not > > be compatible with the NUC's built-in stuff (though my NUC has one USB3 > > port, there's lots of different models). > > > > Usually, though, I see weirdness associated with dmesg messages from > > usb, cam, etc when the hardware is on the sketch end. > > > > Warner > > I'm assuming that I have a fairly dodgy USB device, as the pauses seem > to correspond to this from CAM being emitted: > > Feb 6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): READ(10). CDB: 28 > 00 36 69 02 6e 00 00 80 00 > Feb 6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): CAM status: CCB > request completed with an error > Feb 6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): Retrying command, > 2 more tries remain > > > Things resume after this is emitted, but there is a substantial > (multiple minutes) pause here. I would assume that timeouts would fire > much quicker. > The default timeout is 60s. You can reduce that substantially by setting kern.cam.da.default_timeout to a smaller level. Disk operations completed within 5s these days, except spin ups. Heck, nearly all complete within 500ms. You might try setting this value to maybe 3 or 5 or 10 to see if that helps the hiccups without introducing extra retries when the load is heavy. The smaller values give a faster recovery, but too small a number may result in timeouts and errors under load. I think you need to set this as a tuneable. Warner