LTO-3 / scsi woes
Robin Blanchard
robin.blanchard at itos.uga.edu
Tue Jan 15 12:24:44 PST 2008
By "bypass" I mean I plugged the LTO drive directly to the controller
rather than in-line through the library. I am only using sym0 (the LVD
channel).
These are the devices in question:
<QUALSTAR RLS-8204-20 006D> at scbus0 target 0 lun 0 (ch0,pass0)
<IBM ULTRIUM-TD3 73P5> at scbus0 target 1 lun 0 (sa0,pass1)
Thanks for your help....
> -----Original Message-----
> From: Bob Hetzel [mailto:beh at case.edu]
> Sent: Tuesday, January 15, 2008 2:56 PM
> To: Robin Blanchard
> Subject: Re: [Bacula-users] LTO-3 / scsi woes
>
> Robin,
>
> The Freebsd log says the controller is operating at Fast-80... that
> doesn't sound good. I also noted that it's a dual channel with the
> other channel operating SE (single ended which is "high voltage" scsi
> as
> opposed to LVD or low voltage.
>
> Also, when you say "bypassed" can you clarify?
>
> Bob
>
> Robin Blanchard wrote:
> > Bob,
> >
> > Thanks for the length reply and suggestions. I've swapped
terminators
> > (I've got a U160 and a U320, both with indicator lights -- both
> indicate
> > 'green'), as well as cables, cards, and even the drive itself (we
> have
> > two of the same library, each with a single drive in each). The
> > "closest" I've come thus far is to bypass the library/exchanger, and
> > connect only the LTO-3 drive; but having to set the speed to 20
MB/s.
> > Using an adaptec card, I get absolutely nowhere at all, hence the
> > current use of the LSI card (which, yes, is LVD/SE). I just
installed
> > FBSD (as opposed to RHEL5) to see if I could glean anything else
> useful.
> > The attached dmesg is FBSD 6.2-STABLE with the LSI (sym) card, and
> the
> > library/drive both set in the BIOS to defaults/auto.
> >
> >
> > With an adaptec 2940U2W, I get nothing but garbage:
> >
> > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
> > (probe0:ahc0:0:0:3): SCB 0xe - timed out
> > sg[0] - Addr 0x37d084 : Length 36
> > (probe0:ahc0:0:0:3): Other SCB Timeout
> > ahc0: Timedout SCBs already complete. Interrupts may not be
> functioning.
> > ahc0: Recovery Initiated
> >>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
> > ahc0: Dumping Card State in Data-in phase, at SEQADDR 0xa0
> > Card was paused
> > ACCUM = 0x40, SINDEX = 0xa, DINDEX = 0xe4, ARG_2 = 0x0
> > HCNT = 0x0 SCBPTR = 0x0
> > SCSISIGI[0x54]:(BSYI|ATNI|IOI) ERROR[0x0] SCSIBUSL[0x0]
> > LASTPHASE[0x40]:(IOI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)
> > SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x93]:(SINGLE_EDGE|WIDEXFER)
> > SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x20]:(DPHASE)
> > SSTAT0[0x5]:(DMADONE|SDONE)
> > SSTAT1[0x2]:(PHASECHG) SSTAT2[0x0] SSTAT3[0x0]
SIMODE0[0x8]:(ENSWRAP)
> > SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
> > SXFRCTL0[0x88]:(SPIOEN|DFON) DFCNTRL[0x0]
> > DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
> > STACK: 0x0 0x167 0x17d 0x84
> > SCB count = 20
> > Kernel NEXTQSCB = 7
> > Card NEXTQSCB = 14
> > QINFIFO entries: 14
> > Waiting Queue entries:
> > Disconnected Queue entries:
> > QOUTFIFO entries:
> > Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18
> 19
> > 20 21 22 23 24 25 26 27 28 29 30 31
> > Sequencer SCB Info:
> > 0 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x1]
> > SCB_TAG[0x1]
> > 1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > 31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
> > SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
> > Pending list:
> > 14 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x3]
> > 1 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x1]
> > Kernel Free SCB list: 15 16 17 18 19 0 2 3 4 5 6 8 9 13 12 11 10
> > Untagged Q(0): 14
> > Untagged Q(1): 1
> >
> >
> >
> >> -----Original Message-----
> >> From: Bob Hetzel [mailto:beh at case.edu]
> >> Sent: Tuesday, January 15, 2008 2:08 PM
> >> To: Robin Blanchard
> >> Cc: Allan Black
> >> Subject: Re: [Bacula-users] LTO-3 / scsi woes
> >>
> >> Robin,
> >>
> >> Just a few additions to what Allan said... termination is not
simply
> >> one
> >> size fits all. On most controllers the internal and external
> > connector
> >> are considered part of the same bus. You need to always terminate
> > both
> >> ends of the bus. If you don't use a connector the controller
> > generally
> >> terminates that "stub" for you. If not, the controller is
considered
> > to
> >> be in the middle of the bus, if memory serves, and therefore you'd
> > need
> >> to terminate both ends with an actual physical terminator or using
> the
> >> terminator built into the device at the end if it has one (many
> > devices
> >> no longer come with that option and I don't think any automatically
> >> terminate).
> >>
> >> In any case, you need to read the manual for whatever controller
> > you're
> >> using and do what it says to do. You also need to look into the
> >> instructions about SCSI target ID's too. If you're using cheap
> >> internal
> >> connectors that aren't keyed there's also a chance you've got one
> >> backward. You also need to make sure your terminator is good for
> LVD
> >> (Ultra 160) devices. I suspect you're only using external devices.
> >> Most external terminators have a light and perhaps it'll have info
> on
> >> it. Some change the light color depending on what it detects on
the
> >> bus. If the light isn't lit it may not be working properly or you
> may
> >> have a goofy termpower setting somewhere.
> >>
> >> I can't seem to google the LSI controller, are you sure it's LVD?
> >>
> >> Also there are cable length restrictions of around 25 feet total.
> >> Additionally, many times the more devices you use the shorter the
> > cable
> >> you can use (each device has wiring inside it and with every change
> in
> >> cabling you add noise).
> >>
> >> If it's an IBM drive you can download IBM diags and extensively
test
> > it
> >> as well as communications to it. Likewise for HP and probably
other
> >> drives.
> >>
> >> You also may need to see if the firmware is current on the
> controller
> >> as
> >> sometimes peripherals are shipped with bugs that get fixed in
> software
> >> later (blame the marketing guys for pushing production schedules
> up).
> >> It's also likely there's a firmware update available for the drive
> but
> >> be careful about doing that when you have communications problems
as
> >> you
> >> could render the drive useless if corrupted firmware gets loaded
> into
> >> it.
> >>
> >> Bob
> >>
> >>> Message: 2
> >>> Date: Tue, 15 Jan 2008 00:10:22 +0000
> >>> From: Allan Black <Allan.Black at btconnect.com>
> >>> Subject: Re: [Bacula-users] LTO-3 / scsi woes
> >>> To: Robin Blanchard <robin.blanchard at itos.uga.edu>
> >>> Cc: uganet at listserv.uga.edu, bacula
> >>> <bacula-users at lists.sourceforge.net>
> >>> Message-ID: <478BF9EE.2020803 at btconnect.com>
> >>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>>
> >>> Robin Blanchard wrote:
> >>>>> I've been around the block with LSI and with adaptec: tried an
> > LSI
> >>>>> SYMC101, an adaptec 2940U2W and a 39160. I've removed the
> >>>>> library/exchange from the equation, using only the LTO-3 drive
> >> (and have
> >>>>> actually swapped that drive out for another as well), swapped
> > SCSI
> >>>>> cables, and terminators, and still am getting scsi errors.
Anyone
> >> got
> >>>>> any tips/ideas here ?
> >>> To be honest, this looks like a termination problem (or, at least,
> a
> >> bus
> >>> problem of some sort). However, since you appear to have swapped
> out
> >> every
> >>> piece of hardware, the only thing left seems to be the
> configuration
> >> ....
> >>> Both the external and the internal segments of the SCSI bus need
to
> >> be
> >>> properly terminated. If not, errors will occur on the bus and the
> > HBA
> >> will
> >>> step back the clock speed in an attempt to make the bus work
> >> reliably. So
> >>> far so good - you are seeing parity errors and the HBA is stepping
> >> the
> >>> speed back from 160 MB/s to 40 MB/s.
> >>>
> >>> It looks as if there is either insufficient termination, or too
> much
> >>> termination, on the bus - too much termination can be as bad as no
> >>> termination.
> >>>
> >>> There should be configuration options in the HBA's BIOS to set
> >> termination
> >>> of the internal and external segments at the HBA. Usually
> > termination
> >> of
> >>> the external and internal segments are configured independently.
> > They
> >> can
> >>> usually be set to auto (which is usually the default and the
> >> manufacturer's
> >>> recommended setting), or explicitly on or off. Check you do not
> have
> >>> termination switched on at the HBA *and* a terminator at the end
of
> >> the
> >>> cable.
> >>>
> >>> Is the LTO3 drive internal or external, BTW? The internal segment
> of
> >> the
> >>> bus needs to follow the same rules as the external segment, but is
> >> usually
> >>> more difficult to get right - termination can occur at the HBA,
the
> >> drive(s)
> >>> or the end of the cable. Normally for an Ultra 160 SCSI bus, the
> >> internal
> >>> segment should have termination switched off (or set to auto) at
> the
> >> HBA,
> >>> the devices on the bus should be unterminated and the ribbon cable
> >> should
> >>> be terminated.
> >>>
> >>> Similarly (but less likely), check there is not an option in the
> > LTO3
> >> drive
> >>> to terminate the bus. If the drive is terminating the bus, *and*
> >> there is
> >>> a terminator screwed onto the back, this will add up to too much
> >>> termination. Like I said, this is unlikely. If anything, if the
> > drive
> >> can
> >>> terminate the bus, it will most probably be automatic.
> >>>
> >>> If you have a mixture of wide and narrow devices on the bus
> >> (particularly
> >>> the internal segment) termination can get a bit tricky :-)
> >>>
> >>> I have never used any of the 3 HBAs you mention at the start of
the
> >> email,
> >>> but I have used a 2940N and a 29160N. A couple of points - the
> 2940N
> >> did not
> >>> (as far as I can remember) have an "auto" setting for bus
> > termination
> >> at
> >>> the HBA; it could only be set to "on" or "off". The 2940UW may be
> > the
> >> same.
> >>> Also, the 29160 BIOS tests the bus segments and reports if it
> > detects
> >> a
> >>> termination problem. The 39160 is, I believe, similar to the 29000
> >> series,
> >>> being mainly a dual-channel version. It may tell you of a
> > termination
> >>> problem if you look carefully at the BIOS output (and try not to
> >> blink
> >>> at the wrong time in case you miss it :-)
> >>>
> >>> Depending on how the HBA works, it is possible that a termination
> >> error
> >>> on one segment will affect both segments. If, for example, you
have
> >> an
> >>> external device attached, but no internal devices, then the HBA
> >> should have
> >>> termination *on* for the internal segment, and *off* for the
> > external
> >>> segment. [Or "auto", of course.]
> >>>
> >>> Allan
> >>>
More information about the freebsd-scsi
mailing list