Confirmation Of Drive Failure
Drew Tomlinson
drew at mykitchentable.net
Fri Nov 8 23:08:38 UTC 2013
I've been running FBSD on this home server for about 13 years. Finally
after a power outage, it will no longer boot.
The main drives were two SCSI drives striped using gstripe. I am fairly
certain da0 is dead and the reason it won't boot.
I know there was a working IDE or IDE via firewire enclosure drive
before the crash. I had backups on that drive made from Bacula and I'm
hoping to be able to recover them.
If I'm interpreting all of the below dmesg output correctly, I think I
should have an ad1 drive that I can mount. I'm hoping someone can
confirm or deny that and help me get it mounted if ada1 should be there.
My version of FBSD on the box was 6.4. I am now booted from the 9.2
Live CD.
I'm checking dmesg to see what devices are seen:
da0 at ahc0 bus 0 scbus2 target 0 lun 0
da0: <SEAGATE SX19171W 9D32> Fixed Direct Access SCSI-2 device
da0: 11.626MB/s transfers (5.813MHz, offset 8, 16bit)
da0: Command Queueing enabled
da0: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
da1 at ahc0 bus 0 scbus2 target 2 lun 0
da1: <SEAGATE SX19171W 9D32> Fixed Direct Access SCSI-2 device
da1: 11.626MB/s transfers (5.813MHz, offset 8, 16bit)
da1: Command Queueing enabled
da1: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
cd0 at ata1 bus 0 scbus1 target 0 lun 0
cd0: <HITACHI CDR-8435 0010> Removable CD-ROM SCSI-0 device
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: cd present [274042 x 2048 byte records]
ada0 at ata0 bus 0 scbus0 target 0 lun 0
ada0: <GENERIC GENERIC A08.1500> ATA-5 device
ada0: 33.300MB/s transfers (UDMA2, PIO 8192bytes)
ada0: 76319MB (156301488 512 byte sectors: 15H 63S/T 16383C)
ada0: Previously was known as ad0
ada1 at ata0 bus 0 scbus0 target 1 lun 0
ada1: <WDC WD800AB 03.06A> ATA-0 device
ada1: 3.300MB/s transfers (PIO0, PIO 8192bytes)
ada1: 0MB (0 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad1
SMP: AP CPU #1 Launched!
Thus if I'm reading this right, it's seeing two internal IDE drives, a
CD drive, and two SCSI drives?
Although I'm booted from the CD drive, apparently it is having problems
based upon many of these messages:
(cd0:ata1:0:0:0): READ(10). CDB: 28 00 00 04 2e 78 00 00 01 00
(cd0:ata1:0:0:0): CAM status: SCSI Status Error
(cd0:ata1:0:0:0): SCSI status: Check Condition
(cd0:ata1:0:0:0): SCSI sense: ILLEGAL REQUEST asc:64,0 (Illegal mode for
this track)
(cd0:ata1:0:0:0): Info: 0x42e78
(cd0:ata1:0:0:0): Error 6, Unretryable error
(cd0:ata1:0:0:0): cddone: got error 0x6 back
(cd0:ata1:0:0:0): READ(10). CDB: 28 00 00 04 2e 78 00 00 01 00
Then lots of these messages which tells me ada0 drives is dead?
(ada0:ata0:0:0:0): READ_DMA. ACB: c8 00 80 00 00 40 00 00 00 00 10 00
(ada0:ata0:0:0:0): CAM status: ATA Status Error
(ada0:ata0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
(ada0:ata0:0:0:0): RES: 51 40 80 00 00 00 00 00 00 10 00
(ada0:ata0:0:0:0): Retrying command
Then it looks like there is hope for the stripe:
GEOM_STRIPE: Device data created (id=2880277341).
GEOM_STRIPE: Disk da0s1d attached to data.
GEOM_STRIPE: Disk da1s1d attached to data.
GEOM_STRIPE: Device stripe/data activated.
But then no hope as this sequence repeats itself:
GEOM_STRIPE: Disk da0s1d removed from data.
GEOM_STRIPE: Device stripe/data deactivated.
GEOM_STRIPE: Disk da0s1d attached to data.
GEOM_STRIPE: Device stripe/data activated.
GEOM_STRIPE: Disk da0s1d removed from data.
GEOM_STRIPE: Device stripe/data deactivated
Then I load the sbp module to see if I have any drives in the firewire
enclosure:
fwohci0: <VIA Fire II (VT6306)> port 0x1c00-0x1c7f mem
0xfc104000-0xfc1047ff irq
16 at device 10.0 on pci0
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channels is 8.
fwohci0: EUI64 00:40:63:00:00:00:07:ff
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwohci0: Initiate bus reset
fwohci0: fwohci_intr_core: BUS reset
fwohci0: fwohci_intr_core: node_id=0x00000001, SelfID Count=1,
CYCLEMASTER mode
firewire0: 2 nodes, maxhop <= 1 cable IRM irm(1) (me)
firewire0: bus manager 1
firewire0: fw_explore_node: Pre 1394a-2000 detected
firewire0: New S400 device ID:0030e002ee4000a6
sbp0: <SBP-2/SCSI over FireWire> on firewire0
sbp0: sbp_show_sdev_info: sbp0:0:0: ordered:1 type:14
EUI:0030e002ee4000a6 node:0 speed:2 maxrec:8
sbp0: sbp_show_sdev_info: sbp0:0:0 'Oxford ' '911 ' '000037'
sbp0: sbp_show_sdev_info: sbp0:0:1: ordered:1 type:14
EUI:0030e002ee4000a6 node:0 speed:2 maxrec:8
sbp0: sbp_show_sdev_info: sbp0:0:1 'Oxford ' '911 ' '000037'
sbp0: sbp_timeout:sbp0:0:0 request timeout(cmd orb:0x282fa154) ... agent
reset
(probe0:sbp0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe0:sbp0:0:0:0): CAM status: Command timeout
(probe0:sbp0:0:0:0): Retrying command
I suspect this means no working drives were found in the firewire enclosure?
So I check /dev and see if it sees ad1:
root@:~ # ll /dev/ad*
lrwxr-xr-x 1 root wheel 4 Nov 2 18:07 /dev/ad0@ -> ada0
lrwxr-xr-x 1 root wheel 5 Nov 8 13:38 /dev/ad0d@ -> ada0d
lrwxr-xr-x 1 root wheel 4 Nov 2 18:07 /dev/ad1@ -> ada1
crw-r----- 1 root operator 0x59 Nov 8 13:36 /dev/ada0
crw-r----- 1 root operator 0x75 Nov 8 13:37 /dev/ada0d
crw-r----- 1 root operator 0x5b Nov 2 18:07 /dev/ada1
root@:~ #
So I created /mnt/data and tried to mount ada1:
root@:~ # mount /dev/ada1 /mnt/data
mount: /dev/ada1: Device not configured
So then I try:
root@:~ # bsdlabel /dev/ada1
bsdlabel: cannot get disk geometry: No such file or directory
So does this mean I really don't have an ada1 drive? Or is there some
step I'm missing to make it accessible.
I really appreciate you reading this far and any help you might give.
Cheers,
Drew
More information about the freebsd-questions
mailing list