Replacing a failed disk in raidz2 zfs (and gpt)
Philip M. Gollucci
pgollucci at p6m7g8.com
Thu Feb 3 06:21:42 UTC 2011
All,
I have a zroot(mirror)+zmysql(raidz2) setup on a MySQL db box.
One drive failed (mfid3). We've since replaced it.
I can't for the life of me get zpool to replace it. I can't remember why
I used gpt instead of direct disks for the zmysql pool (but thats how it
is). I've tried all of the following commands with different errors,
and I must say I'm stumped. I've done this several times before for the
ASF (but no gpt at play there).
$ zpool scrub zmysql
just runs, and completes, no error
$ zpool replace zmysql gpt/disk3
cannot replace gpt/disk3 with gpt/disk3: one or more devices is
currently unavailable
$ zpool remove zmysql gpt/disk3
cannot remove gpt/disk3: only inactive hot spares or cache devices can
be removed
$ zpool offline zmysql gpt/disk3
cannot offline gpt/disk3: no valid replicas
$ zpool add zmysql gpt/disk3
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses raidz and new vdev is disk
I would say thats b/c I didn't run gpt commands on it, but see below.
I think got copied over via raid card pass through, or it just hasn't
rescaned it yet.
$ zpool online zmysql gpt/disk3
warning: device 'gpt/disk3' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present
$ zpool add zmysql spare gpt/disk3
cannot add to 'zmysql': one or more devices is currently unavailable
$ zpool replace zmysql gpt/disk3 gpt/disk3
cannot replace gpt/disk3 with gpt/disk3: one or more devices is
currently unavailable
Below is some system information. More details on request.
No, I can not import/export the pool, or reboot the box.
Thanks in advance!
$ zpool status -v zmysql
pool: zmysql
state: DEGRADED
status: One or more devices could not be used because the label is
missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: scrub completed after 0h16m with 0 errors on Tue Feb 1 21:13:41
2011
config:
NAME STATE READ WRITE CKSUM
zmysql DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 UNAVAIL 15 6.96M 0 experienced I/O failures
gpt/disk4 ONLINE 0 0 0
gpt/disk5 ONLINE 0 0 0
gpt/disk6 ONLINE 0 0 0
gpt/disk7 ONLINE 0 0 0
errors: No known data errors
$ zpool upgrade
This system is currently running ZFS pool version 13.
All pools are formatted using this version.
$ zfs upgrade
This system is currently running ZFS filesystem version 3.
All filesystems are formatted with the current version.
$ hd -v /dev/mfid3p1 | head
hd: /dev/mfid3p1: Input/output error
$ hd -v /dev/gpt/disk3 | head
hd: /dev/gpt/disk3: Input/output error
$ ls /dev/mfid3*
crw-r----- 1 root operator - 0, 97 Nov 17 08:03:12 2010 mfid3
crw-r----- 1 root operator - 0, 107 Nov 17 08:03:12 2010 mfid3p1
crw-r----- 1 root operator - 0, 108 Nov 17 08:03:12 2010 mfid3p2
crw-r----- 1 root operator - 0, 109 Nov 17 08:03:12 2010 mfid3p3
$ ls /dev/gpt
total 1
dr-xr-xr-x 2 root wheel - 512 Nov 17 08:03:12 2010 ./
dr-xr-xr-x 7 root wheel - 512 Nov 17 08:03:12 2010 ../
crw-r----- 1 root operator - 0, 117 Nov 17 08:03:12 2010 disk0
crw-r----- 1 root operator - 0, 122 Nov 17 08:03:12 2010 disk1
crw-r----- 1 root operator - 0, 127 Nov 17 08:03:12 2010 disk2
crw-r----- 1 root operator - 0, 132 Nov 17 08:03:12 2010 disk3
crw-r----- 1 root operator - 0, 149 Nov 17 08:03:12 2010 disk4
crw-r----- 1 root operator - 0, 154 Nov 17 08:03:12 2010 disk5
crw-r----- 1 root operator - 0, 159 Nov 17 08:03:12 2010 disk6
crw-r----- 1 root operator - 0, 164 Nov 17 08:03:12 2010 disk7
crw-r----- 1 root operator - 0, 115 Nov 17 08:03:12 2010 swap0
crw-r----- 1 root operator - 0, 120 Nov 17 08:03:12 2010 swap1
crw-r----- 1 root operator - 0, 125 Nov 17 08:03:12 2010 swap2
crw-r----- 1 root operator - 0, 130 Nov 17 08:03:12 2010 swap3
crw-r----- 1 root operator - 0, 147 Nov 17 08:03:12 2010 swap4
crw-r----- 1 root operator - 0, 152 Nov 17 08:03:12 2010 swap5
crw-r----- 1 root operator - 0, 157 Nov 17 08:03:12 2010 swap6
crw-r----- 1 root operator - 0, 162 Nov 17 08:03:12 2010 swap7
(yes, I know its time to update, I'm waiting on 8.2)
$ uname -a
FreeBSD x 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #1 r203057: Wed Jan 27
06:42:10 UTC 2010 root at Z:/usr/obj/usr/src/sys/X amd64
gpart show
=> 34 142081981 mfid0 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid1 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid2 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid3 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid4 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid5 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid6 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
=> 34 142081981 mfid7 GPT (68G)
34 128 1 freebsd-boot (64K)
162 50331648 2 freebsd-swap (24G)
50331810 90177536 3 freebsd-zfs (43G)
140509346 1572669 - free - (768M)
$ pciconf -lv |grep ....
mfi0 at pci0:2:14:0: class=0x010400 card=0x1f031028 chip=0x00151028
rev=0x00 hdr=0x00
vendor = 'Dell Computer Corporation'
device = 'Integrated RAID controller (PERC 5/i RAID Controller)'
console/dmesg during hot swap:
mfi0: sense error 0, sense_key 0, asc 0, ascq 0
mfid3: hard error cmd=read fsbn 50331810
mfi0: 17960 (349585200s/0x0020/info) - Patrol Read started
mfi0: 18038 (349586341s/0x0020/info) - Patrol Read complete
mfi0: 18039 (349891840s/0x0002/WARN) - Removed: PD 03(e1/s3)
mfi0: 18040 (349891840s/0x0002/info) - Removed: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=5000c50001439195,0000000000000000
mfi0: 18041 (349891840s/0x0002/info) - State change on PD 03(e1/s3) from
UNCONFIGURED_BAD(1) to FAILED(11)
mfi0: 18042 (349891840s/0x0002/info) - State change on PD 03(e1/s3) from
FAILED(11) to UNCONFIGURED_BAD(1)
mfi0: 18043 (349891857s/0x0002/info) - Inserted: PD 03(e1/s3)
mfi0: 18044 (349891857s/0x0002/info) - Inserted: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=5000c5001ce0e065,0000000000000000
mfi0: 18045 (349891857s/0x0002/info) - State change on PD 03(e1/s3) from
UNCONFIGURED_BAD(1) to UNCONFIGURED_GOOD(0)
--
------------------------------------------------------------------------
1024D/DB9B8C1C B90B FBC3 A3A1 C71A 8E70 3F8C 75B8 8FFB DB9B 8C1C
Philip M. Gollucci (pgollucci at p6m7g8.com) c: 703.336.9354
VP Apache Infrastructure; Member, Apache Software Foundation
Committer, FreeBSD Foundation
Consultant, P6M7G8 Inc.
Sr. System Admin, Ridecharge Inc.
Work like you don't need the money,
love like you'll never get hurt,
and dance like nobody's watching.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 188 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20110203/c3f34a36/signature-0001.pgp
More information about the freebsd-current
mailing list