misc/119435: ZFS: replacing bad drive without offlining old one
causes in-use "unavail" drive
Weldon Godfrey
weldon at excelsus.com
Mon Jan 7 12:40:03 PST 2008
>Number: 119435
>Category: misc
>Synopsis: ZFS: replacing bad drive without offlining old one causes in-use "unavail" drive
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Jan 07 20:40:03 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator: Weldon Godfrey
>Release: FreeBSD 7.0-Beta2
>Organization:
>Environment:
reeBSD netflow-dc1.corp.ena.net 7.0-BETA2 FreeBSD 7.0-BETA2 #0: Fri Nov 2 14:54:38 UTC 2007 root at myers.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
>Description:
I searched for a FreeBSD-ZFS group to post or ask this one. It appears this is a bug since ZFS should know the drive is faulted. please forgive me and direct me to the right place if I am wrong.
Granted, this is an unoptimal configuration. Since the new PERC controllers will not allow non-virtual disks, I have a system with 4 PERC virtual drives (each drive in RAID0). The 1st virtual disk has a small partician for the OS, the remainder of that drive and the raw disk of the next 3 virtual drives are in zpool tank under raidz1. I replaced the 1st full raw disk (2nd drive) and reinitalized it. When it came back online. It shows the other three are online and the replaced disk is "unavail". When I try to offline the drive, it refuses, complaining it is in use (which it isn't). Therefore, I can't zpool replace the drive because it thinks it is in use. Here is the zpool status:
netflow-dc1# zpool status
pool: tank
state: ONLINE
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: resilver completed with 0 errors on Mon Jan 7 07:51:21 2008
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
mfid0s1d ONLINE 0 0 0
mfid1 UNAVAIL 0 0 0 corrupted data
mfid2 ONLINE 0 0 0
mfid3 ONLINE 0 0 0
>How-To-Repeat:
With PERC controller on Dell 2950:
Create 4 virtual disks with each drive, each RAID0
one 1st disk, create 10G / 8G swap
use fdisk to create unformated partician on rest of drive
create zpool tank with partician above and raw drives of remaining.
bring system down
replace 2nd drive (1st raw disk) with new drive and recreate virtual disks in PERC bios screen (making sure drive order is correct and only re-initalize new disk)
reboot, filesystem is up, state is online, status complains that one or more devices could not be used.
>Fix:
I am sure I could re-install the old drive and offline it then replace the drive and it would be okay.
The system is still in this current state. If you need info from the system, please let me know and I will be happy to perform any diags you need.
thanks,
Weldon
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list