Errors on a file on a zpool: How to remove?

jhell jhell at DataIX.net
Sun Jan 24 05:51:58 UTC 2010


On Sun, 24 Jan 2010 00:28, rincebrain@ wrote:
> On Sun, Jan 24, 2010 at 12:15 AM, jhell <jhell at dataix.net> wrote:
>> From what I see and what was already mentioned earlier in this thread is
>> meta data corruption but the checksum errors do not span across the whole
>> pool of vdevs. These are, correct me if I am wrong USB mass storage devices
>> ? SSD ?
>
> 1.5T Seagate 7200RPM drives.
>
>> In the arrangement of the devices on the system are da2,4,5 on the same hub
>> and da6,7 on another ? If this is the case you may have consolidated your
>> errors down to being a USB problem and narrowed down to where they are
>> connected to.
>
> ...no.
>
> All five are on the same SATA controller. These behaviors persist
> independent of which SATA controller they are plugged into, and I've
> tried all seven in the machine.
>
>> What happened to da1,3 ? Were these once connected to the system ? and if so
>> did you start noticing this problem occur roughly about the same period they
>> were removed ?
>
> da1,3 are being used in another disk pool, and were never a part of this pool.
>
> This is not an issue of a faulty SATA controller or SATA drives.
>
> This is an issue of "there was a single faulty stick of RAM in the machine".
>

Yeah I read this earlier, My apologies it slipped while I was writing 
"mind went into multi-write single read mode".

> I have sixteen disks in this machine. These three are having issues
> only on these particular files, and only on these files, not on random
> portions of the disk. The disks never report read errors - the ZFS
> layer is what reports them. SMART is not reporting any difficulties in
> reading any sectors of these disks.
>
>
> I could be mistaken, but I do not believe there to be a faulty
> controller in play at this time. I've rotated the drives among the
> spares of the 24 ports on the SATA controller in question, as well as
> the on-motherboard controller, and this behavior has persisted.
>
> - Rich
>

As I was thinking earlier... you mentioned you scrubbed multiple times with 
no difference. When I was mentioning the attempt to remove/replace I was 
thinking this will cause a "re-silvering" of the drives possibly fixing 
meta-data for the effected disks if good meta-data still exists somewhere.

Might be worth a shot but I would start with the replace of the devices 
that are showing the errors until you can clear the errors successfully 
without them showing up again and/or until you have replaced all disks.

Best of luck.

-- 

  jhell



More information about the freebsd-fs mailing list