zfs on FreeBSD 8.2 64bit stuck in "One or more devices is currently being resilvered"
Mehmet Erol Sanliturk
m.e.sanliturk at gmail.com
Sat Mar 21 16:57:52 UTC 2015
On Sat, Mar 21, 2015 at 9:01 AM, motty cruz <motty.cruz at gmail.com> wrote:
> Hi Mehmet, are you thinking a bad HDD bay? If I ran the gstat command I
> see is writing to disk :
> dT: 1.002s w: 1.000s
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 0 0 0 0 0.0 0 0 0.0 0.0| acd0
> 0 9 0 0 0.0 9 144 22.1 3.1| mfid0
> 0 9 0 0 0.0 9 144 22.6 3.1| mfid0s1
> 0 9 0 0 0.0 9 144 22.9 3.2| mfid0s1a
> 0 0 0 0 0.0 0 0 0.0 0.0| mfid0s1b
> 0 0 0 0 0.0 0 0 0.0 0.0| mfid0s1d
> 0 0 0 0 0.0 0 0 0.0 0.0| mfid0s1e
> 0 0 0 0 0.0 0 0 0.0 0.0| mfid0s1f
> 2 4631 4631 13270 0.4 0 0 0.0 73.0| da0
> 0 0 0 0 0.0 0 0 0.0 0.0| da1
> 3 3979 3979 13345 0.7 0 0 0.0 78.0| da2
> 0 0 0 0 0.0 0 0 0.0 0.0| da3
> 5 4503 4503 13263 0.5 0 0 0.0 76.0| da4
> 5 4245 4245 13254 0.6 0 0 0.0 77.5| da5
> 4 4741 0 0 0.0 4741 11626 1.2 86.7| da6
>
> disk being replace is da6, as you can see w/s11626? unless I am not
> reading this right? so I don't think is the cable or port. I really don't
> know what is causing this issue:
>
> today is the 3rd day resilvering:
> # zpool status
> pool: tank
> state: ONLINE
> status: One or more devices is currently being resilvered. The pool will
> continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scrub: resilver in progress for 47h47m, 100.00% done, 0h0m to go
> config:
>
> NAME STATE READ WRITE CKSUM
> tank ONLINE 0 0 0
> raidz2 ONLINE 0 0 0
> label/019 ONLINE 0 0 0
> label/001b ONLINE 0 0 0
> label/003 ONLINE 0 0 0
> label/007b ONLINE 0 0 0 1.79T resilvered
> label/005 ONLINE 0 0 0
> label/006 ONLINE 0 0 0
> label/0171 ONLINE 0 0 0
> any suggestion on what should be my next step?
>
> Thanks in advance!
> -Motty
>
>
Yes , it may be .
If you can , you may attach to a working HDD bay and see whether the HDD
has problem or the HDD bay .
Another step may be to remove HDD from the trouble causing bay and use a
correctly working group of HDD bays .
Then add a HDD which you know is working correctly to a suspected HDD bay
and see whether it is causing trouble or not .
Continue in that way , up to identify status of bays or its other related
components .
One important problem is corruption of your data . My suggestion is to back
up your data and up to resolving this issue , do not use this computer for
your production works .
Sometimes a part starts to failure step by step slowly and at the end may
completely fail .
I am saying these to emphasize the importance of saving of your data as
soon as possible .
If you have facility , another step may be to replace HDD bays controller
by a new and good quality controller .
Version 8.2 is very old .
Switching to a new version , either 9.3 , or 10.1 may be useful by using a
spare system to transfer your data to newly installed system .
I think you know very well how to migrate to a new system when ZFS is used .
I am not using ZFS , therefore , my knowledge is very weak .
I have encountered a likely similar problem in a NFS server - client group .
In the server , program sources were corrupted either by truncating lines
or by injecting invalid characters into lines , or changing characters to
invalid characters randomly .
I have replaced server , switch and cables and in suspected ( because of
"Access Violation" messages ) client computer the memory chips . At the end
it come out that the suspected client computer mother board chips is/are
faulty ( not memory chips ) or other parts .
When there is no any sufficiently capable testing equipment , only action
can be done is to replace suspected parts by other ( known to be working
parts as much as possible ) .
> On Fri, Mar 20, 2015 at 10:23 PM, Mehmet Erol Sanliturk <
> m.e.sanliturk at gmail.com> wrote:
>
>>
>>
>> On Fri, Mar 20, 2015 at 3:44 PM, Motty Cruz <motty.cruz at gmail.com> wrote:
>>
>>> Can you describe what you did to replace the disk?
>>>
>>> I sure can. I had spare hdd in the pool.
>>> #zpool replace tank label/004 label/007b
>>>
>>> label/003 ONLINE 0 0 0
>>> replacing DEGRADED 0 0 0
>>> 433419809408607751 UNAVAIL 0 0 0
>>> was/dev/label/007
>>> label/004 ONLINE 0 0 0 2.47T
>>> resilvered
>>> label/005 ONLINE 0 0 0
>>>
>>> after two days of resilvering, the server became unresponsive. I reboot
>>> the server started to resilver again. after that I also
>>> detached bad disk.
>>> #zpool detach tank 433419809408607751
>>>
>>>
Since newly attached HDD is generating trouble , this may show that ,
problem is not in the HDD , but in the HDD bay or its related parts .
My suggestion is , "Do not salvage your disk before verifying that it is
really defective." .
> I have tried zpool clear tank but no success,
>>>
>>> Thanks,
>>> Motty
>>> On 03/20/2015 03:32 PM, Rainer Duffner wrote:
>>>
>>>> Am 20.03.2015 um 23:25 schrieb Motty Cruz <motty.cruz at gmail.com>:
>>>>>
>>>>> Hello Rainer,
>>>>>
>>>>> a disk went bad, I had to replace it, soon after replacing the bad HDD
>>>>> it started the "resilver" process. Process went on and on for hours,
>>>>> unfortunately server stop responding, I was force to reboot. after
>>>>> rebooting started "resilver" process again, from zero. I put the HDD
>>>>> offline replace it "thinking it was a factory bad HHD" started the
>>>>> "resilver" process again.
>>>>>
>>>>>
>>>> I would assume that the ZFS still thinks it’s the old disk somehow.
>>>> This is what usually happens then.
>>>>
>>>>
>>>> I’m not sure if an upgraded FreeBSD will help you with your
>>>> resilver-problem.
>>>>
>>>> Can you describe what you did to replace the disk?
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> freebsd-fs at freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>>
>>
>>
>> Is there a possibility that the resilvered parts ( port , cable , etc. )
>> have hardware failure problems which OS is not able to complete resilvering
>> or it is seen that part to be resilvered ?
>>
>>
>>
>> Mehmet Erol Sanliturk
>>
>>
>>
>>
>>
>>
>
>
> --
> Thanks for your support,
> Motty
>
More information about the freebsd-fs
mailing list