SMART: disk problems on RAIDZ1 pool: (ada6:ahcich6:0:0:0): CAMstatus: ATA Status Error
Daniel Kalchev
daniel at digsys.bg
Wed Dec 13 22:07:37 UTC 2017
> On 13 Dec 2017, at 21:39, O. Hartmann <o.hartmann at walstatt.org> wrote:
>
> Am Wed, 13 Dec 2017 08:47:53 -0800 (PST)
> "Rodney W. Grimes" <freebsd-rwg at pdx.rh.CN85.dnsmgr.net> schrieb:
>
>>> On Tue, 12 Dec 2017 14:58:28 -0800
>>> Cy Schubert <Cy.Schubert at komquats.com> wrote:
>>>
>>>> There are a couple of ways you can address this. You'll need to
>>>> offline the vdev first. If you've done a smartcrl -t long and if the
>>>> test failed, smartcrl -a will tell you which block it had an issue
>>>> with. You can use dd, ddrescue or dd_rescue to dd the block over
>>>> itself. The drive may rewrite the (weak) block or if it fails to it
>>>> will remap it (subsequently showing as reallocated).
>>>>
>>>> Of course there is a risk. If the sector is any of the boot blocks
>>>> there is a good chance the server will hang.
>>>
>>> The drive is part of a dedicated storage-only pool. The boot drive is a
>>> fast SSD. So I do not care about this - well, to say it more politely:
>>> I do not have to take care of that aspect.
>>>
>>>>
>>>> You have to be *absolutely* sure which the bad sector is. And, there
>>>> may be more. There is a risk of data loss.
>>>>
>>>> I've used this technique many times. Most times it works perfectly.
>>>> Other times the affected file is lost but the rest of the file system
>>>> is recovered. And again there is always the risk.
>>>>
>>>> Replace the disk immediately if you experience a growing succession
>>>> of pending sectors. Otherwise replace the disk at your earliest
>>>> convenience.
>>>
>>> The ZFS scrubbing of the volume ended this morning, leaving the pool in
>>> a healthy state. After reboot, there was no sign of CAM errors again.
>>>
>>> But there is something else I'm worried about. The mainboard I use is a
>>>
>>> ASRock Z77 Pro4-M.
>>> The board has a cripple Intel MCP with 6 SATA ports from the chipset,
>>> two of them SATA 6GB, 4 SATA II, and one additional chip with two SATA
>>> 6GB ports:
>>>
>>> [...]
>>> ahci0 at pci0:2:0:0: class=0x010601 card=0x06121849 chip=0x06121b21
>>> rev=0x01 hdr=0x00 vendor = 'ASMedia Technology Inc.'
>>> device = 'ASM1062 Serial ATA Controller'
>>> class = mass storage
>>> subclass = SATA
>>> bar [10] = type I/O Port, range 32, base 0xe050, size 8, enabled
>>> bar [14] = type I/O Port, range 32, base 0xe040, size 4, enabled
>>> bar [18] = type I/O Port, range 32, base 0xe030, size 8, enabled
>>> bar [1c] = type I/O Port, range 32, base 0xe020, size 4, enabled
>>> bar [20] = type I/O Port, range 32, base 0xe000, size 32, enabled
>>> bar [24] = type Memory, range 32, base 0xf7b00000, size 512,
>>> enabled
>>> [...]
>>>
>>> Attached to that ASM1062 SATA chip, is a backup drive via eSATA
>>> connector, a WD 4 TB RED drive. It seems, whenever I attach this drive
>>> and it is online, I experience problems on the ZFS pool, which is
>>> attached to the MCP SATA ports.
>>
>> How does this external drive get its power? Are the earth grounds of
>> both the system and the external drive power supply closely tied
>> togeather? A plug/unplug event with a slight ground creep can
>> wreck havioc with device operation.
>
> The external drive is housed in a external casing. Its PSU is de facto with the same
> "grounding" (earth ground) as the server's PSU, they share the same power plug at its
> point were the plug is comeing out of the wall - so to speak.
Most external drive power supplies are not grounded. At least none I ever saw had grounded plugs for the mains cable. Might be, yours has it...
Worth checking anyway.
Daniel
More information about the freebsd-current
mailing list