geom_mirror/UFS weirdness with 7.2-STABLE
Boris Kochergin
spawk at acm.poly.edu
Wed Aug 26 00:00:04 UTC 2009
Ivan Voras wrote:
> Boris Kochergin wrote:
>
>> Ahoy. I noticed some very odd things in my file server's kernel buffer
>> this morning (there were actually a ton of these--this is a snippet):
>>
>> Jul 20 05:54:10 exodus smartd[763]: Device: /dev/ad1, FAILED SMART
>> self-check. BACK UP DATA NOW!
>> Jul 20 05:57:57 exodus kernel:
>> g_vfs_done():mirror/boots1[READ(offset=-4569735194538825728,
>> length=16384)]error = 5
>> Jul 20 05:57:57 exodus kernel: bad block 8806809555123731765, ino 4430620
>> Jul 20 05:57:57 exodus kernel: pid 35 (softdepflush), uid 0 inumber
>> 4430620 on /: bad block
>>
>
>
>> # df /
>> Filesystem 1K-blocks Used Avail
>> Capacity Mounted on
>> /dev/mirror/boots1 37846636 -4058799239201906816 4058799239236725722
>> -11656883301279% /
>>
>> The system is a:
>>
>> # uname -a
>> FreeBSD exodus.poly.edu 7.2-STABLE FreeBSD 7.2-STABLE #3: Sat Jul 11
>> 16:22:02 EDT 2009 root at exodus.poly.edu:/usr/obj/usr/src/sys/EXODUS
>> amd64
>>
>> Regarding smartd yelling at me about /dev/ad1, it's been doing that for
>> long while before this. There is one sector on the drive that cannot be
>> read, but the disk has otherwise been fine for months. My experience
>> with geom_mirror has been that it disconnects members from an array if
>> they experience I/O errors, so this seems to be something different. Any
>> clues?
>>
>
> It looks like the drive returned corrupted data without returning an
> error - which is strange, but not impossible. You are probably seeing
> numbers like -4058799239201906816 because some metadata is corrupted. If
> so, you should immediately disconnect the problematic drive so that the
> errorneous data isn't picked up and written to the good drive.
>
>
>
In retrospect, it appears to have been bad RAM. The symptoms were just
subtler back then.
-Boris
More information about the freebsd-fs
mailing list