drive failure during rebuild causes page fault
Doug White
dwhite at gumbysoft.com
Mon Dec 13 10:28:53 PST 2004
On Sun, 12 Dec 2004, Joe Rhett wrote:
> On Sun, Dec 12, 2004 at 09:59:16PM -0800, Doug White wrote:
> > Thats a nice shotgun you have there.
>
> Yessir. And that's what testing is designed to uncover. The question is
> why this works, and how do we prevent it?
I'm sure Soren appreciates you donating your feet to the cause :)
Why it works: the system assumes the administrator is competent enough to
not yank a disk that is being rebuilt to.
> Is there a proper way to handle these sort of events? If so, where is it
> documented?
>
> And fyi just pulling the drives causes the same failure so that means that
> RAID1 buys you nothing because your system will also crash.
This is why I don't trust ATA RAID for fault tolerance -- it'll save your
data, but the system will tank. Since the disk state is maintained by
the OS and not abstracted by a separate processor, if a disk dies in a
particularly bad way the system may not be able to cope.
--
Doug White | FreeBSD: The Power to Serve
dwhite at gumbysoft.com | www.FreeBSD.org
More information about the freebsd-stable
mailing list