Bad sectors: how bad can it be
Michael Powell
nightrecon at hotmail.com
Wed Oct 28 15:52:46 UTC 2009
Michaël Grünewald wrote:
[snip]
>
> I have backups of the data contained in the broken, so the data on this
> disc are not a concern. I have however a question: How do I verify that
> a hard-drive is accurately working if its firmware will hide the bad
> sectors as long as possible?
>
[snip]
As Polytropon indicated the smartctl commands for testing contained within
the smartmontools port will extract the error logs from within the drive's
firmware. There are two modes you can select from (basically a long and a
short) that you can execute "now" at a command prompt. It can also be run as
a daemon for continual monitoring. The data returned is somewhat arcane and
can be semi difficult to interpret.
There are various levels of usability which can vary by hardware. Some RAID
controllers may get in the way of direct communication to some hard drives.
Other controllers, as you go up the 'expensive high dollar' ladder will
often do built-in SMART monitoring and will beep and/or send emails when it
detects error conditions from a drive. Some even either contain, or have an
external utility which provide a web based browser accessible view in real
time. The purpose is to attempt to detect a drive that is about to fail.
As far as the most basic level goes, you would look for numbers which
indicate that the bad sector remap area has filled. Once this space gets
filled any new bad sectors that develop can no longer be mapped out. This
usually shows up in the operating system as some generic form of
"unrecoverable read/write error" message and Bad Things begin to happen.
I have not used Spinright in a very long time, but it may buy some life on
such a drive. If it can clear the bad sector remap area after adjusting the
remap table it can give new life to a drive. The same thing used to be
possible on SCSI drives by running the low level format utility usually
contained within the controller firmware.
Such "fixes" should only be viewed as extremely temporary in nature, as the
general pattern with regard to magnetic media failure is that once it starts
to get bad spots it will keep on getting bad spots on a fairly regular basis
afterwords.
Interesting reading:
http://www.usenix.org/publications/login/2008-06/openpdfs/bairavasundaram.pdf
-Mike
More information about the freebsd-questions
mailing list