UFS_DIRHASH panics on a dozen server within 30 hours
Andreas Longwitz
longwitz at incore.de
Sun Sep 11 22:24:02 UTC 2011
Hi,
thank you very much for your answer, I think you pointed me in the right
direction.
> Hmm, the patch in that PR should still apply to newer versions. Also, you
> could just change the malloc() call to always allocate the maximum size
> (instead of using a static buffer) for a smaller diff. It seems though that a
> specific command is overrunning its buffer.
Yes. I found that megarc often wants a buffer of 12868 bytes, but the
controller sends always 25412 bytes back. Because this seems to be an
error in megarc I have submitted a patch for the existing PR ports/137938.
Furthermore I saw some sporadic answers of the controller to megarc
ioctl's with much more data than the buffer size stated by megarc.
Therefore I still use the maximum size in my updated patch in kern/155658.
>> Now I have a dozen core dumps and try to understand what happened.
>> All dumps looks very similar and the panic is always "page fault"
>> in _mtx_lock_sleep called from ufsdirhash_recycle or ufsdirhash_free
>> because the used mtx_object is overwritten with zeros by someone
>> before _mtx_lock_sleep is called.
>
> I don't know of anything in particular that would explain this, esp. as to
> why you would see them all occur at the same time.
In the meantime I had three more crashes in FreeBSD 6. I assume it is
the same problem as in FreeBSD 8, because the memory corruption problem
caused by megarc and the controller has nothing to do with the version
of FreeBSD. I have verified that the overruns occurs in FreeBSD 6 too,
but I do not have an explanation, why FreeBSD did not crash for years
because I used megarc all the time every day.
--
Dr. Andreas Longwitz
Data Service GmbH
Beethovenstr. 2A
23617 Stockelsdorf
Amtsgericht Lübeck, HRB 318 BS
Geschäftsführer: Wilfried Paepcke, Dr. Andreas Longwitz, Josef Flatau
More information about the freebsd-stable
mailing list