Delete a directory, crash the system
Da Rock
freebsd-fs at herveybayaustralia.com.au
Wed Mar 25 09:25:07 UTC 2015
On 03/25/15 14:25, Benjamin Kaduk wrote:
> On Tue, 24 Mar 2015, Da Rock wrote:
>
>> On 03/25/15 00:16, Benjamin Kaduk wrote:
>>> On Mon, 23 Mar 2015, Da Rock wrote:
>>>
>>>> Unfortunately, fsck isn't helping - foreground or otherwise. All it shows
>>>> on
>>>> every single fs is inode 4 recovery which doesn't sound quite right. And
>>> Have you posted the exact output in a previous message (could you send a
>>> link)?
>> Not precisely, but the message is just a flash and there is no copying of it.
>> Anyway, inode 4 is the .sujournal file as expected; this means there is an
>> issue with the softupdates. Could this be narrowing it down (the OP to this
>> was also in this age of enlightenment, SU came in with 8.x didn't it?)?
> Ah, SU+J could be quite relevant. Soft-update journalling was enabled by
> default for a period of time, but I believe it was disabled because there
> were some scenarios where it was destabilizing. CC-ing Kirk to improve on
> my lousy memory.
Hmmm... not sure about that. This was set by a fresh install at the time
and I haven't fiddled with that - I have set trim though (I think). To
verify, I just checked my fresh 10.1 and it has the same settings, so I
don't think they're disabled yet...
>
> Do you remember what version was used to install the system in question
> (i.e., create the filesystem in question)?
Version of what exactly? Do you mean the OS or the utilities for
filesystem ops? The filesystem was originally setup at install (I start
with a clean system when I install freebsd - exceptions happen of
course, but thats the rule. Makes it easier... they are just
workstations after all) so I wouldn't remember or discover exactly what
utils were used. Install was using bsdinstall as per FBSD10 disk.
> Please show the output of
> 'tunefs -p <filesystem>'
root:
tunefs: POSIX.1e ACLs: (-a) disabled
tunefs: NFSv4 ACLs: (-N) disabled
tunefs: MAC multilabel: (-l) disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j) enabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) enabled
tunefs: maximum blocks per file in a cylinder group: (-e) 4096
tunefs: average file size: (-f) 16384
tunefs: average number of files in a directory: (-s) 64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k) 5240
tunefs: optimization preference: (-o) time
tunefs: volume label: (-L)
All the others are about the same - variations mainly in space variables
due to size.
>
>>>> again, it is only showing during updates to ports being built. I'm
>>> Er, what is only showing up? The panics?
>>> Surely you are not only running fsck while building ports...
>> Yes, the panics.
>>
>> Sorry, I thought that was obvious seeing as the alternative is impossible :)
>>>> investigating further, but it may be just a corrupt file in pkg system.
>>>>
>>>> Incidentally, I'm not suggesting an absolute fix for the issue as such,
>>>> but a
>>>> better means of handling it rather than crashing the system. The posts on
>>>> this
>>> Understood. But, there will always be some types of error which are truly
>>> unrecoverable, and there is no real option other than to panic. (Which is
>>> not to say that your situation is necessarily one of them.)
>> That I get, and given this may be an issue with SU it may well be warranted.
>> What can we do to narrow this down, as obviously one cannot be sitting
>> watching exactly what happens for the hours required while building ports.
>> Your bound to look away for just a second and miss it even if you did try! :D
>>>> If I discover anything more I'll keep everyone posted :)
>> So I did some fiddling with fsck, fsdb, find and stat; and got nowhere. I ran
>> fsck again and it gave me not much again. It did hint at some files in the
>> ports tree, so I cleaned up the ports tree to fresh install point, ran fsck
>> again and rebooted. So far so good, but I'm keeping my fingers crossed still.
> It is probably important to note that 'fsck -F' and saying 'no' to "USE
> JOURNAL?" is the most relevant fsck invocation.
Ok. I only use fsck in single user mode, as its only really of use to me
there and something is usually broken if I'm using it :) so -F is
usually implied there. No to use journal - good to know, I'll use that
next time then when it happens.
>
>> This doesn't help the panics - they're still a pita when they happen. It does
>> help me resolve the issue this time though. But initiating this error in
>> testing is damn near impossible. What can we document here as a way to gather
>> data to determine how to resolve this issue? Given my luck with this, its
>> bound to happen again at some point :)
> I think actual diagnostic is beyond my expertise/time committment at the
> moment. I suspect that using tunefs to disable softupdate journalling
> will be a workaround, if that is what you are really interested.
Don't know. Might be SU+J or maybe a pkgng fault in managing ports.
Might just wing it - might be helpful to the project after all :) (could
erk some of my users though :P)
>
> I'll let Kirk decide if he wants to debug more, but the answer may well be
> "no" if you're not running the latest ufs from -current.
>
> -Ben
More information about the freebsd-fs
mailing list