Non-responsive 8.0-RC1
Peter Jeremy
peterjeremy at acm.org
Mon Nov 30 08:13:45 UTC 2009
On 2009-Nov-29 08:56:55 +0100, Thomas Backman <serenity at exscape.org> wrote:
>
>On Nov 28, 2009, at 10:22 PM, Peter Jeremy wrote:
>
>> My main server is running 8.0/amd64 from between RC1 and RC2 and I've
>> recently had a couple of long-duration hangs on it during which time
>> processes doing I/O will stop responding.
I forgot to mention that I checked SMART state on the disks and also
did a 'zpool scrub' after the first occurrence - no problems showed up.
It actually "hung" again just after I sent the original mail. This
time I managed to get console access and could check the kernel state.
This showed that a number of processes were blocked on ZFS locks.
The most commonly reported state was 'tx->tx_quiesce_done_cv)'.
It had been up for about 30 days before I noticed any problems and
seems to have been getting more obvious so it is also possible that
it's related to uptime - either a resource leak somewhere (though
there was nothing obvious) or memory fragmentation.
>Hmm, I know there was some fix to the scheduler re: thread priority,
>and it wouldn't surprise me if it was after your revision.
After looking around in the kernel, I'm now confident that it's not
a priority-inversion issue as the BOINC processes all appeared to be
running normally and not holding locks.
>My advice would be to upgrade to -RELEASE if possible. If not, at
>least check whether your build should be affected.
I have updated to a recent 8-stable and will see what happens.
--
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20091130/9fd668d8/attachment.pgp
More information about the freebsd-current
mailing list