cvs commit: src/sys/kern kern_sig.c
Robert Watson
rwatson at freebsd.org
Sat Oct 25 22:46:09 PDT 2003
On Sun, 26 Oct 2003, Bruce Evans wrote:
> On Sat, 25 Oct 2003, Alfred Perlstein top posted:
>
> > This is bad, it's time to add a flag to the vnode to do this
> > properly instead of relying upon the underlying FS to implement
> > the locking.
>
> How would a mere flag help fix the real complexities for nfs?
Well, the point of the locking originally introduced in the core dump code
was presumably to help avoid a common case scenario: corrupted core dumps
due to parallel dumping. Given the difficulty in addressing the problem
in any thorough way (distributed locking, etc), I think I'd almost rather
go for the simplest possible mechanism. Setting a vnode flag during a
coredump to the vnode, and then causing any other core dump attempts to be
aborted while its set, presents a pretty clean solution to the single-host
case.
> > * Robert Watson <rwatson at FreeBSD.org> [031025 09:14] wrote:
> > > rwatson 2003/10/25 09:14:09 PDT
> > >
> > > FreeBSD src repository
> > >
> > > Modified files:
> > > sys/kern kern_sig.c
> > > Log:
> > > When generate a core dump, use advisory locking in an advisory way:
> > > if we do acquire an advisory lock, great! We'll release it later.
> > > However, if we fail to acquire a lock, we perform the coredump
> > > anyway.
>
> Er, advisory locking means that honoring the lock is not enforced, not
> that it is good to not honor it.
The comment was a bit flippant and inaccurate: if it's possible for the
locking request to succeed, we wait for it to succeed. However, if we get
a fatal error (rather than blocking), then we plow on ahead. By "advisory
locking", I mean that we're using the advisory locking facility. By
"advisory way", I mean "if it works, use it, and if it's not available,
don't".
> > > This problem became particularly visible with NFS after
> > > the introduction of rpc.lockd: if the lock manager isn't running,
> > > then locking calls will fail, aborting the core dump (resulting in
> > > a zero-byte dump file).
> > >
> > > Reported by: Yogeshwar Shenoy <ynshenoy at alumni.cs.ucsb.edu>
>
> There is only a problem if the lock manager is supposed to be running
> but is not. That is a configuration error, or perhaps a transient
> error, so it should not be "fixed" by ignoring the failure. If ignoring
> nfs locks is what is wanted in all cases, then it should be configured
> by mounting the file system with -L (= nolockd). Maybe the lock request
> should hang for transient failures.
>
> Support for correct configuration of this is still mostly nonexistent in
> /etc/defaults/rc.conf and rc.conf(5). The default for nfs mounts is
> lockd, but the default for rpc_lockd_enable is "NO". Setting
> rpc_lockd_enable to "YES" is not sufficient to configure this. The
> setting of at least rc_statd_enable must also be changed.
>
> This stuff is misconfigured on all of the freebsd machines that I
> checked. Some run 4.9, so nfs locking is not available. beast and
> builder demonstrate the bug by giving empty core dumps. bento avoids
> the bug by dumping cores in a non-nfs directory.
Agreed. The current condition of NFS locking is pretty pessimal: we still
have substantial bugs in the implementation of the lock manager, and
configuring locking correctly is difficult. The default configuration is
particularly poor. We should address most of these. :-)
Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org Network Associates Laboratories
More information about the cvs-src
mailing list