cvs commit: src/sys/kern kern_sig.c

Robert Watson rwatson at freebsd.org
Sat Oct 25 22:46:09 PDT 2003


On Sun, 26 Oct 2003, Bruce Evans wrote:

> On Sat, 25 Oct 2003, Alfred Perlstein top posted:
> 
> > This is bad, it's time to add a flag to the vnode to do this
> > properly instead of relying upon the underlying FS to implement
> > the locking.
> 
> How would a mere flag help fix the real complexities for nfs?

Well, the point of the locking originally introduced in the core dump code
was presumably to help avoid a common case scenario: corrupted core dumps
due to parallel dumping.  Given the difficulty in addressing the problem
in any thorough way (distributed locking, etc), I think I'd almost rather
go for the simplest possible mechanism.  Setting a vnode flag during a
coredump to the vnode, and then causing any other core dump attempts to be
aborted while its set, presents a pretty clean solution to the single-host
case.

> > * Robert Watson <rwatson at FreeBSD.org> [031025 09:14] wrote:
> > > rwatson     2003/10/25 09:14:09 PDT
> > >
> > >   FreeBSD src repository
> > >
> > >   Modified files:
> > >     sys/kern             kern_sig.c
> > >   Log:
> > >   When generate a core dump, use advisory locking in an advisory way:
> > >   if we do acquire an advisory lock, great!  We'll release it later.
> > >   However, if we fail to acquire a lock, we perform the coredump
> > >   anyway.
> 
> Er, advisory locking means that honoring the lock is not enforced, not
> that it is good to not honor it.

The comment was a bit flippant and inaccurate: if it's possible for the
locking request to succeed, we wait for it to succeed.  However, if we get
a fatal error (rather than blocking), then we plow on ahead.  By "advisory
locking", I mean that we're using the advisory locking facility.  By
"advisory way", I mean "if it works, use it, and if it's not available,
don't".

> > >   This problem became particularly visible with NFS after
> > >   the introduction of rpc.lockd: if the lock manager isn't running,
> > >   then locking calls will fail, aborting the core dump (resulting in
> > >   a zero-byte dump file).
> > >
> > >   Reported by:    Yogeshwar Shenoy <ynshenoy at alumni.cs.ucsb.edu>
> 
> There is only a problem if the lock manager is supposed to be running
> but is not.  That is a configuration error, or perhaps a transient
> error, so it should not be "fixed" by ignoring the failure.  If ignoring
> nfs locks is what is wanted in all cases, then it should be configured
> by mounting the file system with -L (= nolockd).  Maybe the lock request
> should hang for transient failures. 
> 
> Support for correct configuration of this is still mostly nonexistent in
> /etc/defaults/rc.conf and rc.conf(5).  The default for nfs mounts is
> lockd, but the default for rpc_lockd_enable is "NO".  Setting
> rpc_lockd_enable to "YES" is not sufficient to configure this.  The
> setting of at least rc_statd_enable must also be changed. 
> 
> This stuff is misconfigured on all of the freebsd machines that I
> checked.  Some run 4.9, so nfs locking is not available.  beast and
> builder demonstrate the bug by giving empty core dumps.  bento avoids
> the bug by dumping cores in a non-nfs directory. 

Agreed.  The current condition of NFS locking is pretty pessimal: we still
have substantial bugs in the implementation of the lock manager, and
configuring locking correctly is difficult.  The default configuration is
particularly poor.  We should address most of these.  :-)

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org      Network Associates Laboratories




More information about the cvs-src mailing list