BUG: possible NULL pointer dereference in nfs server
Rick Macklem
rmacklem at uoguelph.ca
Sun Jan 26 02:06:09 UTC 2014
Bruce Evans wrote:
> On Sat, 25 Jan 2014, Dimitry Andric wrote:
>
> > On 25 Jan 2014, at 01:38, Rick Macklem <rmacklem at uoguelph.ca>
> > wrote:
> > ...
> > If it inlines this, the result looks approximately like:
> >
> > 1 {
> > 2 fhandle_t *fhp = NULL;
> > 3 struct nfslockfile *new_lfp;
> > 4 int error;
> > 5
> > 6 if (new_stp->ls_flags & NFSLCK_OPEN) {
> > 7 new_lfp = *NULL;
> > 8 fhp = &new_lfp->lf_fh;
> > 9 } else if (&nfh) {
> > 10 fhp = &nfh;
> > 11 } else {
> > 12 panic("nfsrv_getlockfh");
> > 13 }
> > 14 error = nfsvno_getfh(vp, fhp, p);
> > 15 NFSEXITCODE(error);
> > 16 getlckret = error;
> > 17 }
> >
> > The code in line 7 is the problematic part. Since this is
> > undefined,
> > the compiler inserts a trap instruction here. I think the problem
> > Roman
> > encountered is that on sparc64, there is no equivalent to x86's ud2
> > instruction, so it inserts a call to abort() instead, and that
> > function
> > is not available in kernel-land.
>
> Compiler bug. abort() is not available in freestanding
> implementations.
> The behaviour is only undefined if the null pointer is dereferenced
> at
> runtime, so it doesn't include failing to link to abort() at compile
> time.
>
> > ...
> >> Sorry, I'm not a compiler guy, so I don't know why a compiler
> >> would
> >> generate a trap instruction, but since new_lfpp is never NULL when
> >> this is executed, I don't see a problem.
> >>
> >> If others feel that this needs to be re-coded, please let me know
> >> what
> >> you think the code should look like? (A test for non-NULL with a
> >> panic()
> >> before it is used?)
> >>
> >> Is a trap instruction that never gets executed a problem?
> >
> > It's better to avoid undefined behavior in any case. Just add a
> > NULL
> > check, that should be sufficient.
>
> That might only add bloat and unimprove debugging. Since the null
> pointer
> case cannot happen, it cannot be handed properly. It can be
> mishandled in
> the following ways:
> - return an error, so that all callers have to handle the null
> pointer case
> that can't happen. If the compiler is too smart, it will notice
> more
> undefined behaviour (that can't happen) in callers and "force" you
> to
> handle it there too
> - KASSERT() that the pointer cannot be null. Then:
> - on production systems where KASSERT() is null, this won't work
> around
> the compiler bug. Use a panic() instead. To maximize source
> code
> bloat, ifdef all of this.
> - when KASSERT() is not null, it will work around the compiler
> bug.
> If the case that can't happen actually happens, then this
> unimproves
> the debugging by messing up stack traces and turning restartable
> null pointer or SIGILL traps to non-restartable panics.
> Optimizations that replace a large block of code ending with a
> null pointer trap by a single unimplemented instruction would
> probably break restarting anyway.
>
> Bruce
>
So Roman, all I can suggest is to try adding something like:
if (new_lfpp == NULL)
panic("new_lfpp NULL");
after line#6. If that makes the compiler happy, I can commit it in
April. (Can't do commits before then.)
I agree with Bruce, but the check might be a good idea, in case a
future code change introduces a bug where the function is called with
new_lfpp NULL and NFSLCK_OPEN set.
If this doesn't make the compiler happy, all I can suggest is to
play around until you come up with something that works.
Have fun with it, rick
ps: I haven't seen this reported by tinderbox. Is the problem
specific to your setup?
More information about the freebsd-fs
mailing list