Re: Review D38047 ... and then there was one....

From: Mark Johnston <markj_at_freebsd.org>
Date: Sat, 12 Oct 2024 21:09:07 UTC
On Sat, Oct 12, 2024 at 01:22:57PM -0400, David E. Cross wrote:
> 
> On 10/12/24 11:46, Mark Johnston wrote:
> > And if all of the commits are pushed together, which they would be, then
> > users will only see the end result anyway, so there is nothing
> > inherently dangerous about splitting the change into easier-to-review
> > commits.
> > 
> > Even if the whole patch is applied atomically, it affects multiple
> > binaries in the system (at least libc.so and nscd), and there's nothing
> > guaranteeing that the update is somehow applied atomically across the
> > whole system.  What if some applications using an older libc end up
> > still running with an updated nscd?  That's an untested configuration,
> > but it might still arise if the whole patch is committed as one unit.
> > 
> > I would split it up like this:
> > 1. Rename the cycle prevent symbol.  This is a cosmetic change which is
> >     irrelevant to the correctness of the rest of the patch.
> > 2. In nscd, fix the problem with negative lookups.  I can't easily see
> >     if anything other than the query.c change is required there, which is
> >     part of the reason I asked for the patch to be split up.
> > 3. Use thread-safe lookups in nscd's passwd.c and group.c.  For the
> >     latter, a bit of work is needed to tease out the bits that add
> >     getgrouplist() handling.  I believe that consists of
> >     wrap_getgrouplist() and the handling of nss_lt_pivot in
> >     group_lookup_func().
> > 4. Add getgrouplist() support to nscd.  At this point, libc doesn't
> >     dispatch getgrouplist() to nscd, so this change has no effect.
> > 5. Teach libc to marshall and unmarshall getgrouplist() requests.  This
> >     could reasonably be combined with 4 since the logic is coupled.
> > 6. Fix nscd to export the cycle prevention symbol.
> > 
> > Just the exercise of reading through the patch again with an eye to
> > splitting it up helped me understand it better and find a couple of
> > buglets.  It can be split up mechanically using git add -p for the most
> > part (aside from the fact that parts 3 and 4 touch nscd's group.c).
> > 
> >  From my point of view, your patch hasn't gotten a lot of attention
> > because
> > 1. It's big and affects multiple binaries.
> > 2. There are few committers familiar with our NSS implementation.
> > 
> > The point of splitting it is to help with 1.  It requires more work from
> > developers, but then reviewers are much more likely to read and actually
> > understand it, which helps with 2 in the long run.
> > 
> Ok, if this is just part of the review, and it all goes in at once my
> concerns are ameliorated.  I was afraid this was going to get piecemeal
> applied and then the bulk of the work, and the only real reason I did this
> was going to sit on a shelf; that is not terribly motivating to do a bunch
> of *additional* work

I will be sure to push all of the commits at once. :)

> To address your points individually.
> 
> 1. yes, and I see you correctly place the final step at 6 for this.
> 
> 2. Of note on this fix, this is actually only required once the symbol is
> exported, it has no real effect prior to this, (you can see above at line
> 755 this is gated by the perform actual lookup check (which won't even work
> now).  So that is to say this doesn't break things any more than they are
> already broken (ie, not high priority)
> 
> 3. This I do NOT believe is required.  I spent a lot of time going over the
> threading here; the various getpw and getgr entries already use thread local
> storage, so within a thread you are safe as long as you preserve values
> across calls as needed (and each thread within nscd always works to
> completion, even the lt_all ones).  I remember starting a bigger refactor to
> address that (since it was implemented inconsistently across different
> types). But I decided I wanted no part of that change.   Or are you
> referring to my code here; upon re-reading I am not sure if you are talking
> about pulling my agent code here, or if you think more thread hardening is
> required.

I'm referring to your changes to passwd.c and group.c (excluding
getgrouplist() bits) in nscd.  It's been a while since I looked at how
the nsswitch code manages thread-local storage, so I don't have an
opinion yet on whether the change is actually required.  If you don't
think it is, then it'd be easier to drop that change for now.

> 4/5: agreed
> 
> 6: agreed as above.
> 
> If you could point me at the buglets I will work on addressing them (or
> discussing them, since I believe I am thread safe here), and I will also
> split this up.

I reported them in some comments on https://reviews.freebsd.org/D38047

> Additionally I plan to tackle the other bug raised in -hackers about
> 'getent' ALWAYS hitting ALL databses.. but I need to wrap my head around
> that more; it has been multiple years since I have been in this code and
> dusting off the cobwebs is taking time (another reason I wasn't big on
> splitting this up, I hadn't touched it in forever).
> 
> 
> How do you recommend I post this, are these individual PRs on
> reviews.freebsd.org; do I post the diff differently?

Individual PRs on reviews.freebsd.org is fine.  We also use github pull
requests these days, which may be easier for you.  Either is ok with me.

To submit a series of git commits through phabricator, you can try
installing the freebsd-git-devtools package.  With that, given a range
of commits <start>..<end>, you can upload them to phabricator with a
single command:

$ git arc -r markj <start>~..<end>

(This will add me as a reviewer.)  "git arc --help" will give more
complete usage instructions.  It's not necessary to use this tool to
upload patches, I just find it easier.