svn commit: r240822 - head/sys/geom

Kenneth D. Merry ken at FreeBSD.org
Wed Sep 26 17:29:23 UTC 2012


On Wed, Sep 26, 2012 at 09:20:06 +0200, Pawel Jakub Dawidek wrote:
> On Tue, Sep 25, 2012 at 05:37:12PM -0600, Kenneth D. Merry wrote:
> > On Sat, Sep 22, 2012 at 12:41:49 +0000, Pawel Jakub Dawidek wrote:
> > > Log:
> > >   Use the topology lock to protect list of providers while withering them.
> > >   It is possible that provider is destroyed while we are iterating over the
> > >   list.
> [...]
> > > --- head/sys/geom/geom_disk.c	Sat Sep 22 12:40:52 2012	(r240821)
> > > +++ head/sys/geom/geom_disk.c	Sat Sep 22 12:41:49 2012	(r240822)
> > > @@ -635,10 +635,13 @@ disk_gone(struct disk *dp)
> > >  	struct g_geom *gp;
> > >  	struct g_provider *pp;
> > >  
> > > +	g_topology_lock();
> > >  	gp = dp->d_geom;
> > > -	if (gp != NULL)
> > > +	if (gp != NULL) {
> > >  		LIST_FOREACH(pp, &gp->provider, provider)
> > >  			g_wither_provider(pp, ENXIO);
> > > +	}
> > > +	g_topology_unlock();
> > >  }
> [...]
> > This breaks devices going away in CAM.
> > 
> > When the da(4) driver calls disk_gone(), it is necessarily holding the SIM
> > lock, which is a regular MTX_DEF mutex.  The GEOM topology lock is an sx
> > lock, and of WITNESS blows up because of that:
> [...]
> > disk_gone() needs to be callable from an interrupt context.  So it cannot
> > acquire the topology lock.
> 
> I can reword what you said above a bit: disk_gone() modifies GEOM
> topology, so it has to hold the topology lock, thus it cannot be called
> from an interrupt context.

Okay, but either way we need some way to inform GEOM that the device has
gone away.

> We might be able to change the topology lock to LIST_FOREACH_SAFE(), as
> g_wither_provider() can only destroy current provider. This is because
> CAM own the geom and nobody should be able to mess with its provider's
> list, apart from CAM itself. So I'd need to know how CAM ensures that
> two disk_gone() cannot be called for one geom. The answer might be that
> those geoms have always only one provider, but then I cannot explain why
> we have loop there. Maybe jdp@ will remember why.

Well, disk_gone() will only be called once for each da(4) instance that
goes away.  There is a check in cam_periph_invalidate()
(sys/cam_cam_periph.c) that insures that the peripheral driver's
invalidate routine only gets called once.

I don't know whether there is more than one provider or not.  It may be
that LIST_FOREACH_SAFE() is sufficient.  g_wither_provider() does set the
G_PF_WITHER flag, and that wouldn't be protected by anything other than the
CAM SIM lock in this case.

> I also read original commit message and I don't fully understand.
> 
> It was 7 years ago so I'll quote entire commit message:
> 
> 	Fix a bug that caused some /dev entries to continue to exist after
> 	the underlying drive had been hot-unplugged from the system.  Here
> 	is a specific example.  Filesystem code had opened /dev/da1s1e.
> 	Subsequently, the drive was hot-unplugged.  This (correctly) caused
> 	all of the associated /dev/da1* entries to be deleted.  When the
> 	filesystem later realized that the drive was gone it closed the
> 	device, reducing the write-access counts to 0 on the geom providers
> 	for da1s1e, da1s1, and da1.  This caused geom to re-taste the
> 	providers, resulting in the devices being created again.  When the
> 	drive was hot-plugged back in, it resulted in duplicate /dev entries
> 	for da1s1e, da1s1, and da1.
> 
> 	This fix adds a new disk_gone() function which is called by CAM when a
> 	drive goes away.  It orphans all of the providers associated with the
> 	drive, setting an error condition of ENXIO in each one.  In addition,
> 	we prevent a re-taste on last close for writing if an error condition
> 	has been set in the provider.
> 
> If disk is gone it should orphan its provider. This should cause
> orphaning depended providers, including da1s1 and then da1s1e.
> Orphaning sets provider's error and providers with the error set are not
> retasted. This is from g_access():
> 
> 		else if (pp->acw != 0 && pp->acw == -dcw && pp->error == 0 &&
> 		    !(pp->geom->flags & G_GEOM_WITHER))
> 			g_post_event(g_new_provider_event, pp, M_WAITOK,
> 			    pp, NULL);
> 
> My question is: is disk_geom() really needed? Maybe there is some small
> bug somewhere, but I think GEOM should handle all this without disk_gone().

CAM needs a three step process when a device goes away.  It is currently
represented with disk_gone(), the d_gone() callback inside the disk class,
and disk_destroy().

Here is what CAM needs at each step:

1.  When a device goes away, we need a method to call from daoninvalidate()
    (or any other peripheral driver invalidate routine) with these
    properties:
    - It tells GEOM that the device has gone away, and starts the process
      of shutting down the device.  (i.e. withers/orphans the provider)
    - It is callable from an interrupt context, with the SIM (MTX_DEF) lock
      held, so it can't sleep.

2.  When GEOM has finished cleaning up the device (initiated in step 1),
    it should call the d_gone() method (dadiskgonecb() in the da(4)
    driver) to tell the CAM peripheral driver that the following is true:
    - All I/O to the device has been completed.
    - No more I/O will be sent to the device (e.g. via the strategy routine)
    - No more open/close requests will be sent to the device.  (The final
      close should have been sent to the peripheral driver at this point.)

    CAM will release a reference on the device at that point, and in the
    typical case (where that is the last reference) go to step 3.

3.  When CAM has finished cleaning up all of its state, it will tell GEOM
    that it is okay to finalize the destruction of the device.  So it needs
    a method to call with these properties:

    - It is callable from an interrupt context, with the SIM (MTX_DEF) lock
      held.  So it can't sleep.
    - When it is called, the da(4) driver is indicating that it no longer
      holds any references to that particular GEOM disk instance.

So whether it is disk_gone() or something else, we do need a way to orphan
the provider.  Otherwise, how does GEOM know that the disk is gone?

Ken
-- 
Kenneth Merry
ken at FreeBSD.ORG


More information about the svn-src-head mailing list