svn commit: r310013 - head/sys/dev/xen/blkfront
Roger Pau Monné
royger at freebsd.org
Wed Dec 14 16:47:14 UTC 2016
On Tue, Dec 13, 2016 at 09:28:21PM +0100, Dimitry Andric wrote:
> On 13 Dec 2016, at 07:54, Colin Percival <cperciva at FreeBSD.org> wrote:
> >
> > Author: cperciva
> > Date: Tue Dec 13 06:54:13 2016
> > New Revision: 310013
> > URL: https://svnweb.freebsd.org/changeset/base/310013
> >
> > Log:
> > Check that blkfront devices have a non-zero number of sectors and a
> > non-zero sector size. Such a device would be a virtual disk of zero
> > bytes; clearly not useful, and not something we should try to attach.
> >
> > As a fortuitous side effect, checking that these values are non-zero
> > here results in them not *becoming* zero later on the function. This
> > odd behaviour began with r309124 (clang 3.9.0) but is challenging to
> > debug; making any changes to this function whatsoever seems to affect
> > the llvm optimizer behaviour enough to make the unexpected zeroing of
> > the sector_size variable cease.
>
> I've been having some fun debugging a kernel under Xen, and what I found
> is that sector_size changes from 512 to 0 because of an xs_gather() call
> in xbd_connect(). The function starts as follows:
>
> 1222 static void
> 1223 xbd_connect(struct xbd_softc *sc)
> 1224 {
> 1225 device_t dev = sc->xbd_dev;
> 1226 unsigned long sectors, sector_size, phys_sector_size;
> 1227 unsigned int binfo;
> 1228 int err, feature_barrier, feature_flush;
> ...
> 1237 err = xs_gather(XST_NIL, xenbus_get_otherend_path(dev),
> 1238 "sectors", "%lu", §ors,
> 1239 "info", "%u", &binfo,
> 1240 "sector-size", "%lu", §or_size,
> 1241 NULL);
>
> After this first call to xs_gather(), sectors is 44040322 (in my case of
> a ~21G disk), binfo is zero, and sector_size is 512. During execution
> of the following few lines, sector_size stays 512, until just after the
> fourth call to xs_gather():
>
> 1259 err = xs_gather(XST_NIL, xenbus_get_otherend_path(dev),
> 1260 "feature-flush-cache", "%lu", &feature_flush,
> 1261 NULL);
>
> and then it becomes zero! This is because the %lu format scans a 64 bit
> unsigned integer, while the feature_flush variable is a 32 bit signed
> integer, thus overwriting an adjacent location on the stack, which
> happens to contain sector_size: in the assembly output, sector_size is
> at -88(%rbp), while feature_flush is at -92(%rbp).
>
> There is another instance of such an incorrect format, a few lines
> before this, but it doesn't seem to scribble over important data (or we
> didn't panic because of it, yet):
>
> 1253 err = xs_gather(XST_NIL, xenbus_get_otherend_path(dev),
> 1254 "feature-barrier", "%lu", &feature_barrier,
> 1255 NULL);
>
> Here, feature_barrier is a 32 bit signed integer, while %lu again scans
> a 64 bit unsigned integer.
>
> In short, I think the following change would also fix the problem:
>
> Index: sys/dev/xen/blkfront/blkfront.c
> ===================================================================
> --- sys/dev/xen/blkfront/blkfront.c (revision 309817)
> +++ sys/dev/xen/blkfront/blkfront.c (working copy)
> @@ -1251,13 +1251,13 @@ xbd_connect(struct xbd_softc *sc)
> if (err || phys_sector_size <= sector_size)
> phys_sector_size = 0;
> err = xs_gather(XST_NIL, xenbus_get_otherend_path(dev),
> - "feature-barrier", "%lu", &feature_barrier,
> + "feature-barrier", "%d", &feature_barrier,
> NULL);
> if (err == 0 && feature_barrier != 0)
> sc->xbd_flags |= XBDF_BARRIER;
>
> err = xs_gather(XST_NIL, xenbus_get_otherend_path(dev),
> - "feature-flush-cache", "%lu", &feature_flush,
> + "feature-flush-cache", "%d", &feature_flush,
> NULL);
> if (err == 0 && feature_flush != 0)
> sc->xbd_flags |= XBDF_FLUSH;
>
> However, I have some difficulty getting a custom-built kernel booting on
> my very rudimentary Xen setup. I am using a snapshot raw image, with no
> idea how to insert a kernel into it.
>
> E.g, can you please try this out, instead of the zero-check fix? I am
> very curious whether that fixes the panics. :)
Yes, this indeed fixes the panic, I've reverted Colin's patch and applied yours,
and everything seems fine. Please go ahead and commit it.
xs_gather should really go away and we should use something that we can sanitize
at compile time using __scanflike & friends.
Thanks, Roger.
More information about the svn-src-all
mailing list