svn commit: r304966 - head/sys/boot/i386/libi386

Peter Wemm peter at FreeBSD.org
Sun Aug 28 20:39:35 UTC 2016


Author: peter
Date: Sun Aug 28 20:39:33 2016
New Revision: 304966
URL: https://svnweb.freebsd.org/changeset/base/304966

Log:
  The read-ahead code from r298230 made it likely the boot code would read
  beyond the end of disk. r298900 added code to prevent this.  Some BIOSes
  cause significant delays if asked to read past end-of-disk.
  
  We never trusted the BIOS to accurately report the sectorsize of disks
  before and this set of changes.  Unfortuately they interact badly with
  the infamous >2TB wraparound bugs.  We have a number of relatively-recent
  machines in the FreeBSD.org cluster where the BIOS reports 3TB disks as 1TB.
  
  With pre-r298900 they work just fine.  After r298900 they stop working if
  the boot environment attempts to access anything outside the first 1TB on
  the disk.  'ZFS: I/O error, all block copies unavailable' etc.  It affects
  both UFS and ZFS if they try to boot from large volumes.
  
  This change replaces the blind trust of the BIOS end-of-disk reporting
  with a read-ahead clip to prevent reads crossing the of end-of-disk
  boundary.  Since 2^32 (2TB) size reporting truncation is not uncommon,
  the clipping is done on 2TB aliases of the reported end-of-disk.
  ie: a 3TB disk reported as 1TB has readahead clipped at 1TB, 3TB, 5TB, ...
  as one of them is likely to be the real end-of-disk.
  
  This should make the loader on these broken machines behave the same as
  traditional pre-r298900 loader behavior, without disabling read-ahead.
  
  PR:		212139
  Discussed with:	tsoome, allanjude

Modified:
  head/sys/boot/i386/libi386/biosdisk.c

Modified: head/sys/boot/i386/libi386/biosdisk.c
==============================================================================
--- head/sys/boot/i386/libi386/biosdisk.c	Sun Aug 28 19:48:08 2016	(r304965)
+++ head/sys/boot/i386/libi386/biosdisk.c	Sun Aug 28 20:39:33 2016	(r304966)
@@ -497,7 +497,7 @@ bd_realstrategy(void *devdata, int rw, d
     char *buf, size_t *rsize)
 {
     struct disk_devdesc *dev = (struct disk_devdesc *)devdata;
-    int			blks;
+    int			blks, remaining;
 #ifdef BD_SUPPORT_FRAGS /* XXX: sector size */
     char		fragbuf[BIOSDISK_SECSIZE];
     size_t		fragsize;
@@ -513,14 +513,15 @@ bd_realstrategy(void *devdata, int rw, d
     if (rsize)
 	*rsize = 0;
 
-    if (dblk >= BD(dev).bd_sectors) {
-	DEBUG("IO past disk end %llu", (unsigned long long)dblk);
-	return (EIO);
-    }
-
-    if (dblk + blks > BD(dev).bd_sectors) {
-	/* perform partial read */
-	blks = BD(dev).bd_sectors - dblk;
+    /*
+     * Perform partial read to prevent read-ahead crossing
+     * the end of disk - or any 32 bit aliases of the end.
+     * Signed arithmetic is used to handle wrap-around cases
+     * like we do for TCP sequence numbers.
+     */
+    remaining = (int)(BD(dev).bd_sectors - dblk);	/* truncate */
+    if (remaining > 0 && remaining < blks) {
+	blks = remaining;
 	size = blks * BD(dev).bd_sectorsize;
 	DEBUG("short read %d", blks);
     }


More information about the svn-src-head mailing list