GEOM probes fail on aac with EARLY_AP_STARTUP

sthaug at nethelp.no sthaug at nethelp.no
Fri Sep 8 13:15:16 UTC 2017


> I've got a devel machine here which was failing to boot on our vendored
> FreeBSD 11.1, because GEOM was unable to find the partitions on the boot
> drive and so the root mount failed.  This started happening on many but
> not all boots after I upgraded the machine from 9.3.
> 
> The machine is an Intel S25520UR motherboard with 2x Xeon E5620 CPUs
> (Hyperthreading enabled, so hw.ncpu=16) and an Adaptec 5805, and 2 RAID
> volumes configured on 6 SATA drives.
> 
> When booting, it sees the aac0 controller and aacd0
> volume but GEOM does not find any of the partitions on that volume, and the
> initial mount of root on /dev/aacd0p2 fails.  aacd0 is available and
> readable, but the expected aacd0p{1,2,3} devices do not exist.
> (However, aacd1 and its partitions/devices are configured normally.)
> 
> I think it's a race condition between the aac driver and GEOM probing,
> probably newly triggered/exposed by EARLY_AP_STARTUP.  I've reproduced
> the problem on upstream FreeBSD 11.1 and -current.  Disabling
> EARLY_AP_STARTUP, or setting kern.smp.disabled=1, causes the kernel to
> start correctly. 'boot -v' also causes the kernel to start correctly.

Is there any reason to believe this is limited to aac? I'm asking
because your description is quite similar to boot problema I'm seeing
with 11.1-STABLE on a server with mps (Avago) SCSI/SATA controller
and SATA disks. I'm getting the dreaded "mounting from ... failed
with error 19". 11.1-RELEASE seems to work okay but 11.1-STABLE does
not.

Steinar Haug, Nethelp consulting, sthaug at nethelp.no


More information about the freebsd-scsi mailing list