crash of 32-bit powerpc -r347549 kernel built via system-clang-8 (crash is while trying to mount the root file system)

Mark Millard marklmi at yahoo.com
Wed Jun 5 08:35:42 UTC 2019


On 2019-Jun-3, at 19:40, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-Jun-3, at 17:24, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> I tried (cross) building a 32-bit powerpc kernel and world (non-debug) 
>> with system-clang (on amd64) and use of devel/powerpc64-binutils . The
>> installed kernel panics trying to mount the root file system.
>> 
>> FYI: Typed from picture of screen . . .
>> 
>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>> panic: getnewbuf_empty: Locked buf 0xd2800000 on free queue.
>> . . .
>> 0xd6919080: at kdb_backtrace+0x64
>> 0xd69190e0: at vpanic+0x200
>> 0xd6919150: at panic+0x50
>> 0xd6919190: at getnewbuf+0x594
>> 0xd69191f0: at getblkx+0x540
>> 0xd69192a0: at breadn_flags+0x90
>> 0xd69192f0: at ffs_use_bread+0x9c
>> 0xd6919330: at readsuper+0x68
>> 0xd6919370: at ffs_sbget+0xcc
>> 0xd69193c0: at ffs_mount+0x18b8
>> 0xd69194f0: at vfs_domount+0xa74
>> 0xd69196a0: at vfs_donmount+0x944
>> 0xd6919700: at kernel_mount+0x64
>> 0xd6919740: at parse_mount+0x52c
>> 0xd6919840: at vfs_mountroot+0x71c
>> 0xd69199b0: at start_init+0x44
>> 0xd6919a10: at fork_exit_0xcc
>> 0xd6919a40: at fork_trampoline+0xc
>> KDB: enter panic
>> [ thread pid 1 tid 100002 ]
>> Stopped at kdb_enter+0x74: addi r3,r0,0x0
>> 
>> This reproduces with each boot attempt.
>> 
>> Replacing the kernel with one built via gcc 4.2.1 and booting
>> the result does not panic.
>> 
>> 
>> FYI for the context of the panic call:
>> 
>> /usr/src/sys/kern/vfs_bio.c :
>> 
>> static struct buf *
>> buf_alloc(struct bufdomain *bd)
>> {
>>       struct buf *bp;
>>       int freebufs;
>> 
>>       /*
>>        * We can only run out of bufs in the buf zone if the average buf
>>        * is less than BKVASIZE.  In this case the actual wait/block will
>>        * come from buf_reycle() failing to flush one of these small bufs.
>>        */
>>       bp = NULL;
>>       freebufs = atomic_fetchadd_int(&bd->bd_freebuffers, -1);
>>       if (freebufs > 0)
>>               bp = uma_zalloc(buf_zone, M_NOWAIT);
>>       if (bp == NULL) {
>>               atomic_add_int(&bd->bd_freebuffers, 1);
>>               bufspace_daemon_wakeup(bd);
>>               counter_u64_add(numbufallocfails, 1);
>>               return (NULL);
>>       }
>>       /*
>>        * Wake-up the bufspace daemon on transition below threshold.
>>        */
>>       if (freebufs == bd->bd_lofreebuffers)
>>               bufspace_daemon_wakeup(bd);
>> 
>>       if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) != 0)
>>               panic("getnewbuf_empty: Locked buf %p on free queue.", bp);
> 
> 
> I tried making a debug kernel build via system-clang-8. It
> reports differently but still during getnewbuf being active
> on the stack (again typed from a picture):
> 
> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
> . . . (ignore witness/diagnostic warnings) . . .
> panic: bq_remove: Locked buf 0xd2a00000 not on a queue.
> . . .
> 0xd6b7bfd0: at kdb_backtrace+0x64
> 0xd6b7c030: at vpanic+0x200
> 0xd6b7c0a0: at panic+0x50
> 0xd6b7c0e0: at bq_remove+01e0
> 0xd6b7c100: at buf_import+0x8c
> 0xd6b7c130: at uma_zalloc_arg+0x544
> 0xd6b7c190: at getnewbuf+0x380
> 0xd6b7c1f0: at getblkx+0x620
> 0xd6b7c290: at breadn_flags+0x90
> 0xd6b7c2e0: at ffs_use_bread+0xa8
> 0xd6b7c320: at readsuper+0x68
> 0xd6b7c360: at ffs_sbget+0xcc
> 0xd6b7c3b0: at ffs_mount+0xefc
> 0xd6b7c4e0: at vfs_domount+0xa754
> 0xd6b7c690: at vfs_donmount+0x78c
> 0xd6b7c6f0: at kernel_mount+0x7c
> 0xd6b7c730: at parse_mount+0x52c
> 0xd6b7c830: at vfs_mountroot+0x660
> 0xd6b7c9a0: at start_init+0x4c
> 0xd6b7ca10: at fork_exit_0xb0
> 0xd6b7ca40: at fork_trampoline+0xc
> 
> /usr/src/sys/kern/vfs_bio.c :
> 
> static void
> bq_remove(struct bufqueue *bq, struct buf *bp)
> {
> 
>        CTR3(KTR_BUF, "bq_remove(%p) vp %p flags %X",
>            bp, bp->b_vp, bp->b_flags);
>        KASSERT(bp->b_qindex != QUEUE_NONE,
>            ("bq_remove: buffer %p not on a queue.", bp));
> . . .
> 
> For reference:
> 
> static int
> buf_import(void *arg, void **store, int cnt, int domain, int flags)
> {
>        struct buf *bp;
>        int i;
> 
>        BQ_LOCK(&bqempty);
>        for (i = 0; i < cnt; i++) {
>                bp = TAILQ_FIRST(&bqempty.bq_queue);
>                if (bp == NULL)
>                        break;
>                bq_remove(&bqempty, bp);
>                store[i] = bp;
>        }
>        BQ_UNLOCK(&bqempty);
> 
>        return (i);
> }
> 
> 

I tried building the debug kernel with KTR for KTR_BUF.
Installing and booting the result did not panic. Manually
forcing getting to ddb> soon enough and doing "show ktr"
did show a bq_remove for 0xd2a00000 (and later activity).

From the looks of the KTR_BUF CTRn's, this suggests to me
that the access to bp->qindex in bq_remove is racy in
some way vs. updates to the value.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-ppc mailing list