After more than 59 hr 20 min of poudreire based port building, the Rock64 (4 GiByte) got a data_abort with a panic message that mentioned "vm_fault failed" (cmd in dma_done was NULL, making cmd->data fail).
Mark Millard
marklmi at yahoo.com
Wed Nov 27 20:46:07 UTC 2019
On 2019-Nov-27, at 09:31, Mark Millard <marklmi at yahoo.com> wrote:
> The failure was while dwmmc_intr was active on the bus. It looks
> like the vm_fault failed address matches the elr value, which is
> near the lr value and near the "pc =" value listed for dwmmc_intr.
> (Back trace shown later.)
I should have mentioned that the system was running a non-debug
build (with symbols).
Looks like "cmd" was zero (NULL) in:
766 static int
767 dma_done(struct dwmmc_softc *sc, struct mmc_command *cmd)
768 {
769 struct mmc_data *data;
771 data = cmd->data;
0xffff00000078e51c <+648>: ldr x8, [x23, #40]
for the use of dma_done in dwmmc_intr that is shown below:
. . .
cmd = sc->curcmd;
. . .
/* Ack interrupts */
WRITE4(sc, SDMMC_RINTSTS, reg);
if (sc->use_pio) {
if (reg & (SDMMC_INTMASK_RXDR|SDMMC_INTMASK_DTO)) {
pio_read(sc, cmd);
}
if (reg & (SDMMC_INTMASK_TXDR|SDMMC_INTMASK_DTO)) {
pio_write(sc, cmd);
}
} else {
/* Now handle DMA interrupts */
reg = READ4(sc, SDMMC_IDSTS);
if (reg) {
dprintf("dma intr 0x%08x\n", reg);
if (reg & (SDMMC_IDINTEN_TI | SDMMC_IDINTEN_RI)) {
WRITE4(sc, SDMMC_IDSTS, (SDMMC_IDINTEN_TI |
SDMMC_IDINTEN_RI));
WRITE4(sc, SDMMC_IDSTS, SDMMC_IDINTEN_NI);
dma_done(sc, cmd);
}
}
}
. . .
Unfortunately, I did not get a dump.
> This is a head -r355027 based context.
>
> This does not look easy to reproduce.
>
> I had poudriere running 4 jobs, each allowed to use 4 processes,
> so the bulk of the time the load average was between 8 and 17.
>
> The last top update (of my extended top) showed top never saw
> significant swap usage:
>
> Swap: 4608M Total, 22M Used, 4586M Free, 32M MaxObsUsed
>
> ("MaxObs" is short for "Maximum Observed".)
>
> It also showed (line wrapped by me):
>
> Mem: 196M Active, 1078M Inact, 4272K Laundry, 650M Wired, 264M Buf,
> 2035M Free, 2517M MaxObsActive, 805M MaxObsWired, 3219M MaxObs(Act+Wir)
>
> It showed as running:
>
> /usr/local/sbin/pkg-static create -r /wrkdirs/usr/ports/devel/llvm90/work/stage . . .
> (earlier llvm80 had completed fine)
>
> and 3 of processes the form:
>
> cpdup -i0 -x ref0?
>
> Those 3 seem to be for the 3 "Building"s listed below:
>
> [59:20:56] [02] [00:14:53] Finished devel/qt5-linguist | qt5-linguist-5.13.2: Success
> [59:20:57] [02] [00:00:00] Building deskutils/lumina-archiver | lumina-archiver-1.5.0
> [59:20:57] [03] [00:00:00] Building deskutils/lumina-calculator | lumina-calculator-1.5.0
> [59:20:57] [04] [00:00:00] Building x11/lumina-core | lumina-core-1.5.0
>
>
> The serial console's report was:
>
> Fatal data abort:
> x0: fffffd0000b45b00
> x1: ffff000040588000
> x2: 8c
> x3: 100
> x4: ffff00004035caa0
> x5: ffff00004035c7b0
> x6: 0
> x7: 1
> x8: ffff000000758ebc
> x9: ffff000000a33100
> x10: fffffd0000a28678
> x11: 0
> x12: 9633b10b
> x13: 2af8
> x14: 2777
> x15: 2af8
> x16: 38
> x17: 38
> x18: ffff00004035c870
> x19: fffffd0000a28600
> x20: 8c
> x21: fffffd0000b45e58
> x22: ffff000000a4b000
> x23: 0
> x24: fffffd0000b45e10
> x25: fffffd0000b89514
> x26: fffffd0000b8f180
> x27: fffffd0000b45e00
> x28: ffff000000a4bd98
> x29: ffff00004035c8b0
> sp: ffff00004035c870
> lr: ffff00000078e518
> elr: ffff00000078e51c
> spsr: 145
> far: 28
> esr: 96000005
> panic: vm_fault failed: ffff00000078e51c
> cpuid = 2
> time = 1574872496
> KDB: stack backtrace:
> db_trace_self() at db_trace_self_wrapper+0x28
> pc = 0xffff00000075ba9c lr = 0xffff0000001066a8
> sp = 0xffff00004035c270 fp = 0xffff00004035c480
>
> db_trace_self_wrapper() at vpanic+0x18c
> pc = 0xffff0000001066a8 lr = 0xffff00000041903c
> sp = 0xffff00004035c490 fp = 0xffff00004035c530
>
> vpanic() at panic+0x44
> pc = 0xffff00000041903c lr = 0xffff000000418eac
> sp = 0xffff00004035c540 fp = 0xffff00004035c5c0
>
> panic() at data_abort+0x1e0
> pc = 0xffff000000418eac lr = 0xffff000000777d94
> sp = 0xffff00004035c5d0 fp = 0xffff00004035c680
>
> data_abort() at do_el1h_sync+0x144
> pc = 0xffff000000777d94 lr = 0xffff000000776fb0
> sp = 0xffff00004035c690 fp = 0xffff00004035c6c0
>
> do_el1h_sync() at handle_el1h_sync+0x78
> pc = 0xffff000000776fb0 lr = 0xffff00000075e078
> sp = 0xffff00004035c6d0 fp = 0xffff00004035c7e0
>
> handle_el1h_sync() at dwmmc_intr+0x280
> pc = 0xffff00000075e078 lr = 0xffff00000078e514
> sp = 0xffff00004035c7f0 fp = 0xffff00004035c8b0
>
> dwmmc_intr() at ithread_loop+0x1f4
> pc = 0xffff00000078e514 lr = 0xffff0000003db604
> sp = 0xffff00004035c8c0 fp = 0xffff00004035c940
>
> ithread_loop() at fork_exit+0x90
> pc = 0xffff0000003db604 lr = 0xffff0000003d7be4
> sp = 0xffff00004035c950 fp = 0xffff00004035c980
>
> fork_exit() at fork_trampoline+0x10
> pc = 0xffff0000003d7be4 lr = 0xffff000000776cec
> sp = 0xffff00004035c990 fp = 0x0000000000000000
>
> KDB: enter: panic
> [ thread pid 12 tid 100038 ]
> Stopped at dwmmc_intr+0x288: ldr x8, [x23, #40]
> db>
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list