head -r331499 amd64/threadripper panic in vm_page_free_prep during "poudriere bulk -a", after 14h 22m or so.
Mark Millard
marklmi26-fbsd at yahoo.com
Sun Mar 25 20:15:21 UTC 2018
[Just an added note about where in the sequence panic
messages are sent to the console vs. could potentially
be sent to the console.]
> On 2018-Mar-25, at 12:32 PM, Mark Millard <marklmi26-fbsd at yahoo.com> wrote:
>
> On 2018-Mar-25, at 11:34 AM, Mark Johnston <markj at FreeBSD.org> wrote:
>
>> On Sun, Mar 25, 2018 at 10:41:38AM -0700, Mark Millard wrote:
>>> FreeBSD panic'd while attempting to see if a "poudriere bulk -w -a"
>>> would get the "unnecessary swapping" problem in my UFS-only context,
>>> -r331499 (non-debug but with symbols), under Hyper-V. This is a
>>> Ryzen Threadripper context, but I've no clue if that is important
>>> to the problem. This was after 14 hours or so of building:
>>>
>>> . . .
>>> [14:22:05] [18] [00:01:16] Finished devel/p5-Test-HTML-Tidy | p5-Test-HTML-Tidy-1.00_1: Success
>>> [14:22:08] [18] [00:00:00] Building devel/ocaml-camlp5 | ocaml-camlp5-6.16
>>>
>>> So I've no clue if or how to repeat this.
>>>
>>> Unfortunately dump was unsuccessful.
>>
>> What happened?
>
> It reported:
>
> (da1:strovsc1:0:0:0) WRITE(10). CDB 2a 00 35 24 37 c7 00 00 0 00
> (da1:storvsc1:0:0:0) CAM status Command timeout
> (da1:storvsc1:0:0:0) Error 5, Retries exhausted
> Aborting dump to to I/O error.
>
> ** DUMP FAILED (ERROR 5) **
> = 0x5
>
>>> So all I have is the
>>> backtrace. Hand typed from a screen shot of the console
>>> window:
>>
>> Do you know what the panic message was? There are multiple calls to
>> panic() in vm_page_free_prep().
>
> No. I listed what I could see. The console screen does not have many
> lines or rows and I was sleeping when the panic happened.
I sometimes wonder if panic should repeat the panic message at the
end of the backtrace in order to deal with keeping it visible in
row-restricted console contexts.
> I redid a buildworld buildkernel installkernel installworld sequence
> since then and it looks like the detailed addresses changed (as seen
> in objdump now vs. what was on the console). But the relative offset
> in vm_page_free_prep seem to be a match, at least for the instruction
> after the "callq panic".
>
> Looking at the kernel code I see:
>
> . . .
> <vm_page_free_prep+0x10> mov 0xffffffff81843690,%rax
> <vm_page_free_prep+0x18> mov $0xffffffff81d6d880,%rcx
> <vm_page_free_prep+0x1f> sub %rcx,%rax
> <vm_page_free_prep+0x22> addq $0x1,%gs:(%rax)
> <vm_page_free_prep+0x27> mov 0x54(%rbx),%eax
> <vm_page_free_prep+0x2f> and $0x1,%eax
> <vm_page_free_prep+0x32> jne <vm_page_free_prep+0x15a>
> . . .
> (several paths reach +0x106)
> <vm_page_free_prep+0x106> movw $0x0,0x64(%rbx)
> <vm_page_free_prep+0x10c> cmpl $0x0,0x50(%rbx)
> <vm_page_free_prep+0x110> jne <vm_page_free_prep+0x163>
> . . .
> <vm_page_free_prep+0x15a> mov $0xffffffff8116628b,%rdi
> <vm_page_free_prep+0x161> jmp <vm_page_free_prep+0x16a>
> <vm_page_free_prep+0x163> mov $0xffffffff8120ca97,%rdi
> <vm_page_free_prep+0x16a> xor %eax,%eax
> <vm_page_free_prep+0x16c> mov %rbx,%rsi
> <vm_page_free_prep+0x16f> callq <panic>
> <vm_page_free_prep+0x174> nopw %cs:0x0(%rax,%rax,1)
>
> No KASSERTS present (a non-debug build). That leaves:
>
> if (vm_page_sbusied(m))
> panic("vm_page_free: freeing busy page %p", m);
> and:
>
> if (m->wire_count != 0)
> panic("vm_page_free: freeing wired page %p", m);
>
> I do not have anything that lets me differentiate which
> occurred based on the above detail. Sorry.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-current
mailing list