Re: What kind of code might generate amd64 addressses like 0xFFFFF80000000007 or be based on 0xFFFFF80000000000 ?

From: Philipp <satanist+freebsd_at_bureaucracy.de>
Date: Mon, 16 Dec 2024 07:01:59 UTC
Hi Mark

[2024-12-15 16:03] Mark Millard <marklmi@yahoo.com>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028 is for a crash problem
> someone has been having over more than 2 years. There are boot time crashes
> involved.
>
> It appears that 0xFFFFF80000000007 is showing up in use and stored in data
> structures as a pointer value in fields/arguments that are pointers, where such
> a special value would not be expected. Later defrerencing does not go well, at
> least when the dererefenced data is then in-turn put to use.
>
> The small offset from 0xFFFFF80000000000 suggests to me that the special value likely
> is inappropriately left around and somehow picked up and used. 0xFFFFF80000000000 (or
> near it) might be odd enough to have only a few known likely possible usages. Such
> notes in the bugzilla report would be good if such is the case. Thus my question.

By simple grep through sys/ I found following comment in sys/amd64/include/vmparam.h:

> /*
>  * Virtual addresses of things.  Derived from the page directory and
>  * page table indexes from pmap.h for precision.
> [...]
>  * 0xfffff80000000000 - 0xfffffbffffffffff   4TB direct map

The direct map is 4TB of virtuall address space mapping the physical
address space 1:1 (minus the base). So I would guess this is caused by
an NULL pointer converted by PHYS_TO_DMAP.

Philipp

> The context has amdgpu raven support in use normally. Reportedly the problem has
> never been seen with that disabled. (However, I'm not aware of experiments with
> alternate card types, for example.)
>
> Where, when, and if a boot crash occurs is variable, not stable. But use of the
> list found_modules->tqh_first->. . . tends to be involved.
>
>
>
> Some other modern 13.4-RELEASE related context notes
> ( comments #231 and #233 ):
>
> The person with the problem reports . . .
>
> I am not using a stock distribution of the kernel:
>
> diff -u sys/amd64/conf/{GENERIC,M5P}
> --- sys/amd64/conf/GENERIC 2024-07-03 16:23:56.252550000 -0400
> +++ sys/amd64/conf/M5P 2024-07-03 16:25:05.287604000 -0400
> @@ -18,12 +18,13 @@
>  #
>  
>  cpu HAMMER
> -ident GENERIC
> +ident M5P
>  
>  makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
>  makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>  
> -options SCHED_ULE # ULE scheduler
> +#options SCHED_ULE # ULE scheduler
> +options SCHED_4BSD # 4BSD scheduler
>  options NUMA # Non-Uniform Memory Architecture support
>  options PREEMPTION # Enable kernel thread preemption
>  options VIMAGE # Subsystem virtualization, e.g. VNET
>
>
> I also noted (for modern 13.4-RELEASE times):
>
> Also: the build is based on the -p2 source code (hash 3f40d5821):
>
> # strings boot/kernel/kernel | grep "\-RELEASE"
> @(#)FreeBSD 13.4-RELEASE-p2 3f40d5821 M5P
> FreeBSD 13.4-RELEASE-p2 3f40d5821 M5P
> 13.4-RELEASE-p2
>
> Because it is a rebuild, the kernel ends up with -p2 instead
> of the official -p1 ( from -p2 not updating boot/kernel/kernel
> in the official distributions ).
>
>
>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>