[Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko

Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 21 Dec 2024 09:06:39 UTC

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028

satanist+freebsd@bureaucracy.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |satanist+freebsd@bureaucrac
                   |                            |y.de

--- Comment #259 from satanist+freebsd@bureaucracy.de ---
I think you looking at the wrong direction. The question is where does the NULL
pointer is from.

So lets look at the 'found_modules->tqh_first->link.tqe_next->. .
.->link.tqe_next' instance. This list only managed by sys/kern/kern_linker.c.
And only at one point there is an insert:

```
static modlist_t
modlist_newmodule(const char *modname, int version, linker_file_t container)
{
        modlist_t mod;

        mod = malloc(sizeof(struct modlist), M_LINKER, M_NOWAIT | M_ZERO);
        if (mod == NULL)
                panic("no memory for module list");
        mod->container = container;
        mod->name = modname;
        mod->version = version;
        TAILQ_INSERT_TAIL(&found_modules, mod, link); 
        return (mod);
}
```

So I would guess the +7 is from the TAILQ list and the fake NULL pointer is
directly from malloc(9). So a build with MALLOC_DEBUG might help.

Also I have looked a bit a for PHYS_TO_DMAP in sys/compat/linuxkpi and found
arch_io_reserve_memtype_wc(). This function is used at
drm-kmod/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:

```
                int r = arch_io_reserve_memtype_wc(adev->gmc.aper_base,
                                adev->gmc.aper_size);

                if (r) {
                        DRM_ERROR("Unable to set WC memtype for the aperture
base\n");
#ifdef __linux__
                        /*
                         * BSDFIXME: On recent AMD GPU requested area crosses
                         * DMAP boundries resulting in error. Ignore it for now
                         */
                        return r;
#endif
                }
```

This could also sneak in a fake NULL pointer and cause UB.

-- 
You are receiving this mail because:
You are the assignee for the bug.