drm-kmod-20220907_2 not supported for this configuration / NVIDIA : Failed to initialize the NVIDIA kernel module

From: Mario Marietto <marietto2008_at_gmail.com>
Date: Sat, 24 Feb 2024 14:56:28 UTC
Hello to everyone.

I need to conduct some tests on top of the FreeBSD 13.1 that I'm going to
explain below :

I lost the ability to pass one Nvidia GpU from FreeBSD 14.0 in this case to
any Linux vm. The same procedure that worked until "yesterday" does not
work anymore (for me).

Corvin (a competent bhyve developer) does not reply to my message anymore.
I would like to be sure that it is true that it is bugged,as it seems,and
not that I'm making some mistake.

So,I will explain what I do to enable this functionality. I hope that you
also want to try or that he tries a different procedure that works. The
most important thing is that we will be able to enable the function.

Some time ago Corvin gave me 3 scripts to run in sequence. They are the
following :

a) setup_git_140.sh


git clone https://github.com/beckhoff/freebsd-src /usr/corvin-src-140


b) build_branch_140.sh


    #!/bin/sh

    usage() {
        cat >&2 << EOF
    Usage: ${0} <branch> [<build_options>]
        Checkouts to <branch> and builds it with
    <build_options> (see build.sh for more information).
    EOF
        exit 1
    }

    set -e
    set -u

    readonly script_path="$(cd "$(dirname "${0}")" && pwd)"
    readonly branch="${1?Missing <branch>$(usage)}"
    shift
    echo $branch

    cd /usr/corvin-src-140
    git fetch --all --prune
    git checkout -f "${branch}"

    ${script_path}/build_140.sh "$@"


c) build_build_140.sh


    #!/bin/sh

    usage() {
        cat >&2 << EOF
    Usage: ${0} [--no-bhf] [--reboot] [--verbose] [--without-
    kernel]
        Builds bhyve
    EOF
        exit 1
    }

    build_module() {
        local _path
        _path="${1}"

        # change to module path
        cd "${_path}"

        # clean module
        if test "${clean}" = "true"; then
            make clean > "${cmd_redirect}" 2>&1
        fi

        # build module
        make > "${cmd_redirect}" 2>&1

        # install module
        make install > "${cmd_redirect}"
    }

    build() {
        build_module "${src_dir}/include"
        build_module "${src_dir}/lib/libvmmapi"
        build_module "${src_dir}/sys/modules/vmm"

        # build kernel
        if test "${with_kernel}" = "true"; then
            cd "${src_dir}"
            local kern_opts
            kern_opts="-j$(sysctl -n hw.ncpu)"
            if test "${with_bhf}" = "true"; then
                kern_opts="${kern_opts}
    KERNCONF=BHF"
            fi
            if ! test "${clean}" = "true"; then
                kern_opts="${kern_opts}
    NO_CLEAN=YES"
            fi
            make kernel ${kern_opts} > "${cmd_redirect}" 2>&1
        fi

        build_module "${src_dir}/usr.sbin/bhyve"
        build_module "${src_dir}/usr.sbin/bhyvectl"
        build_module "${src_dir}/usr.sbin/bhyveload"

        if test "${with_reboot}" = "true"; then
            reboot
        fi
    }

    set -e
    set -u

    while test $# -gt 0; do
        case "${1-}" in
            --clean)
                clean="true"
                shift
                ;;
            --reboot)
                with_reboot="true"
                shift
                ;;
            --src-dir=*)
                src_dir="${1#*=}"
                shift
                ;;
            --verbose)
                cmd_redirect="/dev/stdout"
                shift
                ;;
            --without-bhf)
                with_bhf="false"
                shift
                ;;
            --without-kernel)
                with_kernel="false"
                shift
                ;;
            *)
                usage
                ;;
        esac
    done

    readonly clean="${clean-"false"}"
    readonly cmd_redirect="${cmd_redirect-"/dev/null"}"
    readonly src_dir="${src_dir-"/usr/corvin-src-140"}"
    echo $src_dir
    readonly with_bhf="${with_bhf-"true"}"
    readonly with_kernel="${with_kernel-"true"}"
    readonly with_reboot="${with_reboot-"false"}"

    build


Here we go. This is what I do to start the compilation that should produce
the working bhyve system files that will give to use the passthru of one
nvidia gpu on FreeBSD 14.0 :


a) ./setup_git_140.sh

b) ./build_branch_140.sh origin/phab/corvink/14.0/nvidia-wip
--without-bhf --verbose


ok. It compiled the code without giving errors,until a certain point,when
it happens what you see below. I want to understand if the code is bugged.
Please help me :


/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1174:21:
error: use of undeclared identifier 'ctx'
                passthru_cfgwrite(ctx, vcpu, pi, offset -
0x88000, size, value);
                                  ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1174:26:
error: use of undeclared identifier 'vcpu'
                passthru_cfgwrite(ctx, vcpu, pi, offset -
0x88000, size, value);
                                       ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1209:20:
error: use of undeclared identifier 'ctx'
                passthru_cfgread(ctx, vcpu, pi, offset -
0x88000, size, (uint32_t *)&val);
                                 ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1209:25:
error: use of undeclared identifier 'vcpu'
                passthru_cfgread(ctx, vcpu, pi, offset -
0x88000, size, (uint32_t *)&val);
                                      ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1302:29:
error: use of undeclared identifier 'ctx'
                        if (vm_unmap_pptdev_mmio(ctx, sc-
>psc_sel.pc_bus,
                                                 ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1309:27:
error: use of undeclared identifier 'ctx'
                        if (vm_map_pptdev_mmio(ctx, sc-
>psc_sel.pc_bus,
                                               ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1327:29:
error: use of undeclared identifier 'ctx'
                        if (vm_unmap_pptdev_mmio(ctx, sc-
>psc_sel.pc_bus,
                                                 ^
/usr/corvin-src-140/usr.sbin/bhyve/pci_passthru.c:1334:27:
error: use of undeclared identifier 'ctx'
                        if (vm_map_pptdev_mmio(ctx, sc-
>psc_sel.pc_bus,
                                               ^
8 errors generated.
*** Error code 1


Exploring his github it seems that his code should work on FreeBSD 13.1 and
FreeBSD 14.0. So,I've realized that it does not work for the latter. Maybe
it works for FreeBSD 13.1.

To be sure it I've installed it on one of my disks. I've installed xfce4
and KDE,Xorg and from the ports (after having upgraded them),I've installed
the nvidia-driver version. 535.146.02 ; I didn't have any problem.

The problem arises when I tried to install the package drm-kmod from ports.

Unfortunately I'm not able to compile it. This is what happened :

root@marietto:/usr/ports/graphics/drm-kmod # make
====> drm-kmod-20220907_2 not supported for this configuration.

ok. At this point I tried to install it from the packages :

root@marietto:/usr/ports/graphics/drm-kmod # make clean
====> cleaning for drm-kmod-20220907_2

so :

root@marietto:/usr/ports/graphics/drm-kmod # pkg install drm-kmod

New packages to be INSTALLED :
drm-kmod: 20220907_2
OK

now,I have performed additional configuration to :

nano /home/marietto/.xinitrc :

exec ck-launch-session dbus-launch --exit-with-session startxfce4

nano /etc/rc.conf :

kdm5_enable="YES"
dbus_enable="YES"
hald_enable="YES"
kld_list="nvidia nvidia-modeset"
rpcbind_enable="YES"
dtcms_enable="YES"
inetd_enable="YES"

nano /boot/loader.conf

vmm_load="YES"
nmdm_load="YES"
tmpfs_load="YES"
cryptodev_load="YES"
zfs_load="YES"
kern.racct.enable="1"
kern.vty=vt
kern.cam.scsi_delay="10000"

nano /etc/X11/xorg.conf

Section "Device"
               Identifier   "Card0"
               Driver         "nvidia"
               BusID          "PCI:1:0:0"
EndSection


root@marietto:/home/marietto # lspci

01:00.0 NVIDIA GP106
01:00.1 NVIDIA GP106 High Definition Audio Controller
02:00.0 NVIDIA TU102
02:00.1 NVIDIA TU102 High Definition Audio Controller
02:00.2 NVIDIA TU102 USB Controller
02:00.3 NVIDIA TU102 Serial BUS Controller

Now,I want to startx to start xfce4 :

marietto@marietto: $ startx

Fatal server error : no screens found (EE)
Check the log file at "/var/log/Xorg.0.log"

nano /var/log/Xorg.0.log :

NVIDIA dlloader X driver 535.146.02
NVIDIA Unified Driver for all Supported NVIDIA gpus
NVIDIA : Failed to initialize the NVIDIA kernel module

How to fix it ?

-- 
Mario.