[Bug 273982] x11/nvidia-driver-390: patches for user pages handling

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 20 Sep 2023 18:04:00 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273982

            Bug ID: 273982
           Summary: x11/nvidia-driver-390: patches for user pages handling
           Product: Ports & Packages
           Version: Latest
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: danfe@FreeBSD.org
          Reporter: jinxiaoyong@gmail.com
          Assignee: danfe@FreeBSD.org
             Flags: maintainer-feedback?(danfe@FreeBSD.org)

Created attachment 245073
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=245073&action=edit
patches for x11/nvidia-driver

I backported some of the user pages handling code from 535.98 to this 390.154
driver. With these changes, clpeak runs under libc6-shim

$ NVIDIA_LIB64_DIR=/compat/bookworm/lib/x86_64-linux-gnu
/usr/local/bin/nv-sglrun /usr/local/bin/clpeak
shim init

Platform: NVIDIA CUDA
  Device: Quadro 600
    Driver version  : 390.154 (FreeBSD)
    Compute units   : 2
    Clock frequency : 1280 MHz

    Global memory bandwidth (GBPS)
      float   : 19.77
      float2  : 19.99
      float4  : 20.15
      float8  : 17.74
      float16 : 10.88

    Single-precision compute (GFLOPS)
      float   : 161.39
      float2  : 239.89
      float4  : 232.48
      float8  : 223.77
      float16 : 229.75

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 20.49
      double2  : 20.47
      double4  : 20.43
      double8  : 20.35
      double16 : 19.06

    Integer compute (GIOPS)
      int   : 81.53
      int2  : 81.46
      int4  : 81.55
      int8  : 81.64
      int16 : 81.51

    Integer compute Fast 24bit (GIOPS)
      int   : 81.68
      int2  : 81.66
      int4  : 81.52
      int8  : 81.68
      int16 : 81.67

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 6.46
      enqueueReadBuffer               : 6.33
      enqueueWriteBuffer non-blocking : 0.02
      enqueueReadBuffer non-blocking  : 0.02
      enqueueMapBuffer(for read)      : 6.28
        memcpy from mapped ptr        : 9.66
      enqueueUnmap(after write)       : 6.65
        memcpy to mapped ptr          : 9.69

    Kernel launch latency : 5.39 us

-- 
You are receiving this mail because:
You are the assignee for the bug.