Re: Windows Dev Kit 2023 usage notes for main [so: 14], including about oddities

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 26 Apr 2023 12:22:30 UTC
[I'm inserting some updates as in-line comments.]

On Apr 22, 2023, at 20:50, Mark Millard <marklmi@yahoo.com> wrote:

> Up front odd requirement just for the USB-C ports:
> 
> At least my examples of USB3.2 storage media plugged
> into the USB-C ports are not detected by the FeeBSD
> kernel but are detected and handled by UEFI ( and,
> so, by FreeBSD's loader.efi ). Thus the combination
> may need to be avoided for now.
> 
> Plugged into the USB-A ports, I had no storage media
> problems. I had no problems with my example USB3.0
> media in USB-A ports.
> 
> But I do not have a way to plug a USB3.0 storage
> media example into a USB-C port so I've no information
> on that combination.
> 
> I've submitted:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271012
> 
> about the USB-C non-detection issue.
> 
> I'll note that:
> 
> https://learn.microsoft.com/en-us/windows/arm/dev-kit/
> 
> reports:
> 
> QUOTE
> When connecting an external keyboard or mouse, use the USB-A ports,
> not USB-C. Using USB-C to connect a keyboard or mouse will only work
> intermittently.
> END QUOTE
> 
> (It is unclear if that is a Windows specific issue, UEFI issue,
> both, or even more general.)
> 
> 
> My notes below will ignore that I had steps that were
> discovery of the issues above: a simplified history.
> 
> 
> These notes are based on leaving Windows 11 Pro in place
> on the internal storage media: So just USB storage media
> for FreeBSD.
> 
> First time Powered On:
> 
> I used the buttons to boot into UEFI and set the secure
> keys to none, disabling secure boot.
> 
> The (small round) UEFI button is not made to be easy
> to press/hold. The USB-boot button is bigger and
> easier to press but is still not made to be easy.
> (I've set up to avoid its use.) As I understand, these
> two buttons are to be operated when also pressing the
> power/reset button.
> 
> An FYI about the UI in UEFI is that, despite lack of visual
> feedback during a drag, one can:
> 
> A) Drag left somewhat and release in order to "swipe left".
> B) Drag up/down and release in order to change the boot
>   order for finding EFI boot media and such.
> 
> I moved USB to the top for the boot order and diabled
> all the other enabled selections. But, if it does not
> find EFI media on USB, it still eventually boots
> Windwos 11 Pro.
> 
> It appears that once powered on, the power switch being
> held down will eventually force a restart, not a power
> off. It looks like the power cord is the way to force a
> power off.
> 
> The UEFI has no command shell access presented.
> 
> After setting up UEFI as I wanted, I made sure that
> /boot/loader.conf on the USB boot media contained:
> 
> #
> # To allow the Cortex-A78C/Cortex-X1C based
> # Windows Dev Kit 2023 to boot:
> hw.pac.enable=0

Recent enough kernels now automatically deal with the
above for the Windows Dev Kit 2023: no need to be
explicit once having progressed to such.

> (I'd read other material indicating the need. I've
> never seen what happens without it.)
> 
> I next rebooted with the FreeBSD media plugged in.
> (Windows 11 Pro never having a boot even started yet.)
> 
> Note:
> My boot media here is main with enabling of tuning
> for cortex-a72 (media I use with other matchines as
> well). The build of main is a non-debug build (but
> with symbols not stripped).
> 
> The absence of feature-register value reports by the
> kernel after CPU 0 is an implicit indication of "same
> as for prior CPU". Only CPU 0 gets such feature
> register value reports.
> 
> Note:
> It is known that the cortex-a78c and cortex-x1c
> feature regsters reports are not exact matches
> to the clang or gcc13+ defaults for targetting
> those parts.

Even more: There are committed changes for LLVM
defaults for the A78C and X1C. It still does not
seem to be a full match to the reported feature-
register values f or WDK23. (Operating systems
normally do not report about various feature
fields that they always ignore.)

FYI, FreeBSD indicates (HoneyComb then WDK23):

CPU  0: ARM Cortex-A72 r0p3 affinity:  0  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,PIPT ICache,64 byte ERG,64 byte CWG>
 Instruction Set Attributes 0 = <CRC32,SHA2,SHA1,AES+PMULL>
 Instruction Set Attributes 1 = <>
 Instruction Set Attributes 2 = <>
         Processor Features 0 = <GIC,AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32>
         Processor Features 1 = <>
      Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,16TB PA>
      Memory Model Features 1 = <8bit VMID>
      Memory Model Features 2 = <32bit CCIDX,48bit VA>
             Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
AArch32 Instruction Set Attributes 5 = <CRC32,SHA2,SHA1,AES+VMULL,SEVL>
AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD>
AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
CPU  1: ARM Cortex-A72 r0p3 affinity:  0  1
CPU  2: ARM Cortex-A72 r0p3 affinity:  1  0
CPU  3: ARM Cortex-A72 r0p3 affinity:  1  1
CPU  4: ARM Cortex-A72 r0p3 affinity:  2  0
CPU  5: ARM Cortex-A72 r0p3 affinity:  2  1
CPU  6: ARM Cortex-A72 r0p3 affinity:  3  0
CPU  7: ARM Cortex-A72 r0p3 affinity:  3  1
CPU  8: ARM Cortex-A72 r0p3 affinity:  4  0
CPU  9: ARM Cortex-A72 r0p3 affinity:  4  1
CPU 10: ARM Cortex-A72 r0p3 affinity:  5  0
CPU 11: ARM Cortex-A72 r0p3 affinity:  5  1
CPU 12: ARM Cortex-A72 r0p3 affinity:  6  0
CPU 13: ARM Cortex-A72 r0p3 affinity:  6  1
CPU 14: ARM Cortex-A72 r0p3 affinity:  7  0
CPU 15: ARM Cortex-A72 r0p3 affinity:  7  1

No HoneyComb cpu report indicated differences in register values.

Windows Dev Kit 2023 (with some notes/warnings added):

CPU  0: ARM Cortex-A78C r0p0 affinity:  0  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,PIPT ICache,64 byte ERG,64 byte CWG,IDC>

 Instruction Set Attributes 0 = <CondM-8.4,DP,RDM,Atomic,CRC32,SHA2,SHA1,AES+PMULL>
NOTE: CondM-8.4 indicates: FEAT_FlagM;   LLVM: FeatureFlagM,                   so: +flagm
WARNING: gcc13 has FLAGM missing from the cortex-x1c default features
NOTE: DP        indicates: FEAT_DotProd; LLVM: FeatureDotProd,                 so: +dotprod
NOTE: Atomic    indicates: FEAT_LSE;     LLVM: FeatureLSE                      so: +lse
WARNING: gcc13 does not seem to have a LSE
NOTE: Lack of FreeBSD FHM_IMPL indicates no FEAT_FHM; LLVM: no FeatureFP16FML, so: -fp16fml
WARNING: LLVM has FeatureFP16FML for its -mcpu=cortex-a78c default, at least in LLVM15
NOTE:    gcc13 does not indicate its F16FML.

  Instruction Set Attributes 1 = <GPA,RCPC-8.4,APA EPAC2,DCPoP>
NOTE: RCPC-8.4  indicates FEAT_LRCPC2;                               LLVM: FeatureRCPC_IMMO,      so: +rcpc2
WARNING: LLVM has just FeatureRCPC for its -mcpu=cortex-a78c default
WARNING: gcc13 seems to only have a RCPC
NOTE: APA EPAC2 indicates FEAT_PAUTH2 (QARMA/Architected algorithm)
NOTE: LLVM only has: FeaturePAuth, so: +pauth; gcc13 only has PAUTH as well
NOTE: E.g., -mbranch-protection=pac-ret for armv8.2 (avoiding RETAA that requires armv8.3+)
NOTE: not -mbranch-protection=standard (implicit +BTI): BTI requires ARMv8.5
NOTE: Lack of API variant indicates lack of support for IMPLEMENTATION DEFINED algorithm
NOTE: DCPoP (old ARMv8.2-DCPoP) indicates FEAT_DPB;                  LLVM: FeatureCCPP,           so: +ccpp
WARNING: gcc13 does not seem to have a DCPoP/DPB/CCPP

 Instruction Set Attributes 2 = <>
         Processor Features 0 = <CSV3,CSV2,RAS,GIC,AdvSIMD+HP,FP+HP,EL3,EL2,EL1,EL0 32>
         Processor Features 1 = <PSTATE.SSBS>
      Memory Model Features 0 = <TGran4,TGran64,TGran16,SNSMem,BigEnd,16bit ASID,1TB PA>
      Memory Model Features 1 = <XNX,PAN+ATS1E1,LO,HPD+TTPBHA,VH,16bit VMID,HAF+DS>
      Memory Model Features 2 = <AT,32bit CCIDX,48bit VA,IESB,UAO,CnP>
NOTE: AT          indicates: FEAT_LSE2;     LLVM: FeatureLSE2, but no command line notation.
WARNING: LLVM has FeatureLSE2 missing for its -mcpu=cortex-a78c default
WARNING: gcc13 does not seem to have any of: LSE2, IESB, CCIDX, UAO, CnP
WARNING: LLVM does not seem to have any of: IESB, CnP

             Debug Features 0 = <DoubleLock,SPE,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3 v8.1,Debugv8.2>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
AArch32 Instruction Set Attributes 5 = <RDM,CRC32,SHA2,SHA1,AES+VMULL,SEVL>
AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD>
AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP Arith,SIMDHP Arith,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
CPU  1: ARM Cortex-A78C r0p0 affinity:  1  0
CPU  2: ARM Cortex-A78C r0p0 affinity:  2  0
CPU  3: ARM Cortex-A78C r0p0 affinity:  3  0
CPU  4: ARM Cortex-X1C r0p0 affinity:  4  0
CPU  5: ARM Cortex-X1C r0p0 affinity:  5  0
CPU  6: ARM Cortex-X1C r0p0 affinity:  6  0
CPU  7: ARM Cortex-X1C r0p0 affinity:  7  0

No WDK32 cpu report indicated differences in register values.

Checking the match to documenation and defaults for
LLVM via :

arm_cortex_a78c_core_trm_102226_0002_03_en.pdf (the "a78c .pdf")
arm_cortex_x1c_core_trm_101968_0002_04_en.pdf  (the "x1c .pdf")
DDI0487_I_a_a-profile_architecture_reference_manual.pdf

and:

https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/TargetParser/AArch64TargetParser.h
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64.td

(The first two .pdf's indicate specific field values to look
at in the 3rd .pdf .)

Considering only cortex-a78c and cortex-x1c, I see that the
two should be the same for the relevant features I was looking
at but at least one of the 2 has the wrong default feature status
for 4 examples of FEAT_??? .

ID_AA64ISAR0_EL1 TS, bits [55:52] = 0b0001 (FEAT_FlagM)
but LLVM git main still has cortex-x1c with AArch64::AEK_FLAGM
missing in AArch64TargetParser.h --yet correctly has FeatureFlagM
in AArch64.td .
It seems -mcpu=cortex-x1c+flagm notation is best used
explicitly as things are.

ID_AA64ISAR0_EL1 FHM, bits [51:48] = 0b0000 (no FEAT_FHM/no fp16fmll)
but git main still has cortex-a78c with AArch64::AEK_FP16FML in
AArch64TargetParser.h and FeatureFP16FML in AArch64.td .
I seems -mcpu=cortex-a78c+nofp16fml notation is best used
explicitly as things are.

ID_AA64ISAR1_EL1 LRCPC, bits [23:20] = 0b0010 (FEAT_LRCPC2)
but LLVM git main still has cortex-a78c with FeatureRCPC (FEAT_LRCPC)
in AArch64.td instead of FeatureRCPC_IMMO (FEAT_LRCPC2).
No notation in AArch64TargetParser.h refers to FEAT_LRCPC2
so no -mcpu=cortex-a78c+??? can cause the FEAT_LRCPC2 status.

ID_AA64MMFR2_EL1 AT, bits [35:32] = 0b0001 (FEAT_LSE2)
but LLVM git main still has cortex-a78c with FeatureLSE2 (FEAT_LSE2)
missing in AArch64.td . Nothing in AArch64TargetParser.h refers to
FEAT_LSE2 so no -mcpu=cortex-a78c+??? can cause the FEAT_LSE2
status.

So it seems that:

-mcpu=cortex-x1c+flagm also gets FEAT_LRCPC2 and FEAT_LSE2
status but avoids getting FEAT_FHM (matching the combination of
the 3 .pdf files for the 4 FEAT_??? in question).

-mcpu=cortex-a78c+nofp16fml also gets FEAT_FLAGM
but does not get either of FEAT_LRCPC2 or FEAT_LSE2.
(Not fully matching the 3 .pdf files for 2 FAT_??? .) It does
avoid getting FEAT_FHM status.



> A mix of differences for (a line
> with alternate synonyms from differing places
> referencing the feature sets):
> 
> FeatureFlagM/AArch64::AEK_FLAGM/FLAGM/ARMv8.4-CondM/FEAT_FlagM
> and:
> FeatureFP16FML/AArch64::AEK_FP16FML/?gcc?/ARMv8.2-FHM/FEAT_FHM
> 
> clang and gcc13 are not in full agreement about those.
> 
> But I've not done any tuning variation experiments.

I now have. More later below.

> The boot shows ACPI related errors/warnings:
> 
> ACPI Error: AE_NOT_FOUND, While resolving a named reference package element -\_SB_.UBF0.PRT0 (20221020/dspkginit-605)
> ACPI Error: AE_NOT_FOUND, While resolving a named reference package element -\_SB_.UBF0.PRT1 (20221020/dspkginit-605)
> 
> ACPI Warning: \_SB.GPU._CLS: Return Package is too small - found 1 element, expected 3 (20221020/nsprepkg-511)
> 
> can't fetch resource for \_SB|.ADC1 - AE_AML_INVALID_RESOURCE_TYPE
> 
> It also has all 32 temperature sensor accesses as failing,
> spaming the console with messages about each on a regular
> basis. It contributes to /var/log/messages* turnovers:
> 
> acpi_tz0: error fetching current temperature -- AE_NOT_FOUND
> . . .
> acpi_tz31: error fetching current temperature -- AE_NOT_FOUND
> 
> 
> Just FYI: A problem that I've noticed is (e.g.):
> 
> # date
> Wed Dec 31 16:50:41 PST 1969
> 
> despite /etc/rc.conf having:
> 
> ntpd_enable="YES"
> ntpd_sync_on_start="YES"
> 
> and it working to set the time when booting other
> machines (RPi*'s) that have no RTC. I've explicit
> set the date when I've cared.

Turns out the ntpd problem was a kernel issue and
recent enough kernels have ntpd working again.
None of the ntpd lack of time setting was specific
to the Windows Dev Kit 2023 context.

(I'm also setting up the zfs contexts to have
sysutils/fakertc installed and enabled so,
hopefully, fewer early timestamps will be
wildly messed up for systems without a RTC.)

> I was unable to replicate the report:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270805
> 
> in my main [so: 14] context. My earlier notes there
> are a mess from an operator error not noticed for some
> time and the process of exploration being visible. I'll
> not get into the details here but it is another USB
> storage device related oddity.
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270935
> (not  mine) is about (shown in truss output form):
> 
> # truss efivar
> . . .
> openat(AT_FDCWD,"/dev/efi",O_RDWR,00) = 3 (0x3)
> ioctl(3,EFIIOC_VAR_NEXT,0x2b0510b84740) ERR#78 'Function not implemented'
> . . .
> 
> # truss efibootmgr
> . . .
> openat(AT_FDCWD,"/dev/efi",O_RDWR,00) = 3 (0x3)
> ioctl(3,EFIIOC_VAR_NEXT,0x6c2fa42cdd90) ERR#78 'Function not implemented'
> ioctl(3,EFIIOC_VAR_GET,0x6c2fa42cdda0) ERR#78 'Function not implemented'
> ioctl(3,EFIIOC_VAR_GET,0x6c2fa42cdda0) ERR#78 'Function not implemented'
> ioctl(3,EFIIOC_VAR_GET,0x6c2fa42cdda0) ERR#78 'Function not implemented'
> fstat(1,{ mode=crw--w---- ,inode=152,size=0,blksize=4096 }) = 0 (0x0)
> ioctl(1,TIOCGETA,0x6c2fa42cd6c8) = 0 (0x0)
> BootCurrent: 0000
> write(1,"BootCurrent: 0000\n",18) = 18 (0x12)
> ioctl(3,EFIIOC_VAR_GET,0x6c2fa42cdda0) ERR#78 'Function not implemented'
> ioctl(3,EFIIOC_VAR_GET,0x6c2fa42cdda0) ERR#78 'Function not implemented'
> 
> 
> Performance examples for buildworld buildkernel (the
> way I normally build such, not defaults):
> 
> Some idea of performance for building things come from
> doing "rm -fr" then rebuilding the world and kernel and
> doing the same on another type of machine, here a
> HoneyComb. It is the exact same storage media drive
> and rebuilding the exact same build directory tree:
> 
> HoneyComb: World built in 3463 seconds, ncpu: 16, make -j16
> WDK23:     World built in 6601 seconds, ncpu: 8, make -j8
> 
> HoneyComb: Kernel(s)  GENERIC-NODBG-CA72 built in 318 seconds, ncpu: 16, make -j16
> WDK23:     Kernel(s)  GENERIC-NODBG-CA72 built in 597 seconds, ncpu: 8, make -j8
> 
> Note for both contexts:
> 
> make[1]: "/usr/main-src/Makefile.inc1" line 326: SYSTEM_COMPILER: Determined that CC=cc matches the source tree.  Not bootstrapping a cross-compiler.
> make[1]: "/usr/main-src/Makefile.inc1" line 331: SYSTEM_LINKER: Determined that LD=ld matches the source tree.  Not bootstrapping a cross-linker.
> 

My normal aarch64 systems are cortex-a72 based and are
running a world and kernel that were built based on
using -mcpu=cortex-a72 for instruction set selection and
other tuning.

Well, I've now used a system that was built based on
-mcpu=cortex-a78c+flagm for world and kernel and
I did the "rm -fr" and from scratch buildworld buildkernel
on the WDK23. I reshow the prior results and add the new
"optimzed context" ones:

HoneyComb:       World built in 3463 seconds, ncpu: 16, make -j16
WDK23:           World built in 6601 seconds, ncpu: 8, make -j8
WDK23 optimized: World built in 4690 seconds, ncpu: 8, make -j8

HoneyComb:        Kernel(s)  GENERIC-NODBG-CA72 built in 318 seconds, ncpu: 16, make -j16
WDK23:            Kernel(s)  GENERIC-NODBG-CA72 built in 597 seconds, ncpu: 8, make -j8
WDK23 optimized: Kernel(s)  GENERIC-NODBG-CA78C built in 422 seconds, ncpu: 8, make -j8

Nice decreases in time used for the WDK23.

===
Mark Millard
marklmi at yahoo.com