Re: [Intel AlderLake] Read&Write files to FAT32 or UFS partition cause data corrupt due to P-Core&E-Core
- Reply: Tomoaki AOKI : "Re: [Intel AlderLake] Read&Write files to FAT32 or UFS partition cause data corrupt due to P-Core&E-Core"
- In reply to: Tomoaki AOKI : "Re: [Intel AlderLake] Read&Write files to FAT32 or UFS partition cause data corrupt due to P-Core&E-Core"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 08 Aug 2023 14:02:32 UTC
On Tue, Aug 08, 2023 at 10:46:12PM +0900, Tomoaki AOKI wrote: > On Tue, 8 Aug 2023 15:38:46 +0300 > Konstantin Belousov <kostikbel@gmail.com> wrote: > > > On Tue, Aug 08, 2023 at 06:37:35AM +0900, Tomoaki AOKI wrote: > > > On Sun, 6 Aug 2023 12:55:07 +0300 > > > Konstantin Belousov <kostikbel@gmail.com> wrote: > > > > > > > On Sun, Aug 06, 2023 at 06:12:38PM +0900, Tomoaki AOKI wrote: > > > > > On Wed, 23 Feb 2022 01:30:28 +0200 > > > > > Konstantin Belousov <kostikbel@gmail.com> wrote: > > > > > > > > > > > On Tue, Feb 22, 2022 at 06:23:17PM -0500, Alexander Motin wrote: > > > > > > > On 22.02.2022 17:46, Konstantin Belousov wrote: > > > > > > > > Ok, the next step is to get the CPU feature reports from P- vs. E- cores. > > > > > > > > Patch below should work, with verbose boot. > > > > > > > > > > > > > > Not much difference on that level: > > > > > > > > > > > > > > --- zzzp 2022-02-22 18:18:24.531704000 -0500 > > > > > > > +++ zzze 2022-02-22 18:18:18.631236000 -0500 > > > > > > > @@ -1,22 +1,21 @@ > > > > > > > -CPU 2: 12th Gen Intel(R) Core(TM) i7-12700K (3609.60-MHz K8-class CPU) > > > > > > > +CPU 16: 12th Gen Intel(R) Core(TM) i7-12700K (3609.60-MHz K8-class CPU) > > > > > > > Origin="GenuineIntel" Id=0x90672 Family=0x6 Model=0x97 Stepping=2 > > > > > > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > > > > > > Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> > > > > > > > AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> > > > > > > > AMD Features2=0x121<LAHF,ABM,Prefetch> > > > > > > > Structured Extended Features=0x239ca7eb<FSGSBASE,TSCADJ,BMI1,AVX2,FDPEXC,SMEP,BMI2,ERMS,INVPCID,NFPUSG,PQE,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PROCTRACE,SHA> > > > > > > > Structured Extended Features2=0x98c027ac<UMIP,PKU,WAITPKG,GFNI,VAES,VPCLMULQDQ,TME,RDPID,MOVDIRI,MOVDIR64B> > > > > > > > Structured Extended Features3=0xfc1cc410<FSRM,MD_CLEAR,PCONFIG,IBT,IBPB,STIBP,L1DFL,ARCH_CAP,CORE_CAP,SSBD> > > > > > > > XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> > > > > > > > IA32_ARCH_CAPS=0xd6b<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO,TAA_NO> > > > > > > > VT-x: Basic Features=0x3da0500<SMM,INS/OUTS,TRUE> > > > > > > > Pin-Based Controls=0xff<ExtINT,NMI,VNMI,PreTmr,PostIntr> > > > > > > > Primary Processor Controls=0xfffbfffe<INTWIN,TSCOff,HLT,INVLPG,MWAIT,RDPMC,RDTSC,CR3-LD,CR3-ST,CR8-LD,CR8-ST,TPR,NMIWIN,MOV-DR,IO,IOmap,MTF,MSRmap,MONITOR,PAUSE> > > > > > > > Secondary Processor Controls=0xf5d7fff<APIC,EPT,DT,RDTSCP,x2APIC,VPID,WBINVD,UG,APIC-reg,VID,PAUSE-loop,RDRAND,INVPCID,VMFUNC,VMCS,XSAVES> > > > > > > > Exit Controls=0x3da0500<PAT-LD,EFER-SV,PTMR-SV> > > > > > > > Entry Controls=0x3da0500 > > > > > > > EPT Features=0x6f34141<XO,PW4,UC,WB,2M,1G,INVEPT,AD,single,all> > > > > > > > VPID Features=0x10f01<INVVPID,individual,single,all,single-globals> > > > > > > > TSC: P-state invariant, performance statistics > > > > > > > -64-Byte prefetching > > > > > > > -L2 cache: 1280 kbytes, 8-way associative, 64 bytes/line > > > > > > > +L2 cache: 2048 kbytes, 16-way associative, 64 bytes/line > > > > > > > > > > > > > > > > > > > Show me the full verbose dmesg of the boot then. > > > > > > > > > > > > As another blind guess, try to disable pcid, vm.pmap.pcid_enabled=0. > > > > > > > > > > > > > > > > Hi. > > > > > > > > > > Intel N100 is reported to crash without this tunable on 13.2 at > > > > > freebsd-users-jp ML (as this is a ML in Japanese, reported in > > > > > Japanese). [1] > > > > > Crashes with UFS, but ZFS is claimed to be OK. > > > > > > > > > > N100 is an Alder Lake-N processor WITHOUT P-CORE. [2] [3] > > > > > So check logics on workarouund codes (IIRC, all are MFC'ed before 13.2) > > > > > wouldn't be working? > > > > > > > > Show me the output from x86info -r on the machine, I do not care which > > > > specific core it is, they should be all the same. x86info is available > > > > as sysutils/x86info. > > > > > > Requested to original reporter and got the result below. > > > HTH. > > > > > > ----------------------- > > > root@eq12:~ # x86info -r > > > x86info v1.31pre > > > /dev/cpuctl0: No such file or directory > > > Found 4 identical CPUs > > > Extended Family: 0 Extended Model: 11 Family: 6 Model: 190 Stepping: 0 > > > Type: 0 (Original OEM) > > > CPU Model (x86info's best guess): Unknown model. > > ... > > > eax in: 0x0000001a, eax = 20000001 ebx = 00000000 ecx = 00000000 edx = 00000000 > > > > The CPU is reported as small core/atom, so the workaround is turned on. > > I do not think that the issue reported is related to the TLB/PG_G errata. > > > > Why do you think that this is hw issue at all, and not some software bug > > in the build etc ? > > Because the issue looks similar (crashes on UFS but not ZFS, and as far > as the original reporter tested, vm.pmap.pcid_enabled=0 > in /boot/loader.conf helped). > > Moreover, N100 CPU is Alder Lake-N. So potentially includes the same > design issue (common circuits, firmwares, ...). > > So I suspected the same problem persists even without P-core and > adviced the original reporter to add the workaround > in /boot/loader.conf. > It seems to help until now. The workaround is switched on automatically, when kernel detects 'small cores' reported by CPUID.