i386/185777: Using non-generic -march for optimization on non-Intel family 5/6 processors results in invalid instruction

Matthew Rezny matthew at reztek.cz
Tue Jan 14 20:20:00 UTC 2014


>Number:         185777
>Category:       i386
>Synopsis:       Using non-generic -march for optimization on non-Intel family 5/6 processors results in invalid instruction
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jan 14 20:20:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator:     Matthew Rezny
>Release:        10.0-RC4
>Organization:
RezTek, s.r.o.
>Environment:
FreeBSD service2.reztek 10.0-PRERELEASE FreeBSD 10.0-PRERELEASE #1: Sun Jan 12 08:55:24 CET 2014 root at service2.reztek:/usr/obj/usr/src/sys/CUSTOM i386

>Description:
The version of clang we have in 9.2 and the upcoming 10.0 mistakenly uses nopl instructions on CPUs that do not understand it. nopl was introduced in PentiumPro and included in all following Intel processors, but was not documented. Most non-Intel processors did not add the instruction until several models later. 

Thus, Clang avoids using nopl with -march=i686 but will use it with more specific CPU types. However, the exclusion list in Clang 3.3 is incomplete. Notably missing are AMD K6 and Via C3 processors, both of which I have in use. This is corrected in Clang/LLVM 3.4, but it's a little late to go change compiler versions. Fortunately, the change is a simple patch that can be applied to Clang 3.3.
>How-To-Repeat:
Buildworld with CPUTYPE?=k6 or c3 and CFLAGS= -O2, install the result on a system with k6 or c3 family processor, and attempt to boot. Invalid opcode in init prevents going multiuser. Invalid opcode in /bin/sh prevents single user shell.

>Fix:
Apply r195679 (attached for convenience) from LLVM repo and rebuild Clang. I have manually applied the patch and gone through two buildwould/installworld cycles (once with -march=pentium-mmx to get a Clang with the patch applied, and once with -marcg=c3 to get a world with the patch in effect) on a C3-800. I have not yet tested on a K6 but the patch is simple enough to be almost absolutely sure it will work. The patch adds all k6, c3 and winchip models to the exclusion list, which already included geode, pentium(-mmx) and i586/i686.

Too bad I didn't catch this sooner to have allowed the patch to possibly get into the 10.0 release. Of course, I got around to trying 10 on the slowest boxes last so hit this problem on my K6 and C3 machines just before it's release time. Hopefully this patch can go to HEAD and 9/10-STABLE to take care of this prior to the eventual upgrade to LLVM 3.4.



Patch attached with submission follows:

--- llvm.orig/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp 2013-11-03 00:24:20.000000000 +0100
+++ llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp 2013-11-03 20:57:14.000000000 +0100
@@ -309,7 +309,10 @@ bool X86AsmBackend::writeNopData(uint64_
   // This CPU doesnt support long nops. If needed add more.
   // FIXME: Can we get this from the subtarget somehow?
   if (CPU == "generic" || CPU == "i386" || CPU == "i486" || CPU == "i586" ||
-      CPU == "pentium" || CPU == "pentium-mmx" || CPU == "geode") {
+      CPU == "pentium" || CPU == "pentium-mmx" || CPU == "i686" ||
+      CPU == "k6" || CPU == "k6-2" || CPU == "k6-3" || CPU == "geode" ||
+      CPU == "winchip-c6" || CPU == "winchip2" || CPU == "c3" ||
+      CPU == "c3-2") {
     for (uint64_t i = 0; i < Count; ++i)
       OW->Write8(0x90);
     return true;


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-i386 mailing list