svn commit: r286008 - stable/10/contrib/llvm/patches
Dimitry Andric
dim at FreeBSD.org
Wed Jul 29 13:07:20 UTC 2015
Author: dim
Date: Wed Jul 29 13:07:18 2015
New Revision: 286008
URL: https://svnweb.freebsd.org/changeset/base/286008
Log:
Add llvm patch corresponding to r286007.
Added:
stable/10/contrib/llvm/patches/patch-r286007-llvm-r219009-x86-codegen-crash.diff
Added: stable/10/contrib/llvm/patches/patch-r286007-llvm-r219009-x86-codegen-crash.diff
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ stable/10/contrib/llvm/patches/patch-r286007-llvm-r219009-x86-codegen-crash.diff Wed Jul 29 13:07:18 2015 (r286008)
@@ -0,0 +1,214 @@
+Pull in r219009 from upstream llvm trunk (by Adam Nemet):
+
+ [ISel] Keep matching state consistent when folding during X86 address match
+
+ In the X86 backend, matching an address is initiated by the 'addr' complex
+ pattern and its friends. During this process we may reassociate and-of-shift
+ into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
+ shift into the scale of the address.
+
+ However as demonstrated by the testcase, this can trigger CSE of not only the
+ shift and the AND which the code is prepared for but also the underlying load
+ node. In the testcase this node is sitting in the RecordedNode and MatchScope
+ data structures of the matcher and becomes a deleted node upon CSE. Returning
+ from the complex pattern function, we try to access it again hitting an assert
+ because the node is no longer a load even though this was checked before.
+
+ Now obviously changing the DAG this late is bending the rules but I think it
+ makes sense somewhat. Outside of addresses we prefer and-of-shift because it
+ may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
+ example because it create a non-canonical node). We currently don't recognize
+ addresses during DAGCombiner where arguably this canonicalization should be
+ performed. On the other hand, having this in the matcher allows us to cover
+ all the cases where an address can be used in an instruction.
+
+ I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
+ the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
+ for initiating the recursive CSE on users
+ (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
+ is not strictly necessary since the shift is hooked into the visited user. Of
+ course it's safer to keep the DAG consistent at all times (e.g. for accurate
+ number of uses, etc.).
+
+ So rather than changing the fundamentals, I've decided to continue along the
+ previous patches and detect the CSE. This patch installs a very targeted
+ DAGUpdateListener for the duration of a complex-pattern match and updates the
+ matching state accordingly. (Previous patches used HandleSDNode to detect the
+ CSE but that's not practical here). The listener is only installed on X86.
+
+ I tested that there is no measurable overhead due to this while running
+ through the spec2k BC files with llc. The only thing we pay for is the
+ creation of the listener. The callback never ever triggers in spec2k since
+ this is a corner case.
+
+ Fixes rdar://problem/18206171
+
+This fixes a possible crash in x86 code generation when compiling recent
+llvm/clang trunk sources.
+
+Introduced here: http://svnweb.freebsd.org/changeset/base/286007
+
+Index: include/llvm/CodeGen/SelectionDAGISel.h
+===================================================================
+--- include/llvm/CodeGen/SelectionDAGISel.h
++++ include/llvm/CodeGen/SelectionDAGISel.h
+@@ -238,6 +238,12 @@ class SelectionDAGISel : public MachineFunctionPas
+ const unsigned char *MatcherTable,
+ unsigned TableSize);
+
++ /// \brief Return true if complex patterns for this target can mutate the
++ /// DAG.
++ virtual bool ComplexPatternFuncMutatesDAG() const {
++ return false;
++ }
++
+ private:
+
+ // Calls to these functions are generated by tblgen.
+Index: lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+===================================================================
+--- lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
++++ lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+@@ -2345,6 +2345,42 @@ struct MatchScope {
+ bool HasChainNodesMatched, HasGlueResultNodesMatched;
+ };
+
++/// \\brief A DAG update listener to keep the matching state
++/// (i.e. RecordedNodes and MatchScope) uptodate if the target is allowed to
++/// change the DAG while matching. X86 addressing mode matcher is an example
++/// for this.
++class MatchStateUpdater : public SelectionDAG::DAGUpdateListener
++{
++ SmallVectorImpl<std::pair<SDValue, SDNode*> > &RecordedNodes;
++ SmallVectorImpl<MatchScope> &MatchScopes;
++public:
++ MatchStateUpdater(SelectionDAG &DAG,
++ SmallVectorImpl<std::pair<SDValue, SDNode*> > &RN,
++ SmallVectorImpl<MatchScope> &MS) :
++ SelectionDAG::DAGUpdateListener(DAG),
++ RecordedNodes(RN), MatchScopes(MS) { }
++
++ void NodeDeleted(SDNode *N, SDNode *E) {
++ // Some early-returns here to avoid the search if we deleted the node or
++ // if the update comes from MorphNodeTo (MorphNodeTo is the last thing we
++ // do, so it's unnecessary to update matching state at that point).
++ // Neither of these can occur currently because we only install this
++ // update listener during matching a complex patterns.
++ if (!E || E->isMachineOpcode())
++ return;
++ // Performing linear search here does not matter because we almost never
++ // run this code. You'd have to have a CSE during complex pattern
++ // matching.
++ for (auto &I : RecordedNodes)
++ if (I.first.getNode() == N)
++ I.first.setNode(E);
++
++ for (auto &I : MatchScopes)
++ for (auto &J : I.NodeStack)
++ if (J.getNode() == N)
++ J.setNode(E);
++ }
++};
+ }
+
+ SDNode *SelectionDAGISel::
+@@ -2599,6 +2635,14 @@ SelectCodeCommon(SDNode *NodeToMatch, const unsign
+ unsigned CPNum = MatcherTable[MatcherIndex++];
+ unsigned RecNo = MatcherTable[MatcherIndex++];
+ assert(RecNo < RecordedNodes.size() && "Invalid CheckComplexPat");
++
++ // If target can modify DAG during matching, keep the matching state
++ // consistent.
++ std::unique_ptr<MatchStateUpdater> MSU;
++ if (ComplexPatternFuncMutatesDAG())
++ MSU.reset(new MatchStateUpdater(*CurDAG, RecordedNodes,
++ MatchScopes));
++
+ if (!CheckComplexPattern(NodeToMatch, RecordedNodes[RecNo].second,
+ RecordedNodes[RecNo].first, CPNum,
+ RecordedNodes))
+Index: lib/Target/X86/X86ISelDAGToDAG.cpp
+===================================================================
+--- lib/Target/X86/X86ISelDAGToDAG.cpp
++++ lib/Target/X86/X86ISelDAGToDAG.cpp
+@@ -290,6 +290,13 @@ namespace {
+ const X86InstrInfo *getInstrInfo() const {
+ return getTargetMachine().getInstrInfo();
+ }
++
++ /// \brief Address-mode matching performs shift-of-and to and-of-shift
++ /// reassociation in order to expose more scaled addressing
++ /// opportunities.
++ bool ComplexPatternFuncMutatesDAG() const override {
++ return true;
++ }
+ };
+ }
+
+Index: test/CodeGen/X86/addr-mode-matcher.ll
+===================================================================
+--- test/CodeGen/X86/addr-mode-matcher.ll
++++ test/CodeGen/X86/addr-mode-matcher.ll
+@@ -0,0 +1,62 @@
++; RUN: llc < %s | FileCheck %s
++
++; This testcase used to hit an assert during ISel. For details, see the big
++; comment inside the function.
++
++; CHECK-LABEL: foo:
++; The AND should be turned into a subreg access.
++; CHECK-NOT: and
++; The shift (leal) should be folded into the scale of the address in the load.
++; CHECK-NOT: leal
++; CHECK: movl {{.*}},4),
++
++target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
++target triple = "i386-apple-macosx10.6.0"
++
++define void @foo(i32 %a) {
++bb:
++ br label %bb1692
++
++bb1692:
++ %tmp1694 = phi i32 [ 0, %bb ], [ %tmp1745, %bb1692 ]
++ %xor = xor i32 0, %tmp1694
++
++; %load1 = (load (and (shl %xor, 2), 1020))
++ %tmp1701 = shl i32 %xor, 2
++ %tmp1702 = and i32 %tmp1701, 1020
++ %tmp1703 = getelementptr inbounds [1028 x i8]* null, i32 0, i32 %tmp1702
++ %tmp1704 = bitcast i8* %tmp1703 to i32*
++ %load1 = load i32* %tmp1704, align 4
++
++; %load2 = (load (shl (and %xor, 255), 2))
++ %tmp1698 = and i32 %xor, 255
++ %tmp1706 = shl i32 %tmp1698, 2
++ %tmp1707 = getelementptr inbounds [1028 x i8]* null, i32 0, i32 %tmp1706
++ %tmp1708 = bitcast i8* %tmp1707 to i32*
++ %load2 = load i32* %tmp1708, align 4
++
++ %tmp1710 = or i32 %load2, %a
++
++; While matching xor we address-match %load1. The and-of-shift reassocication
++; in address matching transform this into into a shift-of-and and the resuting
++; node becomes identical to %load2. CSE replaces %load1 which leaves its
++; references in MatchScope and RecordedNodes stale.
++ %tmp1711 = xor i32 %load1, %tmp1710
++
++ %tmp1744 = getelementptr inbounds [256 x i32]* null, i32 0, i32 %tmp1711
++ store i32 0, i32* %tmp1744, align 4
++ %tmp1745 = add i32 %tmp1694, 1
++ indirectbr i8* undef, [label %bb1756, label %bb1692]
++
++bb1756:
++ br label %bb2705
++
++bb2705:
++ indirectbr i8* undef, [label %bb5721, label %bb5736]
++
++bb5721:
++ br label %bb2705
++
++bb5736:
++ ret void
++}
More information about the svn-src-stable-10
mailing list