From nobody Thu Jul 20 19:55:49 2023 X-Original-To: dev-commits-ports-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4R6Nj22D5Jz4ntw8; Thu, 20 Jul 2023 19:55:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4R6Nj212Vmz4Q9V; Thu, 20 Jul 2023 19:55:50 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1689882950; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ebhcA3zFNWHlUybbT9/ULMXPrmvgN6lS1k8h98gwgR0=; b=AxPbT+nRPvz0Iw+za8tbKScU8oBlrrqyKTfs38+yU2UrIO0s91KMqK3+mVxTRiaSg+GEKB YqiWeuTbKiZAVgAZTELT8Uzc6i0QK3I+KmVYKI+GBhkW7covoVZElrvLlFxJ0WWoA6HeyJ TywPqo2xyeZZmMqVKSZQ90aSvTkQ0XePSyyQKcq7Gil4QGCUFV4g3p2tDutzOoNJJUF1Ps UrbcCdNrL2DzniGD6ZZELY+nxtVEp71E+JMeTcfE+vYw/SEKFNYArdl9i/t7y8Kp3+38hV vReiqmBEfrw7f81Q4kPmM7Dka2lKr94uXqFgvtXybtFquYlezC+VECiDyXNi6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1689882950; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ebhcA3zFNWHlUybbT9/ULMXPrmvgN6lS1k8h98gwgR0=; b=r9u9YHgdqSdE0C1q0JRs3WO+Yb2EAOX4yLyFYfgGJ1X8LULS2ZUVfmPxaxDLzJTGN1rP2z VVuK9RJU6kargDwTNeVQaOCLUi2oQ5dWveWksqArxfcoWUjO5eh+0jh8+qtfTBNHqlDPXR MBV6793RyHwGTIWPhWiG3/IBzRRhmaYVrw1vBkYFHUBDoDIJiCUZVzDKW2E4fEs2u9DaRe I5A+4HFZmWfNH+Gb3iJNY0KT1E30jMSusyrsompKp5TqCh2sFzUZN9JXy28BnvwsFep29S tkwSHCnIp1BkPSZbulkrGMdsr9xmcVlU8rUv8ZEd27aW7+lHYegnIoUKlcJLvw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1689882950; a=rsa-sha256; cv=none; b=s4COlBGictI/SsiwHqh8lf7uX436g5rNdSYGQwVYrsUuw+z4VYO/TtIzQHBUntM8DIh+pQ l7XnBwfHQkpSsMOBCcWdJlEwG+lulDjxlj36o5oPaVSkafyIzTepYOoarWMzqA38Aprer0 NPs1xZRDyPSrSwa85hFvTjojvJF+Rcz8YVA+Z2Tu1ib4ipTcx5hO4BcOLUGMDpHz5bPPyo Bqm5VFIQFMLUR+UHhoVi/eUaFSJAjlNvp52qwyD/eMlVzLKwfvx3VUzW8nYS4sj+uqRD4P wE1npJZhVPXLh3IjpqvVi61VbUxj5HZ1qJnz5tr3zk/39EzFfEVSdaDvQ5hN/Q== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4R6Nj16ld1zkWM; Thu, 20 Jul 2023 19:55:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 36KJtnYH053087; Thu, 20 Jul 2023 19:55:49 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 36KJtnsu053086; Thu, 20 Jul 2023 19:55:49 GMT (envelope-from git) Date: Thu, 20 Jul 2023 19:55:49 GMT Message-Id: <202307201955.36KJtnsu053086@gitrepo.freebsd.org> To: ports-committers@FreeBSD.org, dev-commits-ports-all@FreeBSD.org, dev-commits-ports-branches@FreeBSD.org From: Brooks Davis Subject: git: fd40ab7afbb7 - 2023Q3 - devel/llvm16: backport upstream powerpc patch List-Id: Commit messages for all branches of the ports repository List-Archive: https://lists.freebsd.org/archives/dev-commits-ports-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-ports-all@freebsd.org X-BeenThere: dev-commits-ports-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: brooks X-Git-Repository: ports X-Git-Refname: refs/heads/2023Q3 X-Git-Reftype: branch X-Git-Commit: fd40ab7afbb79da00a7a8cfa1b695a344eb4b7f0 Auto-Submitted: auto-generated The branch 2023Q3 has been updated by brooks: URL: https://cgit.FreeBSD.org/ports/commit/?id=fd40ab7afbb79da00a7a8cfa1b695a344eb4b7f0 commit fd40ab7afbb79da00a7a8cfa1b695a344eb4b7f0 Author: Brooks Davis AuthorDate: 2023-07-17 17:56:59 +0000 Commit: Brooks Davis CommitDate: 2023-07-20 18:34:55 +0000 devel/llvm16: backport upstream powerpc patch Backport commit 8757ce490130 from llvm-project (by Simon Pilgrim): [PowerPC] Replace PPCISD::VABSD cases with generic ISD::ABDU(X,Y) node (cherry picked from commit 70528428672cb3c386ae3dcfd36694c5dfde11fc) --- devel/llvm16/Makefile | 2 +- devel/llvm16/files/patch-backport-8757ce490130 | 243 +++++++++++++++++++++++++ 2 files changed, 244 insertions(+), 1 deletion(-) diff --git a/devel/llvm16/Makefile b/devel/llvm16/Makefile index c261324da428..45de60ec0a2a 100644 --- a/devel/llvm16/Makefile +++ b/devel/llvm16/Makefile @@ -1,6 +1,6 @@ PORTNAME= llvm DISTVERSION= 16.0.6 -PORTREVISION= 4 +PORTREVISION= 5 CATEGORIES= devel lang MASTER_SITES= https://github.com/llvm/llvm-project/releases/download/llvmorg-${DISTVERSION:S/rc/-rc/}/ \ https://${PRE_}releases.llvm.org/${LLVM_RELEASE}${RCDIR}/ diff --git a/devel/llvm16/files/patch-backport-8757ce490130 b/devel/llvm16/files/patch-backport-8757ce490130 new file mode 100644 index 000000000000..557b7a6c89ee --- /dev/null +++ b/devel/llvm16/files/patch-backport-8757ce490130 @@ -0,0 +1,243 @@ +commit 8757ce490130c2b2862017ab705a9ff24b10033b +Author: Simon Pilgrim +Date: Sat Feb 25 20:06:19 2023 +0000 + + [PowerPC] Replace PPCISD::VABSD cases with generic ISD::ABDU(X,Y) node + + A move towards using the generic ISD::ABDU nodes on more backends + + Also support ISD::ABDS for v4i32 types using the existing signbit flip trick + + PowerPC has a select(icmp_ugt(x,y),sub(x,y),sub(y,x)) -> abdu(x,y) combine that I intend to move to DAGCombiner in a future patch. + + The ABS(SUB(X,Y)) -> PPCISD::VABSD(X,Y,1) v4i32 combine wasn't legal (https://alive2.llvm.org/ce/z/jc2hLU) - so I've removed it, having already added the legal sub nsw tests equivalent. + + Differential Revision: https://reviews.llvm.org/D142313 + +diff --git llvm/lib/Target/PowerPC/PPCISelLowering.cpp llvm/lib/Target/PowerPC/PPCISelLowering.cpp +index 482af0f41ce9..cf72e379a69e 100644 +--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp ++++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp +@@ -1299,6 +1299,11 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM, + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i16, Legal); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i32, Legal); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i64, Legal); ++ ++ setOperationAction(ISD::ABDU, MVT::v16i8, Legal); ++ setOperationAction(ISD::ABDU, MVT::v8i16, Legal); ++ setOperationAction(ISD::ABDU, MVT::v4i32, Legal); ++ setOperationAction(ISD::ABDS, MVT::v4i32, Legal); + } + + if (Subtarget.hasP10Vector()) { +@@ -1386,7 +1391,7 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM, + } + + if (Subtarget.hasP9Altivec()) { +- setTargetDAGCombine({ISD::ABS, ISD::VSELECT}); ++ setTargetDAGCombine({ISD::VSELECT}); + } + + setLibcallName(RTLIB::LOG_F128, "logf128"); +@@ -1750,7 +1755,6 @@ const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const { + case PPCISD::RFEBB: return "PPCISD::RFEBB"; + case PPCISD::XXSWAPD: return "PPCISD::XXSWAPD"; + case PPCISD::SWAP_NO_CHAIN: return "PPCISD::SWAP_NO_CHAIN"; +- case PPCISD::VABSD: return "PPCISD::VABSD"; + case PPCISD::BUILD_FP128: return "PPCISD::BUILD_FP128"; + case PPCISD::BUILD_SPE64: return "PPCISD::BUILD_SPE64"; + case PPCISD::EXTRACT_SPE: return "PPCISD::EXTRACT_SPE"; +@@ -16034,8 +16038,6 @@ SDValue PPCTargetLowering::PerformDAGCombine(SDNode *N, + } + case ISD::BUILD_VECTOR: + return DAGCombineBuildVector(N, DCI); +- case ISD::ABS: +- return combineABS(N, DCI); + case ISD::VSELECT: + return combineVSelect(N, DCI); + } +@@ -17463,24 +17465,6 @@ SDValue PPCTargetLowering::combineTRUNCATE(SDNode *N, + SDLoc dl(N); + SDValue Op0 = N->getOperand(0); + +- // fold (truncate (abs (sub (zext a), (zext b)))) -> (vabsd a, b) +- if (Subtarget.hasP9Altivec() && Op0.getOpcode() == ISD::ABS) { +- EVT VT = N->getValueType(0); +- if (VT != MVT::v4i32 && VT != MVT::v8i16 && VT != MVT::v16i8) +- return SDValue(); +- SDValue Sub = Op0.getOperand(0); +- if (Sub.getOpcode() == ISD::SUB) { +- SDValue SubOp0 = Sub.getOperand(0); +- SDValue SubOp1 = Sub.getOperand(1); +- if ((SubOp0.getOpcode() == ISD::ZERO_EXTEND) && +- (SubOp1.getOpcode() == ISD::ZERO_EXTEND)) { +- return DCI.DAG.getNode(PPCISD::VABSD, dl, VT, SubOp0.getOperand(0), +- SubOp1.getOperand(0), +- DCI.DAG.getTargetConstant(0, dl, MVT::i32)); +- } +- } +- } +- + // Looking for a truncate of i128 to i64. + if (Op0.getValueType() != MVT::i128 || N->getValueType(0) != MVT::i64) + return SDValue(); +@@ -17681,54 +17665,12 @@ isMaskAndCmp0FoldingBeneficial(const Instruction &AndI) const { + return true; + } + +-// Transform (abs (sub (zext a), (zext b))) to (vabsd a b 0) +-// Transform (abs (sub (zext a), (zext_invec b))) to (vabsd a b 0) +-// Transform (abs (sub (zext_invec a), (zext_invec b))) to (vabsd a b 0) +-// Transform (abs (sub (zext_invec a), (zext b))) to (vabsd a b 0) +-// Transform (abs (sub a, b) to (vabsd a b 1)) if a & b of type v4i32 +-SDValue PPCTargetLowering::combineABS(SDNode *N, DAGCombinerInfo &DCI) const { +- assert((N->getOpcode() == ISD::ABS) && "Need ABS node here"); +- assert(Subtarget.hasP9Altivec() && +- "Only combine this when P9 altivec supported!"); +- EVT VT = N->getValueType(0); +- if (VT != MVT::v4i32 && VT != MVT::v8i16 && VT != MVT::v16i8) +- return SDValue(); +- +- SelectionDAG &DAG = DCI.DAG; +- SDLoc dl(N); +- if (N->getOperand(0).getOpcode() == ISD::SUB) { +- // Even for signed integers, if it's known to be positive (as signed +- // integer) due to zero-extended inputs. +- unsigned SubOpcd0 = N->getOperand(0)->getOperand(0).getOpcode(); +- unsigned SubOpcd1 = N->getOperand(0)->getOperand(1).getOpcode(); +- if ((SubOpcd0 == ISD::ZERO_EXTEND || +- SubOpcd0 == ISD::ZERO_EXTEND_VECTOR_INREG) && +- (SubOpcd1 == ISD::ZERO_EXTEND || +- SubOpcd1 == ISD::ZERO_EXTEND_VECTOR_INREG)) { +- return DAG.getNode(PPCISD::VABSD, dl, N->getOperand(0).getValueType(), +- N->getOperand(0)->getOperand(0), +- N->getOperand(0)->getOperand(1), +- DAG.getTargetConstant(0, dl, MVT::i32)); +- } +- +- // For type v4i32, it can be optimized with xvnegsp + vabsduw +- if (N->getOperand(0).getValueType() == MVT::v4i32 && +- N->getOperand(0).hasOneUse()) { +- return DAG.getNode(PPCISD::VABSD, dl, N->getOperand(0).getValueType(), +- N->getOperand(0)->getOperand(0), +- N->getOperand(0)->getOperand(1), +- DAG.getTargetConstant(1, dl, MVT::i32)); +- } +- } +- +- return SDValue(); +-} +- + // For type v4i32/v8ii16/v16i8, transform +-// from (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) to (vabsd a, b) +-// from (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) to (vabsd a, b) +-// from (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) to (vabsd a, b) +-// from (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) to (vabsd a, b) ++// from (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) to (abdu a, b) ++// from (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) to (abdu a, b) ++// from (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) to (abdu a, b) ++// from (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) to (abdu a, b) ++// TODO: Move this to DAGCombiner? + SDValue PPCTargetLowering::combineVSelect(SDNode *N, + DAGCombinerInfo &DCI) const { + assert((N->getOpcode() == ISD::VSELECT) && "Need VSELECT node here"); +@@ -17779,9 +17721,8 @@ SDValue PPCTargetLowering::combineVSelect(SDNode *N, + TrueOpnd.getOperand(1) == CmpOpnd2 && + FalseOpnd.getOperand(0) == CmpOpnd2 && + FalseOpnd.getOperand(1) == CmpOpnd1) { +- return DAG.getNode(PPCISD::VABSD, dl, N->getOperand(1).getValueType(), +- CmpOpnd1, CmpOpnd2, +- DAG.getTargetConstant(0, dl, MVT::i32)); ++ return DAG.getNode(ISD::ABDU, dl, N->getOperand(1).getValueType(), CmpOpnd1, ++ CmpOpnd2, DAG.getTargetConstant(0, dl, MVT::i32)); + } + + return SDValue(); +diff --git llvm/lib/Target/PowerPC/PPCISelLowering.h llvm/lib/Target/PowerPC/PPCISelLowering.h +index 6ed52f540b02..302bd1b91ecc 100644 +--- llvm/lib/Target/PowerPC/PPCISelLowering.h ++++ llvm/lib/Target/PowerPC/PPCISelLowering.h +@@ -440,21 +440,6 @@ namespace llvm { + /// and thereby have no chain. + SWAP_NO_CHAIN, + +- /// An SDNode for Power9 vector absolute value difference. +- /// operand #0 vector +- /// operand #1 vector +- /// operand #2 constant i32 0 or 1, to indicate whether needs to patch +- /// the most significant bit for signed i32 +- /// +- /// Power9 VABSD* instructions are designed to support unsigned integer +- /// vectors (byte/halfword/word), if we want to make use of them for signed +- /// integer vectors, we have to flip their sign bits first. To flip sign bit +- /// for byte/halfword integer vector would become inefficient, but for word +- /// integer vector, we can leverage XVNEGSP to make it efficiently. eg: +- /// abs(sub(a,b)) => VABSDUW(a+0x80000000, b+0x80000000) +- /// => VABSDUW((XVNEGSP a), (XVNEGSP b)) +- VABSD, +- + /// FP_EXTEND_HALF(VECTOR, IDX) - Custom extend upper (IDX=0) half or + /// lower (IDX=1) half of v4f32 to v2f64. + FP_EXTEND_HALF, +@@ -1430,7 +1415,6 @@ namespace llvm { + SDValue combineFMALike(SDNode *N, DAGCombinerInfo &DCI) const; + SDValue combineTRUNCATE(SDNode *N, DAGCombinerInfo &DCI) const; + SDValue combineSetCC(SDNode *N, DAGCombinerInfo &DCI) const; +- SDValue combineABS(SDNode *N, DAGCombinerInfo &DCI) const; + SDValue combineVSelect(SDNode *N, DAGCombinerInfo &DCI) const; + SDValue combineVectorShuffle(ShuffleVectorSDNode *SVN, + SelectionDAG &DAG) const; +diff --git llvm/lib/Target/PowerPC/PPCInstrVSX.td llvm/lib/Target/PowerPC/PPCInstrVSX.td +index 584750488ddd..ed9dbb431441 100644 +--- llvm/lib/Target/PowerPC/PPCInstrVSX.td ++++ llvm/lib/Target/PowerPC/PPCInstrVSX.td +@@ -76,9 +76,6 @@ def SDT_PPCxxswapd : SDTypeProfile<1, 1, [ + def SDTVecConv : SDTypeProfile<1, 2, [ + SDTCisVec<0>, SDTCisVec<1>, SDTCisPtrTy<2> + ]>; +-def SDTVabsd : SDTypeProfile<1, 3, [ +- SDTCisVec<0>, SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisVT<3, i32> +-]>; + def SDT_PPCld_vec_be : SDTypeProfile<1, 1, [ + SDTCisVec<0>, SDTCisPtrTy<1> + ]>; +@@ -105,7 +102,6 @@ def PPCmtvsrz : SDNode<"PPCISD::MTVSRZ", SDTUnaryOp, []>; + def PPCsvec2fp : SDNode<"PPCISD::SINT_VEC_TO_FP", SDTVecConv, []>; + def PPCuvec2fp: SDNode<"PPCISD::UINT_VEC_TO_FP", SDTVecConv, []>; + def PPCswapNoChain : SDNode<"PPCISD::SWAP_NO_CHAIN", SDT_PPCxxswapd>; +-def PPCvabsd : SDNode<"PPCISD::VABSD", SDTVabsd, []>; + + def PPCfpexth : SDNode<"PPCISD::FP_EXTEND_HALF", SDT_PPCfpexth, []>; + def PPCldvsxlh : SDNode<"PPCISD::LD_VSX_LH", SDT_PPCldvsxlh, +@@ -4821,20 +4817,23 @@ def : Pat<(f128 (uint_to_fp (i32 (PPCmfvsr f64:$src)))), + + // Any Power9 VSX subtarget that supports Power9 Altivec. + let Predicates = [HasVSX, HasP9Altivec] in { +-// Put this P9Altivec related definition here since it's possible to be +-// selected to VSX instruction xvnegsp, avoid possible undef. +-def : Pat<(v4i32 (PPCvabsd v4i32:$A, v4i32:$B, (i32 0))), ++// Unsigned absolute-difference. ++def : Pat<(v4i32 (abdu v4i32:$A, v4i32:$B)), + (v4i32 (VABSDUW $A, $B))>; + +-def : Pat<(v8i16 (PPCvabsd v8i16:$A, v8i16:$B, (i32 0))), ++def : Pat<(v8i16 (abdu v8i16:$A, v8i16:$B)), + (v8i16 (VABSDUH $A, $B))>; + +-def : Pat<(v16i8 (PPCvabsd v16i8:$A, v16i8:$B, (i32 0))), ++def : Pat<(v16i8 (abdu v16i8:$A, v16i8:$B)), + (v16i8 (VABSDUB $A, $B))>; + +-// As PPCVABSD description, the last operand indicates whether do the +-// sign bit flip. +-def : Pat<(v4i32 (PPCvabsd v4i32:$A, v4i32:$B, (i32 1))), ++// Signed absolute-difference. ++// Power9 VABSD* instructions are designed to support unsigned integer ++// vectors (byte/halfword/word), if we want to make use of them for signed ++// integer vectors, we have to flip their sign bits first. To flip sign bit ++// for byte/halfword integer vector would become inefficient, but for word ++// integer vector, we can leverage XVNEGSP to make it efficiently. ++def : Pat<(v4i32 (abds v4i32:$A, v4i32:$B)), + (v4i32 (VABSDUW (XVNEGSP $A), (XVNEGSP $B)))>; + } // HasVSX, HasP9Altivec +