From nobody Sat Apr 20 10:32:48 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VM7BS4yMrz5HW1Q; Sat, 20 Apr 2024 10:32:48 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VM7BS32yHz4QJg; Sat, 20 Apr 2024 10:32:48 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1713609168; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=2zUg5GSXMlAAfuY/fjuNiuqg+o4sXnNrHLUY09KXtpo=; b=mnhpW06oi8burzueIBdozXxueuTu+UJlu7z/KtMbuvrzM+8I+Ncfg1CeWeqW5q6WSKjj1I MW6PYcX2mQT0tMB3wSryiPaidcMeJ0Ji/RJDjz2rnTF+6NSK8EoymgMTXX0b8fsayNPOY5 oI5bqnQNW0OSAAhX7xkkVX/MsVOI5GSGfPMBsgKuYoo8n5VLmMlakaL3mcWLmD88/Tagjt ADJINDzvSjemO/fm3h+5Qm9qqJ8YBN4Z/JrjvTw+dTSK6c4zUd4UNqTct8xbcikwOJCNMi KYbxmUnaaXrv00cpjbZNgLCo0JKuD6dox1sRNozWlOPy53pRlsRHACnNlgwymA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1713609168; a=rsa-sha256; cv=none; b=YR78bHsMfUPoCwTCTiMopC5TuQJ8H4VsRy4Fj6h/+tW3nPPVLtM9DlzUFipRZxJPnBQieB Os/g+u+JkVVG3RgU94Nmp1IV9ifxvZj1IOyaCcDsvWcVsD1dwlfXvZWRt7h3ISVoBkNatv hYUIQMtOBCbS8QxW6Q4zm6cMwtekhObnylYqCIEy1HOkA6Blrt/GJANIWeBif9kmN8wZZx UmDvRdizi2YZZfyLrKFCrvHZxnWZR+zQ7qcnlF+IuY5DQKK6FSv8RkFBXri8ZA5Cen2MvB dZEbiEo+tQDtkvWPVXbRu4j6wEYkJefSMuFL6oABXF+oBGGqJtQQ3h1395wJsw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1713609168; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=2zUg5GSXMlAAfuY/fjuNiuqg+o4sXnNrHLUY09KXtpo=; b=KtZ5pmiKyrVCxHj6MnJGkmG1FAcQBci+v4oFF9pFi5o4FHMGmRE9Bz8DVgVB6UKPrynMps Fl8HxB2QrHl8dqQkCtQkLNTBPD+Qeih3AWNqJ73gKR0MYNI2MDvPkHtamucpW198BTqn+4 ngl5KETX1cN5Wk93h2kvteH4ltZoGPgNy4saXm7dVwMoj/blEOyaogXaeuHhkznPfj4s78 pv28xyNgoOvDCB60Zh3mXcAyieJ6Z2WAFm/SsbgJKPPQuEX/UqV0+NKw2bFdkm7KLjM59g 4Webxr3wUOyTIfs7u6cVDg621H+yIKYMmZfuc5q1t6z1xVPUYh199m4MD+iH7g== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4VM7BS2gTBzYmk; Sat, 20 Apr 2024 10:32:48 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 43KAWmS5011945; Sat, 20 Apr 2024 10:32:48 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 43KAWmk1011942; Sat, 20 Apr 2024 10:32:48 GMT (envelope-from git) Date: Sat, 20 Apr 2024 10:32:48 GMT Message-Id: <202404201032.43KAWmk1011942@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Dimitry Andric Subject: git: a3549d8d4dbf - stable/14 - Merge commit 55c466da2f2f from llvm-project (by Benjamin Kramer): List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: dim X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: a3549d8d4dbf7f6e30bcbdc18cf6dda2bd09d39b Auto-Submitted: auto-generated The branch stable/14 has been updated by dim: URL: https://cgit.FreeBSD.org/src/commit/?id=a3549d8d4dbf7f6e30bcbdc18cf6dda2bd09d39b commit a3549d8d4dbf7f6e30bcbdc18cf6dda2bd09d39b Author: Dimitry Andric AuthorDate: 2024-04-11 21:12:42 +0000 Commit: Dimitry Andric CommitDate: 2024-04-20 10:03:25 +0000 Merge commit 55c466da2f2f from llvm-project (by Benjamin Kramer): [X86][AVX512BF16] Add a few missing insert/extract patterns These are really the same as the f16 (and i16) instructions, but we need them for any type that can occur. Merge commit 2e4e04c59043 from llvm-project (by Phoebe Wang): [X86][BF16] Do not lower to VCVTNEPS2BF16 without AVX512VL (#86395) Fixes: #86305 These should fix "fatal error: error in backend: Cannot select: t71: v32bf16 = insert_subvector t67, t64, Constant:i32<16>" when building the misc/ncnn port. PR: 278305 Reported by: yuri MFC after: 1 month (cherry picked from commit 78d3648e73d11c5a4dbcc0392907f0723bf1df1c) --- contrib/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp | 7 +++++-- contrib/llvm-project/llvm/lib/Target/X86/X86InstrAVX512.td | 12 ++++++++++++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/contrib/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp b/contrib/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp index 9e64726fb6ff..96bbd981ff24 100644 --- a/contrib/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/contrib/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -21420,7 +21420,9 @@ SDValue X86TargetLowering::LowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const { } if (VT.getScalarType() == MVT::bf16) { - if (SVT.getScalarType() == MVT::f32 && isTypeLegal(VT)) + if (SVT.getScalarType() == MVT::f32 && + ((Subtarget.hasBF16() && Subtarget.hasVLX()) || + Subtarget.hasAVXNECONVERT())) return Op; return SDValue(); } @@ -21527,7 +21529,8 @@ SDValue X86TargetLowering::LowerFP_TO_BF16(SDValue Op, SDLoc DL(Op); MVT SVT = Op.getOperand(0).getSimpleValueType(); - if (SVT == MVT::f32 && (Subtarget.hasBF16() || Subtarget.hasAVXNECONVERT())) { + if (SVT == MVT::f32 && ((Subtarget.hasBF16() && Subtarget.hasVLX()) || + Subtarget.hasAVXNECONVERT())) { SDValue Res; Res = DAG.getNode(ISD::SCALAR_TO_VECTOR, DL, MVT::v4f32, Op.getOperand(0)); Res = DAG.getNode(X86ISD::CVTNEPS2BF16, DL, MVT::v8bf16, Res); diff --git a/contrib/llvm-project/llvm/lib/Target/X86/X86InstrAVX512.td b/contrib/llvm-project/llvm/lib/Target/X86/X86InstrAVX512.td index bb5e22c71427..fdca58141f0f 100644 --- a/contrib/llvm-project/llvm/lib/Target/X86/X86InstrAVX512.td +++ b/contrib/llvm-project/llvm/lib/Target/X86/X86InstrAVX512.td @@ -494,6 +494,8 @@ defm : vinsert_for_size_lowering<"VINSERTI32x4Z256", v16i8x_info, v32i8x_info, vinsert128_insert, INSERT_get_vinsert128_imm, [HasVLX]>; defm : vinsert_for_size_lowering<"VINSERTF32x4Z256", v8f16x_info, v16f16x_info, vinsert128_insert, INSERT_get_vinsert128_imm, [HasVLX]>; +defm : vinsert_for_size_lowering<"VINSERTF32x4Z256", v8bf16x_info, v16bf16x_info, + vinsert128_insert, INSERT_get_vinsert128_imm, [HasVLX]>; // Codegen pattern with the alternative types insert VEC128 into VEC512 defm : vinsert_for_size_lowering<"VINSERTI32x4Z", v8i16x_info, v32i16_info, vinsert128_insert, INSERT_get_vinsert128_imm, [HasAVX512]>; @@ -501,6 +503,8 @@ defm : vinsert_for_size_lowering<"VINSERTI32x4Z", v16i8x_info, v64i8_info, vinsert128_insert, INSERT_get_vinsert128_imm, [HasAVX512]>; defm : vinsert_for_size_lowering<"VINSERTF32x4Z", v8f16x_info, v32f16_info, vinsert128_insert, INSERT_get_vinsert128_imm, [HasAVX512]>; +defm : vinsert_for_size_lowering<"VINSERTF32x4Z", v8bf16x_info, v32bf16_info, + vinsert128_insert, INSERT_get_vinsert128_imm, [HasAVX512]>; // Codegen pattern with the alternative types insert VEC256 into VEC512 defm : vinsert_for_size_lowering<"VINSERTI64x4Z", v16i16x_info, v32i16_info, vinsert256_insert, INSERT_get_vinsert256_imm, [HasAVX512]>; @@ -508,6 +512,8 @@ defm : vinsert_for_size_lowering<"VINSERTI64x4Z", v32i8x_info, v64i8_info, vinsert256_insert, INSERT_get_vinsert256_imm, [HasAVX512]>; defm : vinsert_for_size_lowering<"VINSERTF64x4Z", v16f16x_info, v32f16_info, vinsert256_insert, INSERT_get_vinsert256_imm, [HasAVX512]>; +defm : vinsert_for_size_lowering<"VINSERTF64x4Z", v16bf16x_info, v32bf16_info, + vinsert256_insert, INSERT_get_vinsert256_imm, [HasAVX512]>; multiclass vinsert_for_mask_cast; defm : vextract_for_size_lowering<"VEXTRACTF32x4Z256", v16f16x_info, v8f16x_info, vextract128_extract, EXTRACT_get_vextract128_imm, [HasVLX]>; +defm : vextract_for_size_lowering<"VEXTRACTF32x4Z256", v16bf16x_info, v8bf16x_info, + vextract128_extract, EXTRACT_get_vextract128_imm, [HasVLX]>; // Codegen pattern with the alternative types extract VEC128 from VEC512 defm : vextract_for_size_lowering<"VEXTRACTI32x4Z", v32i16_info, v8i16x_info, @@ -803,6 +811,8 @@ defm : vextract_for_size_lowering<"VEXTRACTI32x4Z", v64i8_info, v16i8x_info, vextract128_extract, EXTRACT_get_vextract128_imm, [HasAVX512]>; defm : vextract_for_size_lowering<"VEXTRACTF32x4Z", v32f16_info, v8f16x_info, vextract128_extract, EXTRACT_get_vextract128_imm, [HasAVX512]>; +defm : vextract_for_size_lowering<"VEXTRACTF32x4Z", v32bf16_info, v8bf16x_info, + vextract128_extract, EXTRACT_get_vextract128_imm, [HasAVX512]>; // Codegen pattern with the alternative types extract VEC256 from VEC512 defm : vextract_for_size_lowering<"VEXTRACTI64x4Z", v32i16_info, v16i16x_info, vextract256_extract, EXTRACT_get_vextract256_imm, [HasAVX512]>; @@ -810,6 +820,8 @@ defm : vextract_for_size_lowering<"VEXTRACTI64x4Z", v64i8_info, v32i8x_info, vextract256_extract, EXTRACT_get_vextract256_imm, [HasAVX512]>; defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", v32f16_info, v16f16x_info, vextract256_extract, EXTRACT_get_vextract256_imm, [HasAVX512]>; +defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", v32bf16_info, v16bf16x_info, + vextract256_extract, EXTRACT_get_vextract256_imm, [HasAVX512]>; // A 128-bit extract from bits [255:128] of a 512-bit vector should use a