From nobody Sat Sep 14 12:19:07 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X5VbM5pFKz5W8FZ for ; Sat, 14 Sep 2024 12:19:11 +0000 (UTC) (envelope-from bacon4000@gmail.com) Received: from mail-yw1-x112a.google.com (mail-yw1-x112a.google.com [IPv6:2607:f8b0:4864:20::112a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X5VbL6D5xz4MCW for ; Sat, 14 Sep 2024 12:19:10 +0000 (UTC) (envelope-from bacon4000@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=KnirEYhV; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of bacon4000@gmail.com designates 2607:f8b0:4864:20::112a as permitted sender) smtp.mailfrom=bacon4000@gmail.com Received: by mail-yw1-x112a.google.com with SMTP id 00721157ae682-6d3f4218081so28119267b3.1 for ; Sat, 14 Sep 2024 05:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726316349; x=1726921149; darn=freebsd.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=6aHC7Wlm0S8g1AgoyzGfaJI1TsBroNjfsFExCN8xV4I=; b=KnirEYhVcg9RTCCVmtB44cORjmB9D7JY1//SJaP3gWyvZg4+o8wty811wP1ul38Pdh KUq9tHV4x0Op3Ur2+laE6Cg4pg1MHj+sRzoL7o2wO5JLi7YGpoeKfs+1Dtio285vT11Q rTPA56WW8KET7s22IHLvZ/+vKa3uURLcVTvkSXDzN8Ke8KH35HK7jJXkK6vXUJaL1xOl dS69zkWkw/4D/icyX+0obtz9jaBkhg77Ddmcr8eEcUmzJsD2JMV3CiK6wGyWVy1HHBts aUX0YGiB+82UE7ZhJlV+YjHsB6pe/w89Dz3hMcA+26y/GYfmo4GdzGKJWMZAMKTLnrdv 2o/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726316349; x=1726921149; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6aHC7Wlm0S8g1AgoyzGfaJI1TsBroNjfsFExCN8xV4I=; b=lhoPjVds/lF4us8XtoNVLQtKHnph8yu827SxagnSQS4MSYOYgQ+DVor8Hq+3vfX5Ge 0UjVQ8Usoa4YJ8sd9frr63iWm+3E8JCAY3VPFmgjjeiOBn3qyb25vQaV2n2TcOSwLhCo 9oQw/sIwLqITSb8r1Yazbzb1hUcQmusv9cFRPATcPyckK/DpGn3+Fbo0PVsGB8Zi5IR2 p7yIihwbfg06gkUkjUJ2aqv/9YY6AKHDZhltQ96LJWZOfnat1+KDX959G9RcD4ynVzow JKtBtLyXIV+Vd0K0MfUpM0vvWwddSJZrRh0OryGRWrsWt3i4d9IRSn8CX2M2hrQgK9/s +3CQ== X-Gm-Message-State: AOJu0YxNobdYvudYO7qDGJ7F2yAhdR8g2RuREhnR06188gBEMLIWuuyl BMVmcv8mPxeUe8Tgo7VibuV7uEQpmx7h2OqQ2m4kdF++cm7ruomeaQRuhw== X-Google-Smtp-Source: AGHT+IF4SJbnEm37pp6DxRQ5I4fjh3kzQRvn+vp9qLTSVloAs9hfVRl8LCJDRG9WbW3WSSeT6op2EQ== X-Received: by 2002:a05:690c:102:b0:6db:b8ff:9128 with SMTP id 00721157ae682-6dbb8ffacd8mr75198227b3.46.1726316348814; Sat, 14 Sep 2024 05:19:08 -0700 (PDT) Received: from [192.168.0.146] (108-255-3-0.lightspeed.milwwi.sbcglobal.net. [108.255.3.0]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6dbe2df0709sm2118127b3.3.2024.09.14.05.19.08 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 14 Sep 2024 05:19:08 -0700 (PDT) Message-ID: <89490e9e-740e-4444-ab23-40727af6efa2@gmail.com> Date: Sat, 14 Sep 2024 07:19:07 -0500 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Software performance complexity (was The Case for Rust (in any system)) To: freebsd-hackers@freebsd.org References: <2EE309BF-CE1D-48AD-9C53-D4C87998B4A0@freebsd.org> <434910a3-e832-40d1-8fdd-c46739b3e7fe@gmail.com> <4902a4c4-3c3f-4dd9-8022-49dd6b7e585b@gmail.com> Content-Language: en-US From: Jason Bacon In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.99 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; XM_UA_NO_VERSION(0.01)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; ARC_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; MID_RHS_MATCH_FROM(0.00)[]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::112a:from] X-Rspamd-Queue-Id: 4X5VbL6D5xz4MCW On 9/13/24 21:24, Gavin D. Howard wrote: >> Try and explain this for example: >> >> Sorting int array with clang++18 and subscripts... >> User time = 4.74 seconds (.07900 minutes) (.00131 hours). >> RSS = 4204 KB >> >> Sorting long array with clang++18 and subscripts... >> User time = 2.22 seconds (.03700 minutes) (.00061 hours). >> RSS = 4608 KB > > A new, curious participant here. > > My guess is that the ints are being extended to longs inside the loop, > which would require an extra sign extension instruction. According to C standards, and int should never be promoted unless necessary to perform an operation with a higher type, e.g. int + long, int * float, or pass to an argument of a higher type. The purpose of int is to provide the fastest data type on any platform when you don't care if it uses 16, 32, or 64 bits. > > I don't think that explains the time doubling, but simply running that > one instruction may not be the only cause of performance loss from an > extra instruction. > > That one instruction may actually be the straw that broke the L1 camel's > back; without it, the L1 instruction cache may not overflow, but with > it, the L1 instruction cache may overflow, causing cache misses into L2 > on every iteration of the loop. It would also occupy one of the > arithmetic units, which could lead to less instruction level > parallelism or give the compiler less room for unrolling the loop. > > Just a theory; I have no clue. If you have code to share, I'd love to > see it and try to reproduce the effect. I suspect the parameters for triggering certain optimizations are different for C++ and long int than for other cases. See the link to the llvm Github issue earlier in the message you replied to here. The link to the code is also there. Also, clang is slightly faster than gcc on an old AMD Phenom. It's running FreeBSD 14.0 + latest packages, just like the i5, where gcc is much faster. MD5s of the binaries are the same, regardless of where they were compiled. I'd expect that when just using -O2 (and no -march=native). Bottom line: There is no reliably predicting software performance in the real world. Measuring it empirically is the only way to be sure. Cheers, J -- Life is a game. Play hard. Play fair. Have fun.