From nobody Thu Dec 19 18:45:41 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YDfd21V5zz5hbds; Thu, 19 Dec 2024 18:45:42 +0000 (UTC) (envelope-from brooks@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YDfd20sQ7z4HvS; Thu, 19 Dec 2024 18:45:42 +0000 (UTC) (envelope-from brooks@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734633942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=act7yAbktDLPkX5rXFTcri6glNq18dBXhTkcepuAYKs=; b=jQQmEKZSQ21V1BZ7StjyEIkcoSI6T9VL/EeQq3QVpzl3PnLA49FPM4kHiBLhNLTO3t7+Mp JGVbfr4s99HPOOf+i9SseQX4Rf4ntgXNhx/oSefe0auO58dPSHIG0SewTW3uySqnCycPoF +NTggTHfQeFAH3opQxdFHbs9rt6GfIfRIWNp7xttYOgpkAUzouhGUBrbYF7xyVGuTNA5V5 hHTCdEPo9omMwAHoHcmx97E9u7xo67pBFsdcZCMsXPnVXAB738RpDM+21loKbqSgDOWPpN R12eOwVLQbcC5uGK+G/PZj2edAmf+nPzdxSsiJH/LO6MBCEkzdDLs/9OsUGtiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1734633942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=act7yAbktDLPkX5rXFTcri6glNq18dBXhTkcepuAYKs=; b=c6WDSEjZfsvJeE+z5Gl/l5Tfu9rdLjCqEQqhx6IvCIw2V++2HgXwZSUfxtvVaLYIENp+v0 09f9fk5k0xLqDTnz3iKwoUo5a/n67PF8uODUCesH/r73DpaLkRebBaBSDu+MidjAf/rirA 2QFnOc3eOo3jmvKrbV6jL+eLpBQsDM0qoR+MPzIL1zNxXKhaFRVEIpyPNYHB8MOQw9IQPy XMc6e98yjAHLkCPSqG4uahbDcQHKGecB/6zO+TBEXeEM1yhmAnUUyVF6A2+1Q+/9FFM7xZ ATs95SzN6LtgThHLx8IQnvDGgV3toxLd3Je3YK7NC8fmlpa9TRpDatnkn30b7g== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1734633942; a=rsa-sha256; cv=none; b=aiJDFsV5N+9I4kEZg5C+t1OMtaXpqc46NcONABVPxBNLqQakMSpbDs5rT2ILpyPSCOoGAV p/Jj7Mtrt//x6PY0vkzdrdLeplQEdKClrHOuS1wejuGIyExK/yu2b4ozjnjzMRm2nrAlM0 yEEkKhLjOGoeKH8HRgzmOS/QYslQ1KsuWn2CxFWvZYkRRGk4cYLG/d6cdkP3qVwpE0BYaq /C/qKiRMH40hE2Rg5CDmdlQxFh6DXA2YG0WnXVMZL+or4pP7Tayot49r1euEgkK9eeyx1s PJxcshTvOnOY8oTFtFZk+3Lg7wNQnPTxpHJU7WWLk9dDajUauIa2bJIg2EcTKg== Received: from spindle.one-eyed-alien.net (spindle.one-eyed-alien.net [199.48.129.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: brooks/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4YDfd20DbLzbSs; Thu, 19 Dec 2024 18:45:42 +0000 (UTC) (envelope-from brooks@freebsd.org) Received: by spindle.one-eyed-alien.net (Postfix, from userid 3001) id 6AE713C019B; Thu, 19 Dec 2024 18:45:41 +0000 (UTC) Date: Thu, 19 Dec 2024 18:45:41 +0000 From: Brooks Davis To: John Baldwin Cc: Gleb Smirnoff , Ed Maste , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: a1097094c4c5 - main - newvers: Set explicit git revision length Message-ID: References: <202412131306.4BDD6bxu011253@gitrepo.freebsd.org> <9afbf270-0cc0-4fd0-8975-6b88aadd3903@FreeBSD.org> List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9afbf270-0cc0-4fd0-8975-6b88aadd3903@FreeBSD.org> On Thu, Dec 19, 2024 at 10:03:05AM -0500, John Baldwin wrote: > On 12/18/24 12:12, Gleb Smirnoff wrote: > > On Wed, Dec 18, 2024 at 10:22:24AM -0500, Ed Maste wrote: > > E> That said, it doesn't matter what Git's algorithm chooses as the short > > E> hash length; specifying --short bypasses that algorithm. `git > > E> rev-parse --verify --short=12 HEAD` will give us a 12-character short > > E> hash as long as that hash is unique. The reproducibility concern is > > E> thus: what is the probability that the 12-character short hash is > > E> unique at the time and in a repo from which an image is built, but is > > E> not unique for the attempt to reproduce it, or vice-versa. This > > E> probability is rather small. > > E> > > E> If you look at arbitrary commits 6 or 7 characters are usually > > E> sufficient for a unique hash today. For instance, some latest -pX from > > E> recent releng/ branches: > > E> > > E> 13.3: 72aa3d > > E> 13.4: 3f40d5 > > E> 14.0: f10e32 > > E> 14.1: 74b6c98 > > E> 14.2: c8918d6 > > E> > > E> The status quo of --short=12 should be fine for quite some time. > > > > AFAIU John's concern is that you can't guarantee a reproducible build from a > > "dirty" repository. A repository that has more branches than just the official > > ones. I just make a quick check on Netflix repo, that has both the current > > FreeBSD history and the before-the-official-git history together, as well as > > splitted ports subdirectories and of course our own stuff. For short hashes > > there are roughly 2x more ambiguities than for a "clean" repo. Apparently > > chance of collision on a long hash is also doubled. > > > > We can of course say that we don't provide reproducible builds from a "dirty" > > repo. But would be a real limitation. That would cancel a legitimate > > scenario: > > > > git subtree add FreeBSD && cd FreeBSD && make a reproducible build > > In particular, the dirty repository scenario I imagine is FreeBSD's official > repository at some point in the future. A question though is how far in the > future would it have to be to matter. If we would need 100+ years at our > current commit rate to matter, then this is probably moot. The other point > I guess is that how many other user git settings can affect the build? Should > we not require an empty global git config as a prereq for someone who wants a > reproducible build (and use the same setup for our official builds) and say > that if you adjust your user config to impact the build that's kind of your > problem? I'm not super concerned about rollover here. If it becomes an issue, and someone wants to reproduce the build in the future (e.g., a decade from now) they can always produce a custom repo with future history removed to avoid having git add extra digits. IMO that's going to be the least of their problems given they will need to bootstrap the correct LLVM in order to make sure binaries are the same. For FreeBSD itself, I think we're a very long way away. FreeBSD main from about a week ago has 296268 commits per `git rev-list --count HEAD` and CheriBSD has more than twice as many at 662027[0] (more than LLVM's 521761). All default to 12 digits for short. If we wanted to add some margin going to 13 should last until SHA1 is completely untenable as a hash. -- Brooks [0] For those following along, this has two causes: 1) we have both the current history and uqs's git export history in our history, 2) We merge each upstream commit individually so we've added a merge commit for each first-parent commit to src/main since 2015.