From nobody Wed Aug 30 15:09:30 2023 X-Original-To: freebsd-pkgbase@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RbSPz6KNZz4sC5d for ; Wed, 30 Aug 2023 15:09:43 +0000 (UTC) (envelope-from dfr@rabson.org) Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RbSPz1DF9z3FFv for ; Wed, 30 Aug 2023 15:09:43 +0000 (UTC) (envelope-from dfr@rabson.org) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=rabson-org.20230601.gappssmtp.com header.s=20230601 header.b=HamPbdNt; spf=pass (mx1.freebsd.org: domain of dfr@rabson.org designates 2607:f8b0:4864:20::b35 as permitted sender) smtp.mailfrom=dfr@rabson.org; dmarc=none Received: by mail-yb1-xb35.google.com with SMTP id 3f1490d57ef6-d7b9c04591fso1057421276.3 for ; Wed, 30 Aug 2023 08:09:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rabson-org.20230601.gappssmtp.com; s=20230601; t=1693408181; x=1694012981; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=VczkqIAFrTkeq5TmWs5DC/Xhc7QAOxa9r2da9sZY3qs=; b=HamPbdNt84AY0UeeyEAOgdIp4ePxXKLCgjffaZyw4faPnGoCZPfqgYObn3ctn7KN6v wYQ5qnxrQxKTr2EhL8MZEwa9c4npmeOtlByjpBBfuBouuEHf3tbOC0TOws1eXs0cN3+6 zfI71jJYq9H5UO08y8mZJZxThqq5fvvNhsd5hHgxtAuufaNgIvVYm1lyyaoH07RILPoO Ukdy5rpw8w++LRuT6JddosQPqcRNWlgXAU9knl6gbZGbTN9wJseb4n52ZiTcYbFvf6PG p3XaIXYPuSxH9bSdlwchV5tyFY7oHKuVa7XA1sMiJnTMqDGG6KSbKKHX/uPcgmV+SNNY 8U/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693408181; x=1694012981; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VczkqIAFrTkeq5TmWs5DC/Xhc7QAOxa9r2da9sZY3qs=; b=c7S0fXrkcYDPSthWEnIcYfTnnB8oRq7WaCtZkO5wgqiy1p7PxAdrb/5D1CFNoteMgB /Y7Cn1RVXNo+4TJwmyzwVZ1NnQmcsA01yk80Fn9VMIl/IQIFQCPeaWzGRFO0SS6pHr/W 3ou2Z3CnTa8yGvPb8qVG9wwm6AwIpPMD8SjCWftq0vtSwBx57XG3cd8g0E5OfA8dIS7R ceGYMXmaxje8ls7TvmCJkHtIsrY9REbZVbUSUJFWWxGLbbkooPem3x/Z+xp2iAOOEDGy MBfPwH8maK4QZJBAYGf5CZz+8l1r9fOUbBKOIoFEYEZ7qOJslcHnwAoK4hYJsKA4hQpd m9pQ== X-Gm-Message-State: AOJu0Yz+xwQCiuXpvyKQEQJBiYOmlNsH8Ebjlpuh/b/LIUNdcMzsSbFn xNxm478JEC3GuXiVGQP86GX8DadeMj0TdnH4dRzRB+vyYYr+bWrQb0E= X-Google-Smtp-Source: AGHT+IH63oK4HOIA1N3/WErUEgMelh/AuzKvMuAWoYiUmt38FrNsG8wdYZEpeXw7WUOV3liofzjdkrF3ynJAEZiF7tM= X-Received: by 2002:a25:16d7:0:b0:d12:bae1:f324 with SMTP id 206-20020a2516d7000000b00d12bae1f324mr2161711ybw.18.1693408181074; Wed, 30 Aug 2023 08:09:41 -0700 (PDT) List-Id: Packaging the FreeBSD base system List-Archive: https://lists.freebsd.org/archives/freebsd-pkgbase List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-pkgbase@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Doug Rabson Date: Wed, 30 Aug 2023 16:09:30 +0100 Message-ID: Subject: Re: Repeatable builds using pkgbase To: Baptiste Daroussin Cc: freebsd-pkgbase@freebsd.org Content-Type: multipart/alternative; boundary="00000000000017e62d06042551fa" X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.995]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[rabson-org.20230601.gappssmtp.com:s=20230601]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; MLMMJ_DEST(0.00)[freebsd-pkgbase@freebsd.org]; ARC_NA(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::b35:from]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; DMARC_NA(0.00)[rabson.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; FREEFALL_USER(0.00)[dfr]; DKIM_TRACE(0.00)[rabson-org.20230601.gappssmtp.com:+]; TO_DN_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-pkgbase@freebsd.org]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4RbSPz1DF9z3FFv --00000000000017e62d06042551fa Content-Type: text/plain; charset="UTF-8" On Wed, 30 Aug 2023 at 15:59, Doug Rabson wrote: > > > On Mon, 21 Aug 2023 at 17:26, Doug Rabson wrote: > >> >> >> On Mon, 21 Aug 2023 at 17:23, Baptiste Daroussin >> wrote: >> >>> On Mon, Aug 21, 2023 at 02:33:24PM +0100, Doug Rabson wrote: >>> > While working on build scripts for FreeBSD container images, I wanted >>> to >>> > get to the point where my builds are repeatable, i.e. if I create two >>> > images with the same set of packages installed in the same order, they >>> > should be identical. >>> > >>> > The main stumbling block is timestamps. I can force all the file >>> timestamps >>> > to a fixed value with buildah using the '--timestamp' argument to >>> either >>> > 'buildah commit' or 'buildah build' but even then, the two images have >>> > different hashes. Looking deeper, the difference is in >>> > /var/db/pkg/local.sqlite. If I compare SQL dumps of the databases from >>> each >>> > image, I can see a timestamp embedded in the sqlite file: >>> > >>> > diff dump1 dump2 >>> > >>> > >>> > 4c4 >>> > < INSERT INTO packages >>> > VALUES(1,'base','FreeBSD-zoneinfo','13.2p2','zoneinfo >>> package','zoneinfo >>> > package',NULL,NULL,'FreeBSD:13:amd64','re@FreeBSD.org',' >>> > https://www.FreeBSD.org >>> > >>> ','/',731014,0,0,1,1692446701,'2$2$c9w95oqai9bwhny1k4pcg8mji77xgk43zjxxb69j1duzq5jao18wak4deer85epmfpc8ngyysyt9wu74pg7sczkqc3ekyawkfgwzi8d',NULL,NULL,0); >>> > --- >>> > > INSERT INTO packages >>> > VALUES(1,'base','FreeBSD-zoneinfo','13.2p2','zoneinfo >>> package','zoneinfo >>> > package',NULL,NULL,'FreeBSD:13:amd64','re@FreeBSD.org',' >>> > https://www.FreeBSD.org >>> > >>> ','/',731014,0,0,1,1692622924,'2$2$c9w95oqai9bwhny1k4pcg8mji77xgk43zjxxb69j1duzq5jao18wak4deer85epmfpc8ngyysyt9wu74pg7sczkqc3ekyawkfgwzi8d',NULL,NULL,0); >>> > >>> > >>> > Looking at the pkg source, I can see that the prepared statement for >>> > inserting into the packages table explicitly uses NOW() for this >>> column. >>> > Would it be reasonable to allow changing this, e.g. by adding a command >>> > line argument to pkg to override the default? I haven't tried this to >>> see >>> > if that makes the two databases identical - if not, I guess I'll just >>> > remove pkg metadata altogether. >>> >>> yes this would be reasonable, if you use en env var, please respect >>> SOURCE_DATE_EPOCH. >>> >>> I'll try this out, probably using an env var as you suggest. Hopefully >> there is nothing non-deterministic in sqlite which would stop this from >> being reproducible. >> > > Sadly, even if I override the timestamp written to the packages table, the > resulting local.sqlite files on two consecutive runs are still different. > If I compare the two using 'sqlite3 local.sqlite .dump', the sql dumps are > identical so there is something else in sqlite which is making things > non-reproducible. I guess I'll have to fall back to plan B and remove the > package metadata from my images. > Weirdly, if I regenerate the local.sqlite file using sqlite3's .dump and .read commands, the resulting DB file does have a consistent hash so that might be a plan C. --00000000000017e62d06042551fa Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, 30 Aug 2023 at 15:59, Doug Ra= bson <dfr@rabson.org> wrote:


On Mon, 21 Au= g 2023 at 17:26, Doug Rabson <dfr@rabson.org> wrote:


On Mon, 21 Aug 2023 at 17:23, Baptiste Darou= ssin <bapt@freebsd= .org> wrote:
On Mon, Aug 21, 2023 at 02:33= :24PM +0100, Doug Rabson wrote:
> While working on build scripts for FreeBSD container images, I wanted = to
> get to the point where my builds are repeatable, i.e. if I create two<= br> > images with the same set of packages installed in the same order, they=
> should be identical.
>
> The main stumbling block is timestamps. I can force all the file times= tamps
> to a fixed value with buildah using the '--timestamp' argument= to either
> 'buildah commit' or 'buildah build' but even then, the= two images have
> different hashes. Looking deeper, the difference is in
> /var/db/pkg/local.sqlite. If I compare SQL dumps of the databases from= each
> image, I can see a timestamp embedded in the sqlite file:
>
> diff dump1 dump2
>
>
> 4c4
> < INSERT INTO packages
> VALUES(1,'base','FreeBSD-zoneinfo','13.2p2',&#= 39;zoneinfo package','zoneinfo
> package',NULL,NULL,'FreeBSD:13:amd64','re@FreeBSD.org&= #39;,'
> https://www.FreeBSD.org
> ','/',731014,0,0,1,1692446701,'2$2$c9w95oqai9bwhny1k4p= cg8mji77xgk43zjxxb69j1duzq5jao18wak4deer85epmfpc8ngyysyt9wu74pg7sczkqc3ekya= wkfgwzi8d',NULL,NULL,0);
> ---
> > INSERT INTO packages
> VALUES(1,'base','FreeBSD-zoneinfo','13.2p2',&#= 39;zoneinfo package','zoneinfo
> package',NULL,NULL,'FreeBSD:13:amd64','re@FreeBSD.org&= #39;,'
> https://www.FreeBSD.org
> ','/',731014,0,0,1,1692622924,'2$2$c9w95oqai9bwhny1k4p= cg8mji77xgk43zjxxb69j1duzq5jao18wak4deer85epmfpc8ngyysyt9wu74pg7sczkqc3ekya= wkfgwzi8d',NULL,NULL,0);
>
>
> Looking at the pkg source, I can see that the prepared statement for > inserting into the packages table explicitly uses NOW() for this colum= n.
> Would it be reasonable to allow changing this, e.g. by adding a comman= d
> line argument to pkg to override the default? I haven't tried this= to see
> if that makes the two databases identical - if not, I guess I'll j= ust
> remove pkg metadata altogether.

yes this would be reasonable, if you use en env var, please respect
SOURCE_DATE_EPOCH.

I'll try this out, probably using an env var as y= ou suggest. Hopefully there is nothing non-deterministic in sqlite which wo= uld stop this from being reproducible.
<= br>
Sadly, even if I override the timestamp written to the packag= es table, the resulting local.sqlite files on two consecutive runs are stil= l different. If I compare the two using 'sqlite3 local.sqlite .dump'= ;, the sql dumps are identical so there is something else in sqlite which i= s making things non-reproducible. I guess I'll have to fall back to pla= n B and remove the package metadata from my images.

Weirdly, if I regenerate the local.sqlite fi= le using sqlite3's .dump and .read commands, the resulting DB file does= have a consistent hash so that might be a plan C.

=C2=A0
--00000000000017e62d06042551fa--