From nobody Fri Jul 29 11:41:53 2022 X-Original-To: freebsd-git@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4LvQbW1cNhz4XsBK for ; Fri, 29 Jul 2022 11:41:59 +0000 (UTC) (envelope-from mat@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LvQbW0p08z40Rc; Fri, 29 Jul 2022 11:41:59 +0000 (UTC) (envelope-from mat@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1659094919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=KANnmLxnOOfV+RwYE4XepIc2qHXUZ2N6jOe8DU6koP0=; b=GYV9igdpB3Cu26qJC8f/N5PjFobET3x8O96Mz8jSA7avAeJxCVyo94aYbE+JyCw5f1HvD+ ugkb7hsxtW/XutNDK+ukwJdEr+66IX/Fj+12zMnO/igFxJBHakvbs8hjJ1Vi/b6qsz7Ju+ 3xRlxktjBAzjxIWseX1fftcIB4bSXktN44dLhB+eb87cpPFS4N0ZDZrLYtm3kdpBcs7OnV /rD9+DiqHKOLEW6Mnu3UdTPBtiATQFb21hb98YB1EyMzGPbuEXoEZQ/9eQtaUQwn38cpqp TSzmd6I2Pi9a3hRz5c2A65UETyE8s/dciFmUKG7qBMypbf/1J5k+wuJCwrdvNA== Received: from mail.j.mat.cc (owncloud.cube.mat.cc [IPv6:2a01:678:4:1::228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.mat.cc", Issuer "R3" (verified OK)) (Authenticated sender: mat/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4LvQbV6LkkzlRL; Fri, 29 Jul 2022 11:41:58 +0000 (UTC) (envelope-from mat@freebsd.org) Received: from aching.in.mat.cc (unknown [IPv6:2a01:678:ab:50:716:1ded:630c:7c39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: mat@mat.cc) by mail.j.mat.cc (Postfix) with ESMTPSA id 7DDDE942D81; Fri, 29 Jul 2022 11:41:55 +0000 (UTC) Date: Fri, 29 Jul 2022 13:41:53 +0200 From: Mathieu Arnold To: freebsd-git Subject: Git new feature when cloning Message-ID: <20220729114153.cl2p3kpap5qcspz2@aching.in.mat.cc> List-Id: Discussion of git use in the FreeBSD project List-Archive: https://lists.freebsd.org/archives/freebsd-git List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-git@freebsd.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="r3xczjpazl24yfyb" Content-Disposition: inline ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1659094919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=KANnmLxnOOfV+RwYE4XepIc2qHXUZ2N6jOe8DU6koP0=; b=PVAYuR6ILPa8A9lScc4sJf0O9DOS/KOvvor1I7NbZTh162SO00Co56zAv/bkJh5HseK11P PFgxJp3ep9RBATpyRbbyBDMWhVLuMcfBWxFcPkbesZX43Nq2PBJyoVW5p5Bxuk7IJKGENm sPRjl/jW6WJLQ4xWa2m4KaZ4c6vFLQ6CKxMfUh5BmOLVEIJuFqHxVEjT0+W26wGT037cLK yzN88VfYA82fU8kzUN3/RNXb+eDdu3ovD83ka462GW84S/Sb7DNPTfcQ2XVwmfwvJ/qJ6K h4VXNO0lKFDnGr50EF/Bi2c1kKPRNqBGBHIc5JXpUMENEnj1jMEpUaqnI3mQFA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1659094919; a=rsa-sha256; cv=none; b=lKryGK872j4hbJ/xJmfwLuA0fwsc1m+2icJCRibHm5D4zK5323KnJpZsQJajmhBKsJyClP 60WLhrryuAQ9OSZqPUz6vUYAAef3Y9pqgnVddw6QjCmRGfSeoC67nUTR92WFEqtyBU6kkp LeF4PFLhmPEDbh4WbFgwUFga66Gt39KAptcXdHIvrbRflW0I5RxT/GHBJqwIb5mkzAmLoG BPvrRbTP1uSgYPjGDXjnL0WE1cR8l2PisP4p6wLK7rXQwJbBC5V3f1WQDcrP+4fKrwYlsq tNZ/vUf+SGhRzu+0LaR2SPcjSbe6t/u8xaBWOFDJVgjatCkSQkAHhW53V6JpxA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N --r3xczjpazl24yfyb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, A while back, Git grew a way to filter the objects it asks the server when cloning. It can speed up the download because it will download less data. It also stores less information locally, so this is a bonus. The only drawback is that whem you ask for information it does not have locally, it will have to download the missing data, which it'll store locally, so you don't download something twice. (it's done under the hood and you don't see it happening, the only thing you'll see is the command being a bit longer to return.) It all happens in the --filter argument to git clone, see git-rev-list(1) for the whole explanation, and range things you can do. It can filter a few things, but in order of information downloaded, the most common values I can see for our usage are: --filter=3Dblob:none This will download all the commits and all the trees (which are the file list of a directory), and only the blobs needed to checkout the branch you asked for. --filter=3Dtree:0 This will download all the commits, and only the trees and blobs needed to checkout the branch you asked for. Both of those can be used with --sparse, which enables sparse checkout, which basically only checks out the files in the root directory, and you need to use git sparse-checkout to add/remove files to the checkout. That can be useful if you don't have a lot of disk space, and need multiple checkouts to work on. Note that you can't really use --sparse on the ports tree if you want to build things out of it, because you would need to add all the dependencies, and the framework, to build a port. For a kernel developper though, you can probably live with only having the kernel sources and not the whole world. And for numbers because we all love numbers : | filter | SRC | PORTS | DOC | |------------------|-------|-------|------| | blob:none | 605M | 576M | 119M | | blob:none sparse | 314M | 498M | 37M | | tree:0 | 407M | 238M | 97M | | tree:0 sparse | 115M | 115M | 15M | | filtering | 1461M | 1010M | 321M | This is the size of .git/objects, for a checkout done this morning. So it is basically the amount of data downloaded from the server. Note that contrary to using --depth=3DX, which limits the number of commits you get from the server, and which renders the repository ok for testing, but not great for development because fo some limitations, the repository you get when running --filter is fully usable, the only drawback is that if you need bits of history you filtered out, they will be downloaded on the fly so internet access may be required. PS: as filtering is done on the server, a knob needed to be enabled on our servers, gitlab and github already supported the feature. gitrepo.f.o and gitrepo-dev.f.o have it enabled, I am unsure about the mirror status, but they should be ok too. --=20 Mathieu Arnold --r3xczjpazl24yfyb Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGTBAABCgB9FiEEFD4jMKwz5Ud8Ywu3ecmT/A9inX0FAmLjx4FfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDE0 M0UyMzMwQUMzM0U1NDc3QzYzMEJCNzc5Qzk5M0ZDMEY2MjlEN0QACgkQecmT/A9i nX17Hwf/QdAt2kGzh6oLUlHC2Klm+VVKaZeypTQkyDmro3pr0Z5972mEfXsAqkqZ ZfvV09QiEKfhI6X08pjEsY25PDcdEnC5bNu41DkR9WLC5IpnIg5M1SD5NdaIr7d5 2FN90VN6UTeuJwKMnDh3PFYqx3JA+HYcf63dfF4uGG3wK1Oro2cD3x/CQEFD8hF6 LX0cnjgprpl1t+/gXr+SFILEKzlmTJMELki8UV88T67M/EBM4bcARAzekPtYPw/n kDDEGB4x+qaDH1J9/u9nALllQGj14NTdhUCz1EhTNKbNDQKDJfoEmzb7Oo34+3sV HPoqlkO/oNLls+OY0krxdqSDG45tkA== =E6mj -----END PGP SIGNATURE----- --r3xczjpazl24yfyb--