Re: Porting question related to modifying original source code

From: Edward Sanford Sutton, III <mirror176_at_hotmail.com>
Date: Wed, 10 Apr 2024 21:29:37 UTC
On 4/10/24 07:12, Gleb Popov wrote:
> On Wed, Apr 10, 2024 at 5:09 PM Brad D <social@brandongrows.me> wrote:
>>
>> Is it uncalled for replacing problematic embedded libraries with equivalent ones in a port as a dependency if the library is in the repo and well maintained?
> 
> It is the other way around, we're usually striving to use
> system-provided dependencies as much as possible.

   Correct. If the library has not been customized within that port then 
using a copy of the library from the ports tree has the benefits of 
building it once when multiple things depend on it and updates + 
vulnerabilities impact just that library can be tracked and applied to 
just that library accordingly. If FreeBSD requires modifications to the 
library for compatibility then that work is being done in one common 
place with all dependent ports benefiting from the work.
   If this port is the only user of that library then it may seem like 
it doesn't make sense to extract it into a port that then has to be 
maintained, but once again if there is a second consumer of the library 
then there are benefits and it is harder to tell that is the case when 
things are bundled.
   If this port is using a modified copy of the library bundled in then 
it may not be straightforward or be possible to use a copy from the 
ports tree. If it does not build properly on FreeBSD then it may need to 
be altered; if there is an unmodified port of the library then that port 
may be useful to learn what changes to consider but they may no longer 
apply. Watching for security issues of that library requires watching if 
that modified copy is influenced by them for both how it is changed and 
how it is (or could be) used;
   Many projects I have looked at bundle libraries to make it easier to 
build their port: smaller dependency list to give out, they depend on an 
older/newer library than is found on their commonly supported Linux 
distributions, or the library is not available as a package in a 
commonly used repo. Some projects stay on top of security issues while 
others may be oblivious to them for various reasons.

   If your port is pretty straightforward, bonus points for using 
EXTRACT_AFTER_ARGS+="--exclude path_to_a_folder_or_file_to_ignore" lines 
being added to exclude extracting the shared/unneeded files. Any file 
not needed to build the port and not included in a package of the port 
is a file that took disk I/O and "maybe" CPU time to extract and on 
cleanup to delete; builds to filesystems in RAM need less RAM to do so, 
SSD users save on unnecessary write cycles of their drive, and less 
things being written/deleted can mean less data to consider for its 
related fragmentation.
   I do not yet have a good workflow for determining what is not needed 
short of trial and error. I presume a filesystem, where atime is 
present/enabled, could identify all files read after extraction/patching 
completed to determine it.
  I've been meaning to do some basic testing of how many things could be 
avoided/skipped at least from extraction by extracting all ports and 
looking through duplicates and the folder I more commonly see now on 
github projects of 3rdparty. This feels like the thing that Mark Millard 
probably did somewhere and in a better way.
   A port I had done some work on using a .tar.gz takes 10.3s to extract 
and about 35-38s to clean all files. If reducing the extracted content 
it takes 11.4s to extract and about 39s to clean (both timed on magnetic 
hard drive with multiple runs ignoring the first on non-idle system). 
Not all archives, archivers and archive formats benefit from knowing 
what can be skipped. Decreasing 756M across 69609 files to 671M across 
69555 files still sounded useful to me. I should try to see if I can put 
it in a loop or use longer lines but it was a little bit of clutter as 
the following:
EXTRACT_AFTER_ARGS+=   --exclude ${WRKSRC}/*.ods \
	--exclude '*.pdf' \
	--exclude '*.doc' \
	--exclude '*.xls*' \
	--exclude '*.xcf' \
	--exclude '*.blend' \
	--exclude '*.blend1' \
	--exclude '*.yml' \
	--exclude '*.vcproj'
In my case, I did not have anything dynamic to the port but 
conditionally adding more EXTRACT_AFTER_ARGS arguments would likely help 
some ports further when port options change.
   I am sure porters can make a better interface we can do so we don't 
need to put "--exclude=..." (archiver specific parameter format) in 
front of each file/folder in every Makefile. Though it might be 
interesting to have a files to ignore list as an external file, I don't 
know if we like  more files making up the ports tree vs a cleaner 
Makefile so someone better than myself will be making such calls of 
"best practices" but each has both advantages and disadvantages.