Re: www/chromium will not build on a host w/ 8 CPU and 16G mem [RPi4B 8 GiByte example]
Date: Fri, 25 Aug 2023 09:21:33 UTC
On Aug 24, 2023, at 21:57, bob prohaska <fbsd@www.zefox.net> wrote: > On Thu, Aug 24, 2023 at 03:20:50PM -0700, Mark Millard wrote: >> bob prohaska <fbsd_at_www.zefox.net> wrote on >> Date: Thu, 24 Aug 2023 19:44:17 UTC : >> >>> On Fri, Aug 18, 2023 at 08:05:41AM +0200, Matthias Apitz wrote: >>>> >>>> sysctl vfs.read_max=128 >>>> sysctl vfs.aio.max_buf_aio=8192 >>>> sysctl vfs.aio.max_aio_queue_per_proc=65536 >>>> sysctl vfs.aio.max_aio_per_proc=8192 >>>> sysctl vfs.aio.max_aio_queue=65536 >>>> sysctl vm.pageout_oom_seq=120 >>>> sysctl vm.pfault_oom_attempts=-1 >>>> >>> >>> Just tried these settings on a Pi4, 8GB. Seemingly no help, >>> build of www/chromium failed again, saying only: >>> >>> ===> Compilation failed unexpectedly. >>> Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to >>> the maintainer. >>> *** Error code 1 >>> >>> No messages on the console at all, no indication of any swap use at all. >>> If somebody can tell me how to invoke MAKE_JOBS_UNSAFE=yes, either >>> locally or globally, I'll give it a try. But, if it's a system problem >>> I'd expect at least a peep on the console.... >> >> Are you going to post the log file someplace? > > > http://nemesis.zefox.com/~bob/data/logs/bulk/main-default/2023-08-20_16h11m59s/logs/errors/chromium-115.0.5790.170_1.log > >> You may have missed an earlier message. > > Yes, I did. Some (very long) lines above there is: > > [ 96% 53691/55361] "python3" "../../build/toolchain/gcc_link_wrapper.py" --output="./v8_context_snapshot_generator" -- c++ -fuse-ld=lld -Wl,--build-id=sha1 -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--icf=all -Wl,--color-diagnostics -Wl,--undefined-version -Wl,-mllvm,-enable-machine-outliner=never -no-canonical-prefixes -Wl,-O2 -Wl,--gc-sections -rdynamic -pie -Wl,--disable-new-dtags -Wl,--icf=none -L/usr/local/lib -fstack-protector-strong -L/usr/local/lib -o "./v8_context_snapshot_generator" -Wl,--start-group @"./v8_context_snapshot_generator.rsp" -Wl,--end-group -lpthread -lgmodule-2.0 -lglib-2.0 -lgobject-2.0 -lgthread-2.0 -lintl -licui18n -licuuc -licudata -lnss3 -lsmime3 -lnssutil3 -lplds4 -lplc4 -lnspr4 -ldl -lkvm -lexecinfo -lutil -levent -lgio-2.0 -ljpeg -lpng16 -lxml2 -lxslt -lexpat -lwebp -lwebpdemux -lwebpmux -lharfbuzz-subset -lharfbuzz -lfontconfig -lopus -lopenh264 -lm -lz -ldav1d -lX11 -lXcomposite -lXdamage -lXext -lXfixes -lXrender -lXrandr -lXtst -lepoll-shim -ldrm -lxcb -lxkbcommon -lgbm -lXi -lGL -lpci -lffi -ldbus-1 -lpangocairo-1.0 -lpango-1.0 -lcairo -latk-1.0 -latk-bridge-2.0 -lsndio -lFLAC -lsnappy -latspi > FAILED: v8_context_snapshot_generator That FAILED line is 64637. > Then, a bit further down in the file a series of > d.lld: error: relocation R_AARCH64_ABS64 cannot be used against local symbol; recompile with -fPIC > complaints. The first R_AARCH64_ABS64 lines is 64339. After that are the next 2 lines, with: defined in obj/third_party/ffmpeg/libffmpeg_internal.a(ffmpeg_internal/autorename_libavcodec_aarch64_fft_neon.o) and: referenced by ffmpeg_internal/autorename_libavcodec_aarch64_fft_neon.o:(fft_tab_neon) in archive obj/third_party/ffmpeg/libffmpeg_internal.a > Unclear if the two kinds of complaints are related, nor whether they're the first.. > >> How long had it run before stopping? > > 95 hours, give or take. Nothing about timeout was reported > >> How does that match up with the MAX_EXECUTION_TIME >> and NOHANG_TIME and the like that you have poudriere set >> up to use ( /usr/local/etc/poudriere.conf ). > > NOHANG_TIME=44400 > MAX_EXECUTION_TIME=1728000 > MAX_EXECUTION_TIME_EXTRACT=144000 > MAX_EXECUTION_TIME_INSTALL=144000 > MAX_EXECUTION_TIME_PACKAGE=11728000 > Admittedly some are plain silly, I just started > tacking on zeros after getting timeouts and being > unable to match the error message and variable name.. > > I checked for duplicates this time, however. Not stopped for time. >> Something relevant for the question is what you have for: >> >> # Grep build logs to determine a possible build failure reason. This is >> # only shown on the web interface. >> # Default: yes >> DETERMINE_BUILD_FAILURE_REASON=no >> >> Using DETERMINE_BUILD_FAILURE_REASON leads to large builds >> running for a long time after it starts the process of >> stopping from a timeout the grep activity takes a long >> time and the build activity is not stopped during the >> grep. >> >> >> vm.pageout_oom_seq=120 and vm.pfault_oom_attempts=-1 make >> sense to me for certain kinds of issues involved in large >> builds, presuming sufficient RAM+SWAP for how it is set >> up to operate. vm.pageout_oom_seq is associated with >> console/log messages. if one runs out of RAM+SWAP, >> vm.pfault_oom_attempts=-1 tends to lead to deadlock. But >> it allows slow I/O to have the time to complete and so >> can be useful. >> >> I'm not sure that any vfs.aio.* is actually involved: special >> system calls are involved, splitting requests vs. retrieving >> the status of completed requests later. Use of aio has to be >> explicit in the running software from what I can tell. I've >> no information about which software builds might be using aio >> during the build activity. >> >> # sysctl -d vfs.aio >> vfs.aio: Async IO management >> vfs.aio.max_buf_aio: Maximum buf aio requests per process >> vfs.aio.max_aio_queue_per_proc: Maximum queued aio requests per process >> vfs.aio.max_aio_per_proc: Maximum active aio requests per process >> vfs.aio.aiod_lifetime: Maximum lifetime for idle aiod >> vfs.aio.num_unmapped_aio: Number of aio requests presently handled by unmapped I/O buffers >> vfs.aio.num_buf_aio: Number of aio requests presently handled by the buf subsystem >> vfs.aio.num_queue_count: Number of queued aio requests >> vfs.aio.max_aio_queue: Maximum number of aio requests to queue, globally >> vfs.aio.target_aio_procs: Preferred number of ready kernel processes for async IO >> vfs.aio.num_aio_procs: Number of presently active kernel processes for async IO >> vfs.aio.max_aio_procs: Maximum number of kernel processes to use for handling async IO >> vfs.aio.unsafe_warningcnt: Warnings that will be triggered upon failed IO requests on unsafe files >> vfs.aio.enable_unsafe: Permit asynchronous IO on all file types, not just known-safe types >> >> >> vfs.read_max may well change the disk access sequences: >> >> # sysctl -d vfs.read_max >> vfs.read_max: Cluster read-ahead max block count >> >> That might well help some spinning rust or other types of >> I/O. > There don't seem to be any indications of disk speed being > a problem, despite using "spinning rust" 8-) Nope: R_AARCH64_ABS64 misuse is not a disk speed issue. >> >> >> MAKE_JOBS_UNSAFE=yes is, for example, put in makefiles of >> ports that have problems with parallel build activity. It >> basically disables having parallel activity in the build >> context involved. I've no clue if you use the likes of, >> say, >> > >> /usr/local/etc/poudriere.d/make.conf >> >> with conditional logic inside such as use of notation >> like: >> >> .if ${.CURDIR:M*/www/chromium} >> STUFF HERE >> .endif >> >> but you could. The actual R_AARCH64_ABS64 use is in: obj/third_party/ffmpeg/libffmpeg_internal.a(ffmpeg_internal/autorename_libavcodec_aarch64_fft_neon.o) not directly in chromium. The solution is not clear to me. > That wasn't needed when the Pi4 last compiled www/chromium. > A Pi3 did benefit from tuning of that sort. > > It sounds like the sysctl settings were unlikely to be > a source of the trouble seen, if not actively helpful. Yep, the sysctl's were not relevant. > For the moment the machine is updating world and kernel. > That should finish by tomorrow, at which point I'll try > to add something like > > .if ${.CURDIR:M*/www/chromium} > MAKE_JOBS_UNSAFE=yes > .endif > > to /usr/local/etc/poudriere.d/make.conf That will not help avoid the R_AARCH64_ABS64 abuse, unfortunately. === Mark Millard marklmi at yahoo.com