From nobody Sat Mar 18 20:08:02 2023 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PfBrb19Hgz3yySk for ; Sat, 18 Mar 2023 20:08:15 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:c2c:26d8::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PfBrZ2YNWz3pyD for ; Sat, 18 Mar 2023 20:08:14 +0000 (UTC) (envelope-from eugen@grosbein.net) Authentication-Results: mx1.freebsd.org; dkim=none; spf=fail (mx1.freebsd.org: domain of eugen@grosbein.net does not designate 2a01:4f8:c2c:26d8::2 as permitted sender) smtp.mailfrom=eugen@grosbein.net; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=grosbein.net (policy=none) Received: from eg.sd.rdtc.ru (root@eg.sd.rdtc.ru [62.231.161.221] (may be forged)) by hz.grosbein.net (8.17.1/8.17.1) with ESMTPS id 32IK84Z9091894 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 18 Mar 2023 20:08:05 GMT (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: nagy.attila@gmail.com Received: from [10.58.0.11] (dadvw [10.58.0.11] (may be forged)) by eg.sd.rdtc.ru (8.16.1/8.16.1) with ESMTPS id 32IK83ZM042573 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Sun, 19 Mar 2023 03:08:03 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: Fwd: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine To: Attila Nagy , freebsd-stable@freebsd.org References: From: Eugene Grosbein Message-ID: Date: Sun, 19 Mar 2023 03:08:02 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT autolearn=disabled version=3.4.6 X-Spam-Report: * -0.0 SHORTCIRCUIT No description available. * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hz.grosbein.net X-Spamd-Result: default: False [-1.95 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_SPF_FAIL(1.00)[-all]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.95)[-0.952]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[grosbein.net : No valid SPF, No valid DKIM,none]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/32, country:DE]; FREEMAIL_TO(0.00)[gmail.com,freebsd.org]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-stable@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; FREEFALL_USER(0.00)[eugen]; RCPT_COUNT_TWO(0.00)[2]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 4PfBrZ2YNWz3pyD X-Spamd-Bar: - X-ThisMailContainsUnwantedMimeParts: N 17.03.2023 3:44, Attila Nagy wrote: > Hi, > > As this is super annoying, I'm willing to pay a $500 bounty for solving this issue (whomever is first, however I don't anticipate a big competition :) Having an invoice would be best, but I'm willing to accept individuals as well). > I can't give remote access, but can run debug builds with serial console. stable/13 branch. > > I have a bunch of netbooted machines, one set in a cluster is older (HP DL80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n G10, AMD EPYC2x16C, BCM57412 -bnxt- NICs). > All of these boot from the network, which is basically: > - get IP and options with DHCP with the help of the NIC's PXE stack > - get the loader and kernel, start it > - do another round of DHCP from the kernel (bootp_subr.c) > - mount the root via NFS and let everything work as usual > > The problem is that the newer machines take an indefinite time to boot. The older ones (with igb NIC) work reliably, they always boot fast. > The process of getting an IP address via DHCP (bootpc_call from bootp_subr.c) either succeeds normally (in a few seconds), or takes a lot of time. > Common (measured) times to boot range from 10s of minutes to anywhere between a few hours (1-6). > Sometimes it just gets stuck and couldn't get past bootpc_call (getting the DHCP lease). > > What I've already tried: > - we have a redundant set of DHCP servers which offer static leases (so there are two DHCPOFFERs), so I tried to turn off one of them, nothing has changed > - tried to disable SMP, the effect is the same > - tried to see whether it's a network issue. The NIC's PXE stack always gets the lease quickly and booting FreeBSD from an ISO and issuing dhclient on the same interface is also fast. After the machines have booted, there are no network issues, they work reliably (since more than a year for 20+ machines, so not just a few hours) > > This issue wasn't so bad previously (only a few mins to tens of minutes delay), but recently it got pretty unbearable, even making some machines unbootable for days... > > First I thought it might be a packet loss (or more exactly packet delivery from the DHCP server to the receiving socket), either in the network or in the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c and udp_usrreq.c. > > After spending some time trying to understand the problem it feels like a race condition in > bootpc_call, but I don't know the code well enough to effectively verify that. For me, it looks like timekeeping problem. Please show output of: sysctl kern.timecounter kern.eventtimer After it booted to single- or multi-user mode. Also, show verbose boot log (bootverbose). Sometimes UEFI/BIOS SETUP has some settings for ACPI/HPET timers (enable/disable), did you try "playing" with such options? Note that there is loader tunnable kern.timecounter.hardware="HPET" that can be used to force some timecounter source for kernel using loader.conf or device.hints, any way that puts it to kenv; kenv/device.hints may be compiled into custom kernel binary even.