From nobody Thu Mar 16 21:26:40 2023 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Pd0hF4499z3ycg8 for ; Thu, 16 Mar 2023 21:26:53 +0000 (UTC) (envelope-from nagy.attila@gmail.com) Received: from mail-ua1-x930.google.com (mail-ua1-x930.google.com [IPv6:2607:f8b0:4864:20::930]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Pd0hF2MD4z4Lt4 for ; Thu, 16 Mar 2023 21:26:53 +0000 (UTC) (envelope-from nagy.attila@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ua1-x930.google.com with SMTP id g23so2102662uak.7 for ; Thu, 16 Mar 2023 14:26:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679002012; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7m33QQQjpkNurVcvNdQvwPevvLQ9mxpuq3hRCVIu1As=; b=llZqoeSYYYoi80ITLAjDwoDiqbSDgVINip0iVSGKJpOzgwtCF6mGyq1kCJq1Bs/aP4 +YvsKY01lBvBw3FmkCVazlHEdRVQS/Foh9TRktLb5uLoAi4ocweFm8rWwoSS4oSoQOiG vFgyiG3HyHpJHueaoUm8gb5BHe3fMBipzVW4ZwSqICQ0YZ6rrhadlbI6Z/GEnht73Y7n 3JRVmVvWH0Hh2pKuumpRaTnccws3gHgz7n+YVetzTn2GwCylisW8PrQkIn7IKLBU64O3 mQdiTSnTBuBv2hHJBAFGGXhReIiJiX3HYb0YwxmFqOVGa8DcgBFUhG2UxIiWfuwc7nBS O3zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679002012; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7m33QQQjpkNurVcvNdQvwPevvLQ9mxpuq3hRCVIu1As=; b=A1N2uSrVz9MhJh3OMosfZb/Xui5dsYWrLc/m7VWQqMPfcI0HWSlLRFtdCOpxqcaW3T ptoRWTQ2SbjjjIWKOL96u7nntWt9GvBK/O7AuJhpUMRX8aZoJt8OeTLefF0J4I/1wbAz jqAUEqBNopB/404iSei5mQLzgWA5iOQtwgKth1v4swHByc89NU673jbwBPaZLPDSiLrR L3/oARm8NKI2dGlIBs13DnCPxlJkn6Yk4dvmpbLJEyE76w0LaD+47tvFk+pSF7H9aeqG Z1m3+1aKzPtWssvokgoz26DJ2yNkiC6LLJo2WU0ahgkfoMAyEPJtxEe2jlRxTje2+7Fy MkbA== X-Gm-Message-State: AO0yUKXo0tbI2hkreOaoBk/LSEkYxhS0IJrRTgblBwKhYE3HsclRVqas VGk0pwkOJk/UURUpt9uNLipwu0TuGWF22k1jqbc= X-Google-Smtp-Source: AK7set+ghw/dNzYhPzTc+nJyJfFU7SGa6/iLvojas3AaaT7fMLdbtZ/PK0V3F+rZP4qFDqI1CDgDV/RZqiSCW5uS2wA= X-Received: by 2002:ab0:4a12:0:b0:68d:6360:77b with SMTP id q18-20020ab04a12000000b0068d6360077bmr29269338uae.1.1679002012016; Thu, 16 Mar 2023 14:26:52 -0700 (PDT) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: <132303943.191443.1679001265318@mail.yahoo.com> In-Reply-To: <132303943.191443.1679001265318@mail.yahoo.com> From: Attila Nagy Date: Thu, 16 Mar 2023 22:26:40 +0100 Message-ID: Subject: Re: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine To: =?UTF-8?B?WXZlcyBHdcOpcmlu?= Cc: "freebsd-stable@freebsd.org" Content-Type: multipart/alternative; boundary="00000000000080e78305f70b1ed9" X-Rspamd-Queue-Id: 4Pd0hF2MD4z4Lt4 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --00000000000080e78305f70b1ed9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey, Sure. We're talking about 30 machines, all behave the same (either bad or good). I'm pretty sure it's not a cabling issue. :) Yves Gu=C3=A9rin ezt =C3=ADrta (id=C5=91pont: 2023. m= =C3=A1rc. 16., Cs, 22:14): > Dear Attila, > > May be I will add some noise to your thread, sorry in advance, I am just = a > sysadmin and I faced the same problem with one of my old hp g7 the networ= k > card was broken (malfunctionning) , sometime it works and sometime not wh= en > I used pxe and dhcpd (take to much time to answer to the dhcp so the > motherboard decided to reboot, etc. (infinite loop)). The card works > perfectly when it's setup by an OS. > > May be it's a stupid question or two: do you check the network cable ? (= I > faced some defective cables and it ruin my day...) in the same way what > about the hub/router attached to this server (configuration, etc.), Do yo= u > switched a good one by a bad one ? (same network cable, hub/router, etc.) > > I spend too much nights in the lab... > > Regards, > > Yves Guerin > > > Le jeudi 16 mars 2023 =C3=A0 16:44:49 UTC=E2=88=924, Attila Nagy > a =C3=A9crit : > > > Hi, > > As this is super annoying, I'm willing to pay a $500 bounty for solving > this issue (whomever is first, however I don't anticipate a big competiti= on > :) Having an invoice would be best, but I'm willing to accept individuals > as well). > I can't give remote access, but can run debug builds with serial console. > stable/13 branch. > > I have a bunch of netbooted machines, one set in a cluster is older (HP > DL80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n > G10, AMD EPYC2x16C, BCM57412 -bnxt- NICs). > All of these boot from the network, which is basically: > - get IP and options with DHCP with the help of the NIC's PXE stack > - get the loader and kernel, start it > - do another round of DHCP from the kernel (bootp_subr.c) > - mount the root via NFS and let everything work as usual > > The problem is that the newer machines take an indefinite time to boot. > The older ones (with igb NIC) work reliably, they always boot fast. > The process of getting an IP address via DHCP (bootpc_call from > bootp_subr.c) either succeeds normally (in a few seconds), or takes a lot > of time. > Common (measured) times to boot range from 10s of minutes to anywhere > between a few hours (1-6). > Sometimes it just gets stuck and couldn't get past bootpc_call (getting > the DHCP lease). > > What I've already tried: > - we have a redundant set of DHCP servers which offer static leases (so > there are two DHCPOFFERs), so I tried to turn off one of them, nothing ha= s > changed > - tried to disable SMP, the effect is the same > - tried to see whether it's a network issue. The NIC's PXE stack always > gets the lease quickly and booting FreeBSD from an ISO and issuing dhclie= nt > on the same interface is also fast. After the machines have booted, there > are no network issues, they work reliably (since more than a year for 20+ > machines, so not just a few hours) > > This issue wasn't so bad previously (only a few mins to tens of minutes > delay), but recently it got pretty unbearable, even making some machines > unbootable for days... > > First I thought it might be a packet loss (or more exactly packet deliver= y > from the DHCP server to the receiving socket), either in the network or i= n > the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c > and udp_usrreq.c. > > After spending some time trying to understand the problem it feels like a > race condition in > bootpc_call, but I don't know the code well enough to effectively verify > that. > > Here are the modified bootp_subr.c and udp_usrreq.c: > > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515= 7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c > (modified from stable/13 branch from a few weeks earlier) > > This is the output with the always working DL80 (igb) machine: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt > > This is the console output from a working boot for the XL225n (bnxt) > machine: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt > as you can see, it's much slower than the DL80 (which also isn't that > fast...) > > And this one is a longer output, without success to that point (2 minutes > without completing the DHCP flow): > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw > > / > > a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt > > > For the latter, here's an excerpt from the DHCP log: > > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515= 7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt > > It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED even if > there's answers from the DHCP server. > > Here's another, longer console log, which succeeded after spending 236 > seconds in the loop: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f= 52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt > > Any ideas about this? > > --00000000000080e78305f70b1ed9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey,

Sure. We're talking= about 30 machines, all behave the same (either bad or good). I'm prett= y sure it's not a cabling issue. :)

Yves Gu=C3=A9rin <yvesguerin@yahoo.ca> ezt =C3=ADrta= (id=C5=91pont: 2023. m=C3=A1rc. 16., Cs, 22:14):
Dear = Attila,

May be I will add = some noise to your thread, sorry in advance, I am just a sysadmin and I fac= ed the same problem with one of my old hp g7 the network card was broken (m= alfunctionning) , sometime it works and sometime not when I used pxe and dh= cpd (take to much time to answer to the dhcp so the motherboard decided to = reboot, etc. (infinite loop)).=C2=A0 The card works perfectly when it's= setup by an OS.

May be it= 's a stupid question or two: do you check the network cable ?=C2=A0 (I = faced some defective cables and it ruin my day...) in the same way what abo= ut the hub/router attached to this server (configuration, etc.), Do you swi= tched a good one by a bad one ? (same network cable, hub/router, etc.)

I spend too much nights in the= lab...

Regards,
<= /div>

Yves Guerin


=20
=20
Le jeudi 16 mars 2023 =C3=A0 16:44:49 UTC=E2=88=924, At= tila Nagy <na= gy.attila@gmail.com> a =C3=A9crit :


Hi,

As this is super ann= oying, I'm willing to pay a $500 bounty for solving this issue (whomeve= r is first, however I don't anticipate a big competition :) Having an i= nvoice would be best, but I'm willing to accept individuals as well).
I can't give remote access, but can run debug builds with seri= al console. stable/13 branch.

I have a bunch o= f netbooted machines, one set in a cluster is older (HP DL80 G9, 2x8C, Inte= l I350 -igb- NICs), the other set is newer (HP XL225n G10, AMD EPYC2x16C, B= CM57412 -bnxt- NICs).
All of these boot from the network, which i= s basically:
- get IP and options with DHCP with the help of the = NIC's PXE stack
- get the loader and kernel, start it
- do another round of DHCP from the kernel (bootp_subr.c)
- mo= unt the root via NFS and let everything work as usual

<= div>The problem is that the newer machines take an indefinite time to boot.= The older ones (with igb NIC) work reliably, they always boot fast.
The process of getting an IP address via DHCP (bootpc_call from boo= tp_subr.c) either succeeds normally (in a few seconds), or takes a lot of t= ime.
Common (measured) times to boot range from 10s of minutes to= anywhere between a few hours (1-6).
Sometimes it just gets stuck= and couldn't get past bootpc_call (getting the DHCP lease).
=
What I've already tried:
- we have a redundant= set of DHCP servers which offer static leases (so there are two DHCPOFFERs= ), so I tried to turn off one of them, nothing has changed
- = tried to disable SMP, the effect is the same
- tried to see w= hether it's a network issue. The NIC's PXE stack always gets the le= ase quickly and booting FreeBSD from an ISO and issuing dhclient on the sam= e interface is also fast. After the machines have booted, there are no netw= ork issues, they work reliably (since more than a year for 20+ machines, so= not just a few hours)

This issue wasn't s= o bad previously (only a few mins to tens of minutes delay), but recently i= t got pretty unbearable, even making some machines unbootable for days...

First I thought it might be a packet loss (or more = exactly packet delivery from the DHCP server to the receiving socket), eith= er in the network or in the NIC/kernel itself, so I placed a few random pri= ntfs into bootp_subr.c and udp_usrreq.c.

After spe= nding some time trying to understand the problem it feels like a race condi= tion in
bootpc_call, but I don't know the code well enou= gh to effectively verify that.

Here are the mo= dified bootp_subr.c and udp_usrreq.c:
(modified from stable/13 = branch from a few weeks earlier)

This is the o= utput with the always working DL80 (igb) machine:

This is the console ou= tput from a working boot for the XL225n (bnxt) machine:
as you can see, it's = much slower than the DL80 (which also isn't that fast...)
And this one is a longer output, without success to that point = (2 minutes without completing the DHCP flow):
https://gist.github.com/bra-fsn/128ae9= a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt<= /div>

For the latter, here's an excerpt from the DHC= P log:

<= /div>
It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED e= ven if there's answers from the DHCP server.

Here's another, longer console log, which succeeded after spending 2= 36 seconds in the loop:

Any ideas about this= ?

--00000000000080e78305f70b1ed9--