Re: 60+% ping packet loss on Pi3 under -current and stable-13

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 02 May 2022 00:10:59 UTC
On 2022-May-1, at 16:27, bob prohaska <fbsd@www.zefox.net> wrote:

> On Sun, May 01, 2022 at 12:58:45PM -0700, Mark Millard wrote:
>> 
>> Looks like there is some problem getting past
>> gig1-1-1.gw.davsca11.sonic.net .
>> 
> 
> That seems independent of my own internal connection problems,
> but worth taking up with my ISP on Monday. Meanwhile, can you
> ping any other hosts in the 50.1.20.31-24 range? All are up
> at the moment. Hosts 28 and 24 are the troublemakers. 
> 
> If anybody cares there's an ascii-art network diagram at
> http://www.zefox.net/~fbsd/netmap
> 
> Not sure it'll survive the mailing list, but here goes:
> dsl_modem-----switch---------router-----lan-------wifi-----pi4_workstation
>                      |                  |             | 
>                      |                  |             |---Mac workstation
>                      |                  |
>                      |                  |------printer
>    ------------------|
>    |
>    |------50.1.20.30 ns1.zefox.net Pi2 12.3 usb-serial----50.1.20.27
>    |------50.1.20.29 ns2.zefox.net Pi2 12.3 usb-serial----50.1.20.30
>    |------50.1.20.27 www.zefox.net Pi2 12.3 usb-serial----50.1.20.26
>    |------50.1.20.26 www.zefox.com Pi2 -current usb-serial---50.1.20.24
>    |------50.1.20.24 pelorus.zefox.org Pi3 13.1 usb-serial---50.1.20.28
> switch
>    |------50.1.20.25 nemesis.zefox.com Pi4 -current usb-serial---50.1.20.29
>    |------50.1.20.28 www.zefox.org Pi3 -current usb-serial----50.1.20.25


For ns1.zefox.net there is no problem and
it looks like:

                                     My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> ns1.zefox.net (50.1.20.29)                2022-05-01T16:52:27-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                       Packets               Pings
 Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.1.1                                       0.0%    53    1.2   0.8   0.1   1.4   0.4
 2. 172.30.26.67                                      0.0%    53   11.8  25.0  11.8  61.0  11.4
 3. 68.85.243.125                                     0.0%    53   10.0  10.0   7.7  46.9   5.3
 4. 96.216.60.165                                     0.0%    53    8.8   9.3   7.8  12.1   0.9
 5. 68.85.243.197                                     0.0%    53    8.6  13.2   8.6  28.3   4.2
 6. be-36231-cs03.seattle.wa.ibone.comcast.net        0.0%    53   15.3  14.8  13.0  16.9   1.0
 7. be-2312-pe12.seattle.wa.ibone.comcast.net         0.0%    53   16.2  15.9  12.9  59.8   6.5
 8. (waiting for reply)
 9. be3717.ccr22.sfo01.atlas.cogentco.com             0.0%    53   29.8  30.9  26.5  97.9  10.1
10. be2430.ccr31.sjc04.atlas.cogentco.com             0.0%    53   29.0  29.0  26.6  39.3   1.8
11. 38.104.141.82                                     0.0%    53   28.9  33.8  26.1 115.0  17.0
12. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    53   32.1  31.3  29.2  33.9   1.0
13. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    53   30.5  32.1  29.2  57.6   4.3
14. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    53   31.8  32.0  28.8  43.7   2.0
15. gig1-1-1.gw.davsca11.sonic.net                    0.0%    52   31.0  32.4  30.2  38.4   1.4
16. ns1.zefox.net                                     0.0%    52   51.4  51.1  49.8  53.4   0.8

ns2.zefox.net and others got a 17. instead of
a 16. An example is:

                                     My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> ns2.zefox.net (50.1.20.30)                2022-05-01T16:58:45-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                       Packets               Pings
 Host                                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.1.1                                       0.0%    55    0.3   0.9   0.1   1.4   0.4
 2. 172.30.26.66                                      0.0%    55   13.5  26.4  10.4  54.7  10.1
 3. 68.85.243.77                                      0.0%    55   10.5   9.1   7.9  10.5   0.6
 4. 24.124.129.106                                    0.0%    54    8.3   9.5   8.2  13.4   1.0
 5. 96.216.60.165                                     0.0%    54    8.8   9.8   7.8  22.8   2.2
 6. 68.85.243.197                                     0.0%    54   17.1  15.1   9.0  37.3   5.9
 7. be-36241-cs04.seattle.wa.ibone.comcast.net        0.0%    54   15.2  15.0  13.2  17.8   0.9
 8. be-2412-pe12.seattle.wa.ibone.comcast.net         0.0%    54   15.0  14.8  13.2  17.1   1.0
 9. (waiting for reply)
10. be2075.ccr21.sfo01.atlas.cogentco.com             0.0%    54   28.4  29.2  26.9  36.8   1.4
11. be2379.ccr31.sjc04.atlas.cogentco.com             0.0%    54   29.8  30.0  27.3  84.2   7.6
12. 38.104.141.82                                     0.0%    54   28.6  33.7  27.5 105.5  16.2
13. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net            0.0%    54   31.6  31.4  29.5  33.8   0.9
14. 0.xe-0-0-0.cr1.scrmca13.sonic.net                 0.0%    54   31.1  32.1  29.1  52.9   3.4
15. gig1-1-1.gw.wscrca11.sonic.net                    0.0%    54   31.2  31.9  30.0  34.1   0.9
16. gig1-1-1.gw.davsca11.sonic.net                    0.0%    54   33.3  32.6  30.8  45.8   2.1
17. ns2.zefox.net                                     0.0%    54   52.5  51.4  49.1  54.9   1.2

The routing need not be the same from one
try to the next.

www.zefox.net     is similar.
www.zefox.com     is similar.
pelorus.zefox.org is similar.
nemesis.zefox.com is similar.
www.zefox.org     is similar.

Notably www.zefox.org was what I tried and
reported on before that had the failures.

I observed a initial connection sequence once
for pelorus.zefox.org where it briefly displayed
something like (not captured, just from memory):

16. gig1-1-1.gw.davsca11.sonic.net
17. (waiting for reply)
18. (waiting for reply)
19. pelorus.zefox.org

before changing to

16. gig1-1-1.gw.davsca11.sonic.net
17. ns2.zefox.net

That may be normal but usually timed such that I
would not usually see it.

But it might actually be evidence of a stage that
the leads to the overall failure by never getting
past the:

16. gig1-1-1.gw.davsca11.sonic.net
17. (waiting for reply)
18. (waiting for reply)
19. WHATEVER

in some cases.

However, in the above the below worked fine:

50.1.20.24 pelorus.zefox.org Pi3 13.1 usb-serial---50.1.20.28
50.1.20.28 www.zefox.org Pi3 -current usb-serial----50.1.20.25

What changed?


===
Mark Millard
marklmi at yahoo.com