Serial ports intermittently not getting data

From: Mark Kane <mark_at_kane.mn>
Date: Sat, 06 Nov 2021 01:20:47 UTC
Hi,

I'm running into an issue with serial ports randomly not seeing data on 
certain boots where nothing has changed, except rebooting. For debugging 
purposes I have simplified the setup down to just 2 serial ports and 
seeing if cu(1) in base is able to see data. The actual application is 4 
separate Java processes accessing 4 serial ports via the comms/rxtx 
serial library.

- /dev/ttyu1 is connected to a temperature sensor that constantly 
outputs data. /dev/ttyu4 is connected to a GPS receiver that constantly 
outputs data.
- On any given boot of FreeBSD 13.0 or 12.2 cu(1) may see data from one, 
both, or neither of the ports. In the case where a port isn't receiving 
data, cu(1) prints "Connected" but there is no data from the serial 
port.
- Without making any changes to the system, rebooting via `shutdown -r 
now`, `shutdown -p now`, or `shutdown -c now` may allow one or both of 
the serial ports to start seeing data again.
- On the next reboot with no changes the behavior seems random whether 
one, both, or neither port will have data.
- After 200 reboot tests with no changes made, 85 boots have at least 
one of the two serial ports in the broken state with no data and the 
other 115 reboots have both ports working correctly.
- No amount of fiddling with stty options has changed this behavior. The 
default lines for these ports in /etc/ttys being present or commented 
out also seems to have no effect.
- I have not found any way to "reset" the ports back to a working state 
with data while FreeBSD is booted short of trying a reboot, which may or 
may not allow either/both ports to work again.
- Bugs 152310 [1] and 147082 [2] sound like they could be related. 
Unfortunately the mentioned workaround of switching uart from kernel to 
a module and kldunload && kldload does not bring the ports back to life.

Does anyone have any ideas what might be going on and any possible way 
to force ports that have no data back to a working state without 
rebooting?

Thanks in advance,

-Mark

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=152310
[2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=147082

dmesg:
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
     The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-RELEASE-p4 #0: Thu Oct 14 19:05:57 UTC 2021
     root@test:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git 
llvmorg-11.0.1-0-g43ff75f2c3fe)
VT(vga): resolution 640x480
CPU: Intel(R) Atom(TM) CPU  E3845  @ 1.91GHz (1916.71-MHz K8-class CPU)
   Origin="GenuineIntel"  Id=0x30679  Family=0x6  Model=0x37  Stepping=9
   
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
   
Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
   AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
   AMD Features2=0x101<LAHF,Prefetch>
   Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
   VT-x: (disabled in BIOS) PAT,HLT,MTF,PAUSE,EPT,UG,VPID
   TSC: P-state invariant, performance statistics
real memory  = 2147483648 (2048 MB)
avail memory = 1907335168 (1818 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <ALASKA A M I >
WARNING: L1 data cache covers fewer APIC IDs than a core (0 < 1)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe0Block: 
128/32 (20201113/tbfadt-748)
ioapic0 <Version 2.0> irqs 0-86
Launching APs: 2 3 1
Timecounter "TSC" frequency 1916712912 Hz quality 1000
KTLS: Initialized 4 threads
random: entropy device external interface
000.000019 [4354] netmap_init               netmap: loaded module
[ath_hal] loaded
WARNING: Device "kbd" is Giant locked and may be deleted before FreeBSD 
14.0.
kbd0 at kbdmux0
mlx5en: Mellanox Ethernet driver 3.6.0 (December 2020)
nexus0
vtvga0: <VT VGA driver>
cryptosoft0: <software crypto>
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS>
acpi0: <ALASKA A M I >
acpi0: Power Button (fixed)
unknown: I/O range not supported
cpu0: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
atrtc0: Warning: Couldn't map I/O.
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 on 
acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pcib0: Length mismatch for 3 range: 50d06fff vs 50d07000
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xe080-0xe087 mem 
0xd0000000-0xd03fffff,0xc0000000-0xcfffffff irq 16 at device 2.0 on pci0
vgapci0: Boot video device
ahci0: <AHCI SATA controller> port 
0xe070-0xe077,0xe060-0xe063,0xe050-0xe057,0xe040-0xe043,0xe020-0xe03f 
mem 0xd0d06000-0xd0d067ff irq 19 at device 19.0 on pci0
ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
pci0: <encrypt/decrypt> at device 26.0 (no driver attached)
hdac0: <Intel BayTrail HDA Controller> mem 0xd0d00000-0xd0d03fff irq 22 
at device 27.0 on pci0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0xd000-0xd01f 
mem 0xd0a00000-0xd0afffff,0xd0b00000-0xd0b03fff irq 16 at device 0.0 on 
pci1
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 4 RX queues 4 TX queues
igb0: Using MSI-X interrupts with 5 vectors
igb0: Ethernet address: 00:10:f3:93:cb:86
igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
pcib3: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pci2: <ACPI PCI bus> on pcib3
pci2: <simple comms, UART> at device 0.0 (no driver attached)
pcib4: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pci3: <ACPI PCI bus> on pcib4
igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0xc000-0xc01f 
mem 0xd0700000-0xd07fffff,0xd0800000-0xd0803fff irq 19 at device 0.0 on 
pci3
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 4 RX queues 4 TX queues
igb1: Using MSI-X interrupts with 5 vectors
igb1: Ethernet address: 00:10:f3:93:cb:87
igb1: netmap queues/slots: TX 4/1024, RX 4/1024
ehci0: <Intel BayTrail USB 2.0 controller> mem 0xd0d05000-0xd0d053ff irq 
23 at device 29.0 on pci0
usbus0: EHCI version 1.0
usbus0 on ehci0
usbus0: 480Mbps High Speed USB v2.0
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
ppc1: <Parallel port> port 0x378-0x37f irq 15 on acpi0
ppc1: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc1
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
uart2: <16950 or compatible> port 0x248-0x24f irq 5 on acpi0
uart1: <16950 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart3: <16950 or compatible> port 0x3e8-0x3ef irq 11 on acpi0
uart4: <16950 or compatible> port 0x2e8-0x2ef irq 10 on acpi0
uart5: <16950 or compatible> port 0x228-0x22f irq 6 on acpi0
uart6: <16950 or compatible> port 0x238-0x23f irq 7 on acpi0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
Timecounters tick every 1.000 msec
hdacc0: <Realtek ALC888 HDA CODEC> at cad 0 on hdac0
hdaa0: <Realtek ALC888 Audio Function Group> at nid 1 on hdacc0
pcm0: <Realtek ALC888 (Rear Analog)> at nid 20 and 24 on hdaa0
pcm1: <Realtek ALC888 (Analog)> at nid 21 and 26 on hdaa0
hdacc1: <Intel Valleyview2 HDA CODEC> at cad 2 on hdac0
hdaa1: <Intel Valleyview2 Audio Function Group> at nid 1 on hdacc1
hdaa1: hdaa_audio_as_parse: Duplicate pin 0 (5) in association 1! 
Disabling association.
hdaa1: hdaa_audio_as_parse: Duplicate pin 0 (6) in association 1! 
Disabling association.
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <KINGSTON SA400S37120G SBFKB1E2> ACS-3 ATA SATA 3.x device
ada0: Serial Number 50026B778358B910
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors)
Trying to mount root from ufs:/dev/ada0p3 [rw,noatime]...
ugen0.1: <Intel EHCI root HUB> at usbus0
uhub0 on usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
warning: total configured swap (2097152 pages) exceeds maximum 
recommended amount (1871544 pages).
warning: increase kern.maxswzone or reduce amount of swap.
ichsmb0: <Intel Baytrail SMBus controller> port 0xe000-0xe01f mem 
0xd0d04000-0xd0d0401f irq 18 at device 31.3 on pci0
smbus0: <System Management Bus> on ichsmb0
drmn0: <drmn> on vgapci0
vgapci0: child drmn0 requested pci_enable_io
vgapci0: child drmn0 requested pci_enable_io
[drm] Unable to create a private tmpfs mount, hugepage support will be 
disabled(-19).
Successfully added WC MTRR for [0xc0000000-0xcfffffff]: 0;
[drm] Got stolen memory base 0x7b000000, size 0x4000000
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
[drm] Connector VGA-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.VGA-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector DP-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.DP-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector HDMI-A-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.HDMI-A-1
[drm]   - kern.vt.fb.default_mode
sysctl_warn_reuse: can't re-use a leaf (hw.dri.debug)!
[drm] Initialized i915 1.6.0 20190822 for drmn0 on minor 0
WARNING: Device "fb" is Giant locked and may be deleted before FreeBSD 
14.0.
VT: Replacing driver "vga" with new "fb".
start FB_INFO:
type=11 height=900 width=1600 depth=32
cmsize=16 size=5763072
pbase=0xc0020000 vbase=0xfffff800c0020000
name=drmn0 flags=0x0 stride=6400 bpp=32
cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000
end FB_INFO
drmn0: fb0: i915drmfb frame buffer device
bridge0: Ethernet address: 58:9c:fc:10:35:24
lo0: link state changed to UP
bridge0: link state changed to UP
igb0: promiscuous mode enabled
uhub0: 8 ports with 8 removable, self powered
ugen0.2: <vendor 0x8087 product 0x07e6> at usbus0
uhub1 on uhub0
uhub1: <vendor 0x8087 product 0x07e6, class 9/0, rev 2.00/0.17, addr 2> 
on usbus0
uhub1: 4 ports with 4 removable, self powered
ugen0.3: <Lite-On Technology Corp. USB Multimedia Keyboard> at usbus0
ukbd0 on uhub1
ukbd0: <Lite-On Technology Corp. USB Multimedia Keyboard, class 0/0, rev 
1.10/1.09, addr 3> on usbus0
kbd1 at ukbd0
usbhid0 on uhub1
usbhid0: <Lite-On Technology Corp. USB Multimedia Keyboard, class 0/0, 
rev 1.10/1.09, addr 3> on usbus0
hidbus0: <HID bus> on usbhid0
hidraw0: <Lite-On Technology Corp. USB Multimedia Keyboard Raw HID 
Device> on hidbus0
ugen0.4: <Logitech USB Optical Mouse> at usbus0
usbhid1 on uhub1
usbhid1: <Logitech USB Optical Mouse, class 0/0, rev 2.00/72.00, addr 4> 
on usbus0
hidbus1: <HID bus> on usbhid1
hidraw1: <Logitech USB Optical Mouse Raw HID Device> on hidbus1
ugen0.5: <vendor 0x05e3 USB2.0 Hub> at usbus0
uhub2 on uhub1
uhub2: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 5> on 
usbus0
hcons0: <Lite-On Technology Corp. USB Multimedia Keyboard Consumer 
Control> on hidbus0
hsctrl0: <Lite-On Technology Corp. USB Multimedia Keyboard System 
Control> on hidbus0
hms0: <Logitech USB Optical Mouse> on hidbus1
hms0: 8 buttons and [XYW] coordinates ID=0
igb1: link state changed to UP
uhub2: 4 ports with 4 removable, self powered