ns8250: UART FCR is broken, message might be misleading
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 06 Jul 2024 18:36:09 UTC
Hi, I have 3 machines running FreeBSD 14.0. (I have upgraded one to 14.1 recently). All 3 have a uart with a GPS behind it, but no program reading from the uart. I see the "ns8250: UART FCR is broken" on all of them on average a little less than one per day. For example, the one machine has an uptime of 48 days. In that time the message was printed 44 times, but there was 12846 overruns according to the below sysctl: dev.uart.2.rx_overruns: 12846 If the FCR was really broken, I would have expected the message to be printed for every overrun. The 16550d documentation that I could find on the internet has this about Bit 1 of the Fifo Control Register (FCR): Bit 1 Writing a 1 to FCR1 clears all bytes in the RCVR FIFO and resets its counter logic to 0 The shift register is not cleared The 1 that is written to this bit position is self-clear-ing So what I think is happening is that occasionally when the RCVR FIFO is cleared, a character is almost received and between the RCVR is cleared and LSR bit LSR_RXRDY is checked, the new character is there. The piece of code in ns8250_flush() looks like this: <snip> uart_setreg(bas, REG_FCR, fcr); uart_barrier(bas); /* * Detect and work around emulated UARTs which don't implement the * FCR register; on these systems we need to drain the FIFO since * the flush we request doesn't happen. One such system is the * Firecracker VMM, aka. the rust-vmm/vm-superio emulation code: * https://github.com/rust-vmm/vm-superio/issues/83 */ lsr = uart_getreg(bas, REG_LSR); if (((lsr & LSR_TEMT) == 0) && (what & UART_FLUSH_TRANSMITTER)) drain |= UART_DRAIN_TRANSMITTER; if ((lsr & LSR_RXRDY) && (what & UART_FLUSH_RECEIVER)) drain |= UART_DRAIN_RECEIVER; if (drain != 0) { printf("ns8250: UART FCR is broken\n"); ns8250_drain(bas, drain); } </snip> So how to distinguish between a real FCR error and this case? Maybe if ns8250_drain() returned the number of bytes it drained instead and it returned one, then it isn't an FCR error. Currently ns8250_drain() returns 0 on no error or EIO if there is a hardware problem. Maybe that can be changed to return -EIO and handled properly where its return value is used? Note that these uarts are implemented on Xilinx/AMD FPGAs using the v2.0 IP in this link, but I do think it can probably happen on other 16x50 uarts too. https://docs.amd.com/v/u/en-US/pg143-axi-uart16550 Regards John