Investigating jh7100 (VisionFive) MMC issue
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 07 Feb 2023 13:27:21 UTC
Hello, Here is a description of my efforts to investigate what is missing from jh7100 (VisionFive) MMC host driver -- the driver file is here: https://github.com/mhorne/freebsd/blob/beaglev/sys/riscv/ starfive/starfive_if_dwc.c I use device trees provided in Linux repository: https://github.com/starfive-tech/linux/blob/ 594a662ba20ee2234e2c09a4b25d79f8fb99cfef/arch/riscv/boot/dts/starfive/ I also have my own GPIO and reset drivers in use plus one clock added. These may affect the boot process. Ask me if interested. -- Without "sc->use_pio = 1" in dwmmc_starfive.c the execution halts at msleep() call which is located in the file sys/dev/mmc/mmc.c It seems that it remains waiting for an external event which never occurs. After switching pio mode on (like is done in several other board specific dwmmc driver codes) booting does not stall anymore but we get an error message: mmcsd0: Error indicated: 4 Failed Possibly related to this error – which appears multiple times during the boot – mountroot fails (error 19) Using printfs to track the execution suggests that the earlier manifestation of mmscd0's error 4 happens in the function dwmmc_intr() which is found in a file sys/dev/mmc/host/dwmmc.c. The error number 4 matches to this error code (MMC_ERR_FAILED defined in mmcreg.h) and printfs rule out other places where the same error is assigned. Printing opcodes and register data out at this point indicates that opcodes used when the error manifests are 17, MMC_READ_SINGLE_BLOCK and 18, MMC_READ_MULTIPLE_BLOCK and MINTSTS register has bits 3 and 7 on. These bits indicate data_transfer_over_interrupt and data_crc_error_interrupt according to two datasheets I've found. So it looks like there is a CRC error. Printfs reveal that during the boot many data packets are transferred. CRC error appears with some of these. Data packets transfered during the boot consists of 3735929054 sequences (0xDEADC0DE), only tails of transfers may be different: example 1: ... 3735929054 8309528 4294967232 9 example 2: ... 3735929054 8313624 4294967232 3735929054 Curiously sometimes the very last row is another 3735929054 sequence instead of 9 (example 2) but this difference is seen between different boots and concerns all the transfers instead of marking a difference between error indicating and non-error indicating transfers. The quantity of data transferred seems to be always correct (...at least if value 9 is not an error) Printing out all the possible variables related to these data transfers (struct mmc_command) does not seem to reveal much differences between error giving and non-error giving transfers. Only suspectible variable I have found is "arg" variable. It is usually 0 and sometimes 2 when the error occurs. With errorless transfers (when opcodes are 17, 18) these values seem not to appear. Tracking this arg variable takes us to function mmcsd_rw() in file mmscd.c. There arg gets its value from bp->bio_pblkno (physical block address) via "block" variable. This argument comes from function's parameter struct bio *bp. Ultimately the origin of this value seems to lie in the struct bio's bio_offset which is initialised somewhere in geom system (files like geom_slice.c, geom_disk.c) Might this path be worth of studying further... Another hypothesis is that the error is related to bus/clock rates: fedora linux boot prints: [ 3.071120] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 400000Hz, actual 400000HZ div = 125) [ 3.168274] mmc_host mmc0: Bus speed (slot 0) = 100000000Hz (slot req 50000000Hz, actual 50000000HZ div = 1) [ 9.240888] mmc_host mmc1: Bus speed (slot 0) = 123750000Hz (slot req 400000Hz, actual 399193HZ div = 155) [ 9.398503] mmc_host mmc1: Bus speed (slot 0) = 123750000Hz (slot req 50000000Hz, actual 30937500HZ div = 2) freebsd boot prints (with my own printfs): dwmmc_setup_bus(), div: 1, bus_hz: 100000000, freq: 50000000 dwmmc_setup_bus(), div: 125, bus_hz: 100000000, freq: 400000 dwmmc_setup_bus(), div: 155, bus_hz: 123750000, freq: 400000 It seems one combination of rates is missing from FreeBSD (or the booting simply doesn't proceed far enough?) ...though it's related to mmc1 while the relevant action seems to take place in mmc0. I wonder if those rates seen are actually supported by FreeBSD. In some previous cases FreeBSD has used different rates to linux due to less extensive drivers. So here's where I currently stand. Tips are welcome. -- Jari Sihvola <jsihv at gmx.com>