From nobody Tue Jan 09 16:48:53 2024 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4T8cMb6JcSz55rvC for ; Tue, 9 Jan 2024 16:48:59 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mx1.sbone.de (cross.sbone.de [195.201.62.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.sbone.de", Issuer "SBone.DE Root Certificate Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4T8cMZ6T8Zz4DJF for ; Tue, 9 Jan 2024 16:48:58 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of bzeeb-lists@lists.zabbadoz.net designates 195.201.62.131 as permitted sender) smtp.mailfrom=bzeeb-lists@lists.zabbadoz.net Received: from mail.sbone.de (mail.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:1025]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.sbone.de (Postfix) with ESMTPS id 120488D4A162; Tue, 9 Jan 2024 16:48:57 +0000 (UTC) Received: from content-filter.t4-02.sbone.de (content-filter.t4-02.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:2742]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPS id 6B1E42D029D8; Tue, 9 Jan 2024 16:48:56 +0000 (UTC) X-Virus-Scanned: amavisd-new at sbone.de Received: from mail.sbone.de ([IPv6:fde9:577b:c1a9:4902:0:7404:2:1025]) by content-filter.t4-02.sbone.de (content-filter.t4-02.sbone.de [IPv6:fde9:577b:c1a9:4902:0:7404:2:2742]) (amavisd-new, port 10024) with ESMTP id EjB6XUo2L_p7; Tue, 9 Jan 2024 16:48:55 +0000 (UTC) Received: from strong-aiccu0.sbone.de (strong-aiccu0.sbone.de [IPv6:fde9:577b:c1a9:f491::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPSA id 9C4A02D029D7; Tue, 9 Jan 2024 16:48:54 +0000 (UTC) Date: Tue, 9 Jan 2024 16:48:53 +0000 (UTC) From: "Bjoern A. Zeeb" To: Emmanuel Vadot cc: =?UTF-8?Q?S=C3=B8ren_Schmidt?= , "freebsd-arm@freebsd.org" , Warner Losh Subject: Re: MMCCAM hang In-Reply-To: <5299p2p7-4r17-7o65-3569-o4pn3pq8r597@yvfgf.mnoonqbm.arg> Message-ID: <084r150q-076r-9rpn-89p2-87osq1p82orp@yvfgf.mnoonqbm.arg> References: <49DE81A1-7DF5-48BF-A334-961A73B91E53@gmail.com> <20240109114822.522d91fea8cf170af4d895b7@bidouilliste.com> <5299p2p7-4r17-7o65-3569-o4pn3pq8r597@yvfgf.mnoonqbm.arg> X-OpenPGP-Key-Id: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="1098556516-899724204-1704818935=:2837" X-Spamd-Bar: / X-Spamd-Result: default: False [-0.80 / 15.00]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; CTYPE_MIXED_BOGUS(1.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; R_SPF_ALLOW(-0.20)[+ip4:195.201.62.131]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:24940, ipnet:195.201.0.0/16, country:DE]; MIME_TRACE(0.00)[0:+,1:+]; FROM_HAS_DN(0.00)[]; R_DKIM_NA(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org,bsdimp.com]; TAGGED_RCPT(0.00)[]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; DMARC_NA(0.00)[zabbadoz.net]; MLMMJ_DEST(0.00)[freebsd-arm@freebsd.org]; RCPT_COUNT_THREE(0.00)[4] X-Rspamd-Queue-Id: 4T8cMZ6T8Zz4DJF This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --1098556516-899724204-1704818935=:2837 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8BIT On Tue, 9 Jan 2024, Bjoern A. Zeeb wrote: > On Tue, 9 Jan 2024, Emmanuel Vadot wrote: > >> On Tue, 9 Jan 2024 11:36:32 +0100 >> Søren Schmidt wrote: >> >>>> On 28 Dec 2023, at 02.08, Warner Losh wrote: >>>> On Wed, Dec 27, 2023, 4:55?PM Bjoern A. Zeeb >>>> > >>>> wrote: >>>>> Hi, >>>>> >>>>> sdhci_fsl_fdt0: Desired SD/MMC freq: 50000000, actual: 50000000; base >>>>> 700000000 prescale 1 divisor 14 >>>>> GEOM: new disk sdda0 >>>>> sdda0 at sdhci_slot0 bus 0 scbus0 target 0 lun 0 >>>>> sdda0: Relative addr: 00000002 >>>>> Card features: >>>>> Card random: unblocking device. >>>>> GEOM: new disk sdda0boot0 >>>>> memory OCR: 00ff8080 >>>>> sdda0: Serial Number ....... >>>>> sdda0: MMCHC .................................. by 17 0x0000 >>>>> GEOM: new disk sdda0boot1 >>>>> uhub0: 2 ports with 2 removable, self powered >>>>> >>>>> at which point basically anything hangs. In auto-boot it is >>>>> before/during file-system checks. >>>>> In single user mode camcontrol devlist will show sdda0 >>>>> but >>>>> >>>>> root@:/ # gpart show sdda0 >>>>> load: 6.06 cmd: gpart 24 [g_waitfor_event] 1.28r 0.00u 0.00s 0% 2088k >>>>> {forever} >>>>> >>>>> >>>>> Unclear at which point I broke to debugger and this is where it seems to >>>>> hang: >>>>> >>>>> db> trace 100088 >>>>> Tracing pid 4 tid 100088 td 0xffff0000dc527000 >>>>> ipi_stop() at ipi_stop+0x34 >>>>> arm_gic_v3_intr() at arm_gic_v3_intr+0xe4 >>>>> intr_irq_handler() at intr_irq_handler+0x80 >>>>> handle_el1h_irq() at handle_el1h_irq+0x14 >>>>> --- interrupt >>>>> spinlock_exit() at spinlock_exit+0x44 >>>>> callout_reset_sbt_on() at callout_reset_sbt_on+0x210 >>>>> sdhci_cam_action() at sdhci_cam_action+0x284 >>>>> xpt_run_devq() at xpt_run_devq+0x4c8 >>>>> xpt_action_default() at xpt_action_default+0x470 >>>>> sddastart() at sddastart+0x1bc >>>>> xpt_run_allocq() at xpt_run_allocq+0xa8 >>>>> xpt_done_process() at xpt_done_process+0x610 >>>>> xpt_done_td() at xpt_done_td+0x1a8 >>>>> fork_exit() at fork_exit+0x8c >>>>> fork_trampoline() at fork_trampoline+0x18 >>>>> >>>>> >>>>> Anyone an idea? >>>> >>>> >>>> >>>> Looks like deadlock with another thread. Anybody else in the time keeping >>>> / callout code? >>> >>> I think this is related to the MMC driver having issues (MMCCAM or not). >>> If I try to use a MMC sdcard on any of my rk35X8 boards as the disk device >>> it will eventually hang on first access to the MMC controlled media. >>> I thought I had an issue here with my dev setup but clealy I'm not alone >>> :) >> >> SDCard on RK356X don't use sdhci but dwmmc so it's not related to what >> bz@ is seeing. >> That being said I have no problem using dwmmc as the root device on my >> nanopi r5s or quartz64. > > For what is worth my current feeling seems to be it is related to the > boot[01] disks on the eMMC. okay, I quickly tried the funny bit to skip them (no disk created). Th errors from the sdda stopped after about 25-ish times. I didn't check the commands if they were the same. But now it looks like this: # ls -l /dev/*da* crw-r----- 1 root operator 0x50 Dec 19 10:32 /dev/nda0 crw-r----- 1 root operator 0x55 Dec 19 10:32 /dev/sdda0 # gpart show sdda0 gpart: No such geom: sdda0. # gpart show nda0 gpart: No such geom: nda0. # gpart create -s GPT -l 67108864 sdda0 # -l is from D33168 and not the issue here sdhci_fsl_fdt0-slot0: sdhci_cam_request: ccb 0 ccb 0xffffa0000e440800 curcmd 0 req 0 sdhci_fsl_fdt0-slot0: sdhci_start_command: curcmd 0 cmd 0xffffa0000e4408d0 cmd_done 1 flags 0x000035 sdhci_fsl_fdt0-slot0: sdhci_req_done: curcmd 0xffffa0000e4408d0 ccb 0xffffa0000e440800 cmd_done 1 cmd.flags 0x000035 cmd.error 1 gpart: Input/output error # gpart create -s GPT sdda0 gpart: geom 'sdda0': File exists # gpart show sdda0 => 131104 122011576 sdda0 GPT (58G) 131104 122011576 - free - (58G) # shutdown -r now ... Login: ... # gpart show sdda0 gpart: No such geom: sdda0. Something obviously non-obvious must be strange here. I should try another device though I know this works under Linux. Should I try legacy mmc again? > I see geom tasting on boot0 but the consumer for boot1 never shows up in > ddb> show geom > I disabled the graid and then the same observation moved on to gpart. > > Also once the error starts the fsl is never ecovering; eventually the > ccb and curcmd stay the same pointers even. It seems to just roto-tile, > which makes me wonder if some error propagation is missing/gone. > > If I enable kern.cam.boot_delay="30000" and have my root on an md(4) > I get to Login: -- strangely but then the nda and the sdda show up and > then typing gpart show or whatever else geom-ish a few commands go > through and then we are in the error again. > > I haven't been able to dig much further; no other locks held in debug > kernels (just a malloc WAITOK complaint early on during "attach"). > > I'd still be happy to hear for more possible cases; especially if other > sdhci devices are working with MMCCAM? It kept me from doing the actual > work I wanted to do with mmccam over the holidays sadly. > > > Feature request: somehow I wished we could enable/disable FDT/OFW based > devices like we do for PCI with devctl ... can we? Like have it > disabled in FDT at boot but later enable/probe/attach... > > > With SD cards and dwmmc I had mostly mixed results in the past; they > worked for quite a while but after 600 days of uptime they were gone > (problem probably long fixed but I am at 900 days now for the last > running RK device and then won't bother for a long while I hope). > > -- Bjoern A. Zeeb r15:7 --1098556516-899724204-1704818935=:2837--