From nobody Wed Sep 29 17:27:53 2021 X-Original-To: arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4C29C17D3FE8 for ; Wed, 29 Sep 2021 17:27:56 +0000 (UTC) (envelope-from manu@bidouilliste.com) Received: from mx.blih.net (mail.blih.net [212.83.155.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx.blih.net", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HKNcW6djCz4ZH8; Wed, 29 Sep 2021 17:27:55 +0000 (UTC) (envelope-from manu@bidouilliste.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bidouilliste.com; s=mx; t=1632936474; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GzqLcKg7M1FbWoUYpdA8+aZHh2yIq+H9mMZQ3yw/N3k=; b=PCGUTBVlByZ2g2WsBIyNcEK8ewzwUwSY7yDWPrjuMAGDdmjE3JHSkY7qLGUa1Zg7ZFD6D2 BqH0Qfi7epGApG9iY1BtijglykJ26Tg014aJndzh9tpXklqAYTZ6zSSDJxhoAkPffsTTvO ND+4XjbuQx7Rxs80UTekqdIVYec+peM= Received: from amy (lfbn-idf2-1-644-191.w86-247.abo.wanadoo.fr [86.247.100.191]) by mx.blih.net (OpenSMTPD) with ESMTPSA id 9cbc1c38 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 29 Sep 2021 17:27:53 +0000 (UTC) Date: Wed, 29 Sep 2021 19:27:53 +0200 From: Emmanuel Vadot To: Andriy Gapon Cc: "freebsd-arm@freebsd.org" Subject: Re: rock64 verbose boot hangs Message-Id: <20210929192753.449ad9a061366ea5e19d735e@bidouilliste.com> In-Reply-To: <4d24bb8a-0ffe-9073-7863-e83025ffc4fa@FreeBSD.org> References: <20210920190213.5839f18816daf1f6e4289b94@bidouilliste.com> <4d24bb8a-0ffe-9073-7863-e83025ffc4fa@FreeBSD.org> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd14.0) List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4HKNcW6djCz4ZH8 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Wed, 29 Sep 2021 20:07:25 +0300 Andriy Gapon wrote: > On 23/09/2021 20:46, Andriy Gapon wrote: > > On 20/09/2021 20:02, Emmanuel Vadot wrote: > >> > >> =A0 Hi Andriy, > >> > >> On Sat, 18 Sep 2021 15:58:00 +0300 > >> Andriy Gapon wrote: > >> > >>> > >>> Normal boot works every time, but with boot_verbose=3D"YES" it hanged= on all > >>> attempts so far. > >>> > >>> Last messages on the console: > >>> cpulist0: on ofwbus0 > >>> cpu0: on cpulist0 > >>> cpu0: Nominal frequency 600Mhz > >>> cpufreq_dt0: on cpu0 > >>> cpufreq_dt0: 408.000 Mhz (950000 uV) > >>> cpufreq_dt0: 600.000 Mhz (950000 uV) > >>> cpufreq_dt0: 816.000 Mhz (1000000 uV) > >>> cpufreq_dt0: 1008.000 Mhz (1100000 uV) > >>> cpufreq_dt0: 1200.000 Mhz (1225000 uV) > >>> cpufreq_dt0: 1296.000 Mhz (1300000 uV) > >>> cpu1: on cpulist0 > >>> cpu1: Nominal frequency 600Mhz > >>> cpufreq_dt1: on cpu1 > >>> > >>> The kernel is totally unresponsive after that. > >> > >> =A0 Can't reproduce here, I'm running 548a706608d with latest DTB and > >> latest u-boot/atf > >> > >>> Any suggestions on how to debug this? > >> > >> =A0 Not really sure how to start, that seems weird that the kernel will > >> hang at the cpufreq attach but maybe try modifying the DTB to remove > >> this node ? > >> =A0 Also did that happens with my recent commit on clock or was this t= he > >> same before ? >=20 > An update relevant to the question above. > Actually, after upgrading to a version that includes your clock changes t= he=20 > problem went away! > I don't know what to make out of this fact, but it looks like the problem= was a=20 > clock plus timing issue. I'm not that surprised. Before my clock changes netboot always failed in a really weird way where AP couldn't be started and the serial output was switching chars around (Like "cuolt'd rsart AP"). So I'm glad that it fixed your problems because I had really no idea how to debug that :P > > Thank you and every one else who responded with information and suggest= ions. > >=20 > > Some extra details. > > I've been having this problem since I've got this board 9 months ago. > > It's been through several FreeBSD and U-Boot and stuff in the ESP parti= tion=20 > > upgrades.=A0 And the problem was always present. > >=20 > > Now I've done more extensive testing with a couple of dozen reboots in = a row and=20 > > some additional debug prints (like, for example, DEBUG in subr_bus.c). > >=20 > > I actually see several variations of the problem. > > Sometimes it's a hang, but sometimes it's a crash. > > A hang can happen in different places and a crash can happen in differe= nt places=20 > > too. > > Some crashes happens during AP startup and the information I am getting= is not=20 > > very usable. > > Some crashes happen during a driver probing when the bus code searches = the hints=20 > > memory space.=A0 Those crashes look like a memory corruption happens th= ere at random. > >=20 > > Given those variations plus some other differences that I have comparin= g to=20 > > other Rock64 users (like needing special setup for eMMC and for the wat= chdog), I=20 > > am inclined to think that the board I have has something special either= in the=20 > > hardware (like a different configuration via some fuses) or in the Boot= ROM. > > Even though the PCB has the standard markings. > >=20 > > And I would not be surprised about that (that it could be a customized= =20 > > production) as I got my Rock64-s via a special / unusual deal on Amazon= .=20 > > Iconikal and Recon Sentinal are keywords to search for, for those inter= ested. > > Some news articles from the time: > > https://liliputing.com/2020/09/this-10-single-board-computer-is-faster-= than-a-raspberry-pi-3.html=20 > >=20 > > https://www.tomshardware.com/news/raspberry-pi-sized-iconikal-rockchip-= sbc-only-dollar8-on-amazon=20 > >=20 > >=20 > > So, in the end, I still do not know what causes the verbose boot to han= g / crash. > > Maybe there is some (not fully working) watchdog that gets armed and di= sarmed by=20 > > some hardware accesses and the verbose boot is too slow to complete in = time. > >=20 > > Here is a small subset of panics and hangs that I saw: > > https://people.freebsd.org/~avg/rock64-verbose-boot-panic.txt > >=20 >=20 >=20 > --=20 > Andriy Gapon >=20 --=20 Emmanuel Vadot