From nobody Sat Sep 10 01:41:26 2022 X-Original-To: freebsd-ppc@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4MPbDn4H5vz4bwVJ for ; Sat, 10 Sep 2022 01:41:29 +0000 (UTC) (envelope-from jmmv@outlook.com) Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam07olkn2031.outbound.protection.outlook.com [40.92.15.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4MPbDm43dDz3sYF; Sat, 10 Sep 2022 01:41:28 +0000 (UTC) (envelope-from jmmv@outlook.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EQEw34lNBlR8jhMd40F2ZydytXFipTNil794sf24nzIw0YjM4wZ+TTPbQzbCFvFLY3oGFfWAukEWoBSCZobjodrtoZcA4sOlczN35Ne+Px7qTzdT8m1zn7hj+DT6u8Hd+CkMHlzgk7igYyVsz56WfEnGrsBAL7STT134h1/aX78jQjQ73+KbgH81jX4RzXrtAUISLpFtRzG4hdPUygmiib+hJvXBessP7xITNUqm79w2z/Npea21kTjsuZnQgNTvy0vOuCO0EfcWy0JW18+1neiZHsefrzIYYrrYIvMhfrnKxCOtd7mmlQfB15umsOheMhuwyWDm1JFmi0UYp4rsMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fawNvY5u042tVvreYelbnmX4Ka5s3sHoIp8+FYY5Z00=; b=S/ht/WU0XW6isMTZaOPEVZFmLBgvLQ2BWbrgku5P9UgThxbvB65wymQUv+ErD2KKThZcFLXaSgPosTuEAECRKsUw1xAcmDxIg3dTo2q47/L+K8V4Jf4ZFZFmCQz+qTtFH8wL9BE6KcpTalzscHyBN/BalTKX5Z9M06e/o5tnOyzZ4mO7BIlVF9DjlrFC/+dKVdQAjbz3BTQ5rYFr8QpgTSVny/gxQsaQdYZ+Ln8y+qgK1FP6hTPQOt+ieCq4Of9ENdaznjKHlh2HML5c50wapBCXvqPidIvWJwDG+kbaNb8YHc4MljYr+X5dUuOI0BdyAxdyRJb2Ze9TxGcjMaFh+w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fawNvY5u042tVvreYelbnmX4Ka5s3sHoIp8+FYY5Z00=; b=A/ZREgR1u4PB6aR4YwA13cseWHdoIynqI8dUmPZEcYCiiVIAUWyzza7+0RwJN3HZKcCPaAtHWuM77qQoe1tN5AkpZ+hte4/HTlWSZ9fKNNkybhV4AZSWdg2xjUAK+P6ZsZswd6q759m8BX2VB71IWxlbHUGD74mi4uBtpGzLcfUdzIjki8QreBPeIzInhM+Jf5aAslCj8OpIfn4EIlbR/pu/2sfPeK6AhODusEtP6o+S6jXSsJT8Ytn9trpU+gn+2QLuYY72NjHKwwR4Uqt6EjuwxW2pE07y/+fYtKkYfk0cSZiz7Y10iDzcTleDgg3UYkze9w9yRM+sRBWQQif1fg== Received: from PH0PR20MB3704.namprd20.prod.outlook.com (2603:10b6:510:20::22) by IA1PR20MB4708.namprd20.prod.outlook.com (2603:10b6:208:3d4::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.12; Sat, 10 Sep 2022 01:41:26 +0000 Received: from PH0PR20MB3704.namprd20.prod.outlook.com ([fe80::10fd:af3b:4ba6:230a]) by PH0PR20MB3704.namprd20.prod.outlook.com ([fe80::10fd:af3b:4ba6:230a%6]) with mapi id 15.20.5566.021; Sat, 10 Sep 2022 01:41:26 +0000 From: Julio Merino To: Justin Hibbits CC: "freebsd-ppc@freebsd.org" Subject: RE: PowerMac G5 crashes with "instruction storage interrupt" on recent 13 Thread-Topic: PowerMac G5 crashes with "instruction storage interrupt" on recent 13 Thread-Index: AQHYpOsmmne4KVhe90KjAGOwAMTWSq3Xf/GIgAADJoCAAC9H34AABAsAgABqRdQ= Date: Sat, 10 Sep 2022 01:41:26 +0000 Message-ID: References: <20220909120857.61f65069@ralga-linux> <20220909151238.5da8b63a@ralga-linux> In-Reply-To: <20220909151238.5da8b63a@ralga-linux> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 2 x-tmn: [f7sWeQefP38VVoT26gwUEfH8iqqbDHER] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 46030164-7288-498a-5d7f-08da92cd936b x-ms-traffictypediagnostic: IA1PR20MB4708:EE_ x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: r5lGW+olwvIm43MtTxxgzxOnb09Xudb6TBuMJAjvQrC60NTQ4HFWOqk9YynMXwqPawjpsAtx+Z4ILuCy2v3mPBnR0dqYKq1EBOgw3tTdoU04OcytYRyFAinsyebOT8dPMNdcqAuKC7zpE0Zdo3lahcCHUA8tgekjgjpJuAWtzKDtEaTUICiLcCMrxIrzVc8DpPGXD1jdVwIMnt4YPKussOFgu3hmlacGvSW+jlwfXpNfI3y2VLMfkQim3UYMFcHVqaNAvoPs90UV+BYBUCzVPvHhlmx9Tz2F5yJS5zmLOXLDvUPhC4/XPMUS16m7x2cGsU3xdsI6UN8Q+CbtbicgNIInxoRheQKza1E+w8aFDa75fknEsYB/S40REbGBZHG8qGSm0PUBNdW1a2gFW5fU0SrTMZoBJl47FXdiehbItEf/onlvBAUE9RCEJ4mre18p6y5GL+OwHY9+oTtgT69l625xJiEiVRRVTz6JuJQc6rXH6f0oDUUg4nuVY8FxdIByeN1HrIZYmJR9uowzIyWJEm15w3lNtMymGINTMe653n6SlviNXEYtaiWKX7VgKQMUZ96ODlkU0m6fBp2TWuyvLmwZASb1agA+T8pbLzneAvdPYbnFKAiYZLf3t14kN8tyg2kmpjsPzyOTbCPpJu/sD7DPKvHV8F7BjgSlf6ikPADOx+o5/MEDI5clD0ILp+7d x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?Windows-1252?Q?JBTACKC6IYsWUANwYmZFFFmJnfyKRqt2ia7+YsgZ2YgyfZR+ns+VQ4Gy?= =?Windows-1252?Q?an40+IyAhKTWQKTdi2fErqlko/Dqp9NPhe/pi0P50E7SpG59uWsW+mrT?= =?Windows-1252?Q?dI4ZDzVO+Ypycl1qEKAOCvXujTHuqdntWbPXq3M5WEkBaUeWmw76p4es?= =?Windows-1252?Q?ZkVK5QP02wETOgnGZuhxQMWDNbqALKFMG+mPbCD4hpyMYotN+lTA5BnY?= =?Windows-1252?Q?pb7ch2OqWN9MoT7AHKjneN8g6DxTqsEDPZAnQIsUYCCq0CqjL0l69Rlk?= =?Windows-1252?Q?Bo/QiVNxLGs5dmAXCG4ukTpsBepsXuypOvj1567VkfZddyB6vAEIPT0I?= =?Windows-1252?Q?9GBGadomABQWUwTiTe/LMoYLeKvgNmVT67s00BUBkcX++a/sXlC7DbPz?= =?Windows-1252?Q?jt+0PKN9dUfqXJrLGleAkuAOAbOM3UppzQvEf+W1GzlKl70cLQY2Jfxf?= =?Windows-1252?Q?lyDUHZ+XFSzsQMgfde6/iykTWpR+F+aHoxxSxdNYVmL9hIvOirkmG63W?= =?Windows-1252?Q?F0dArXQSxmGDDhoTNis6wymVh2ChkKB0ee9rK9QVa0bzQ6Qa1Xia3uRH?= =?Windows-1252?Q?Ou70zqhgvoi6vrbxqfCxJTOAj6XT9PKJsRnZlmoRjq5hqmzzcGbhwq08?= =?Windows-1252?Q?LkEkwPCW2UnATRbuVFo6YUAy11y77tWYlEbugqVa+U2C1sa0u1kONlhe?= =?Windows-1252?Q?edizo11RagEWGlUJpXprsA+A/Zm/Zh8sKEqqbCNXuxtd5XsGcUPWOwMn?= =?Windows-1252?Q?opf/fJkuYKsJy2CHVhdZ+BB89Pk4/7fFZuPfSYmRWonwd/Ep4/1hivye?= =?Windows-1252?Q?62UznE+0TqFLscw96pKFSreg1Pjz7acxRq5KhiJqP5j9Xchn1dIpAzZe?= =?Windows-1252?Q?1eyBbsP314X+rUJYruywIQDKJKEOO1PVjINahONOlrbUVsng91yCEt4I?= =?Windows-1252?Q?zdLX3ifq5bvUW3OSlmZWHNaVK3RlB62gOL24uxsZAav6UzV1gfPyEV4x?= =?Windows-1252?Q?4mGI/D8MyyvIEgJNn/gYDYfCKAiPEUUJHgxbQlTATUoNYh3kRjxxx7n0?= =?Windows-1252?Q?swE8pX4NqpF9TenIoJvadoV7kA7S2gFx4IBVa7IvPLH7myixZucicfOL?= =?Windows-1252?Q?2txoiPlCdSUuc2Em8jxyhzMX6SHzbCHlpV8hItefLT7ci9nNP6H3fIjO?= =?Windows-1252?Q?IeggZy/7yJ//Rfu67g/abPlJaoeIx0xASnWQSd43NghS03GMYvvJCQPd?= =?Windows-1252?Q?hQxuLVGwPtlkLnGUbsyQi0vv2w7kJpH1UcgJyFI6BA34lMk7zsDqwIq5?= =?Windows-1252?Q?5U8u6TCQGPvXpmOyqqJOyoLkOkSMiYGlIqsUiBnvtsBbIIoRRchWsWCs?= =?Windows-1252?Q?DNiz/rfjmzzeTQ=3D=3D?= Content-Type: multipart/alternative; boundary="_000_PH0PR20MB370485AF1ACF74A8D9FBCC6DC0429PH0PR20MB3704namp_" List-Id: Porting FreeBSD to the PowerPC List-Archive: https://lists.freebsd.org/archives/freebsd-ppc List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-ppc@freebsd.org X-BeenThere: freebsd-ppc@freebsd.org MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH0PR20MB3704.namprd20.prod.outlook.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 46030164-7288-498a-5d7f-08da92cd936b X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Sep 2022 01:41:26.5272 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR20MB4708 X-Rspamd-Queue-Id: 4MPbDm43dDz3sYF X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=outlook.com header.s=selector1 header.b="A/ZREgR1"; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=none; spf=pass (mx1.freebsd.org: domain of jmmv@outlook.com designates 40.92.15.31 as permitted sender) smtp.mailfrom=jmmv@outlook.com X-Spamd-Result: default: False [-3.30 / 15.00]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; NEURAL_HAM_SHORT(-0.99)[-0.991]; FORGED_SENDER(0.30)[julio@meroh.net,jmmv@outlook.com]; R_SPF_ALLOW(-0.20)[+ip4:40.92.0.0/15]; R_DKIM_ALLOW(-0.20)[outlook.com:s=selector1]; NEURAL_HAM_LONG(-0.11)[-0.109]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DWL_DNSWL_NONE(0.00)[outlook.com:dkim]; MIME_TRACE(0.00)[0:+,1:+,2:~]; MLMMJ_DEST(0.00)[freebsd-ppc@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[40.92.15.31:from]; FREEMAIL_ENVFROM(0.00)[outlook.com]; ASN(0.00)[asn:8075, ipnet:40.80.0.0/12, country:US]; RCPT_COUNT_TWO(0.00)[2]; RCVD_COUNT_THREE(0.00)[3]; FROM_NEQ_ENVFROM(0.00)[julio@meroh.net,jmmv@outlook.com]; FROM_HAS_DN(0.00)[]; FREEFALL_USER(0.00)[jmmv]; DKIM_TRACE(0.00)[outlook.com:+]; TO_DN_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DMARC_NA(0.00)[meroh.net]; TO_DN_EQ_ADDR_SOME(0.00)[] X-ThisMailContainsUnwantedMimeParts: N --_000_PH0PR20MB370485AF1ACF74A8D9FBCC6DC0429PH0PR20MB3704namp_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable I have now tried to compare the dmesgs and sysctl of a good kernel (built a= t 9171b8068b92 with the workaround applied) and a recent bad kernel with th= e workaround applied as well. The main differences comparing dmesg output, where the dash prefix is for t= he good kernel and the plus prefix is for the bad kernel: ----- -bus_dmamem_alloc failed to align memory properly. -firewire0: 2 nodes, maxhop <=3D 1 cable IRM irm(1) (me) +firewire0: 2 nodes, maxhop <=3D 1 Not IRM capable irm(-1) +pci1:5:4:0: VPD data does not start with ident (0x8) +pci1:5:4:0: failed to read VPD data. +pci1:5:4:0: no valid vpd ident found +pci1:5:4:1: VPD data does not start with ident (0x8) +pci1:5:4:1: failed to read VPD data. +pci1:5:4:1: no valid vpd ident found +WARNING: Current temperature (CPU A0 DIODE TEMP: 916.0 C) exceeds critical= temperature (90.0 C); count=3D1 ----- Note here that the temperature measured seems obviously wrong once the fans= spin up like crazy. And soon after this, count grows too high and the mach= ine shuts down by itself. Looking at differences for all sysctls that mention =93temp=94: ----- dev.ds1631.0.%pnpinfo: name=3Dtemp-monitor compat=3Dds1631 -dev.ds1631.0.sensor.mlb_inlet_amb.temp: 27.5C +dev.ds1631.0.sensor.mlb_inlet_amb.temp: 29.6C dev.ds1775.0.%pnpinfo: name=3Dtemp-monitor compat=3Dds1775 -dev.ds1775.0.sensor.drive_bay.temp: 26.5C +dev.ds1775.0.sensor.drive_bay.temp: 29.5C dev.max6690.0.%pnpinfo: name=3Dtemp-monitor compat=3Dmax6690 -dev.max6690.0.sensor.backside.temp: 36.1C -dev.max6690.0.sensor.kodiak_diode.temp: 48.7C +dev.max6690.0.sensor.backside.temp: 42.2C +dev.max6690.0.sensor.kodiak_diode.temp: 55.2C dev.max6690.1.%pnpinfo: name=3Dtemp-monitor compat=3Dmax6690 -dev.max6690.1.sensor.tunnel.temp: 31.2C -dev.max6690.1.sensor.tunnel_heatsink.temp: 33.7C +dev.max6690.1.sensor.tunnel.temp: 34.7C +dev.max6690.1.sensor.tunnel_heatsink.temp: 39.0C -dev.smusat.0.cpu_a0_diode_temp: 34.2C -dev.smusat.0.cpu_a1_diode_temp: 35.0C kstat.zfs.misc.arcstats.arc_tempreserve: 0 ----- The fact that dev.smusat.* is gone from the =93bad=94 kernel seems suspicio= us, but smusat0 is detected properly in both kernels according to dmesg=85 Any thoughts? I can try to bisect this as well, but there are 1500+ changes= to sort through so this will take a while. Thanks! From: Justin Hibbits Sent: Friday, September 9, 2022 12:12 To: Julio Merino Cc: freebsd-ppc@freebsd.org Subject: Re: PowerMac G5 crashes with "instruction storage interrupt" on re= cent 13 That seems bizarre. There haven't been any changes to the controller thread (powermac_thermal.c) in more than 7 years. Are there any problems with sensors? I tested the change I made back in 2015 on my dual core G5, with the intent that it would ramp the fans up sooner (non-linear), and back them down with hysteresis. So when there's load that raises the temperature significantly it will ramp the fans up as quickly as it can, hitting 100% fan long before it can reach maximum temperature. - Justin On Fri, 9 Sep 2022 19:01:06 +0000 Julio Merino wrote: > Ah, thanks for the workaround. I applied it on top of 9171b8068b92 > and the kernel was able to boot successfully =96 and it seems stable so > far. > > However, if I apply the hack on top of stable/13=92s HEAD, there is > still the issue of the fans going crazy at the slightest increase in > CPU load but they do drop back down to quiet when the load subsumes. > (For example, a simple =93git log=94 in /usr/src makes the fan spin up > within a couple of seconds and they stop soon after that.) Any ideas > on where this might come from? > > > From: Justin Hibbits > Sent: Friday, September 9, 2022 09:09 > To: Julio Merino > Cc: freebsd-ppc@freebsd.org > Subject: Re: PowerMac G5 crashes with "instruction storage interrupt" > on recent 13 > > Hi Julio, > > 971cb62e0b23 is the likely culprit. Alfredo has a patch at > https://reviews.freebsd.org/D36234 that you can use until the problem > is solved. The alternative is you could build everything into the > kernel instead of using modules. > > The problem appears to be in either lld or the kernel linker. > > - Justin > > On Fri, 9 Sep 2022 16:00:33 +0000 > Julio Merino wrote: > > > Armed with a lot of patience, I was able to bisect where the crashes > > are coming from. They seem to be due to these three consecutive and > > related commits (because the first one broke the build and required > > two extra fixes for powerpc=92s GENERIC64 to build): > > > > 9171b8068b92 cpuset: Fix the KASAN and KMSAN builds > > 01f281d0ee52 Fix the build after 47a57144 > > 971cb62e0b23 cpuset: Byte swap cpuset for compat32 on big endian > > architectures > > > > Any idea on how to look into these crashes further? > > > > Thank you! > > > > > > From: Julio Merino > > Sent: Sunday, July 31, 2022 07:45 > > To: freebsd-ppc@freebsd.org > > Subject: PowerMac G5 crashes with "instruction storage interrupt" on > > recent 13 > > > > Hi all, > > > > I have a PowerMac G5 that=92s running an old build of FreeBSD 13 > > stable (from around October of last year) that I=92m trying to > > upgrade to recent stable/13. > > > > Booting into a new kernel brings two issues: the first is that the > > fans spin up to jet engine levels right before transferring control > > to userspace. An old patch I have locally to mitigate this (which I > > got from whichever outstanding bug exists for this in the bug > > tracker) doesn=92t seem to work any longer. > > > > The second is that the kernel crashes (apparently) as soon as it > > tries to mount a ZFS pool during early stages of the boot process, > > but after successfully transferring control to userspace. Typing > > this from a photo of the crash so omitting details that I think > > aren=92t going to be relevant here, like addresses, here is what I > > get: > > > > ---- > > Setting hostid: =85 > > ZFS filesystem version: 5 > > ZFS storage pool version: features support (500) > > > > Fatal kernel trap: > > > > Exception =3D 0x400 (instruction storage interrupt) > > =85 > > pid =3D 64, comm =3D zpool > > > > panic: instruction storage interrupt trap > > cpuid =3D 1 > > time =3D =85 > > KDB: stack backtrace: > > #0 kdb_backtrace > > #1 vpanic > > #2 panic > > #3 trap > > #4 powerpc_interrupt > > Uptime: 7s > > ---- > > > > Any thoughts about what I could look into? Any =93recent=94 commits tha= t > > you think may be at fault? > > > > Thanks! > > > --_000_PH0PR20MB370485AF1ACF74A8D9FBCC6DC0429PH0PR20MB3704namp_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable

I have now tried to compare the dmesgs and sysctl of= a good kernel (built at 9171b8068b92 with the workaround applied) and a re= cent bad kernel with the workaround applied as well.

 

The main differences comparing dmesg output, where t= he dash prefix is for the good kernel and the plus prefix is for the bad ke= rnel:

 

-----

-bus_dmamem_alloc failed to align memory properly.

 

-firewire0: 2 nodes, maxhop <=3D 1 cable IRM irm(= 1)  (me)

+firewire0: 2 nodes, maxhop <=3D 1 Not IRM capabl= e irm(-1)

 

+pci1:5:4:0: VPD data does not start with ident (0x8= )

+pci1:5:4:0: failed to read VPD data.

+pci1:5:4:0: no valid vpd ident found

+pci1:5:4:1: VPD data does not start with ident (0x8= )

+pci1:5:4:1: failed to read VPD data.

+pci1:5:4:1: no valid vpd ident found

 

+WARNING: Current temperature (CPU A0 DIODE TEMP: 91= 6.0 C) exceeds critical temperature (90.0 C); count=3D1

-----

 

Note here that the temperature measured seems obviou= sly wrong once the fans spin up like crazy. And soon after this, count grow= s too high and the machine shuts down by itself.

 

Looking at differences for all sysctls that mention = =93temp=94:

 

-----

dev.ds1631.0.%pnpinfo: name=3Dtemp-monitor compat=3D= ds1631

-dev.ds1631.0.sensor.mlb_inlet_amb.temp: 27.5C<= /o:p>

+dev.ds1631.0.sensor.mlb_inlet_amb.temp: 29.6C<= /o:p>

dev.ds1775.0.%pnpinfo: name=3Dtemp-monitor compat=3D= ds1775

-dev.ds1775.0.sensor.drive_bay.temp: 26.5C

+dev.ds1775.0.sensor.drive_bay.temp: 29.5C

dev.max6690.0.%pnpinfo: name=3Dtemp-monitor compat= =3Dmax6690

-dev.max6690.0.sensor.backside.temp: 36.1C

-dev.max6690.0.sensor.kodiak_diode.temp: 48.7C<= /o:p>

+dev.max6690.0.sensor.backside.temp: 42.2C

+dev.max6690.0.sensor.kodiak_diode.temp: 55.2C<= /o:p>

dev.max6690.1.%pnpinfo: name=3Dtemp-monitor compat= =3Dmax6690

-dev.max6690.1.sensor.tunnel.temp: 31.2C<= /p>

-dev.max6690.1.sensor.tunnel_heatsink.temp: 33.7C

+dev.max6690.1.sensor.tunnel.temp: 34.7C<= /p>

+dev.max6690.1.sensor.tunnel_heatsink.temp: 39.0C

-dev.smusat.0.cpu_a0_diode_temp: 34.2C

-dev.smusat.0.cpu_a1_diode_temp: 35.0C

kstat.zfs.misc.arcstats.arc_tempreserve: 0

-----

 

The fact that dev.smusat.* is gone from the =93bad= =94 kernel seems suspicious, but smusat0 is detected properly in both kerne= ls according to dmesg=85

 

Any thoughts? I can try to bisect this as well, but = there are 1500+ changes to sort through so this will take a while.

 

Thanks!

 

 

From: Justin Hibbits
Sent: Friday, September 9, 2022 12:12
To: Julio Merino
Cc: freebsd-ppc@freebsd.o= rg
Subject: Re: PowerMac G5 crashes with "instruction storage inte= rrupt" on recent 13

 

That seems bizarre.&n= bsp; There haven't been any changes to the controller
thread (powermac_thermal.c) in more than 7 years.  Are there any
problems with sensors?  I tested the change I made back in 2015 on my<= br> dual core G5, with the intent that it would ramp the fans up sooner
(non-linear), and back them down with hysteresis.  So when there's loa= d
that raises the temperature significantly it will ramp the fans up as
quickly as it can, hitting 100% fan long before it can reach maximum
temperature.

- Justin

On Fri, 9 Sep 2022 19:01:06 +0000
Julio Merino <julio@meroh.net> wrote:

> Ah, thanks for the workaround. I applied it on top of 9171b8068b92
> and the kernel was able to boot successfully =96 and it seems stable s= o
> far.
>
> However, if I apply the hack on top of stable/13=92s HEAD, there is > still the issue of the fans going crazy at the slightest increase in > CPU load but they do drop back down to quiet when the load subsumes. > (For example, a simple =93git log=94 in /usr/src makes the fan spin up=
> within a couple of seconds and they stop soon after that.) Any ideas > on where this might come from?
>
>
> From: Justin Hibbits<mailto= :jhibbits@FreeBSD.org>
> Sent: Friday, September 9, 2022 09:09
> To: Julio Merino<mailto:julio@me= roh.net>
> Cc: freebsd-ppc@freebsd.org<mailto:freebsd-ppc@freebsd.org>
> Subject: Re: PowerMac G5 crashes with "instruction storage interr= upt"
> on recent 13
>
> Hi Julio,
>
> 971cb62e0b23 is the likely culprit.  Alfredo has a patch at
> https://reviews.freebsd= .org/D36234 that you can use until the problem
> is solved.  The alternative is you could build everything into th= e
> kernel instead of using modules.
>
> The problem appears to be in either lld or the kernel linker.
>
> - Justin
>
> On Fri, 9 Sep 2022 16:00:33 +0000
> Julio Merino <julio@meroh.net> wrote:
>
> > Armed with a lot of patience, I was able to bisect where the cras= hes
> > are coming from. They seem to be due to these three consecutive a= nd
> > related commits (because the first one broke the build and requir= ed
> > two extra fixes for powerpc=92s GENERIC64 to build):
> >
> > 9171b8068b92 cpuset: Fix the KASAN and KMSAN builds
> > 01f281d0ee52 Fix the build after 47a57144
> > 971cb62e0b23 cpuset: Byte swap cpuset for compat32 on big endian<= br> > > architectures
> >
> > Any idea on how to look into these crashes further?
> >
> > Thank you!
> >
> >
> > From: Julio Merino<mailto:j= ulio@meroh.net>
> > Sent: Sunday, July 31, 2022 07:45
> > To: freebsd-ppc@freebsd.org<mailto:freebsd-ppc@freebsd.org>=
> > Subject: PowerMac G5 crashes with "instruction storage inter= rupt" on
> > recent 13
> >
> > Hi all,
> >
> > I have a PowerMac G5 that=92s running an old build of FreeBSD 13<= br> > > stable (from around October of last year) that I=92m trying to > > upgrade to recent stable/13.
> >
> > Booting into a new kernel brings two issues: the first is that th= e
> > fans spin up to jet engine levels right before transferring contr= ol
> > to userspace. An old patch I have locally to mitigate this (which= I
> > got from whichever outstanding bug exists for this in the bug
> > tracker) doesn=92t seem to work any longer.
> >
> > The second is that the kernel crashes (apparently) as soon as it<= br> > > tries to mount a ZFS pool during early stages of the boot process= ,
> > but after successfully transferring control to userspace. Typing<= br> > > this from a photo of the crash so omitting details that I think > > aren=92t going to be relevant here, like addresses, here is what = I
> > get:
> >
> > ----
> > Setting hostid: =85
> > ZFS filesystem version: 5
> > ZFS storage pool version: features support (500)
> >
> > Fatal kernel trap:
> >
> > Exception =3D 0x400 (instruction storage interrupt)
> > =85
> > pid =3D 64, comm =3D zpool
> >
> > panic: instruction storage interrupt trap
> > cpuid =3D 1
> > time =3D =85
> > KDB: stack backtrace:
> > #0 kdb_backtrace
> > #1 vpanic
> > #2 panic
> > #3 trap
> > #4 powerpc_interrupt
> > Uptime: 7s
> > ----
> >
> > Any thoughts about what I could look into? Any =93recent=94 commi= ts that
> > you think may be at fault?
> >
> > Thanks!
> > 
>

 

--_000_PH0PR20MB370485AF1ACF74A8D9FBCC6DC0429PH0PR20MB3704namp_--