[Bug 283747] kernel panic after telegraf service restart
- In reply to: bugzilla-noreply_a_freebsd.org: "[Bug 283747] [crash] kernel panic after telegraf service restart"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 28 Mar 2025 18:06:03 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283747 --- Comment #47 from Gleb Smirnoff <glebius@FreeBSD.org> --- Mike, my current hypothesis is that we have a 32-bit overflow in credential reference counting. The overflow happens, when we reap a group of processes, and reference counts of the group summed up together overflow. AFAIU, telegraf will fork+exec arbitrary programs, which in their turn can also fork+exec more programs. While telegraf itself seems to do proper wait(2)-ing on zombies, but some external program may leak zombies, and do not exit itself. Then, when telegraf is restarted, this pack of zombies is reaped and this is where overflow could be hit. This is fixed by attachment 258804. I am not sure in my hypothesis, that's why it is not even committed to CURRENT. However, everyone affected by the bug are advices to use this patch and let's see what happens. We still have some time before 14.3. I will probably start review process to get it into CURRENT, anyway. With this info, you may have some idea on how to reproduce it. I know, you are good at chasing bugs, Mike :) Sorry that it hits you, but I'm glad that you joined the team chasing this bug. -- You are receiving this mail because: You are the assignee for the bug.