[Bug 260438] dns/bind-tools: dig SIGABRT under high load

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 15 Dec 2021 14:24:31 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260438

            Bug ID: 260438
           Summary: dns/bind-tools: dig SIGABRT under high load
           Product: Ports & Packages
           Version: Latest
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: Individual Port(s)
          Assignee: mat@FreeBSD.org
          Reporter: david@isnic.is
             Flags: maintainer-feedback?(mat@FreeBSD.org)
          Assignee: mat@FreeBSD.org

We have some shell scripts to set up zones on new slaves. Since we have a large
number (tens of thousands) of zones, we do this in parallel, querying the slave
to see if the zone is set up and if not send an `rndc addzone`.

When doing a bulk provisioning like this, we see dig very occasionally die with
SIGABRT:
```
[root@hfp-master /usr/home/ansible]# dmesg
pid 72286 (dig), jid 0, uid 0: exited on signal 6 (core dumped)
```

Error ouput when this happens:
```
dighost.c:2628: REQUIRE((__builtin_expect(!!(((query)) != ((void *)0)), 1) &&
__builtin_expect(!!(((const isc__magic_t *)((query)))->magic == ((('D') << 24 |
('i') << 16 | ('g') << 8 | ('q')))), 1))) failed, back trace
#0 0x4359ba in ??
#1 0x43594a in ??
#2 0x2bc814 in ??
#3 0x462a33 in ??
#4 0x44ade8 in ??
#5 0x447225 in ??
#6 0x800a39ada in ??
#7 0x800a4ac1b in ??
#8 0x800a3a051 in ??
#9 0x44735b in ??
#10 0x465135 in ??
Abort trap (core dumped)
```

This only seems to happen under load, running multiple dig commands in parallel
in a tight loop.

I've tried to create a concise repro case here without sharing our whole DNS
deployment script set, but don't have it ready. I'll add it later if I get it
working.

This looks extremely similar to the following upstream bugs:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1981
https://gitlab.isc.org/isc-projects/bind9/-/issues/1971
https://gitlab.isc.org/isc-projects/bind9/-/issues/1956

Based on comments there, this should be fixed in this MR:
https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3721

However it seems this still happens on FreeBSD.

We are running bind-tools-9.16.23.

-- 
You are receiving this mail because:
You are the assignee for the bug.