svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311
Mark Millard
marklmi at yahoo.com
Sun May 3 21:32:42 UTC 2020
[The bit argument ot bitmap_unset seems to be way
too large.]
On 2020-May-3, at 11:08, Mark Millard <marklmi@ at yahoo.com> wrote:
> [At around 4AM local time dhcient got a signal 11,
> despite the jemalloc revert. The other exmaples
> have not happened.]
>
> On 2020-May-2, at 18:46, Mark Millard <marklmi at yahoo.com> wrote:
>
>> [I'm only claiming the new jemalloc is involved and that
>> reverting avoids the problem.]
>>
>> I've been reporting to some lists problems with:
>>
>> dhclient
>> sendmail
>> rpcbind
>> mountd
>> nfsd
>>
>> getting SIGSEGV (signal 11) crashes and some core
>> dumps on the old 2-socket (1 core per socket) 32-bit
>> PowerMac G4 running head -r360311.
>>
>> Mikaël Urankar sent a note suggesting that I try
>> testing reverting head -r360233 for my head -r360311
>> context. He got it right . . .
>>
>>
>> Context:
>>
>> The problem was noticed by an inability to have
>> other machines do a:
>>
>> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
>>
>> sort of operation and to have succeed. By contrast, on
>> the old PowerMac G4 I could initiate mounts against
>> other machines just fine.
>>
>> I do not see any such problems on any of (all based
>> on head -r360311):
>>
>> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
>> armv7 (OrangePi+ 2ed)
>> aarch64 (Rock64, RPi4, RPi3,
>> OverDrive 1000,
>> Macchiatobin Double Shot)
>> amd64 (ThreadRipper 1950X)
>>
>> So I expect something 32-bit powerpc specific
>> is somehow involved, even if jemalloc is only
>> using whatever it is.
>>
>> (A kyua run with a debug kernel did not find other
>> unexpected signal 11 sources on the 32-bit PowerMac
>> compared to past kyua runs, at least that I noticed.
>> There were a few lock order reversals that I do not
>> know if they are expected or known-safe or not.
>> I've reported those reversals to the lists as well.)
>>
>>
>> Recent experiments based on the suggestion:
>>
>> Doing the buildworld, buildkernel and installing just
>> the new kernel and rebooting made no difference.
>>
>> But then installing the new world and rebooting did
>> make things work again: I no longer get core files
>> for the likes of (old cores from before the update):
>>
>> # find / -name "*.core" -print
>> /var/spool/clientmqueue/sendmail.core
>> /rpcbind.core
>> /mountd.core
>> /nfsd.core
>>
>> Nor do I see the various notices for sendmail
>> signal 11's that did not leave behind a core file
>> --or for dhclient (no core file left behind).
>> And I can mount the old PowerMac's drive from
>> other machines just fine.
>>
>>
>> Other notes:
>>
>> I do not actively use sendmail but it was left
>> to do its default things, partially to test if
>> such default things are working. Unfortunately,
>> PowerMacs have a problematical status under
>> FreeBSD and my context has my historical
>> experiments with avoiding various problems.
>
> Looking, I see that I got a:
>
> pid 572 (dhclient), jid 0, uid 0: exited on signal 11 (core dumped)
>
> notice under the reverted build. No instances
> of the other examples. This is the first that a
> dhclient example has produced a .core file.
>
> gdb indicates 0x5180936c for r7 in:
>
> lwz r8,36(r7)
>
> as leading to the failure. This was in
> arena_dalloc_bin_locked_impl (where
> arena_slab_reg_dalloc and bitmap_unset
> were apparently inlined).
>
> The chain for the example seems to be:
> fork_privchld -> dispatch_imsg -> jemalloc
>
> For reference . . .
>
> # gdb dhclient /dhclient.core
> GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
> Copyright (C) 2020 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> . . .
> Reading symbols from dhclient...
> Reading symbols from /usr/lib/debug//sbin/dhclient.debug...
> [New LWP 100089]
> Core was generated by `dhclient: gem0 [priv]'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 bitmap_unset (bitmap=0x50407164, binfo=<optimized out>, bit=167842154) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
> 341 /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h: No such file or directory.
> (gdb) bt -full
> #0 bitmap_unset (bitmap=0x50407164, binfo=<optimized out>, bit=167842154) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
> goff = <optimized out>
> gp = 0x51809390
> propagate = <optimized out>
> g = <optimized out>
> i = <optimized out>
> #1 arena_slab_reg_dalloc (slab=0x50407140, slab_data=0x50407164, ptr=0x50088b50) at jemalloc_arena.c:273
> bin_info = <optimized out>
> binind = 0
> regind = 167842154
> #2 arena_dalloc_bin_locked_impl (tsdn=0x5009f018, arena=<optimized out>, slab=<optimized out>, ptr=<optimized out>, junked=<optimized out>) at jemalloc_arena.c:1540
> slab_data = <optimized out>
> binind = <optimized out>
> bin_info = <optimized out>
> bin = <optimized out>
> nfree = <optimized out>
> #3 0x502916a8 in __je_arena_dalloc_bin_junked_locked (tsdn=<optimized out>, arena=<optimized out>, extent=<optimized out>, ptr=<optimized out>) at jemalloc_arena.c:1559
> No locals.
> #4 0x50250d2c in __je_tcache_bin_flush_small (tsd=0x5009f018, tcache=<optimized out>, tbin=0x5009f1c0, binind=<optimized out>, rem=24) at jemalloc_tcache.c:149
> ptr = <optimized out>
> i = 0
> extent = 0x50407140
> bin_arena = 0x50400380
> bin = <optimized out>
> ndeferred = 0
> merged_stats = <optimized out>
> arena = 0x50400380
> nflush = 75
> __vla_expr0 = <optimized out>
> item_extent = 0xffffd1f0
> #5 0x502508a0 in __je_tcache_event_hard (tsd=<optimized out>, tcache=0x5009f108) at jemalloc_tcache.c:54
> tbin_info = <optimized out>
> binind = 7
> tbin = 0x5009f1c0
> #6 0x5029a684 in __free (ptr=0x500530c0) at /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/rtree.h:374
> tcache = 0x5009f108
> tsd = <optimized out>
> log_var = <optimized out>
> log_var = <optimized out>
> #7 0x10025994 in dispatch_imsg (ifix=<optimized out>, fd=10) at /usr/powerpc32_src/sbin/dhclient/privsep.c:215
> hdr = {code = IMSG_SCRIPT_WRITE_PARAMS, len = 3225}
> lease = {next = 0x0, expiry = 1588504529, renewal = 1588504229, rebind = 1588504454, address = {len = 4, iabuf = "\300\250\001i", '\000' <repeats 11 times>}, nextserver = {len = 4,
> iabuf = '\000' <repeats 15 times>}, server_name = 0x0, filename = 0x0, medium = 0x0, is_static = 0, is_bootp = 0, options = {{len = 0, data = 0x0}, {len = 4,
> data = 0x500530c8 "\377\377\377"}, {len = 0, data = 0x0}, {len = 4, data = 0x500530d0 "\300\250\001\001"}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 4,
> data = 0x500530d8 "\300\250\001\001"}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 0, data = 0x0}, {
> len = 0, data = 0x0}, {len = 0, data = 0x0}, {len = 20, data = 0x50055200 "hsd1.or.comcast.net."}, {len = 0, data = 0x0} <repeats 35 times>, {len = 4, data = 0x500530e0 ""}, {len = 0,
> data = 0x0}, {len = 1, data = 0x500530e8 "\005"}, {len = 4, data = 0x500530f0 "\300\250\001\001"}, {len = 0, data = 0x0} <repeats 201 times>}}
> medium_len = <optimized out>
> medium = <optimized out>
> totlen = 3225
> filename_len = <optimized out>
> filename = 0x0
> ret = <optimized out>
> buf = <optimized out>
> mtu = <optimized out>
> servername_len = <optimized out>
> servername = 0x0
> reason_len = <optimized out>
> reason = <optimized out>
> --Type <RET> for more, q to quit, c to continue without paging--
> prefix_len = <optimized out>
> prefix = 0x500530c0 "new_"
> i = 0
> optlen = 0
> #8 0x100189f4 in fork_privchld (fd=10, fd2=<optimized out>) at /usr/powerpc32_src/sbin/dhclient/dhclient.c:2847
> pfd = {{fd = 10, events = 1, revents = 1}}
> nfds = <optimized out>
> #9 0x10017a80 in main (argc=<optimized out>, argv=<optimized out>) at /usr/powerpc32_src/sbin/dhclient/dhclient.c:505
> pipe_fd = {10, 11}
> rights = {cr_rights = {1342801412, 18446706484155777024}}
> immediate_daemon = 0
> i = 0
> ch = <optimized out>
> otherpid = 8
> pw = 0x5039b9d8
> fd = <optimized out>
> capmode = <optimized out>
>
> (gdb) disass
> Dump of assembler code for function arena_dalloc_bin_locked_impl:
> 0x502916b8 <+0>: mflr r0
> 0x502916bc <+4>: stw r0,4(r1)
> 0x502916c0 <+8>: stwu r1,-48(r1)
> 0x502916c4 <+12>: stw r30,40(r1)
> 0x502916c8 <+16>: stw r24,16(r1)
> 0x502916cc <+20>: stw r25,20(r1)
> 0x502916d0 <+24>: stw r26,24(r1)
> 0x502916d4 <+28>: stw r27,28(r1)
> 0x502916d8 <+32>: stw r28,32(r1)
> 0x502916dc <+36>: stw r29,36(r1)
> 0x502916e0 <+40>: bl 0x502916e4 <arena_dalloc_bin_locked_impl+44>
> 0x502916e4 <+44>: mr r27,r3
> 0x502916e8 <+48>: mflr r30
> 0x502916ec <+52>: addis r30,r30,14
> 0x502916f0 <+56>: addi r30,r30,7788
> 0x502916f4 <+60>: mr r28,r4
> 0x502916f8 <+64>: lwz r4,5856(r30)
> 0x502916fc <+68>: lwz r3,4(r5)
> 0x50291700 <+72>: mr r29,r5
> 0x50291704 <+76>: andi. r5,r7,1
> 0x50291708 <+80>: mr r26,r6
> 0x5029170c <+84>: lbz r4,0(r4)
> 0x50291710 <+88>: rlwinm r5,r3,14,25,31
> 0x50291714 <+92>: mulli r24,r5,224
> 0x50291718 <+96>: mulli r25,r5,44
> 0x5029171c <+100>: cmpwi cr1,r4,0
> 0x50291720 <+104>: cror 4*cr5+lt,4*cr1+eq,gt
> 0x50291724 <+108>: bge cr5,0x50291a2c <arena_dalloc_bin_locked_impl+884>
> 0x50291728 <+112>: lwz r4,0(r29)
> 0x5029172c <+116>: lwz r6,6036(r30)
> 0x50291730 <+120>: lwz r7,8(r29)
> 0x50291734 <+124>: rlwinm r8,r5,2,0,29
> 0x50291738 <+128>: li r9,1
> 0x5029173c <+132>: add r24,r28,r24
> 0x50291740 <+136>: lwzx r6,r6,r8
> 0x50291744 <+140>: subf r7,r7,r26
> 0x50291748 <+144>: mulhwu r6,r6,r7
> 0x5029174c <+148>: rlwinm r7,r6,29,3,29
> 0x50291750 <+152>: add r7,r29,r7
> => 0x50291754 <+156>: lwz r8,36(r7)
> 0x50291758 <+160>: clrlwi r10,r6,27
> 0x5029175c <+164>: slw r9,r9,r10
> 0x50291760 <+168>: xor r9,r9,r8
> 0x50291764 <+172>: cmplwi r8,0
> 0x50291768 <+176>: stw r9,36(r7)
> 0x5029176c <+180>: bne 0x502917e4 <arena_dalloc_bin_locked_impl+300>
> 0x50291770 <+184>: lwz r7,4408(r30)
> 0x50291774 <+188>: mulli r8,r5,44
> 0x50291778 <+192>: add r5,r7,r8
> 0x5029177c <+196>: lwz r5,16(r5)
> 0x50291780 <+200>: cmplwi r5,2
> 0x50291784 <+204>: blt 0x502917e4 <arena_dalloc_bin_locked_impl+300
> . . .
>
> (gdb) info reg
> r0 0x502916a8 1344870056
> r1 0xffffd1a0 4294955424
> r2 0x500a6018 1342857240
> r3 0x0 0
> r4 0x0 0
> r5 0x0 0
> r6 0xa01116a 167842154
> r7 0x5180936c 1367380844
> r8 0x0 0
> r9 0x1 1
> r10 0x1e 30
> r11 0x5005d114 1342558484
> r12 0x84000c00 2214595584
> r13 0x0 0
> r14 0xffffd1f0 4294955504
> r15 0xfffffffc 4294967292
> r16 0x4a 74
> r17 0x4b 75
> r18 0x0 0
> r19 0x504009a0 1346374048
> r20 0x0 0
> r21 0xffffd1f0 4294955504
> r22 0x620 1568
> r23 0x50400380 1346372480
> r24 0x50400380 1346372480
> r25 0x0 0
> r26 0x50088b50 1342737232
> r27 0x5009f018 1342828568
> r28 0x50400380 1346372480
> r29 0x50407140 1346400576
> r30 0x50373550 1345795408
> r31 0xffffd310 4294955792
> pc 0x50291754 0x50291754 <arena_dalloc_bin_locked_impl+156>
> msr <unavailable>
> cr 0x42480c00 1112017920
> lr 0x502916e4 0x502916e4 <arena_dalloc_bin_locked_impl+44>
> ctr 0x5005d114 1342558484
> xer 0x0 0
> fpscr 0x0 0
> vscr <unavailable>
> vrsave <unavailable>
bitmap_unset (bitmap=0x50407164, binfo=<optimized out>, bit=167842154)
explains calculating: gp = 0x51809390
via bitmap+(bit/4/8):
(gdb) print/x 0x50407164 +167842154/4/8
$16 = 0x51809390
The last potential bit/4/8 value to be able to access memory (without
spanning a hole) is:
(gdb) print *(bitmap+582566)
$13 = 0
(gdb) print/x (bitmap+582566)
$14 = 0x5063fffc
So it looks like arena_slab_reg_dalloc produced an
invalid bit value. Looking at that code shows that
regind hold the parameter value that matches:
static void
arena_slab_reg_dalloc(extent_t *slab, arena_slab_data_t *slab_data, void *ptr) {
szind_t binind = extent_szind_get(slab);
const bin_info_t *bin_info = &bin_infos[binind];
size_t regind = arena_slab_regind(slab, binind, ptr);
assert(extent_nfree_get(slab) < bin_info->nregs);
/* Freeing an unallocated pointer can cause assertion failure. */
assert(bitmap_get(slab_data->bitmap, &bin_info->bitmap_info, regind));
bitmap_unset(slab_data->bitmap, &bin_info->bitmap_info, regind);
extent_nfree_inc(slab);
}
The backtrace showed binind==0 for arena_slab_reg_dalloc.
That leaves:
arena_slab_regind(slab, binind, ptr)
as producing the odd value.
size_t
arena_slab_regind(extent_t *slab, szind_t binind, const void *ptr) {
size_t diff, regind;
/* Freeing a pointer outside the slab can cause assertion failure. */
assert((uintptr_t)ptr >= (uintptr_t)extent_addr_get(slab));
assert((uintptr_t)ptr < (uintptr_t)extent_past_get(slab));
/* Freeing an interior pointer can cause assertion failure. */
assert(((uintptr_t)ptr - (uintptr_t)extent_addr_get(slab)) %
(uintptr_t)bin_infos[binind].reg_size == 0);
diff = (size_t)((uintptr_t)ptr - (uintptr_t)extent_addr_get(slab));
/* Avoid doing division with a variable divisor. */
regind = div_compute(&arena_binind_div_info[binind], diff);
assert(regind < bin_infos[binind].nregs);
return regind;
}
ptr == 0x50088b50
slab == 0x50407140
static inline void *
extent_addr_get(const extent_t *extent) {
assert(extent->e_addr == PAGE_ADDR2BASE(extent->e_addr) ||
!extent_slab_get(extent));
return extent->e_addr;
}
(gdb) print *slab
$17 = {e_bits = 0, e_addr = 0x0, {e_size_esn = 0, e_bsize = 0}, ql_link = {qre_next = 0x0, qre_prev = 0x0}, ph_link = {phn_prev = 0x0, phn_next = 0x0, phn_lchild = 0x0}, {e_slab_data = {bitmap = {
0 <repeats 17 times>}}, e_prof_tctx = {repr = 0x0}}}
That looks wrong: all fields are zero, which is not
likely to be the description of a slab. But I'll continue
to be sure I get the reported value of bit.
So extent_addr_get(slab)==slab->e_addr
and slab->e_addr==0x0
and diff==ptr .
(gdb) print/x arena_binind_div_info[binind]
$19 = {magic = 0x20000000}
static inline size_t
div_compute(div_info_t *div_info, size_t n) {
assert(n <= (uint32_t)-1);
/*
* This generates, e.g. mov; imul; shr on x86-64. On a 32-bit machine,
* the compilers I tried were all smart enough to turn this into the
* appropriate "get the high 32 bits of the result of a multiply" (e.g.
* mul; mov edx eax; on x86, umull on arm, etc.).
*/
size_t i = ((uint64_t)n * (uint64_t)div_info->magic) >> 32;
#ifdef JEMALLOC_DEBUG
assert(i * div_info->d == n);
#endif
return i;
}
(gdb) print/x ((unsigned long long)0x50088b50 * (unsigned long long)0x20000000) >> 32
$21 = 0xa01116a
(gdb) print ((unsigned long long)0x50088b50 * (unsigned long long)0x20000000) >> 32
$22 = 167842154
(As reported.)
So returning to *slab being all zero . . .
The slab value in the call chain seems to trace back to=
__je_tcache_bin_flush_small code:
bin_t *bin = &bin_arena->bins[binind];
. . .
malloc_mutex_lock(tsd_tsdn(tsd), &bin->lock);
. . .
for (unsigned i = 0; i < nflush; i++) {
void *ptr = *(tbin->avail - 1 - i);
extent = item_extent[i];
assert(ptr != NULL && extent != NULL);
if (extent_arena_get(extent) == bin_arena) {
arena_dalloc_bin_junked_locked(tsd_tsdn(tsd),
bin_arena, extent, ptr);
. . .
malloc_mutex_unlock(tsd_tsdn(tsd), &bin->lock);
(So ptr's value here is later slab's value in the call
chain.)
The backtrace shows binind = 7 via __je_tcache_event_hard .
(Not the same as the earlier binind.)
#4 0x50250d2c in __je_tcache_bin_flush_small (tsd=0x5009f018, tcache=<optimized out>, tbin=0x5009f1c0, binind=<optimized out>, rem=24) at jemalloc_tcache.c:149
ptr = <optimized out>
i = 0
extent = 0x50407140
bin_arena = 0x50400380
bin = <optimized out>
ndeferred = 0
merged_stats = <optimized out>
arena = 0x50400380
nflush = 75
__vla_expr0 = <optimized out>
item_extent = 0xffffd1f0
(gdb) print/x bin_arena->bins[7]
$44 = {lock = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0},
n_owner_switches = 0x0, prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x0, postponed_next = 0x504021d0}, witness = {name = 0x0, rank = 0x0, comp = 0x0, opaque = 0x0, link = {qre_next = 0x0,
qre_prev = 0x0}}, lock_order = 0x0}}, slabcur = 0x50407140, slabs_nonfull = {ph_root = 0x0}, slabs_full = {qlh_first = 0x0}, stats = {nmalloc = 0x64, ndalloc = 0x0, nrequests = 0x1,
curregs = 0x64, nfills = 0x1, nflushes = 0x1, nslabs = 0x1, reslabs = 0x0, curslabs = 0x1, mutex_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, n_wait_times = 0x0,
n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = {repr = 0x0}, n_owner_switches = 0x0, prev_owner = 0x0, n_lock_ops = 0x0}}}
That indicates: bin_arena->bins[7]->lock = 0x0 .
Expected? Single threaded context?
(gdb) print *item_extent[0]
$27 = {e_bits = 0, e_addr = 0x0, {e_size_esn = 0, e_bsize = 0}, ql_link = {qre_next = 0x0, qre_prev = 0x0}, ph_link = {phn_prev = 0x0, phn_next = 0x0, phn_lchild = 0x0}, {e_slab_data = {bitmap = {
0 <repeats 17 times>}}, e_prof_tctx = {repr = 0x0}}}
Other *item_extent[INDEX] that I tried got the same: all zeros.
This is what contributed to the huge bit value.
item_extent[] is based on the declaration:
VARIABLE_ARRAY(extent_t *, item_extent, nflush);
and:
/* Declare a variable-length array. */
#if __STDC_VERSION__ < 199901L
# ifdef _MSC_VER
# include <malloc.h>
# define alloca _alloca
# else
# ifdef JEMALLOC_HAS_ALLOCA_H
# include <alloca.h>
# else
# include <stdlib.h>
# endif
# endif
# define VARIABLE_ARRAY(type, name, count) \
type *name = alloca(sizeof(type) * (count))
#else
# define VARIABLE_ARRAY(type, name, count) type name[(count)]
#endif
WARNING: C11 turned VLAs into a conditional feature
(__STDC_NO_VLA__). Only C99 has it as required. Thus the
above definition of VARIABLE_ARRAY is incomplete or
limited to C99 and before relative the the language
vintages.
Looking around, the stack frames seem to span the
space okay:
(gdb) print/x &item_extent[75]
$32 = 0xffffd31c
(gdb) print/x &item_extent[0]
$33 = 0xffffd1f0
r1 0xffffd1a0 4294955424
r14 0xffffd1f0 4294955504
r15 0xfffffffc 4294967292
r21 0xffffd1f0 4294955504
(gdb) print/x *(void**)0xffffd1a0
$36 = 0xffffd1d0
(gdb) print/x *(void**)0xffffd1d0
$37 = 0xffffd1e0
(gdb) print/x *(void**)0xffffd1e0
$38 = 0xffffd440
(gdb) print/x *(void**)0xffffd440
$39 = 0xffffd460
And I've run out of ideas for what else to look at
(for now). (It is not like I understand jemalloc.)
(Last I knew, 32-bit powerpc did not have red-zone
stack-space criteria to leave room for signals
to use.)
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-current
mailing list