Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini()
Peter Eriksson
pen at lysator.liu.se
Wed Dec 4 12:48:10 UTC 2019
Going deeper… now it’s pointing towards pmap_remove(), called from kmem_unback(), which is called from kmem_free(), which is called from page_free() which is called from keg_free_slab()...
kmem_unback: pmap_remove(kernel_pmap, 18446741884786954240, 18446741884786970624) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741884786954240, 16384) took 1 seconds
page_free: kmem_free(kmem_arena, 18446741884786954240, 16384) took 1 seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds
kmem_unback: pmap_remove(kernel_pmap, 18446741884989739008, 18446741884989755392) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741884989739008, 16384) took 1 seconds
page_free: kmem_free(kmem_arena, 18446741884989739008, 16384) took 1 seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds
kmem_unback: pmap_remove(kernel_pmap, 18446741885257895936, 18446741885257912320) took 1 seconds
kmem_free: kmem_unback(kmem_object, 18446741885257895936, 16384) took 1 seconds
page_free: kmem_free(kmem_arena, 18446741885257895936, 16384) took 1 seconds
keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds
- Peter
> On 4 Dec 2019, at 11:13, Peter Eriksson <pen at lysator.liu.se> wrote:
>
> Yeah, I noticed that after I sent that email (should send stuff just before you go to bed :-)
>
> Anyway, some more timing/debug printfs point to the keg->uk_freef(mem) call in keg_free_slab() which sometimes take about a second to execute.
> (Should probably switch to some other time source with greater resolution than 1s next :-)
>
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
> keg_free_slab: keg->uk_freef(mem) took 1 seconds
>
> - Peter
>
>> On 4 Dec 2019, at 10:48, Andriy Gapon <avg at FreeBSD.org> wrote:
>>
>> On 04/12/2019 00:28, Peter Eriksson wrote:
>>> … and X levels deeper it gives this (not all filesystems mounted - I rebooted fairly quickly):
>>>
>>> keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds [20021 loops]
>>> zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 seconds
>>> zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds
>>> zone_free_item(zone=UMA Zones): zone->uz_dtor() took 14 seconds
>>> uma_zdestroy(zio_buf_12288) took 14 seconds
>>> kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] took 14 seconds
>>> kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds
>>>
>>>
>>> Hmm… I wonder if it isn’t this code I /usr/src/sys/sys/queue.h:
>>>
>>>> #define SLIST_REMOVE(head, elm, type, field) do { \
>>>> QMD_SAVELINK(oldnext, (elm)->field.sle_next); \
>>>> if (SLIST_FIRST((head)) == (elm)) { \
>>>> SLIST_REMOVE_HEAD((head), field); \
>>>> } \
>>>> else { \
>>>> QUEUE_TYPEOF(type) *curelm = SLIST_FIRST(head); \
>>>> while (SLIST_NEXT(curelm, field) != (elm)) \
>>>> curelm = SLIST_NEXT(curelm, field); \
>>>> SLIST_REMOVE_AFTER(curelm, field); \
>>>> } \
>>>> TRASHIT(*oldnext); \
>>>> } while (0)
>>>
>>> Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain():
>>>
>>>> finished:
>>>> KEG_UNLOCK(keg);
>>>>
>>>> while ((slab = SLIST_FIRST(&freeslabs)) != NULL) {
>>>> SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink);
>>>> keg_free_slab(keg, slab, keg->uk_ipers);
>>>> }
>>>
>>> (The keg_drain print above is the time for the while() loop).
>>>
>>> If I’m reading this right it looks like it’s potentially doing a linear search through a (long) linked list, for every entry in the free slabs list. 20000 entries in 14 seconds is 1400 entries per second.
>>
>> It's not a search. The loop is removing *all* items. There is no other way to
>> do it :-)
>> Expanding your calculations, it takes on the order of a millisecond to process a
>> single slab. I bet that the time is spent in keg_free_slab() and it's a lot of
>> time.
>>
>> I wonder if the OR should actually be AND in this condition:
>> if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize * i)) ||
>> keg->uk_fini != trash_fini)
>> Looks like there is a "ton" of memory reading done here.
>>
>> --
>> Andriy Gapon
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-fs
mailing list