Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 24 Aug 2023 07:22:00 UTC
On Aug 23, 2023, at 22:54, Mateusz Guzik <mjguzik@gmail.com> wrote:

> On 8/24/23, Mark Millard <marklmi@yahoo.com> wrote:
>> On Aug 23, 2023, at 15:10, Mateusz Guzik <mjguzik@gmail.com> wrote:
>> 
>>> On 8/23/23, Mark Millard <marklmi@yahoo.com> wrote:
>>>> [Forked off the ZFS deadlock 14 discussion, per feedback.]
>>>> . . .
>>> 
>>> This is a known problem, but it is unclear if you should be running
>>> into it in this setup.
>> 
>> The changed fixed the issue: so I do run into the the issue
>> for this setup. See below.
>> 
>>> Can you try again but this time *revert*
>>> 138a5dafba312ff39ce0eefdbe34de95519e600d, like so:
>>> git revert 138a5dafba312ff39ce0eefdbe34de95519e600d
>>> 
>>> may want to switch to a different branch first, for example: git
>>> checkout -b vfstesting
>> 
>> # git -C /usr/main-src/ diff sys/kern/vfs_subr.c
>> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
>> index 0f3f00abfd4a..5dff556ac258 100644
>> --- a/sys/kern/vfs_subr.c
>> +++ b/sys/kern/vfs_subr.c
>> @@ -3528,25 +3528,17 @@ vdbatch_process(struct vdbatch *vd)
>>        MPASS(curthread->td_pinned > 0);
>>        MPASS(vd->index == VDBATCH_SIZE);
>>  +       mtx_lock(&vnode_list_mtx);
>>        critical_enter();
>> -       if (mtx_trylock(&vnode_list_mtx)) {
>> -               for (i = 0; i < VDBATCH_SIZE; i++) {
>> -                       vp = vd->tab[i];
>> -                       vd->tab[i] = NULL;
>> -                       TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
>> -                       TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
>> -                       MPASS(vp->v_dbatchcpu != NOCPU);
>> -                       vp->v_dbatchcpu = NOCPU;
>> -               }
>> -               mtx_unlock(&vnode_list_mtx);
>> -       } else {
>> -               for (i = 0; i < VDBATCH_SIZE; i++) {
>> -                       vp = vd->tab[i];
>> -                       vd->tab[i] = NULL;
>> -                       MPASS(vp->v_dbatchcpu != NOCPU);
>> -                       vp->v_dbatchcpu = NOCPU;
>> -               }
>> +       for (i = 0; i < VDBATCH_SIZE; i++) {
>> +               vp = vd->tab[i];
>> +               TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
>> +               TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
>> +               MPASS(vp->v_dbatchcpu != NOCPU);
>> +               vp->v_dbatchcpu = NOCPU;
>>        }
>> +       mtx_unlock(&vnode_list_mtx);
>> +       bzero(vd->tab, sizeof(vd->tab));
>>        vd->index = 0;
>>        critical_exit();
>> }
>> 
>> Still with:
>> 
>> # grep USE_TMPFS= /usr/local/etc/poudriere.conf
>> # EXAMPLE: USE_TMPFS="wrkdir data"
>> #USE_TMPFS=all
>> #USE_TMPFS="data"
>> USE_TMPFS=no
>> 
>> 
>> That allowed the other builders to eventually reach "Builder started"
>> and later activity, [00:05:50] [27] [00:02:29] Builder started
>> being the first non-[01] to do so, no vlruwk's observed in what
>> I saw in top:
>> 
>> . . .
>> 
>> Now testing for the zfs deadlock issue should be possible for
>> this setup.
>> 
> 
> Thanks for testing, I wrote a fix:
> 
> https://people.freebsd.org/~mjg/vfs-recycle-fix.diff
> 
> Applies to *stock* kernel (as in without the revert).

I'm going to leave the deadlock test running for when
I sleep tonight. So it is going to be a while before
I get to testing this. $ work will likely happen first
as well. (No deadlock observed yet, by the way. 6+ hrs
and 3000+ ports built so far.)

I can easily restore the sys/kern/vfs_subr.c to then
do normal 14.0-ALPHA2-ish based patching with: so not
a problem. Thanks.


===
Mark Millard
marklmi at yahoo.com