[Bug 262189] ZFS volume not showing up in /dev/zvol when 1 CPU

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 26 Feb 2022 12:29:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=262189

--- Comment #6 from Janis <zedupsys@gmail.com> ---
Thank's for the useful information, i did not know the internals in this
regard. Now i understand what the bug #250297 report is speaking about. That
bug seemed might be relevant to my case. I tried to reproduce it with zfs
create/destroy shell script loop and could not hit kernel panic as stated there
in comments. I did not get that he speaks about how volmode=dev is created
internally, so his use-case seemed a bit bizarre.

About asynchronous nature for "zfs create" command. At first i thought this is
the case, but it does not seem to be. There are two problems as i see:
1) if this was just that "zfs create" returns too early, it would be a
mini-bug, since it is expected that command returns when things are done. I
guess, i wouldn't even report it.
2) with sleep between "zfs create" and "dd" on multi-core systems it "solves"
some of dd problems, but not all of the missing ZVOL cases; there are still
ZVOLs that never show up in /dev/zvol but can be seen in "zfs list". On single
CPU case, it solves nothing at all and ZVOL never appears, only after
reboot/export-import.


To illustrate 1 CPU case, i run script:
#!/bin/sh
name_pool=zroot/stress
echo `date`
ls /dev/zvol
seq 1 10 | while read i; do
zfs create -o volmode=dev -V 1G $name_pool/data$i
done
sleep 300
echo `date`
ls /dev/zvol

Output is:
Sat Feb 26 12:21:08 EET 2022
ls: /dev/zvol: No such file or directory
Sat Feb 26 12:26:11 EET 2022
ls: /dev/zvol: No such file or directory

even now after a while
# date
Sat Feb 26 12:35:03 EET 2022
# ls /dev/zvol
ls: /dev/zvol: No such file or directory


I do not know how long is considered asynchronous, but it seems too long, so i
assume that ZVOL will never show up.

With 1 CPU machine in file "cat /var/run/devd.pipe" i get lines like so for
each create command:
!system=DEVFS subsystem=CDEV type=CREATE cdev=zvol/zroot/stress/data20
!system=GEOM subsystem=DEV type=CREATE cdev=zvol/zroot/stress/data20
!system=DEVFS subsystem=CDEV type=DESTROY cdev=zvol/zroot/stress/data20
!system=GEOM subsystem=DEV type=DESTROY cdev=zvol/zroot/stress/data20

This seems wrong, since both "last" events are DESTROY.


With 4 CPUs it is harder to reproduce, so i ran with 2 CPUs enabled in BIOS.
Physical hardware, 16G RAM.

So i ran script as follows:
#!/bin/sh
name_pool=zroot/stress
zfs create -o mountpoint=none $name_pool
seq 1 1000 | while read i; do
zfs create -o volmode=dev -V 1G $name_pool/data$i
done

Testing result:
# zfs list | grep stress | wc -l
    1001
# ls /dev/zvol/zroot/stress/ | wc -l
     638

Output clearly shows that ZVOLs are missing (even if ignoring output header and
ZVOL parent, diff is too big)

I created files and will attach them (though maybe it is enough with my
pointers):
zfs list -H -o name > /service/log/zfs_list_001.log
ls ls /dev/zvol/zroot/stress/ > ls_dev_zvol_stress__001.log
cat /var/run/devd.pipe| grep -v "!system=ZFS" >
/service/log/grepped_devd.pipe_no_dd_seq_1000__001.log


With diff and sorting we see that, i.e. there is no ZVOL for:
-data526
-data527
-data528
-data529

# ls /dev/zvol/zroot/stress/data526
ls: /dev/zvol/zroot/stress/data526: No such file or directory

# zfs get -H volmode zroot/stress/data526
zroot/stress/data526 volmode dev local


For non-existing case we see this in devd.pipe file:
# cat /service/log/grepped_devd.pipe_no_dd_seq_1000__001.log | grep data526
!system=DEVFS subsystem=CDEV type=CREATE cdev=zvol/zroot/stress/data526
!system=GEOM subsystem=DEV type=CREATE cdev=zvol/zroot/stress/data526
!system=DEVFS subsystem=CDEV type=DESTROY cdev=zvol/zroot/stress/data526
!system=GEOM subsystem=DEV type=DESTROY cdev=zvol/zroot/stress/data526

It's the same fingerprint as in 1 CPU case, double destroy events are last
ones.

Whereas for existing ZVOLs there is:
# cat /service/log/grepped_devd.pipe_no_dd_seq_1000__001.log | grep data525
!system=DEVFS subsystem=CDEV type=CREATE cdev=zvol/zroot/stress/data525
!system=GEOM subsystem=DEV type=CREATE cdev=zvol/zroot/stress/data525
!system=DEVFS subsystem=CDEV type=DESTROY cdev=zvol/zroot/stress/data525
!system=GEOM subsystem=DEV type=DESTROY cdev=zvol/zroot/stress/data525
!system=DEVFS subsystem=CDEV type=CREATE cdev=zvol/zroot/stress/data525

-- 
You are receiving this mail because:
You are the assignee for the bug.