[Bug 258022] [FUSES] Inode attributes are cached unnecessarily/for too long
Date: Tue, 24 Aug 2021 11:30:38 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258022 Bug ID: 258022 Summary: [FUSES] Inode attributes are cached unnecessarily/for too long Product: Base System Version: 13.0-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: chogata@moosefs.pro This is a problem a user of MooseFS reported. Under some circumstances creating a new fs entry (any type: directory, regular file, special file) on MooseFS mount shows a message "Resource temporarily unavailable" and any subsequent operations on this inode (ls -al, rm or rmdir) also show this message. And MooseFS cannot be unmounted, the system shows a message "Device busy". Only a restart of the whole machine helps. Since this was a bit similar to a problem some versions of Linux kernel had, when a process on one machine deleted a CWD of a process on a different machine, we at first thought it had to do with CWDs only and introduced some safeguards in MooseFS client for FreeBSD. But recent findings show this is much more serious and on FreeBSD side. A simple test: we take two machines and mount MooseFS on both. On FreeBSD machine (13.0-RELEASE-p3) we use these mount options: mfsmount -o mfsattrcacheto=0 -o mfsxattrcacheto=0 -o mfsentrycacheto=0 -o mfsdirentrycacheto=0 -o mfssymlinkcacheto=0 -o mfsgroupscacheto=0 /mnt/mfs All the -o options are to disable any attribute caches that may exist (any lookup, access, mkdir etc. operations will return 0 seconds as cache time). We also have a second machine with the same MooseFS instance. Operating system on the second machine is irrelevant. Then we perform these steps, exactly in the order shown below: ***FreeBSD machine*** ~# cd /mnt/mfs/testdir /mnt/mfs/testdir# ls -al total 2932 drwxr-xr-x 2 root wheel 1 Aug 23 12:41 . drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. /mnt/mfs/testdir# ****** (FreeBSD "sees" that "testdir" is empty) ***OTHER machine*** ~# cd /mnt/mfs/testdir /mnt/mfs/testdir# mkdir dir /mnt/mfs/testdir# ****** (Other machine creates a directory named "dir" inside "testdir") ***FreeBSD machine*** /mnt/mfs/testdir# ls -al total 2933 drwxr-xr-x 3 root wheel 1 Aug 23 12:59 . drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. drwxr-xr-x 2 root wheel 1 Aug 23 12:59 dir /mnt/mfs/testdir# ****** (FreeBSD "sees" that there is now "dir" inside "testdir") ***OTHER machine*** /mnt/mfs/testdir# ls -ali total 2933 8 drwxr-xr-x 3 root wheel 1 Aug 23 12:59 . 1 drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. 9 drwxr-xr-x 2 root wheel 1 Aug 23 12:59 dir /mnt/mfs/testdir# rmdir dir /mnt/mfs/testdir# ls -al total 2932 drwxr-xr-x 2 root wheel 1 Aug 23 13:00 . drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. /mnt/mfs/testdir# ****** (We check the inode number of "dir" on the other machine and delete "dir") ***FreeBSD machine*** /mnt/mfs/testdir# ls -al total 2932 drwxr-xr-x 2 root wheel 1 Aug 23 13:00 . drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. /mnt/mfs/testdir# ****** (FreeBSD "sees" again, that "testdir" is empty) Now we wait for at least 5 minutes, the timing will be explained below. ***FreeBSD machine*** /mnt/mfs/testdir# echo "foo" > file.txt -bash: file.txt: Resource temporarily unavailable /mnt/mfs/testdir# ls -al ls: file.txt: Resource temporarily unavailable total 2932 drwxr-xr-x 2 root wheel 1 Aug 23 13:17 . drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. /mnt/mfs/testdir# ****** Ooops?! ***OTHER machine*** /mnt/mfs/testdir# ls -ali total 2932 8 drwxr-xr-x 2 root wheel 1 Aug 23 13:17 . 1 drwxrwxrwx 43 root wheel 3001433 Aug 23 12:28 .. 9 -rw-r--r-- 1 root wheel 0 Aug 23 13:17 file.txt /mnt/mfs/testdir# ****** The newly created file got the same inode number as the recently deleted directory "dir"... Notes: 1) The effect is not exclusive to former directory inode numbers becoming file inode numbers. It happens whenever the new object is of a different type than the old one (so ex-directory inode number becomes re-used as file, ex-file as fifo, ex-fifo as a device or directory etc.). The "ls -al" scenario is not the only one, the same will happen if objects are created on FreeBSD machine and then deleted from another machine, which is of course a normal occurrence in a network file system. 2) Default inode reuse time in MooseFS is 24 hours. It was set to 5 minutes for testing purposes only. The person, that reported the problem first (there were others after), used the default 24 hours. And only inodes that are truly "free" are reused, that means: no CWDs (active on any MooseFS client connected to the instance), no sustained (deleted but still open) files are reused. The 24 hour delay is counted from the moment they are considered free, so if a file is in a sustained state for, let's say, 24 hours after deletion (and then whatever process had a hold on it finally finishes), its inode number is still not reused for another 24 hours. 3) Default cache times in MooseFS: file attributes cache timeout - 1 second, extended attributes (xattr) cache timeout - 30 seconds, directory entry cache timeout - 1 second, negative entry cache timeout - 0 seconds (default no negative cache), symbolic link cache timeout - 300 seconds, supplementary groups cache timeout - 300 seconds 4) Caches in the above experiment were ALL set to 0. 5) The problem was first reported on FreeBSD 12.1. So, to sum it up: we say "don't cache anything at all/longer than 300 seconds", FreeBSD caches indode attributes (we don't know, which ones, but at least type) for longer than 24 hours and it causes a serious problem, because a new inode with reused inode number is basically unusable in the file system. -- You are receiving this mail because: You are the assignee for the bug.