[Bug 261671] rc script fails to start gssd on 12.3
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 261671] rc script fails to start gssd on 12.3"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 02 Feb 2022 03:46:21 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261671 Bug ID: 261671 Summary: rc script fails to start gssd on 12.3 Product: Base System Version: 12.3-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: conf Assignee: bugs@FreeBSD.org Reporter: bugs.freebsd@scourger.nl Created attachment 231515 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=231515&action=edit Patch with with a workaround. On FreeBSD 12.3, gssd fails to start on boot. ## Environment I installed a clean FreeBSD 12.3 system with minimal configuration changes. It mounts a few NVSv4 filesystems using Kerberos for authentication. Users and groups are stored in LDAP. A very minimal set of packages is installed to provide the functionality (see attached pkg.txt). NFS mounts are specified in /etc/fstab with (among others) the "late" flag set. Contents of /etc/rc.conf are included as an attachment. The system uses boot environments with subordinate filesystems like shown below (currently only one BE): # zfs list -r -o name,mountpoint,canmount,mounted fenrir/ROOT NAME MOUNTPOINT CANMOUNT MOUNTED fenrir/ROOT none on no fenrir/ROOT/default none noauto yes fenrir/ROOT/default/usr /usr noauto yes fenrir/ROOT/default/usr/local /usr/local noauto yes fenrir/ROOT/default/var /var noauto yes After configuration of the system, I tested my setup by starting the daemons and invoking "mount -a -l", and the NFS filesystems got mounted succesfully. Then came the moment of the first reboot, where I was confronted with an interrupted boot process at the "mountlate" stage (asking to go into single user mode or proceed to multi-user). I have used virtually the same setup on earlier hosts without problems since the 10.X era (including the FreeBSD 12.2 system I'm writing this on). For good measure, I also tried to upgrade an existing 12.2 install to 12.3 in a boot environment without subordinate datasets. This resulted in the same error condition. ## Problem description During boot, gssd(8) fails to start properly on FreeBSD 12.3. Any "late" NFSv4 filesystem in /etc/fstab fail to mount during boot. The console shows an error message when it tries to start gssd, as shown in the following snippet: Starting file system checks: Mounting local filesystems:. /etc/rc: WARNING: run_rc_command: cannot run /usr/sbin/gssd ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/compat/pkg 32-bit compatibility ldconfig path: /usr/lib32 The same configuration works fine on FreeBSD 12.2. It appears that the culprit is a change in the ordering of rc files. On FreeBSD 12.3, the 'gssd' script gets wedged between 'zfsbe' and 'zfs' (see the attached rcorder-12.3.orig). On 12.2, gssd is started much later in the boot process (well after NETWORKING; see attached rcorder-12.2.orig). As a test, I made a minor change to the gssd script to see if the rc ordering was indeed the problem. Adding NETWORKING to the REQUIRE line seems to be sufficient to fix the booting problem. I also added "BEFORE: mountcritremote" to make sure gssd doesn't start too late on diskless clients (though I haven't tested diskless). See the attached gssd.patch for the exact changes that I made. The patch changes the startup order to the one listed in rcorder-12.3.fixed. To test the hypothesis that rc ordering is indeed the issue, I tried 4 testcases: Case 1: default /etc/rc.d/gssd, no NFS filesystems in /etc/fstab The system boots without obvious issues, but gssd is not running. Trying to mount a NFSv4 filesystem immediately returns "Permission denied". If you start gssd manually, mounting NFSv4 works. Case 2: default /etc/rc.d/gssd, NFS filesystems in /etc/fstab gssd doen't start during boot, as in case 1. The boot process is interrupted during the "mountlate" stage, when it tries to mount the NFS filesystems. If you choose to proceed into multi-user mode, you'll have to manually cancel further mount attempts during boot. Once in multi-user mode, depending on how quickly/often CTRC-c was pressed to abort "mountlate", 0 or more instances of gssd are running (I've observed 1 and 2). Even if only 1 instance of gssd is running, it is not possible to mount NFSv4 filesystems. A manual mount hangs in the "[rpccon]" state before timing out with a "Permission denied" error: root@fenrir:~ # mount /net/cerberus/incoming/ load: 0.01 cmd: mount_nfs 48471 [rpccon] 0.86r 0.00u 0.00s 0% 8080k load: 0.01 cmd: mount_nfs 48471 [rpccon] 1.88r 0.00u 0.00s 0% 8080k load: 0.01 cmd: mount_nfs 48471 [rpccon] 2.99r 0.00u 0.00s 0% 8080k mount_nfs: nmount: /net/cerberus/incoming: Permission denied After killing all gssd instances and running "service gssd restart", mounting the filesystems is possible. Case 3: modified /etc/rc.d/gssd, no NFS filesystems in /etc/fstab The system boots without issue, gssd is running and NFSv4 filesystems can be mounted manually. Case 4: modified /etc/rc.d/gssd, NFS filesystems in /etc/fstab The system boots as expected, gssd is running and filesystems are automatically mounted as expected. These results seem to confirm that the problem stems from an attempt to start gssd too early. Note that I haven't tested this with NFSv3 or non-Kerberized NFSv4, so it is possible that those work fine. ## How to reproduce Do a fresh installation of FreeBSD 12.3, and perform the minimal required configuration for gssd. Running "service gssd start" should succesfully launch the daemon. Reboot, and observe that gssd hasn't started. ## Solution A simple fix would be to change the REQUIRE line in the gssd rc file. But that might just be patchwork that hides the actual problem. It is unclear to me why the rc ordering is so different between 12.2 and 12.3; as far as I can see there haven't been any big changes to any of the files in /etc/rc.d. However, one of the few rc scripts that changed is in fact gssd (see review D27203 ). Ironically, that commit doesn't seem to cause the problem. Using the 12.2 version of the gssd rc script on FreeBSD 12.3 still causes a startup failure. In any case, there are huge differences when comparing the output of "rcorder /etc/rc.d/*" between 12.2 and 12.3, while the contents of files in /etc/rc.d are almost exactly the same. At this point, my guess is that something has changed in the behaviour of rcorder(8) itself. I can't say if that is intended, or a bug. -- You are receiving this mail because: You are the assignee for the bug.