[Bug 275597] Samba: smbd sometimes aborts by PANIC when 'vfs objects = cap'
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 18 Dec 2023 01:45:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275597 --- Comment #2 from uratan <uratan@miomio.jp> --- Created attachment 247109 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=247109&action=edit a script to make a set of test files that is used by the debugging I noticed that I could do printf-debugging by setting the debug level in /usr/local/etc/smb4.conf, ... (modification to) /usr/local/etc/smb4.conf +----------------------------------------- |; max log size = 50 | max log size = 9999 | log level = 6 +----------------------------------------- ... so I decided to start tracking down the cause by myself. I made samba-4.16.11 (same version as packages) from ports and gathered information with the DEBUG() macro. which are output to /var/log/samba4/log.<client-hostname>. - * - * - I confirmed at first the place where the 'Segmentation fault' were caused at, it was in function cap_readdir() shown below. (/usr/ports/net/samba416/work/samba-4.16.11/source3/modules/vfs_cap.c) +------------------------------------------------------------------ 87 |static struct dirent *cap_readdir(vfs_handle_struct *handle, 88 | struct files_struct *dirfsp, 89 | DIR *dirp, 90 | SMB_STRUCT_STAT *sbuf) 91 |{ 92 | struct dirent *result; 93 | struct dirent *newdirent; 94 | char *newname; 95 | size_t newnamelen; 96 | DEBUG(3,("cap: cap_readdir\n")); 97 | 98 | result = SMB_VFS_NEXT_READDIR(handle, dirfsp, dirp, NULL); 99 | if (!result) { 100 | return NULL; 101 | } 102 | 103 | newname = capdecode(talloc_tos(), result->d_name); 104 | if (!newname) { 105 | return NULL; 106 | } 107 | DEBUG(3,("cap: cap_readdir: %s\n", newname)); 108 | newnamelen = strlen(newname)+1; 109 | newdirent = talloc_size( 110 | talloc_tos(), sizeof(struct dirent) + newnamelen); 111 | if (!newdirent) { 112 | return NULL; 113 | } 114 | talloc_set_name_const(newdirent, "struct dirent"); **115 | memcpy(newdirent, result, sizeof(struct dirent)); **116 | memcpy(&newdirent->d_name, newname, newnamelen); 117 | return newdirent; 118 |} +------------------------------------------------------------------ In the function cap_readdir(), there are 2 memcpy()s used just before return statement. I confirmed that the 'Segmentation fault' were caused only at first memcpy(), the second memcpy() has no problem, with DEBUG() like below. +------------------------------------------------------------------ 87 |static struct dirent *cap_readdir(vfs_handle_struct *handle, | -- omitted -- | 114 | talloc_set_name_const(newdirent, "struct dirent"); |DEBUG(3,("@@u@@ 456\n")); |memcpy(&newdirent->d_name, newname, newnamelen); //// dup of #116 |DEBUG(3,("@@u@@ 888\n")); **115 | memcpy(newdirent, result, sizeof(struct dirent)); |DEBUG(3,("@@u@@ 789\n")); **116 | memcpy(&newdirent->d_name, newname, newnamelen); |DEBUG(3,("@@u@@ 999\n")); 117 | return newdirent; 118 |} +------------------------------------------------------------------ | | V V (excerpt of /var/log/samba4/log.argon near PANIC abort with above codes) +------------------------------------------------------------------ | : | cap: cap_readdir | cap: cap_readdir: igfxCPL.cpl | @@u@@ 456 | @@u@@ 888 | @@u@@ 789 | @@u@@ 999 | smbd_dirptr_get_entry: dirptr 0x2af766c0 now at offset 7585 | fsp_new: allocated files structure (508 used) | fget_ea_dos_attribute: Cannot get attribute from EA on file win/tst/teHW2/N3x50B-ITX/v15.38.2.64.4189/Graphics/igfxCPL.cpl: Error = Attribute not found | dos_mode_debug_print: fdos_mode returning (0x80): "" | smbd_dirptr_get_entry mask=[*] found win/tst/teHW2/N3x50B-ITX/v15.38.2.64.4189/Graphics/igfxCPL.cpl fname=igfxCPL.cpl (igfxCPL.cpl) | file_free: freed files structure 0 (507 used) | cap: cap_readdir | cap: cap_readdir: igfxCUIService.exe | @@u@@ 456 | @@u@@ 888 | @@u@@ 789 | @@u@@ 999 | smbd_dirptr_get_entry: dirptr 0x2af766c0 now at offset 7681 | fsp_new: allocated files structure (508 used) | fget_ea_dos_attribute: Cannot get attribute from EA on file win/tst/teHW2/N3x50B-ITX/v15.38.2.64.4189/Graphics/igfxCUIService.exe: Error = Attribute not found | dos_mode_debug_print: fdos_mode returning (0x80): "" | smbd_dirptr_get_entry mask=[*] found win/tst/teHW2/N3x50B-ITX/v15.38.2.64.4189/Graphics/igfxCUIService.exe fname=igfxCUIService.exe (igfxCUIService.exe) | file_free: freed files structure 0 (507 used) | cap: cap_readdir | cap: cap_readdir: igfxCUIServicePS.dll | @@u@@ 456 | @@u@@ 888 | =============================================================== | INTERNAL ERROR: Signal 11: Segmentation fault in pid 16090 (4.16.11) | If you are running a recent Samba version, and if you think .... | =============================================================== | PANIC (pid 16090): Signal 11: Segmentation fault in 4.16.11 | BACKTRACE: | #0 log_stack_trace + 0x43 [ip=0x203e8643] [sp=0xffbfd35c] | : +------------------------------------------------------------------ - * - * - I tried various things hoping something would change the situation for about a week (only within the cap_readdir()). Finally, after over 20 builds & tests, I reached some solution that, writing down the first memcpy() by replacing with simple assignment statements (see below) made the problem disappeared. +------------------------------------------------------------------ 87 |static struct dirent *cap_readdir(vfs_handle_struct *handle, | -- omitted -- | 114 | talloc_set_name_const(newdirent, "struct dirent"); |#define Q_HACK_U 1 |#if Q_HACK_U | newdirent->d_pad1 = result->d_pad1 ; | newdirent->d_namlen = result->d_namlen ; | newdirent->d_pad0 = result->d_pad0 ; | newdirent->d_type = result->d_type ; | newdirent->d_reclen = result->d_reclen ; | newdirent->d_off = result->d_off ; | newdirent->d_fileno = result->d_fileno ; |#else /* Q_HACK_U */ **115 | memcpy(newdirent, result, sizeof(struct dirent)); |#endif /* Q_HACK_U */ **116 | memcpy(&newdirent->d_name, newname, newnamelen); 117 | return newdirent; 118 |} +------------------------------------------------------------------ note: I considered that the bottom side of the memory area is more safe because the second memcpy() did not cause any problems, so it is the reason why the structure member assignments are in reverse order (to insert further DEBUG()s if needed). Also member d_name[] is not copied manually because it is done by second memcpy(). - * - * - Since there was nothing unnatural about the pointer values, etc. by the result of another research with DEBUG() like below, ... +------------------------------------------------------------------ 87 |static struct dirent *cap_readdir(vfs_handle_struct *handle, | -- omitted -- | |DEBUG(3,("@u@ %lx %lx %x\n", newdirent, result, sizeof(struct dirent))); **115 | memcpy(newdirent, result, sizeof(struct dirent)); +------------------------------------------------------------------ | | V V (excerpt of /var/log/samba4/log.argon near PANIC abort with above codes) +------------------------------------------------------------------ | : | cap: cap_readdir: HDXRtGi.inf | @u@ 2a7b6130 22e82e90 118 | @@u@@ 999 | cap: cap_readdir: HDXSBCH.inf | @u@ 2a7b6430 22e82eb8 118 | @@u@@ 999 | cap: cap_readdir: HDXSEDS.inf | @u@ 2a7b65b0 22e82ee0 118 | @@u@@ 999 | cap: cap_readdir: HDXSF.inf | @u@ 2a7b6730 22e82f08 118 | =============================================================== | INTERNAL ERROR: Signal 11: Segmentation fault in pid 94386 (4.16.11) | : +------------------------------------------------------------------ ... I can only assume that there is something in the implementation of memcpy() that is incompatible with smbd's behavior... (maybe) (multi-core ? or threading ?) Well, however, I feel very very very happy right now because the diff.exe works well now. (and quiet /var/log/messages) - * - * - ADDITIONAL INFO [a] I also modified the version message to distinguish my smbd. (/usr/ports/net/samba416/work/samba-4.16.11/source3/include/smb.h) +------------------------------------------------------------------ 31 | 32 |/* logged when starting the various Samba daemons */ |#define Q_HACK_U 1 |#if Q_HACK_U |#define COPYRIGHT_STARTUP_MESSAGE "Copyright Andrew Tridgell and the Samba Team 1992-2023 +u" |#else /* Q_HACK_U */ 33 |#define COPYRIGHT_STARTUP_MESSAGE "Copyright Andrew Tridgell and the Samba Team 1992-2022" |#endif /* Q_HACK_U */ 34 | +------------------------------------------------------------------ [b] The subjected 14.0R uses UFS file system. (/home is shared by Samba) +----------------------------------------- | % mount |/dev/ada0p3 on / (ufs, local, noatime) |devfs on /dev (devfs) |/dev/ada0p4 on /usr (ufs, local, noatime, soft-updates) |/dev/ada0p5 on /home (ufs, NFS exported, local, noatime, soft-updates) |/dev/md1 on /mnt (ufs, local) |fdescfs on /var/run/samba4/fd (fdescfs) +----------------------------------------- [c] The depth of the directory hierarchy might have no relations to this problem. Attached sh-scripts: z2-make-test-files.sh makes a set of test files that was used by above debugging. +----------------------------------------- |% sh z2-make-test-files.sh |....... : | ...... done | |% du -d2 -m teHWs/ |620 teHWs/N3x50B-ITX/7464 |621 teHWs/N3x50B-ITX/7492 |322 teHWs/N3x50B-ITX/v15.38.2.64.4189 |261 teHWs/N3x50B-ITX/v15.40.0.4177_BSW |1822 teHWs/N3x50B-ITX |1822 teHWs/ | |% find teHWs/ -type f | wc -l | 1591 |% find teHWs/ -type d | wc -l | 135 +----------------------------------------- Since this problem may or may not appear once per comparison by diff.exe below (about 1min), so the frequency of occurrence is about less less less than 1/3182 (1591 files * 2 trials) in my environment... +----------------------------------------- |U:\> diff.exe -rq j:\win\tst\teHWs u:\tst\teHWs (u: is win7 local disk) +----------------------------------------- -- You are receiving this mail because: You are the assignee for the bug.