Under qemu-aarch64-static "wc /dev/null" gets "Unsupported ancillary data: 1/0" from a sendmsg attempt: because of wrong cmsg_len type in target_cmsghdr
Mark Millard
marklmi at yahoo.com
Thu Jan 3 10:06:58 UTC 2019
[Adding a for-reference note.]
On 2019-Jan-3, at 01:25, Mark Millard <marklmi at yahoo.com> wrote:
> [This note follows the investigation sequence,
> ending with the important conclusions.]
>
> My test context here is a poudriere-devel bulk -i for a
> amd64->aarch64 context.
>
> wc /dev/null or wc //dev/null does:
>
> # wc /dev/null
> Unsupported ancillary data: 1/0
>
> that then hangs-up until I ^C to get back to a prompt.
>
>
> Here is what ktrace/kdump shows the process before the hang through
> when I hit ^C to stop the hang-up:
>
> . . .
> 98475 101033 qemu-aarch64-static 0.000340 CALL sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80)
> 98475 101033 qemu-aarch64-static 0.000003 RET sigprocmask[340] 0
> 98475 101033 qemu-aarch64-static 0.000001 CALL pselect[522](0x6,0,0x7ffffffe3fb0,0,0,0x7ffffffe3d80)
> 98475 101033 qemu-aarch64-static 0.000001 RET pselect[522] 1
> 98475 101033 qemu-aarch64-static 0.000001 CALL sigprocmask[340](SIG_SETMASK,0x7ffffffe3c80,0)
> 98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0
> 98475 101033 qemu-aarch64-static 0.000042 CALL write[4](0x2,0x7ffffffe3480,0x20)
> 98475 101033 qemu-aarch64-static 0.000036 GIO fd 2 wrote 32 bytes
> "Unsupported ancillary data: 1/0
> "
> 98475 101033 qemu-aarch64-static 0.000003 RET write[4] 32/0x20
> 98475 101033 qemu-aarch64-static 0.000001 CALL sendmsg[28](0x5,0x7ffffffe3c28,0)
> 98475 101033 qemu-aarch64-static 0.000003 RET sendmsg[28] -1 errno 22 Invalid argument
> 98475 101033 qemu-aarch64-static 0.000184 CALL close[6](0x3)
> 98475 101033 qemu-aarch64-static 0.000040 RET close[6] 0
> 98475 101033 qemu-aarch64-static 0.000017 CALL close[6](0x7)
> 98475 101033 qemu-aarch64-static 0.000005 RET close[6] 0
> 98475 101033 qemu-aarch64-static 0.000002 CALL sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80)
> 98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0
> 98475 101033 qemu-aarch64-static 0.000001 CALL pselect[522](0x6,0x7ffffffe3dd0,0,0,0,0x7ffffffe3d80)
> 98475 101539 qemu-aarch64-static 0.000089 RET nanosleep[240] 0
> 98475 101539 qemu-aarch64-static 0.000042 CALL _umtx_op[454](0x86101f008,UMTX_OP_WAIT_UINT_PRIVATE,0,0,0)
> 98475 101033 qemu-aarch64-static 15.845396 RET pselect[522] -1 errno 4 Interrupted system call
>
> Note the qemu-aarch64 genrated message and the later:
> sendmsg[28] -1 errno 22 Invalid argument
>
> The qemu-*-static code that wrote the message is from
> t2h_freebsd_cmsg and is:
>
> if ((cmsg->cmsg_level == TARGET_SOL_SOCKET) &&
> (cmsg->cmsg_type == SCM_RIGHTS)) {
> int *fd = (int *)data;
> int *target_fd = (int *)target_data;
> int i, numfds = len / sizeof(int);
>
> for (i = 0; i < numfds; i++) {
> fd[i] = tswap32(target_fd[i]);
> }
> } else if ((cmsg->cmsg_level == TARGET_SOL_SOCKET) &&
> (cmsg->cmsg_type == SCM_TIMESTAMP) &&
> (len == sizeof(struct timeval))) {
> /* copy struct timeval to host */
> struct timeval *tv = (struct timeval *)data;
> struct target_freebsd_timeval *target_tv =
> (struct target_freebsd_timeval *)target_data;
> __get_user(tv->tv_sec, &target_tv->tv_sec);
> __get_user(tv->tv_usec, &target_tv->tv_usec);
> } else {
> gemu_log("Unsupported ancillary data: %d/%d\n",
> cmsg->cmsg_level, cmsg->cmsg_type);
> memcpy(data, target_data, len);
> }
>
> Well it turns out that qemu_*-static 's code has:
>
> struct target_cmsghdr {
> abi_long cmsg_len;
> int32_t cmsg_level;
> int32_t cmsg_type;
> };
>
> where for amd64 target_cmsghdr has:
>
> (gdb) p/d sizeof(struct target_cmsghdr)
> $2 = 16
> (gdb) p/d sizeof(((struct target_cmsghdr *)0)->cmsg_len)
> $5 = 8
> (gdb) p/d &((struct target_cmsghdr *)0)->cmsg_level
> $4 = 8
> (gdb) p/d &((struct target_cmsghdr *)0)->cmsg_type
> $1 = 12
>
> which does not match the amd64 or aarch64 native:
>
> struct cmsghdr {
> socklen_t cmsg_len; /* data byte count, including hdr */
> int cmsg_level; /* originating protocol */
> int cmsg_type; /* protocol-specific type */
> /* followed by u_char cmsg_data[]; */
> };
>
> because the cmsghdr's cmsg_len is smaller, even on a 64-bit architecture:
>
> (gdb) p/d sizeof(((struct cmsghdr *)0)->cmsg_len)
> $6 = 4
>
> /usr/include/arpa/inet.h:typedef __socklen_t socklen_t;
> /usr/include/netinet/in.h:typedef __socklen_t socklen_t;
> /usr/include/netinet6/in6.h:typedef __socklen_t socklen_t;
> /usr/include/sys/_types.h:typedef __uint32_t __socklen_t;
> /usr/include/sys/socket.h:typedef __socklen_t socklen_t;
> . . .
> /usr/include/netdb.h:typedef __socklen_t socklen_t;
>
> so abi_long does not match socklen_t for 64-bit architectures.
>
> So code such as in t2h_freebsd_cmsg:
>
> cmsg->cmsg_level = tswap32(target_cmsg->cmsg_level);
> cmsg->cmsg_type = tswap32(target_cmsg->cmsg_type);
>
> is not using the correct target offsets when aarch64 is the target
> that it is extracting from (for example).
>
> For comparison on a 64-bit architecture:
>
> (gdb) p/d sizeof(struct cmsghdr)
> $1 = 12
> (gdb) p/d &((struct cmsghdr *)0)->cmsg_level
> $2 = 4
> (gdb) p/d &((struct cmsghdr *)0)->cmsg_type
> $3 = 8
>
>
> I do not yet have a tested change.
>
On aarch64 (like on amd64):
# more cmsghdr_size_offsets.c
#include "/usr/include/sys/socket.h" // cmsghdr
#include <stddef.h> // offsetof
#include <stdio.h> // printf
int
main()
{
printf("%lu\n", (unsigned long) sizeof(struct cmsghdr));
printf("cmsg_len %lu\n", (unsigned long) offsetof(struct cmsghdr, cmsg_len));
printf("cmsg_level %lu\n", (unsigned long) offsetof(struct cmsghdr, cmsg_level));
printf("cmsg_type %lu\n", (unsigned long) offsetof(struct cmsghdr, cmsg_type));
return 0;
}
produces:
# ./a.out
12
cmsg_len 0
cmsg_level 4
cmsg_type 8
which qemu-aarch64-static 's target_cmsghdr definitely does not match.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list