Re: How are syscall functions defined?

From: Pat Maddox <pat_at_patmaddox.com>
Date: Sat, 29 Jul 2023 01:19:23 UTC
Hey Warner,

Thanks for taking the time to walk through that, it's super helpful. Full disclosure: this is much deeper into src/ than I've ever ventured before.

The generated jail_attach.S and corresponding RSYSCALL / KERNCALL definitions make sense to me.

Now I'm wondering about the other side of the boundary - how that assembly makes its way to the kernel implementation.

Here's what I think happens:

1. CPU sends a trap which leads to `call amd64_syscall` [1]
2. amd64_syscall [2] calls syscallenter [3]
3. syscallenter calls sv_fetch_syscall_args [4] which is set to cpu_fetch_syscall_args [5]
4. cpu_fetch_syscall_args uses the syscall arg code as an index into sysent [6]
5. syscallenter calls the syscall entry sycall property [7]

So what we get, in shortened form, is:

1. libc produces assembly `mov 436,%eax; KERNCALL`
2. syscallenter grabs sysent[436] and calls its sycall property, which in this case is sys_jail_attach [8]

Whew.

Is that right?

Pat

---

[1] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/exception.S#n580
[2] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/trap.c#n1187
[3] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n58
[4] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n82
[5] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/elf_machdep.c#n87
[6] https://cgit.freebsd.org/src/tree/sys/amd64/amd64/trap.c#n1080
[7] https://cgit.freebsd.org/src/tree/sys/kern/subr_syscall.c#n162
[8] https://cgit.freebsd.org/src/tree/sys/kern/kern_jail.c#n2599

On Sat, Jul 1, 2023, at 6:26 AM, Warner Losh wrote:
> OK. System calls are a pain. there's a lot of boilerplate needed to make
> them all work.
>
> So, it's been automated. The process starts after you add a system call to
> syscalls.master.
> 'make sysent' is run which creates a number of different files. It creates
> the kernel glue.
> These glue files are then committed to the tree. On the kernel side we have
> sys/kern/init_sysent.c which has the 'sysent' array which is used to
> dispatch the system
> calls. sys/kern/syscalls.c has the names, and sys/kern/systrace_args has
> information
> for dtrace decoding them.
>
> In userland, though, the system calls live in libc. But there's no source
> file for them.
> Instead, libc's sys/Makefile.inc includes sys/sys/syscall.mk, which is also
> generated above,
> which has a list of all the system call files to create. Dependency rules
> in sys/Makefile.inc
> cause those .o's to be created with this rule:
> ${SASM}:
>         printf '/* %sgenerated by libc/sys/Makefile.inc */\n' @ > ${.TARGET}
>         printf '#include "compat.h"\n' >> ${.TARGET}
>         printf '#include "SYS.h"\nRSYSCALL(${.PREFIX})\n' >> ${.TARGET}
>         printf  ${NOTE_GNU_STACK} >>${.TARGET}
>
> which is where the source winds up: in the object tree as jail_attach.S
> likely
> with the contents (generated by hand):
>
> /* jail_attach.S generated by libc/sys/Makefile.inc */
> #incldue "compat.h"
> #include "SYS.h"
> RSYSCALL(jail_attach)
> .section .note.GNU-stack,"".%%progbits
>
> The different __sys_jail_attach wrapping for the threading
> libraries also is part of the RSYSCALL macro, for example amd64:
> #define RSYSCALL(name)  ENTRY(__sys_##name);                            \
>                         WEAK_REFERENCE(__sys_##name, name);             \
>                         WEAK_REFERENCE(__sys_##name, _##name);          \
>                         mov $SYS_##name,%eax; KERNCALL;                 \
>                         jb HIDENAME(cerror); ret;                       \
>                         END(__sys_##name)
>
> The System.map file, etc, all know that this is generated, and is used to
> put the symbols in the proper version area. Symbol versions are beyond
> the scope of this post.
>
> Warner
>
> On Sat, Jul 1, 2023 at 5:23 AM Pat Maddox <pat@patmaddox.com> wrote:
>
>> On Sat, Jul 1, 2023, at 3:11 AM, Pat Maddox wrote:
>> > jail_attach is defined in syscalls.master [1] which generates a
>> > declaration in jail.h [2]. Try as I might, I can’t find any definition
>> > of that specific syscall function (or any other).  I think the closest
>> > I’ve found is sys_jail_attach in kern_jail.c [3]. I suspect there’s
>> > some generation going on that defines jail_attach - but if that’s the
>> > case, I haven’t been able to track it down.
>> >
>> > Can someone point me to how the C function gets defined?
>> >
>> > Thanks,
>> > Pat
>> >
>> > [1]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/kern/syscalls.master#L2307
>> > [2]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/sys/jail.h#L119
>> > [3]
>> >
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/sys/kern/kern_jail.c#L2340
>>
>> Symbol.map [1] is used to produce a version map [2] which is then fed to
>> the linker [3], which I assume maps the symbols in the resulting binary. I
>> intend to experiment with that a bit, but I think that makes sense.
>>
>> Pat
>>
>> [1]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/lib/libc/sys/Symbol.map#L672
>> [2]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/share/mk/bsd.symver.mk#L43
>> [3]
>> https://github.com/freebsd/freebsd-src/blob/releng/13.2/share/mk/bsd.lib.mk#L253
>>
>>