git: af949c590bd8 - main - Disable stack gap for ntpd during build.
Dawid Górecki
dgr at semihalf.com
Tue May 25 12:40:42 UTC 2021
Hi Cy,
On Fri, May 21, 2021 at 10:53:51AM -0700, Cy Schubert wrote:
> In message <CAPv3WKe4O--Jne20ozpMfLe3XvyPZXawUx+LgvOF8bsDEVsa7g at mail.gmail.c
> om>
> , Marcin Wojtas writes:
> > Hi Cy,
> >
> > pt., 21 maj 2021 o 16:46 Cy Schubert <Cy.Schubert at cschubert.com> napisał(a):
> > >
> > > In message <02078965-24BE-4F23-92D5-5E8E54A0C3E7 at freebsd.org>, Jessica
> > > Clarke w
> > > rites:
> > > > > On 21 May 2021, at 15:11, Marcin Wojtas <mw at semihalf.com> wrote:
> > > > >
> > > > > Hi Jess
> > > > >
> > > > > pt., 21 maj 2021 o 15:39 Jessica Clarke <jrtc27 at freebsd.org> napisał(a
> > ):
> > > > >>
> > > > >> On 21 May 2021, at 14:34, Marcin Wojtas <mw at FreeBSD.org> wrote:
> > > > >>>
> > > > >>> The branch main has been updated by mw:
> > > > >>>
> > > > >>> URL: https://cgit.FreeBSD.org/src/commit/?id=af949c590bd8a00a5973b587
> > 5d7e
> > > > 0fa6832ea64a
> > > > >>>
> > > > >>> commit af949c590bd8a00a5973b5875d7e0fa6832ea64a
> > > > >>> Author: Marcin Wojtas <mw at FreeBSD.org>
> > > > >>> AuthorDate: 2021-05-21 09:29:22 +0000
> > > > >>> Commit: Marcin Wojtas <mw at FreeBSD.org>
> > > > >>> CommitDate: 2021-05-21 13:33:06 +0000
> > > > >>>
> > > > >>> Disable stack gap for ntpd during build.
> > > > >>>
> > > > >>> When starting, ntpd calls setrlimit(2) to limit maximum size of its
> > > > >>> stack. The stack limit chosen by ntpd is 200K, so when stack gap
> > > > >>> is enabled, the stack gap is larger than this limit, which results
> > > > >>> in ntpd crashing.
> > > > >>
> > > > >> Isn’t the bug that the unusable gap counts as usage?
> > > > >>
> > > > >> Jess
> > > > >>
> > > > >
> > > > > An alternative solution was submitted
> > > > > (https://reviews.freebsd.org/D29832), so that to extend the limit for
> > > > > ntpd, but eventually it was recommended to simple disable the stack
> > > > > gap for it until it's fixed upstream (see the last comment in the
> > > > > linked revision).
> > > >
> > > > That’s my point, there is nothing to “fix” upstream. NTPD uses less
> > tha
> > > > n 200K
> > > > of stack, thus it is perfectly reasonable for it to set its limit to that
> > . Th
> > > > e
> > > > fact that FreeBSD decides to count an arbitrary, non-deterministic amount
> > of
> > > > additional unusable virtual address space towards that limit is not its f
> > ault
> > > > ,
> > > > but a bug in FreeBSD that needs to be fixed as it’s entirely unreasonab
> > le f
> > > > or
> > > > applications to have to account for that.
> > >
> > > This latest problem is not stack gap. It is PIE.
> > >
> >
> > I have to disagree.
>
> We are talking cross purposes. Your examples later on in your email prove
> my point.
>
> > ntpd does not start because of stack gap, not PIE, even though it may
> > seem like PIE causes this. This is due to the fact that stack gap is
> > disabled if PIE is disabled. Because of that value of sysctl
> > kern.elf64.aslr.stack_gap does not matter when kern.elf64.aslr.pie_enable
> > is set to 0. When pie_enabled is set to 1 and stack gap is enabled, then
> > ntpd fails to start, but when pie_enabled is set to 1 and stack_gap
> > is set to 0, then ntpd starts without any issue. We verified this on
> > FreeBSD-CURRENT snapshot from 2021-05-20.
>
> I verified the PIE problem on a -CURRENT as of my comments in the review.
> Enabling stack gap and disabling PIE resolved the issue. The reason for
> stack gap is not a problem is that ntpd disables stack gap at line 441 of
> ntpd.c.
>
> Furthermore enabling stack gap and disabling PIE circumvents the problem. I
> tested this myself and left that note in the review.
>
> Enable stack gap and disable PIE: It works. But look at line 441 of ntpd.c
> to see stack gap disabled before ntpd forks itself.
The issue is caused by stack gap, not by PIE. However, it may seem like
pie_enabled sysctl causes it.
ASLR stack gap is only created if kern.elf.aslr.pie_enable is set to 1
when the binary has ET_DYN type. For ET_EXEC type, sysctl
kern.elf.aslr.enable has to be set to 1 instead. Otherwise, the value of
kern.elf.aslr.stack_gap will be ignored and it will work as if set to 0.
The code governing this behavior can be found in sys/kern/imgact_elf.c
lines 1175-1196, 1228-1232 and in sys/kern/kern_exec.c lines 1547-1557.
About procctl - in FreeBSD there are in fact two different stack gaps.
One is the stack gap located at the bottom of the stack, the second one
has a random size and is located at the top of the stack. The second
stack gap is related to ASLR, while the first exists to prevent stack
overflow overwriting nearby mappings. Procctl only affects the first
stack gap, the second one - which is causing the segfault - is not
affected by procctl.
>
> >
> > The fact that this is a stack gap issue can be verified using following
> > procedure:
> > 1. Install FreeBSD-CURRENT snapshot from 2021-05-20 using default
> > configuration.
> > 2. On a newly installed system start ntpd. With default configuration
> > it should start successfully.
> > 3. Set sysctl kern.elf64.aslr.pie_enable=1 and start ntpd. This time ntpd
> > should fail. An entry indicating that ntpd was killed because of signal
> > 11 should be visible in /var/log/messages.
> > 4. Set sysctl kern.elf64.aslr.stack_gap=0 and start ntpd once again. This
> > time ntpd should start even though pie_enable is set to 1.
> >
> > Exact log from the boot it was tested:
> > root at freebsd-ntpd-test:~ # sysctl -a | grep aslr
> > kern.elf32.aslr.stack_gap: 3
> > kern.elf32.aslr.honor_sbrk: 1
> > kern.elf32.aslr.pie_enable: 0
> > kern.elf32.aslr.enable: 0
> > kern.elf64.aslr.stack_gap: 3
> > kern.elf64.aslr.honor_sbrk: 1
> > kern.elf64.aslr.pie_enable: 0
> > kern.elf64.aslr.enable: 0
> > vm.aslr_restarts: 0
> > root at freebsd-ntpd-test:~ # ntpd
> > root at freebsd-ntpd-test:~ # ps aux | grep ntpd
> > root 826 0.0 0.2 22060 6960 - Ss 17:38 0:00.01 ntpd
> > root 828 0.0 0.1 12976 2416 0 S+ 17:38 0:00.00 grep ntpd
> > root at freebsd-ntpd-test:~ # killall ntpd
> > root at freebsd-ntpd-test:~ # ps aux | grep ntpd
> > root 831 0.0 0.1 12976 2416 0 S+ 17:38 0:00.00 grep ntpd
> > root at freebsd-ntpd-test:~ # sysctl kern.elf64.aslr.pie_enable=1
> > kern.elf64.aslr.pie_enable: 0 -> 1
>
> This causes the problem.
Yes, this seems to cause the problem. However, what really happens is
that along with pie_enable, the stack gap is enabled. When pie_enable
was set to 0, stack_gap was ignored.
>
> > root at freebsd-ntpd-test:~ # ntpd
> > root at freebsd-ntpd-test:~ # ps aux | grep ntpd
> > root 836 0.0 0.1 14128 2452 0 S+ 17:39 0:00.00 grep ntpd
> > root at freebsd-ntpd-test:~ # cat /var/log/messages | tail
> > May 21 17:38:25 freebsd-ntpd-test ntpd[826]: ntpd exiting on signal 15
> > (Terminated)
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: ntpd 4.2.8p15-a (1): Starting
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: Command line: ntpd
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]:
> > ----------------------------------------------------
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: ntp-4 is maintained by
> > Network Time Foundation,
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: Inc. (NTF), a non-profit
> > 501(c)(3) public-benefit
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: corporation. Support and
> > training for ntp-4 are
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: available at
> > https://www.nwtime.org/support
> > May 21 17:39:14 freebsd-ntpd-test ntpd[833]:
> > ----------------------------------------------------
> > May 21 17:39:14 freebsd-ntpd-test kernel: pid 834 (ntpd), jid 0, uid
> > 0: exited on signal 11 (core dumped)
This happened when kern.elf64.aslr.pie_enable=1 and
kern.elf64.aslr.stack_gap=3.
> > root at freebsd-ntpd-test:~ # sysctl kern.elf64.aslr.stack_gap=0
> > kern.elf64.aslr.stack_gap: 3 -> 0
> > root at freebsd-ntpd-test:~ # sysctl -a | grep aslr
> > kern.elf32.aslr.stack_gap: 3
> > kern.elf32.aslr.honor_sbrk: 1
> > kern.elf32.aslr.pie_enable: 0
> > kern.elf32.aslr.enable: 0
> > kern.elf64.aslr.stack_gap: 0
> > kern.elf64.aslr.honor_sbrk: 1
> > kern.elf64.aslr.pie_enable: 1
>
> This is the problem.
At this point the stack gap was disabled while still leaving
pie_enable set to 1.
>
> > kern.elf64.aslr.enable: 0
> > vm.aslr_restarts: 1
> > root at freebsd-ntpd-test:~ # ntpd
> > root at freebsd-ntpd-test:~ # ps aux | grep ntpd
> > root 845 0.0 0.2 22060 6924 - Ss 17:40 0:00.01 ntpd
> > root 847 0.0 0.1 12976 2440 0 S+ 17:40 0:00.00 grep ntpd
Here the ntpd daemon started with pie_enable set to 1. stack_gap was set
to 0. No segfault.
> > root at freebsd-ntpd-test:~ # cat /var/log/messages | tail
> > May 21 17:39:14 freebsd-ntpd-test kernel: pid 834 (ntpd), jid 0, uid
> > 0: exited on signal 11 (core dumped)
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: ntpd 4.2.8p15-a (1): Starting
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: Command line: ntpd
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]:
> > ----------------------------------------------------
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: ntp-4 is maintained by
> > Network Time Foundation,
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: Inc. (NTF), a non-profit
> > 501(c)(3) public-benefit
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: corporation. Support and
> > training for ntp-4 are
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: available at
> > https://www.nwtime.org/support
> > May 21 17:40:52 freebsd-ntpd-test ntpd[844]:
> > ----------------------------------------------------
> > May 21 17:40:52 freebsd-ntpd-test ntpd[845]: leapsecond file
> > ('/var/db/ntpd.leap-seconds.list'): stat failed: No such file or
> > directory
> > root at freebsd-ntpd-test:~ # killall ntpd
> >
> > Best regards,
> > Marcin
>
> Running on my firewall, which has had this same ASLR configuration for
> about a year.
>
> cwfw# sysctl kern.elf64.aslr
> kern.elf64.aslr.stack_gap: 3
> kern.elf64.aslr.honor_sbrk: 1
> kern.elf64.aslr.pie_enable: 0
> kern.elf64.aslr.enable: 1
> cwfw# ps auxww | grep ntpd
> ntpd 1499 0.0 0.1 22044 5776 - Ss 09:30 0:00.28
> /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f
> /var/db/ntp/ntpd.drift -g
> root 3032 0.0 0.0 13044 2384 0 S+ 10:49 0:00.00 grep ntpd
> cwfw# uptime
> 10:49AM up 1:20, 1 user, load averages: 1.06, 1.02, 0.97
> cwfw# uname -a
> FreeBSD cwfw 14.0-CURRENT FreeBSD 14.0-CURRENT #151
> komquats-n246804-af949c590bd8-dirty: Fri May 21 07:09:32 PDT 2021
> root at cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/PROD2 amd64
> cwfw#
>
> My laptop:
>
> slippy# sysctl kern.elf64.aslr
> kern.elf64.aslr.stack_gap: 3
> kern.elf64.aslr.honor_sbrk: 1
> kern.elf64.aslr.pie_enable: 0
> kern.elf64.aslr.enable: 1
> slippy# ps auxww | grep ntpd
> ntpd 2100 0.0 0.1 22036 8600 - Ss 09:35 0:00.27
> /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f
> /var/db/ntp/ntpd.drift -g
> root 4632 0.0 0.0 13040 2724 1 S+ 10:51 0:00.00 grep ntpd
> slippy# uptime
> 10:51AM up 1:17, 0 users, load averages: 0.11, 0.16, 0.16
> slippy# uname -a
> FreeBSD slippy 14.0-CURRENT FreeBSD 14.0-CURRENT #155
> komquats-n246804-af949c590bd8-dirty: Fri May 21 07:07:22 PDT 2021
> root at cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/BREAK amd64
> slippy#
>
> One of my poudriere machines:
>
> cwsys# sysctl kern.elf64.aslr
> kern.elf64.aslr.stack_gap: 3
> kern.elf64.aslr.honor_sbrk: 1
> kern.elf64.aslr.pie_enable: 0
> kern.elf64.aslr.enable: 1
> cwsys# ps auxww | grep ntpd
> ntpd 4039 0.0 0.1 22040 7340 - Ss 09:34 0:00.46
> /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f
> /var/db/ntp/ntpd.drift -g
> root 6385 0.0 0.0 13044 2712 2 S+ 10:52 0:00.01 grep ntpd
> cwsys# uptime
> 10:52AM up 1:19, 2 users, load averages: 0.26, 0.25, 0.24
> cwsys# uname -a
> FreeBSD cwsys 14.0-CURRENT FreeBSD 14.0-CURRENT #155
> komquats-n246804-af949c590bd8-dirty: Fri May 21 07:07:22 PDT 2021
> root at cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/BREAK amd64
> cwsys#
>
> Three examples of stack gap enabled and PIE disabled. When I enable PIE,
> ntpd fails.
If the binaries are ET_DYN type, the stack gap is actually disabled.
Since this is 14 this is most likely the case, unless explicitly
disabled by using WITHOUT_PIE build option.
>
>
> --
> Cheers,
> Cy Schubert <Cy.Schubert at cschubert.com>
> FreeBSD UNIX: <cy at FreeBSD.org> Web: https://FreeBSD.org
> NTP: <cy at nwtime.org> Web: https://nwtime.org
>
> The need of the many outweighs the greed of the few.
>
>
You can also see that on 13.0-RELEASE ntpd will also segfault, but
instead of kern.elf64.aslr.pie_enabled set to 1, kern.elf64.aslr.enabled
should be set to 1. This is due to the fact that 13 is built with
WITHOUT_PIE option and the executables are of ET_EXEC type.
Again, you can set stack_gap to 0 there and the problem will disappear.
Setting kern.elf64.aslr.stack_gap to value > 0 does not necessarily mean
that stack gap is actually enabled, so the whole situation can be
confusing.
Best regards,
Dawid
More information about the dev-commits-src-main
mailing list