i386 4/4 change
Bruce Evans
brde at optusnet.com.au
Sat Mar 31 15:57:19 UTC 2018
On Sat, 31 Mar 2018, Konstantin Belousov wrote:
> the change to provide full 4G of address space for both kernel and
> user on i386 is ready to land. The motivation for the work was to both
> mitigate Meltdown on i386, and to give more breazing space for still
> used 32bit architecture. The patch was tested by Peter Holm, and I am
> satisfied with the code.
>
> If you use i386 with HEAD, I recommend you to apply the patch from
> https://reviews.freebsd.org/D14633
> and report any regressions before the commit, not after. Unless
> a significant issue is reported, I plan to commit the change somewhere
> at Wed/Thu next week.
>
> Also I welcome patch comments and reviews.
It crashes at boot time in getmemsize() unless booted with loader which
I don't want to use.
It is much slower, and I couldn't find an option to turn it off.
For makeworld, the system time is slightly more than doubled, the user
time is increased by 16%, and the real time is increased by 21%.
On amd64, turning off pti and not having ibrs gives almost no increase
in makeworld times relative to old versions, and pti only costs about
5% IIRC.
Makeworld is not very syscall-intensive. netblast is very syscall-intensive,
and its throughput is down by a factor of 5 (660/136 = 4.9, 1331/242 = 5.5).
netblast 127.0.0.1 5001 5 10 (localhost, port 5001, 5-byte tinygrams for 10 s):
537 kpps sent, 0 kpps dropped # before this patch (CPU use 1.3)
136 kpps sent, 0 kpps dropped # after (CPU use 2.1)
(Pure software overheads. It uses 1.6 times as much CPU to go 4 times
slower).
netblast 192.168.2.8 (low end PCI33 lem on low latency 1 Gbps LAN)
275 kpps sent, 1045 kpps dropped # before (CPU use 1.3)
245 kpps sent, 0 kpps dropped # after (CPU use 1.3)
(The hardware can't do anywhere near line rate of ~1500 kpps, so this
becomes a benchmark of syscalls and dropping packets. The change makes
FreeBSD so slow that 8 CPUs at 4.08 can't saturate a low end PCI33 NIC
(the hardware saturates at about 282 kpps for tx and about 400 kpps for
rx)).
netblast 192.168.2.8 (low end PCIe em on low latency 1 Gbps LAN)
1316 kpps sent, 3 kpps dropped # before (CPU use 1.6)
243 kpps sent, 0 kpps dropped # after (CPU use 1.2)
This is seriously slower for the most useful case. It reduces a system
that could almost reach line rate using about 2 of 8 CPUs at 4 GHz to
one that that is slower than with 1 CPU at 2 GHz (the latter saturates
in software at about 640 kpps in old versions of FreeBSD at at about
400 kpps in -current).
Initial debugging of the crash: it crashes on the first pmap_kenter()
in getmemsize(). I configure debug.late_console to 0. That works,
and without it getmemsize() can't even be debugged since it is after
console initialization and ddb entry with -d.
In getmemsize(), of course all the preload calls return 0 and smapbase is
NULL. Then vm86 bios calls work and give basemem = 0x276. Then
basemem_setup() is called and it returns. Then pmap_kenter() is called
and it crashes:
Stopped at getmemsize+0xb3: pushl $0x1000
Stopped at getmemsize+0xb8: pushl $0x1000
Stopped at getmemsize+0xbd: call pmap_kenter
Stopped at pmap_kenter: pushl %ebp
Stopped at pmap_kenter+0x1: movl %esp,%ebp
Stopped at pmap_kenter+0x3: movl 0x8(%ebp),%eax
Stopped at pmap_kenter+0x6: shrl $0xc,%eax
Stopped at pmap_kenter+0x9: movl 0xc(%ebp),%edx
Stopped at pmap_kenter+0xc: orl $0x3,%edx
Stopped at pmap_kenter+0xf: movl %edx,PTmap(,%eax,4)
The last instruction crashes because PTmap is not mapped at this point:
db> p/x $edx
1003
db> p/x PTmap
ff800000
db> p/x $eax
1
db> x/x PTmap
PTmap:KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper(cec5cb,1420a04,c6de83,1420978,1,...) at db_trace_self_wrapper+0x24/frame 0x142095c
kdb_reenter(1420978,1,ff80003a,1420998,8f1419,...) at kdb_reenter+0x24/frame 0x1420968
trap(1420a10) at trap+0xa0/frame 0x1420a04
calltrap() at calltrap+0x8/frame 0x1420a04
--- trap 0xc, eip = 0xc5c394, esp = 0x1420a50, ebp = 0x1420a88 ---
db_read_bytes(ff800001,3,1420aa0) at db_read_bytes+0x29/frame 0x1420a88
db_get_value(ff800000,4,0,0,d2d304,...) at db_get_value+0x20/frame 0x1420ab4
db_examine(ff800000,1,ffffffff,1420b00) at db_examine+0x144/frame 0x1420ae4
db_command(cb1d99,1420be4,8f0f01,d1d28a,0,...) at db_command+0x20a/frame 0x1420b90
db_command_loop(d1d28a,0,1420bac,1420b9c,1420be4,...) at db_command_loop+0x55/frame 0x1420b9c
db_trap(a,ffff4ff0,1,1,80046,...) at db_trap+0xe1/frame 0x1420be4
kdb_trap(a,ffff4ff0,1420cc4) at kdb_trap+0xb1/frame 0x1420c10
trap(1420cc4) at trap+0x523/frame 0x1420cb8
calltrap() at calltrap+0x8/frame 0x1420cb8
--- trap 0xa, eip = 0xc65a4a, esp = 0x1420d04, ebp = 0x1420d04 ---
pmap_kenter(1000,1000,1429000,8efe13,0,...) at pmap_kenter+0xf/frame 0x1420d04
getmemsize(1,5a8807ff,ee,59a80097,ee,...) at getmemsize+0xc2/frame 0x1420fc4
init386(1428000) at init386+0x2bb/frame 0x1420ff4
btext() at btext+0x55
*** error reading from address ff800000 ***
--More-- KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper(cec5cb,1420ab4,8ee255,cb1923,ff800000,...) at db_trace_self_wrapper+0x24/frame 0x1420a7c
kdb_reenter(cb1923,ff800000,0) at kdb_reenter+0x24/frame 0x1420a88
db_get_value(ff800000,4,0,0,d2d304,...) at db_get_value+0x3a/frame 0x1420ab4
db_examine(ff800000,1,ffffffff,1420b00) at db_examine+0x144/frame 0x1420ae4
db_command(cb1d99,1420be4,8f0f01,d1d28a,0,...) at db_command+0x20a/frame 0x1420b90
db_command_loop(d1d28a,0,1420bac,1420b9c,1420be4,...) at db_command_loop+0x55/frame 0x1420b9c
db_trap(a,ffff4ff0,1,1,80046,...) at db_trap+0xe1/frame 0x1420be4
kdb_trap(a,ffff4ff0,1420cc4) at kdb_trap+0xb1/frame 0x1420c10
trap(1420cc4) at trap+0x523/frame 0x1420cb8
calltrap() at calltrap+0x8/frame 0x1420cb8
--- trap 0xa, eip = 0xc65a4a, esp = 0x1420d04, ebp = 0x1420d04 ---
pmap_kenter(1000,1000,1429000,8efe13,0,...) at pmap_kenter+0xf/frame 0x1420d04
getmemsize(1,5a8807ff,ee,59a80097,ee,...) at getmemsize+0xc2/frame 0x1420fc4
init386(1428000) at init386+0x2bb/frame 0x1420ff4
btext() at btext+0x55
db>
Bruce
More information about the freebsd-amd64
mailing list