kern/187238: vm.pmap.pcid_enabled="1" causes Java to coredump in FBSD 10

Henrik Gulbrandsen henrik at gulbra.net
Sun Mar 23 12:03:27 UTC 2014


This is the most time-consuming bug I've encountered in my life, and not
only because I started looking for it in the JVM, but now it seems to 
have
been hiding in plain sight. I'm pretty sure that pmap->pm_save is 
handled
incorrectly in the current kernel. Judging from the code, it's supposed 
to
include all CPUs where the pmap has been active since the latest call to
pmap_invalidate_all(...). However, that means that it should always be a
superset of pmap->pm_active, since any CPU where the pmap is active may
cache pmap information at any time. Currently, this is not the case, and
since only CPUs in pmap->pm_save are targeted in the TLB shootdown, we
are left with inconsistencies that crash the process soon afterwards.

The attached patch solves this by only clearing a CPU from pmap->pm_save
if it is not currently included in pmap->pm_active. As far as I can 
tell,
that eliminates the bug. The patch is against STABLE, since that's what
I'm currently running, but CURRENT should be pretty close, except for 
the
default setting of pmap_pcid_enabled.

By the way, the logic in the invalidation functions is a bit messy now
and can probably be simplified. Also, is there a good reason for 
ignoring
the pmap argument in smp_masked_invltlb(...)?

/Henrik

P.S. After five days it turns out that mx1.FreeBSD.org has been 
rejecting
this email due to a slight misconfiguration of my mail server. I hope 
that
I haven't caused too many hours of frustration by this failure to report
the bug fix in due time. Anyway, in the meantime my test (java/openjdk6
building itself) has been running continuously in the background. It 
used
to fail almost every single time, but has now gone through 765 
iterations
without a single crash. I believe that indicates that the bug is fixed.


More information about the freebsd-java mailing list