Stale memory during post fork cow pmap update
Elliott.Rabe at dell.com
Elliott.Rabe at dell.com
Tue Feb 13 09:11:48 UTC 2018
On 02/10/2018 04:56 PM, Konstantin Belousov wrote:
> On Sat, Feb 10, 2018 at 09:56:20PM +0000, Elliott.Rabe at dell.com wrote:
>> On 02/10/2018 05:18 AM, Konstantin Belousov wrote:
>>> On Sat, Feb 10, 2018 at 05:12:11AM +0000, Elliott.Rabe at dell.com wrote:
>>>> ...
>>>> I've been hunting for the root cause of elusive, slight memory
>>>> corruptions in a large, complex process that manages many threads. All
>>>> failures and experimentation thus far has been on x86_64 architecture
>>>> machines, and pmap_pcid is not in use.
>>>> ...
>>> It is necessary for you to provide the test and provide
>>> some kind of the test trace or the output which illustrates the issue
>>> you found.
>> Here is the sequence of actions I am referring to. There is only one
>> lock, and all the writes/reads are on one logical page.
>>
>> +The process is forked transitioning a map entry to COW
>> +Thread A writes to a page on the map entry, faults, updates the pmap to
>> writable at a new phys addr, and starts TLB invalidations...
>> +Thread B acquires a lock, writes to a location on the new phys addr,
>> and releases the lock
>> +Thread C acquires the lock, reads from the location on the old phys addr...
>> +Thread A ...continues the TLB invalidations which are completed
>> +Thread C ...reads from the location on the new phys addr, and releases
>> the lock
>>
>> In this example Thread B and C [lock, use and unlock] properly and
>> neither own the lock at the same time. Thread A was writing somewhere
>> else on the page and so never had/needed the lock. Thread B sees a
>> location that is only ever read|modified under a lock change beneath it
>> while it is the lock owner.
> I believe you mean 'Thread C' in the last sentence.
You are correct, I did mean Thread C.
>> I will get a test patch together and make it available as soon as I can.
> Please.
Sorry for my delayed response; I had been working off a separate project
based on releng/11.1 and it took me longer then I expected to get a dev
rig setup off of master on which I could re-evaluate the situation.
I am attaching my test apparatus, however, calling it a test is probably
a disservice to tests everywhere. I consider this entire fixture
disposable, so I didn't get carried away trying to properly
style/partition/locate the code. I never wanted anything this
complicated either; it pretty much just evolved into a development aid
to spelunk around in the fault/pmap handling. My attempts thus-far at
reducing the fixture to be user-space only have not been successful.
Additionally, I have noticed that the fixture is /very/ sensitive to any
changes in timing; several of the debugging entries even seem key to
hitting the problem. I didn't have much luck getting the problem to
manifest on a virtual machine guest w/ a VirtualBox host either. For
all of these reasons, I don't think there is value here in trying to use
this as any sort of regression fixture, unless perhaps if someone is
willing to try to turn it into something less ridiculous. Despite all
shortcomings, on my hardware anyways, it is able to reproduce the
example I described pretty much immediately when I use it with the
debugging knob "-v". Instructions and expectations are at the top of the
main test fixture source file.
I am also attaching a patch that I have been using to prevent the
problem. I was looking at things with a much narrower view and made the
changes directly in pmap_enter. I suspect the internal
double-update-invalidate is slightly better performance wise then taking
two whole faults, but I haven't benchmarked it, it probably doesn't
matter much compared to the cost and frequency of the actual copies, and
it also has the disadvantage of being architecture specific. I also
don't feel like I have enough experience with the vm fault code in
general for my commentary to be very valuable here. However, I do
wonder: 1) if there are any other scenarios where a potentially
accessible page might be undergoing an [address+writable] change in the
same way (this sort of thing seems hard to read out of code), and 2) if
there is ever any legal reason why an accessible page should be
undergoing such a change? If not, perhaps we could come up with an
appropriate sanity-check condition to guard against any cases of this
sort of thing accidentally slipping in the future.
The attached git patches should apply and build cleanly on master commit
fe0ee5c. I have verified at least these three scenarios in my environment:
1) the fixture alone reproduces the problem.
2) the fixture with my patch does not reproduce the problem.
3) the fixture with your patch does not reproduce the problem.
Thanks!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-DISPOSABLE-A-test-fixture-that-can-repro-a-pmap-upda.patch
Type: text/x-patch
Size: 70608 bytes
Desc: 0001-DISPOSABLE-A-test-fixture-that-can-repro-a-pmap-upda.patch
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20180213/73a33d63/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-TRIAL-Double-invalidate-when-finishing-COW-pmap-upda.patch
Type: text/x-patch
Size: 3688 bytes
Desc: 0002-TRIAL-Double-invalidate-when-finishing-COW-pmap-upda.patch
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20180213/73a33d63/attachment-0003.bin>
More information about the freebsd-hackers
mailing list