copyin+copyout in one step ?
Alfred Perlstein
bright at mu.org
Tue May 28 01:10:15 UTC 2013
On 5/27/13 4:56 PM, Alfred Perlstein wrote:
> On 5/27/13 4:38 PM, Luigi Rizzo wrote:
>> Hi,
>> say a process P1 wants to use the kernel to copy the content of a
>> buffer SRC (in its user address space) to a buffer DST (in the
>> address space of another process P2), and assume that P1 issues the
>> request to the kernel when P2 has already told the kernel where the
>> data should go:
>>
>> P1 P2
>> +------+ +--------+
>> | SRC | | DST |
>> +--v---+ +--^-----+
>> --------|------------------------|----------
>> | | kernel
>> | ^
>>
>> | |
>> | +--------+ |
>> +----->| tmpbuf +--------+
>> copyin| | copyout
>> P1 ctx+--------+ P2 ctx
>>
>> I guess the one above is the canonical way: P1 does a copyin() to a
>> temporary buffer, then notifies P2 which can then issue or complete
>> a syscall to do a copyout from tmpbuf to DST in P2's context.
>>
>>
>> But I wonder, is it possible to do it as follows: P2 tells the kernel
>> where the data should go (DST); later, P1 issues a system call and
>> through a combined "copyinout()" moves data directly from SRC to DST,
>> operating in the context of P1.
>>
>> | copyinout() ? |
>> +------------>-----------+
>> issued by P1
>>
>>
>> Is this doable at all ? I suspect that "tell DST to the kernel"
>> might be especially expensive as it needs to pin the page
>> so it is accessible while doing the syscall for P1 ?
>> (the whole point for this optimization is saving the extra
>> copy through the buffer, but it may be pointless if pinning
>> the memory is more expensive than the copy)
>>
> I suspect you'll want to use something like vslock(9) and sf_bufs.
>
> Have a look at vm/vm_glue.c -> vslock() vm_imgact_hold_page().
>
> On amd64, I *think* mapping an sfbuf or if you are really evil you can
> optimistically wire the page in the vm (cheap). If it's present then
> you can just use the direct map to access it. However, if it's not
> present, then fall back to another method, or maybe just fault it in
> (which will have to happen anyhow) and then retry.
>
> Sounds like a cool project!
>
> -Alfred
Oh, one other thing.. look at the pipe code. It used to do what you
suggest, I think however it was driven by the READER pinning the
WRITER's address space and doing a direct copy. However it may not be
optimized for NOT-mapping into kva as I suggested doing.
-Alfred
More information about the freebsd-current
mailing list