Large array in KVM
Robert Watson
rwatson at FreeBSD.org
Sat Dec 8 23:32:07 PST 2007
On Thu, 6 Dec 2007, Sonja Milicic wrote:
> I'm working on a kernel module that needs to maintain a large structure in
> memory. As this structure could grow too big to be stored in memory, it
> would be good to offload parts of it to the disk. What would be the best way
> to do this? Could using a memory-mapped file help?
Sonja,
I think the answer depends a bit on just how large the data is. The two most
critical limits are consumption of physical memory and consumption of address
space.
There are several parts of the kernel that deal with these sorts of scenarios
for various reasons. You might take a look at the pipe code, which maps
pageable buffers into kernel address space, and the md(4) code, which can
provide swap-backed virtual disk storage. And, of couse, the file system is
the quintissential kernel subsystem that brings data in and out of memory from
disk :-).
On 64-bit systems, address space limits won't be much of a concern in most
scenarios, but on 32-bit systems, the kernel address space is quite small
(512m/1g in most configurations), and as such is both significantly smaller
than physical memory, and also potentially quite full on busy systems. On
32-bit systems, it is therefore critical to manage address space use and not
just memory use, so it may not be possible to simply map and use large amounts
of memory without careful planning.
If you're talking about a relatively small amount of memory -- e.g., a few
megabytes -- that you want to be pageable, the pipe code is a good reference.
Remember that page faults may sleep for an extended period, so you would need
to be able to avoid touching potentially paged out memory while holding
mutexes, rwlocks, and critical sections, as well as from non-sleepable
contexts such as interrupt threads. Using VM, you can explicitly manage the
paging, or you can just make sure to touch the memory only in safe contexts,
such as from the kernel portions of user threads when either no locks are
held, or only sleepable locks (such as lockmgr, sx(9)).
For larger amounts of memory, you will probably want to maintain your own
cache of data loaded explicitly or mapped and faulted explicitly because of
address space limits. You may find that you want to interact directly with
the buffer cache/VM system, and might find that your code ends up looking a
bit like a file system itself.
So, in brief summary: consider both physical and address space limitations,
and to what extent you'll need to manage the use to prevent exhaustion of
either resouce. You also need to be careful with locks and contexts you might
need to fault in data. File system code, pipe code, md code all useful
reference material.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-hackers
mailing list