Re: FreeBSD hugepages
- Reply: Jake Freeland : "Re: FreeBSD hugepages"
- In reply to: Jake Freeland : "Re: FreeBSD hugepages"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 25 Jul 2024 20:18:16 UTC
On Thu, Jul 25, 2024 at 02:47:16PM -0500, Jake Freeland wrote: > On 7/25/24 14:02, Konstantin Belousov wrote: > > On Thu, Jul 25, 2024 at 01:46:17PM -0500, Jake Freeland wrote: > > > Hi there, > > > > > > I have been steadily working on bringing Data Plane Development Kit (DPDK) > > > on FreeBSD up to date with the Linux version. The most significant hurdle so > > > far has been supporting concurrent DPDK processes, each with their own > > > contiguous memory regions. > > > > > > These contiguous regions are used by DPDK as a heap for allocating DMA > > > buffers and other miscellaneous resources. Retrieving the underlying memory > > > and mapping these regions is currently different on Linux and FreeBSD: > > > > > > On Linux, hugepages are fetched from the kernel's pre-allocated hugepage > > > pool and are mapped into virtual address space on DPDK initialization. Since > > > the hugepages exist in a pool, multiple processes can reserve their own > > > hugepages and operate concurrently. > > > > > > On FreeBSD, DPDK uses an in-house contigmem kernel module that reserves a > > > large contiguous region of memory on load. During DPDK initialization, the > > > entire region is mapped into virtual address space. This leaves no memory > > > for another independent DPDK process, so only one process can operate at a > > > time. > > > > > > I could modify the DPDK contigmem module to mimic Linux's hugepages, but I > > > thought it would be better to integrate and upstream a hugepage-like > > > interface directly in the FreeBSD kernel source. I am writing this email to > > > see if anyone has any advice on the matter. I did not see any previous > > > attempts at this in Phabriactor or the commit log, but it is possible that I > > > missed it. I have read about transparent superpage promotion, but that seems > > > like a different mechanism altogether. > > > > > > At a quick glance, the implementation seems straightforward: read some > > > loader tunables, allocate persistent hugepages at boot time, and create a > > > pseudo filesystem that supports creating and mapping hugepages. I could be > > > underestimating the magnitude of this task, but that is why I'm asking for > > > thoughts and advice :) > > > > > > For reference, here is Linux's documentation on hugepages: > > > https://docs.kernel.org/admin-guide/mm/hugetlbpage.html > > Are posix shm largepages objects enough (they were developed to support > > DPDK). Look for shm_create_largepage(3). > Yes, shm_create_largepage(2) looks promising, but I would like the ability > to allocate these largepages at boot time when memory fragmentation as at a > minimum. Perhaps a couple sysctl tunables could be added onto the > vm.largepages node to specify a pagesize and allocate some number of pages > at boot? We could add an rc script which creates named largepage objects. This can be done using the posixshmcontrol utility. That might not be early enough during boot for some purposes. In that case, we could have a module which creates such objects from within the kernel. This is pretty straightforward to do; I wrote a dumb version of this for a mips-specific project a few years ago, feel free to take code or inspiration from it: https://people.freebsd.org/~markj/tlbdemo.c > It seems Linux had an interface similar to shm_create_largepage(2) back in > v2.5, but they removed it in favor of their hugetlbfs filesystem. It would > be nice to stay close to the file-backed Linux interface to maximize code > sharing in userspace. It looks like the foundation for hugepages is there, > but the interface for allocation and access needs to be extended. POSIX shm objects have most of the properties one would want, I'd expect, save the ability to access them via standard syscalls. What else is missing besides the ability to reserve memory at boot time?