Re: physical memory chunk statistics?
- In reply to: Konstantin Belousov : "Re: physical memory chunk statistics?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 21 Dec 2024 18:47:40 UTC
On Sat, Dec 21, 2024 at 08:17:12PM +0200, Konstantin Belousov wrote: > On Sat, Dec 21, 2024 at 05:03:45PM +0000, Bjoern A. Zeeb wrote: > > Hi, > > > > upon boot we display our physical memory junks nicely. > > > > Physical memory chunk(s): > > 0x0000000000001000 - 0x000000000009ffff, 651264 bytes (159 pages) > > 0x0000000000101000 - 0x0000000012df5fff, 315576320 bytes (77045 pages) > > 0x0000000013c00000 - 0x0000000013d87fff, 1605632 bytes (392 pages) > > 0x0000000016401000 - 0x000000001e920fff, 139591680 bytes (34080 pages) > > 0x000000001eb1b000 - 0x000000001f73afff, 12713984 bytes (3104 pages) > > > > > > Do we have any way on a running system to export some statistics of > > how much each of them is used up? Something like [Use, Requests]? > Look at vm.phys_segs. These are segments of physical memory as seen > by the phys allocator (vm_phys.c). It is mostly the same chunks as were > reported e.g. by UEFI parser at boot, but after the kernel initial memory > and other stuff took some pages before vm was initialized. > > > > > > > Say one wanted to debug the good old lower 4GB contigmalloc failing > > problem (and example but also something I am just facing again). > > How would one do that? The immediate questions are: > > (a) how much of the available lower physical 4G chunks are used? > > (b) how much fragmentation is there roughly? > > (c) what's the largest contig. chunk avail? > > > Then look at vm.phys_free. First, there are free lists. On amd64, freelist > 0 is the normal freelist (all pages except freelist 1), and freelist 1 is > the pages below 4G (ie. DMA32). See sys/amd64/include/vmparam.h for > definitions and comments. This isn't always true: we only create a dedicated DMA32 freelist on systems with >= VM_DMA32_NPAGES_THRESHOLD pages of physical memory, which is 64GB by default. So on smaller machines one will only see two freelists, default and ISA DMA, instead of three. I think the reason is that static allocations during boot (e.g., the VM page array) can possibly deplete all physical RAM below 4GB, so the explicit freelist helps reserve memory needed for DMA. There is some overhead in searching multiple freelists when free pages are scarce, so there is perhaps some performance penalty in creating the DMA32 freelist unconditionally. Maybe the threshold should be overridable by a tunable. > The sysctl reports the number of free pages clustered by the contigous > order. Pools are mostly internal to vm, they allow to distinguish allocs > from direct map (e.g. for UMA page allocs) vs. generic contig allocs. The intent of the direct freepool, I believe, is to cluster allocations of pages accessed solely via the direct map, i.e., small UMA allocs and page table pages, in order to improve TLB efficiency. > People actively working with allocators could correct me. > > > Given (c) is likely harder and expensive (a) and (b) could at least > > give an idea. Did we ever solve that? > > vm.phys_free should give the answer up to the supported order size. > > That said, a radical solution for the problem of memory below 4G is > to use IOMMU. It might be enabled just for specific PCI device. >