vm.kmem_size_max and vm.kmem_size capped at 329853485875
(~307GB)
Alan Cox
alc at rice.edu
Mon Aug 20 15:22:32 UTC 2012
On 08/18/2012 19:57, Gezeala M. Bacuño II wrote:
> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox<alc at rice.edu> wrote:
>> On 08/17/2012 17:08, Gezeala M. Bacuño II wrote:
>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox<alc at rice.edu> wrote:
>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., the
>>>> region where the kernel's slab and malloc()-like memory allocators obtain
>>>> their memory. While this heap may occupy the largest portion of the
>>>> kernel's virtual address space, it cannot occupy the entirety of the
>>>> address
>>>> space. There are other things that must be given space within the
>>>> kernel's
>>>> address space, for example, the file system buffer map.
>>>>
>>>> ZFS does not, however, use the regular file system buffer cache. The ARC
>>>> takes its place, and the ARC abuses the kernel's heap like nothing else.
>>>> So, if you are running a machine that only makes trivial use of a non-ZFS
>>>> file system, like you boot from UFS, but store all of your data in ZFS,
>>>> then
>>>> you can dramatically reduce the size of the buffer map via boot loader
>>>> tuneables and proportionately increase vm.kmem_size.
>>>>
>>>> Any further increases in the kernel virtual address space size will,
>>>> however, require code changes. Small changes, but changes nonetheless.
>>>>
>>>> Alan
>>>>
>>>>
>>> <<snip>>
>>>
>>>>> Additional Info:
>>>>> 1] Installed using PCBSD-9 Release amd64.
>>>>>
>>>>> 2] uname -a
>>>>> FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD
>>>>> 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
>>>>>
>>>>>
>>>>> root at build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
>>>>> amd64
>>>>>
>>>>> 3] first few lines from /var/run/dmesg.boot:
>>>>> FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
>>>>>
>>>>>
>>>>> root at build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
>>>>> amd64
>>>>> CPU: Intel(R) Xeon(R) CPU E7- 8837 @ 2.67GHz (2666.82-MHz K8-class CPU)
>>>>> Origin = "GenuineIntel" Id = 0x206f2 Family = 6 Model = 2f
>>>>> Stepping
>>>>> = 2
>>>>>
>>>>>
>>>>> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>>>>>
>>>>>
>>>>> Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI>
>>>>> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
>>>>> AMD Features2=0x1<LAHF>
>>>>> TSC: P-state invariant, performance statistics
>>>>> real memory = 549755813888 (524288 MB)
>>>>> avail memory = 530339893248 (505771 MB)
>>>>> Event timer "LAPIC" quality 600
>>>>> ACPI APIC Table:<ALASKA A M I>
>>>>> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs
>>>>> FreeBSD/SMP: 8 package(s) x 8 core(s)
>>>>>
>>>>> 4] relevant sysctl's with manual tuning:
>>>>> kern.maxusers: 384
>>>>> kern.maxvnodes: 8222162
>>>>> vfs.numvnodes: 675740
>>>>> vfs.freevnodes: 417524
>>>>> kern.ipc.somaxconn: 128
>>>>> kern.openfiles: 5238
>>>>> vfs.zfs.arc_max: 428422987776
>>>>> vfs.zfs.arc_min: 53552873472
>>>>> vfs.zfs.arc_meta_used: 3167391088
>>>>> vfs.zfs.arc_meta_limit: 107105746944
>>>>> vm.kmem_size_max: 429496729600 ==>> manually tuned
>>>>> vm.kmem_size: 429496729600 ==>> manually tuned
>>>>> vm.kmem_map_free: 107374727168
>>>>> vm.kmem_map_size: 144625156096
>>>>> vfs.wantfreevnodes: 2055540
>>>>> kern.minvnodes: 2055540
>>>>> kern.maxfiles: 197248 ==>> manually tuned
>>>>> vm.vmtotal:
>>>>> System wide totals computed every five seconds: (values in kilobytes)
>>>>> ===============================================
>>>>> Processes: (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150)
>>>>> Virtual Memory: (Total: 1086325716K Active: 12377876K)
>>>>> Real Memory: (Total: 144143408K Active: 803432K)
>>>>> Shared Virtual Memory: (Total: 81384K Active: 37560K)
>>>>> Shared Real Memory: (Total: 32224K Active: 27548K)
>>>>> Free Memory Pages: 365565564K
>>>>>
>>>>> hw.availpages: 134170294
>>>>> hw.physmem: 549561524224
>>>>> hw.usermem: 391395241984
>>>>> hw.realmem: 551836188672
>>>>> vm.kmem_size_scale: 1
>>>>> kern.ipc.nmbclusters: 2560000 ==>> manually tuned
>>>>> kern.ipc.maxsockbuf: 2097152
>>>>> net.inet.tcp.sendbuf_max: 2097152
>>>>> net.inet.tcp.recvbuf_max: 2097152
>>>>> kern.maxfilesperproc: 18000
>>>>> net.inet.ip.intr_queue_maxlen: 256
>>>>> kern.maxswzone: 33554432
>>>>> kern.ipc.shmmax: 10737418240 ==>> manually tuned
>>>>> kern.ipc.shmall: 2621440 ==>> manually tuned
>>>>> vfs.zfs.write_limit_override: 0
>>>>> vfs.zfs.prefetch_disable: 0
>>>>> hw.pagesize: 4096
>>>>> hw.availpages: 134170294
>>>>> kern.ipc.maxpipekva: 8586895360
>>>>> kern.ipc.shm_use_phys: 1 ==>> manually tuned
>>>>> vfs.vmiodirenable: 1
>>>>> debug.numcache: 632148
>>>>> vfs.ncsizefactor: 2
>>>>> vm.kvm_size: 549755809792
>>>>> vm.kvm_free: 54456741888
>>>>> kern.ipc.semmni: 256
>>>>> kern.ipc.semmns: 512
>>>>> kern.ipc.semmnu: 256
>>>>>
>>> Thanks. It will be mainly used for postgreSQL and java. We have a huge
>>> db (3TB and growing) and we need to have as much of it as we can on
>>> zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and
>>> 9 machines vm.kmem_size is always auto-tuned to almost the same size
>>> as our installed RAM. What I've tuned on those machines is lower
>>> vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked
>>> well for us and the machines does not swap out. Now on this machine, I
>>> do think that I need to adjust my formula for tuning vfs.zfs.arc_max,
>>> 25% for other stuff is probably overkill.
>>>
>>> We were able to successfully bump vm.kmem_size_max and vm.kmem_size to
>>> 400GB:
>>> vm.kmem_size_max: 429496729600 ==>> manually tuned
>>> vm.kmem_size: 429496729600 ==>> manually tuned
>>> vfs.zfs.arc_max: 428422987776 ==>> auto-tuned (vm.kmem_size - 1G)
>>> vfs.zfs.arc_min: 53552873472 ==>> auto-tuned
>>>
>>> Which other tuneables do I need to set on /boot/loader.conf so we can
>>> boot the machine with vm.kmem_size> 400G. As I don't know which part
>>> of the boot-up process is failing with vm.kmem_size/_max set to 450G
>>> or 500G, I have no idea which to tune next.
>>
>>
>> Your objective should be to reduce the value of "sysctl vfs.maxbufspace".
>> You can do this by setting the loader.conf tuneable "kern.maxbcache" to the
>> desired value.
>>
>> What does your machine currently report for "sysctl vfs.maxbufspace"?
>>
> Here you go:
> vfs.maxbufspace: 54967025664
> kern.maxbcache: 0
Try setting kern.maxbcache to two billion and adding 50 billion to the
setting of vm.kmem_size{,_max}.
> Other (probably) relevant values:
> vfs.hirunningspace: 16777216
> vfs.lorunningspace: 11206656
> vfs.bufdefragcnt: 0
> vfs.buffreekvacnt: 2
> vfs.bufreusecnt: 320149
> vfs.hibufspace: 54966370304
> vfs.lobufspace: 54966304768
> vfs.maxmallocbufspace: 2748318515
> vfs.bufmallocspace: 0
> vfs.bufspace: 10490478592
> vfs.runningbufspace: 0
>
> Let me know if you need other tuneables or sysctl values. Thanks a lot
> for looking into this.
>
More information about the freebsd-performance
mailing list