[patch] unprivileged mlock(2)
Andrey Zonov
zont at FreeBSD.org
Tue Aug 28 21:15:14 UTC 2012
On 8/29/12 12:20 AM, Andriy Gapon wrote:
> on 28/08/2012 21:07 Bryan Drewery said the following:
>> On 8/28/2012 11:37 AM, Andrey Zonov wrote:
>>> Hi,
>>>
>>> We've got RLIMIT_MEMLOCK for years, but this limit is useless, because
>>> only root may call mlock(2), and root may raise any limits.
>>>
>>> I suggest patch that allows to call mlock(2) for unprivileged users.
>>> Are there any objections to got it in tree?
>>>
>>
>> FYI, see previous recent thread on this here:
>>
>> http://lists.freebsd.org/pipermail/freebsd-arch/2012-May/012552.html
>> and
>> http://lists.freebsd.org/pipermail/freebsd-arch/2012-June/012606.html
>
> Yes, Andrey, I highly suggest that you read those threads completely.
>
> Here are some observations.
>
> It doesn't look like mlockall and mlockall(MCL_FUTURE) in particular properly
> honor RLIMIT_MEMLOCK. If this is not fixed, then it would be premature to
> enable the privilege for non-privileged users.
>
This should be surely fixed, but I don't know how. Any suggestions are
welcome.
> I am against adding the sysctl knob. If RLIMIT_MEMLOCK limit is properly
> implemented then it is sufficient to effectively deny the privilege (and with
> much finer granularity).
>
Until all bugs around this problem will be fixed, to have such sysctl
would be nice, and even keep it turned off to not change default
behavior (not like in patch).
> When the privilege is allowed to ordinary users, then memorylocked in the
> default login.conf would need to be set to something much lower than the current
> 'unlimited' :-)
>
It's not a problem to set it to a new reasonable value in the tree, but
it will be a problem to fix this everywhere.
> Also, note that currently RLIMIT_MEMLOCK is abused at least in vslock() (used at
> least by sysctl's kernel side). The temporary wirings performed as an
> implementation detail or side-effect should not be accounted against the limit.
> The limit is for wirings that a user makes and controls explicitly. It should
> not be applied to something that kernel does behind the scenes without user's
> knowledge.
>
I was surprised when I stepped on this few years ago on machine with
thousands processes. top(8) ate 100% CPU in a forever loop, ps(1)
didn't work, and that is because memorylocked limit was set to low.
Later I submitted two patches which fixed kvm (r230873) and sockstat
(r230874), but I totally agree with you here, we shouldn't check for
limits in vslock().
Index: sys/vm/vm_glue.c
===================================================================
--- sys/vm/vm_glue.c (revision 239772)
+++ sys/vm/vm_glue.c (working copy)
@@ -183,7 +183,6 @@ int
vslock(void *addr, size_t len)
{
vm_offset_t end, last, start;
- unsigned long nsize;
vm_size_t npages;
int error;
@@ -195,18 +194,6 @@ vslock(void *addr, size_t len)
npages = atop(end - start);
if (npages > vm_page_max_wired)
return (ENOMEM);
- PROC_LOCK(curproc);
- nsize = ptoa(npages +
- pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map)));
- if (nsize > lim_cur(curproc, RLIMIT_MEMLOCK)) {
- PROC_UNLOCK(curproc);
- return (ENOMEM);
- }
- if (racct_set(curproc, RACCT_MEMLOCK, nsize)) {
- PROC_UNLOCK(curproc);
- return (ENOMEM);
- }
- PROC_UNLOCK(curproc);
#if 0
/*
* XXX - not yet
@@ -222,14 +209,6 @@ vslock(void *addr, size_t len)
#endif
error = vm_map_wire(&curproc->p_vmspace->vm_map, start, end,
VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES);
-#ifdef RACCT
- if (error != KERN_SUCCESS) {
- PROC_LOCK(curproc);
- racct_set(curproc, RACCT_MEMLOCK,
- ptoa(pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map))));
- PROC_UNLOCK(curproc);
- }
-#endif
/*
* Return EFAULT on error to match copy{in,out}() behaviour
* rather than returning ENOMEM like mlock() would.
--
Andrey Zonov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 535 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20120828/a2515451/signature.pgp
More information about the freebsd-arch
mailing list