amd64/170351: [patch] amd64: 64-bit process can't always get
unlimited rlimit
Bruce Evans
brde at optusnet.com.au
Sat Aug 4 05:42:56 UTC 2012
On Fri, 3 Aug 2012, Konstantin Belousov wrote:
> On Fri, Aug 03, 2012 at 03:35:20PM +0000, Ming Qiao wrote:
> > >Description:
> > On the amd64 platform, if a 32-bit process ever manually set its rlimit,
> > none of its 64-bit child or offspring will be able to get the full 64-bit
> > rlimit anymore, even if they explicitly set the limit to unlimited.
> >=20
> > Note that for the sake of simplicity, only datasize limit is referred
> > in this report. But the same logic applies to all other memory segment
> > (i.e. stacksize, etc.).
> > ...
> ...
> The problem you described is architectural. By design, Unix resource
> limits cannot be increased after they were decreased, except by root.
> In your scenario, the limits were decreased by mere fact of running the
> 32bit process which have lower 'infinity' limits then 64bit processes.
>
> That said, I see two possible solutions.
>
> First is to manually set compat.ia32.max* sysctls to 0. Then you get
> desired behaviour for 64bit processes execed from 32bit, it seems.
> It does not require code change. Since you are fine with denying fix
> for infinity, this setting gives the same effect as the patch.
>
> Second approach (which is essentially a correction to your approach
> from fix.diff) is to track the fact that corresponding rlimits are set
> to 'ABI infinity', in some per-struct rlimit flag. Then, get/setrlimit
> should first test the 'ABI infinity' flag and behave as if rlimit is set
> to infinity for current bitness even if the actual value of rlimit is
> not infinity. Flag is set when rlimit is set to infinity by current ABI.
>
> The second approach would provide 'correct' fix, but it is not trivial
> amount of work for very rare situation (execing 64bit process from 32bit),
> and current behaviour of inheriting 32bit limits may be argued as right.
> If you want, feel free to develop such patch, I will review and commit it,
> but I do not want to spend efforts on developing it myself ATM.
Third approach: "unlimited" never really means unlimited, so leave the data
size "unlimited" like most other defaults. RLIM_INFINITY is the same in
32-bit mode as in 64-bit mode, so there is no problem in representing
"unlimited".
Some defaults on a 9.0-STABLE i386 system, according to bash:
% socket buffer size (bytes, -b) unlimited
% core file size (blocks, -c) unlimited
% data seg size (kbytes, -d) 524288
% file size (blocks, -f) unlimited
% max locked memory (kbytes, -l) unlimited
% max memory size (kbytes, -m) unlimited
All the memory and file sizes are have finite limits, but the actual limits
are very load-dependent and this won't tell us what these are. The data
size limit is also load-dependent, and this may tell us a wrong value.
% open files (-n) 3000
Knowing the actual limit for this is more important. I think this limit
is required to be the same as getdtablesize() and sysconf(_SC_OPEN_MAX).
% pipe size (512 bytes, -p) 1
Seems to be a bash bug. There is no rlimit for pipes, and the finite limit
for this is not 512. (Like most limits, it depends in a complicated way
on related and unrelated system resources, so the actual limit is sometimes
0 and sometimes closer to the real or virtual memory size. Not so close
to the memory sizes for this, since there is a limit on pipe kva. This
limit is more broken than most, since is is global. It's implementation
has many style bugs.)
% stack size (kbytes, -s) 65536
% cpu time (seconds, -t) unlimited
% max user processes (-u) 5547
This is probably also required to track a sysconf() value.
% virtual memory (kbytes, -v) unlimited
% swap size (kbytes, -w) unlimited
So there are only 4 finite "infinite" rlimits, with 2 probably required.
Bash (4.2.20) is also missing support for the new limit on pseudo-
terminals. This is shown by sh:
% sbsize (bytes, -b) unlimited
% pseudo-terminals (-p) unlimited
The socket buffer limit is shown by both, but sh doesn't describe it
properly (it uses the kernel/API abbreviated name for the descrption).
On a 10.0-CURRENT amd64 system, according to bash:
% data seg size (kbytes, -d) 33554432
% open files (-n) 11095
% pipe size (512 bytes, -p) 1
% stack size (kbytes, -s) 524288
% max user processes (-u) 5547
Now the finiteness of the data seg size limit is nonsense. The finite
value is 32GB, but the system has only has 16GB of RAM and 8GB of swap.
Overcommit may allow more virtual data, but you don't want that unless
you want to have no limit. The correct spelling for no limit on the
data seg size is "unlimited" (RLIM_INFINITY, not 32GB). 64-bit systems
probably all have this this limit (but misspelled) without really
trying, since the large virtual address space makes it very easy to
exceed physical resources, and any arbitrary limit is likely to be
smaller than necessary for some loads (especially overcommitted ones),
or too large to always be physically satisfiable.
8GB swap with 16GB RAM is also nonsense. Might as well have the correct
amount of swap (0), or if you want some swap then spare a few bytes of the
16GB for a RAM disk.
On my i386 local system:
% data seg size (kbytes) 524288
The above i386 system has 2GB of RAM and 4GB of swap. It can actually run
up to 11 threads using the data limit without overcommit. But I have only
1GB of RAM and the correct amount of swap (0), so I can't run more than
1 thread using the data limit without overcommit.
Bruce
More information about the freebsd-amd64
mailing list