Python on FreeBSD is slower than on Linux

Konstantin Belousov kostikbel at gmail.com
Fri Nov 13 08:23:45 UTC 2015


On Fri, Nov 13, 2015 at 09:01:57AM +0100, Baptiste Daroussin wrote:
> On Fri, Nov 13, 2015 at 12:36:29PM +1100, Kubilay Kocak wrote:
> > On 13/11/2015 6:26 AM, Vladimir Bogrecov wrote:
> > > Hello,
> > > 
> > > I'm developing a little project on Python 3.5. The server's operating
> > > system is FreeBSD 10.2. Today I decided to do a little test "just for fun"
> > > and the result has confused me. I ran the following code
> > > 
> > > import random
> > > import time
> > > 
> > > 
> > > def test_sort(size):
> > >     sequence = [i for i in range(0, size)]
> > >     random.shuffle(sequence)
> > >     start = time.time()
> > >     ordered_sequence = sorted(sequence)
> > >     print(time.time() - start)
> > > 
> > > 
> > > if __name__ == '__main__':
> > >     test_sort(1000000)
> > > 
> > > on FreeBSD 10.2 x64 and on Debian 8 x64. Both computers was the smallest
> > > (5$ per month) virtual machines on the Digital Ocean (
> > > https://www.digitalocean.com). The average result on the FreeBSD was 1.5
> > > sec, on the Debian 1.0 sec. Both machines was created specially for test
> > > and had not any customization. Could you help me to understand why python
> > > is so slower on FreeBSD and may be there are some steps I can perform to
> > > speed up the python to work not slower than on Debian.
> > > 
> > > I have found in Google the similar question:
> > > https://lists.freebsd.org/pipermail/freebsd-python/2012-June/004306.html so
> > > it has an interest not only for me.
> > > 
> > > P.S. I really like FreeBSD and I would be happy to solve this issue. If you
> > > will have an interest to this issue I can provide SSH access for both
> > > machines :)
> > > 
> > > Thank You!
> > 
> > From FreeBSD Python's (team) point of view, I can't think of anything
> > obvious off the top of my head that might cause a ~30% performance issue
> > for that workload.
> > 
> > Let's get a trace (truss, strace, dtrace) of what's going during the run
> > so we can figure out exactly what's happening and in what context.
> > 
> > With respect to the testing environment, certain VPS providers throttle
> > bursts of CPU pretty heavily, so you'll want to account for/isolate that
> > as a potential contributor. Yes both OS's are being run on the same
> > provider, but as Alfred said, one OS may be mitigating/working around
> > certain virtualisation 'issues'.
> > 
> > A full trace of what the test case is doing is definitely the next best
> > step I can think of, even before profiling in python, which is probably
> > going to provide insight as well.
> > 
> > Personally, I'd love to hear about anything that might result in FreeBSD
> > always topping the charts for Python performance.
> > 
> Well the python devs are aware by themselves of potential performances issues on
> FreeBSD (and non linux in general) for example subprocess will try to close fds,
> on linux by getting the list of fd from /proc/fd and only close the one they do
> not want among the existing ones. on freebsd they do the same if /dev/fd is
> mounted meaning without /dev/fd, perfs will suck. They do not use closefrom(2)
> here because on linux it is not async-signal-safe. one could make them use
> closefrom(2) on non linux for example or even more efficiently but freebsd only
> modify the code to use kinfo_getfile(3).
> 
> https://bugs.python.org/issue11284
> 
> Another area is the AIO iirc (needs to be double checked) the python uses linux
> only things for aio which makes this way slower on FreeBSD.
> 
> I'm kind of surprised given the number of pythonic people we have that no one
> has had a look at how python perform on FreeBSD and how things are implemented
> in the python VM to help them.

Note that the code provided does not do any system actions at all.  It is,
I guess, is pure calculation and probably memory allocation.  The later,
for the initial warm-up, may have different constant cost between different
implementations of malloc/operating system/policy put by hypervisor on
the OS access patterns to memory.

In other words, to meaningfully compare apples to apples, the testing
must isolate each variadic part. Run the tests on the same _real_
hardware, provide the warm-up to isolate the initialization cost, do
statistically-meaningful analysis. Do trace the test e.g. using ktrace
to see the program<->system iteration, in particular, ktrace allows to
see the page faults experienced by the execution.


More information about the freebsd-python mailing list