Poor read() performance, and I can't profile it
Kirk Strauser
kirk at strauser.com
Thu Jun 12 19:17:58 UTC 2008
On Wednesday 11 June 2008, Chuck Swiger wrote:
> If your data files are small enough to fit into 2GB of address space,
> try using mmap() and then treat the file(s) as an array of records or
> memoblocks or whatever, and let the VM system deal with paging in the
> parts of the file you need. Otherwise, don't fread() 1 record at a
> time, read in at least a (VM page / sizeof(record)) number of records
> at a time into a bigger buffer, and then process that in RAM rather
> than trying to fseek in little increments.
During a marathon session last night, I did just that. I changed the sequential reads
in the "outer" file to fread many records at a time. Then I switched to mmap() for the
random-access file. The results were much better, with good CPU usage and only 3 times
the wall clock runtime:
kirk at linux$ date; time /tmp/cdbf /tmp/invoice.dbf >/dev/null; date
Thu Jun 12 13:56:49 CDT 2008
/tmp/cdbf /tmp/invoice.dbf > /dev/null 29.00s user 11.16s system 56% cpu 1:11.03 total
Thu Jun 12 13:58:00 CDT 2008
kirk at freebsd$ date; time /tmp/cdbf ~pgsql/data/frodumps/xbase/invoice.dbf invid ln
>/dev/null; date
Thu Jun 12 14:10:57 CDT 2008
/tmp/cdbf ~pgsql/data/frodumps/xbase/invoice.dbf invid ln > /dev/null 38.14s user
6.21s system 23% cpu 3:05.13 total
Thu Jun 12 14:14:02 CDT 2008
> Also, if you're malloc'ing and freeing buf & memohead with every
> iteration of the loop, you're just thrashing the malloc system;
> instead, allocate your buffers once before the loop, and reuse them
> (zeroize or copy new data over the previous results) instead.
Also done. I'd gotten some technical advice from Slashdot (which speaks volumes for my
clueless, granted) that made it sound like a good idea. I changed almost all the
mallocs into static buffers.
I'm still offering that shell account to anyone who wants to take a peek. :-)
--
Kirk Strauser
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 155 bytes
Desc: This is a digitally signed message part.
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20080612/4c45a1ea/attachment.pgp
More information about the freebsd-questions
mailing list