Fast diff command for large files?
Kirk Strauser
kirk at strauser.com
Mon Nov 7 17:29:41 GMT 2005
On Monday 07 November 2005 10:40, francisco at natserv.net wrote:
> I had the same setup a while back.
> A few suggestions.
Thanks for the tips; unfortunately, any fix that involves touching the
FoxPro code is basically impossible. It's not that we *can't*, but that
the sole FoxPro programmer at our company is completely occupied with other
projects.
> What type of system is this? In particular do any record can be modified
> or are only recent records changed?
Nope - every line in each table is subject to change.
Here's how our current system works:
1) Copy each FoxPro table file (and associated memo file if one exists) to a
Unix server via Samba.
2) Run my modified version of the "xbase" program to convert each table to a
tab-delimited file that can be loaded into PostgreSQL using the "copy
table" command. These files are named "foo.dump", "bar.dump", etc.
3) If "foo.dump-old" exists:
a) Using Andrew's algorithm, get the difference between foo.dump-old and
foo.dump. Write these out as a set of "delete from ..." commands and
a "copy table" command. Pipe this relatively tiny file into the
"psql" command to upload the modifications.
Otherwise:
b) Use the psql command to upload foo.dump
4) "mv foo.dump foo.dump-old"
5) Profit!
I've already cut the runtime in half. The next big step is going to be
getting our Windows admin to install rsync on the fileserver so that we can
minimize the time spent in step one. With the exception of the space
required by keeping the old version of the dump files (step 4), this is
exceeding all of our performance expectations by a wide margin.
Even better, step 3a cuts the time that the PostgreSQL server has to spend
committing the new data by several orders of magnitude. The net effect is
that our web visitors don't see a noticeable slowdown during the import
stage.
--
Kirk Strauser
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 155 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20051107/1ee9f8ea/attachment.bin
More information about the freebsd-questions
mailing list