editing pdf files
Polytropon
freebsd at edvax.de
Sat Oct 13 11:19:16 UTC 2012
On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote:
> ive got a question that fits in here. hopefully.
>
> last week I found a book from 1901 that google had scanned and listed
> as a pdf file. it was text plus photos of the rich/famous of the
> 1800s. somehow, google found the exact string that matched my great
> grandfather [from the civil war]. I d'loaded the file (maybe 2mbytes)
> and searched using acroread. nada. I used the pdftotext utility.
> same: nothing but some 600 page numbers.
>
> my guess is that google just took photos of the book and used other
> tools to create a pdf file. I am not =that= serious about genealogy,
> but I would like to know if there are any tools to edit this kind of
> pdf file.
In case the PDF is nothing more than a compilation of images,
there's a way to deal with it for editing:
step 1: disassemble
step 2: edit images
step 3: reassemble
The disassembling can be done with
% pdfimages source.pdf .
Then the files can be edited whatever tool you like, e. g. Gimp.
They often come out in PBM format.
Finally the images can be re-converted to PDF and combined to one
PDF file:
for IMG in .*.pbm; do
convert ${IMG} ${IMG}.pdf
done
pdftk .*.pdf output target.pdf
Note the ".*" prefix for the file specification: The images extracted
by pdfimages match that pattern (at least in the case I tested it for).
If they get other names than .0000001.pbm, change the approach
accordingly.
--
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
More information about the freebsd-questions
mailing list