any way to turn a pdf file into something OCR-able?

Olivier Nicole on at cs.ait.ac.th
Mon Dec 1 18:42:50 PST 2008


> >  1) Some PDFs are just wrappers around JPEG images. In this case
> >  there is no text for pdftotext to convert => epic fail.
> 
> 	In this case "convert" from the ImageMagick port will get you a
> series of .jpg/.gif/.<whatever>.  Read the manual carefully before
> attempting; also note this can be a slow process.

pdfimages (from ports graphics/xpdf) can also do that, maybe at a
lesser cost.

Bests,

Olivier


More information about the freebsd-questions mailing list