OCR...
Gary Kline
kline at thought.org
Wed Jan 28 11:22:17 PST 2009
On Wed, Jan 28, 2009 at 12:08:55PM +0200, Reko Turja wrote:
> >so what is the best commercial/shareware that can read a 10pt-font
> >file? (( also, when i have time to get back into actually hacking,
> >this [[turning imaged pdf into OCR'able ascii or 8859-1]] is giong
> >to
> >be a first target. any idea which team i should go with. gOCR
> >looks
> >best so far to me.
>
> AABBYY Finereader - Omnipage haven't been able to catch it in several
> years either feature or qualitywise. No idea if Finereader runs under
> emulator though. If the file is already a PDF and 72 DPI with text as
> graphics most of the damage has already been done, and it will be
> extremely hard to OCR.
>
well, damage is probably done. how can i check the resolution?
i tried to increase it by creating huge ppm and tif files, but
then that's really absurd since there can only be just so much
data per image. i _could_ try xv and jpeg and smoothing image to
refine, but too much hassle.
(i used gocr -m 130 and "saw" the glyphs it (presumably) saw.
seemed pretty much okay to my eyes. but then i'm not a computer
program. [MAYBE :)]
gary
> -Reko
>
--
Gary Kline kline at thought.org http://www.thought.org Public Service Unix
http://jottings.thought.org http://transfinite.thought.org
The 2.23a release of Jottings: http://jottings.thought.org/index.php
More information about the freebsd-questions
mailing list