OCR...

Gary Kline kline at thought.org
Wed Jan 28 11:22:17 PST 2009


On Wed, Jan 28, 2009 at 12:08:55PM +0200, Reko Turja wrote:
> >so what is the best commercial/shareware that can read a 10pt-font
> >file?  (( also, when i have time to get back into actually hacking,
> >this [[turning imaged pdf into OCR'able ascii or 8859-1]] is giong 
> >to
> >be a first target.  any idea which team i should go with.  gOCR 
> >looks
> >best so far to me.
> 
> AABBYY Finereader - Omnipage haven't been able to catch it in several 
> years either feature or qualitywise. No idea if Finereader runs under 
> emulator though.  If the file is already a PDF and 72 DPI with text as 
> graphics most of the damage has already been done, and it will be 
> extremely hard to OCR.
> 

	well, damage is probably done.  how can i check the resolution?
	i tried to increase it by creating huge ppm and tif files, but
	then that's really absurd since there can only be just so much
	data per image.  i _could_ try xv and jpeg and smoothing image to
	refine, but too much hassle.  

	(i used gocr -m 130 and "saw" the glyphs it (presumably) saw.
	seemed pretty much okay to my eyes.  but then i'm not a computer
	program.  [MAYBE :)]

	gary



> -Reko 
> 

-- 
 Gary Kline  kline at thought.org  http://www.thought.org  Public Service Unix
        http://jottings.thought.org   http://transfinite.thought.org
    The 2.23a release of Jottings: http://jottings.thought.org/index.php



More information about the freebsd-questions mailing list