Outliner Software
Home Forum Archives Search Login Register


 

Outliner Software Forum RSS Feed Forum Posts Feed

Subscribe by Email

CRIMP Defined

 

Tip Jar

Searching scanned PDFs

< Next Message | Back to archived message list | Previous Message >

Note: This message is from the outliners.com archive kindly provided by Dave Winer.

Outliners.com Message ID: 3723

Posted by graham.smith
2005-08-10 13:03:10

 

Elsewhere I have been discussing the problems of searching documents that have been scanned into PDFs. This means they aren’t text based PDFs but image based PDFs so they cannot be searched by text based search engines like DTSearch, Copernic etc.

My solution is to rather tediously OCR them into txt files,and then link the txt file to the PDF file using Zoot.

However at http://www.lvpaperless.com/i-pi.htm we have a program that claims to search on the basis of word “shapes” and will search scanned PDFs that haven’t been OCRd.

However at $1000 you would need to be doing a lot of this sort of thing to make it worth the money !!!

Graham

 


Back to archived message list

© 2006-2025 Pixicom - Some Rights Reserved. | Tip Jar