Text from Image

Thread Starter

Eric007

Joined Aug 5, 2011
1,154
Hi All,

It's been a very long time I have been here. Hope everyone is doing alright.

I am working on a project where I need to extract texts from an image taken a phone camera...
I did some research and found Tesseract (which is free). I implemented it but it does not work pretty well.

Anyone with a good algorithm or library I can implement in my android project?

Thanks in advance for your responses.

Eric
 

atferrari

Joined Jan 6, 2004
3,954
Hi All,

It's been a very long time I have been here. Hope everyone is doing alright.

I am working on a project where I need to extract texts from an image taken a phone camera...
I did some research and found Tesseract (which is free). I implemented it but it does not work pretty well.

Anyone with a good algorithm or library I can implement in my android project?

Thanks in advance for your responses.

Eric
Hola Eric,

From my limited (very) experience with OCR in my mobile, most tend to not recognize limits for what should be converted to text.

Depending of the quality of the images I got a rather high number of errors.

I finally uninstalled all.

If you get one working reasonably well, please say it here.
 

jpanhalt

Joined Jan 18, 2008
9,347
Many years ago, we used OCR to convert thousands of pages of manuals and other things to digital. I used Omnipage for my personal work. I don't recall what we used at work. I believe it was part of a much larger system. The software we used for work was pretty good and also expensive. For my personal stuff, Omnipage was OK. Everything was technical, so we built libraries to avoid spellcheck errors. Alternatively, just turn off spellcheck autocorrect.

If you search on OCR, you will find a lot of options, including Omnipage.
 

dl324

Joined Mar 30, 2015
10,705
If anyone has a code I can buy, please let me know.
My HP All-in-one can do OCR on images with at least 300ppi resolution. The standalone OCR is hpDocCvt.exe. I've found conversion accuracy to be fairly good; I scanned dozens of documents and the error rate wasn't worth worrying about.

Not having the ability to edit PDF (the OCR can do text only), I bought a program (Nuance?) to convert PDF's to .doc files. I don't recall any issues with it maintaining page layout.
 

MrSoftware

Joined Oct 29, 2013
1,775
Taking a stab in the dark, but maybe running your image through some filters first would help? Maybe convert to black and white or gray scale, or try to normalize the colors, etc.. Then run your OCR on it?
 
Top