Check my previous post to see what OCR is
http://www.dotnetissues.com/2011/09/ocr-in-c-using-googles-tessnet2-fetch.html
In this Post I am going to use MODI -Microsoft Office Document Imaging and it gives 100% correct results and works perfectly fine with digits too.
1.) I Installed MSOffice 2007
2.) Go to Add Reference -->Com and Select Microsoft Office Document Imaging library .
3.)Below is the sample code for Extract Text from Image using MODI
http://www.dotnetissues.com/2011/09/ocr-in-c-using-googles-tessnet2-fetch.html
In this Post I am going to use MODI -Microsoft Office Document Imaging and it gives 100% correct results and works perfectly fine with digits too.
1.) I Installed MSOffice 2007
2.) Go to Add Reference -->Com and Select Microsoft Office Document Imaging library .
3.)Below is the sample code for Extract Text from Image using MODI
using System;
using System.Collections.Generic;
using System.Text;
using System.Drawing;
using System.Threading;
using MODI;
namespace TesseractConsole
{
class Program
{
static void Main(string[] args)
{
DocumentClass doc = new DocumentClass();
doc.Create(@"C:\Documents and Settings\lak\Desktop/quotes_7a.jpg");
doc.OCR(MiLANGUAGES.miLANG_ENGLISH, true, true);
foreach (MODI.Image image in doc.Images)
{
Console.WriteLine(image.Layout.Text);
}
}
}
}
I got 100% correct result and found it better then Google's tessnet2
Excellent tips.Really useful stuff .Never had an idea about this, will look for more of such informative posts from your side.. good job...Keep it up
ReplyDelete