ASP.NET Issues: OCR in C# using MODI -Microsoft Office Document Imaging(Fetch Text From image in C#)

September 13, 2011

OCR in C# using MODI -Microsoft Office Document Imaging(Fetch Text From image in C#)

Check my previous post to see what OCR is
http://www.dotnetissues.com/2011/09/ocr-in-c-using-googles-tessnet2-fetch.html

In this Post I am going to use MODI -Microsoft Office Document Imaging and it gives 100% correct results and works perfectly fine with digits too.

1.) I Installed MSOffice 2007
2.) Go to Add Reference -->Com and Select Microsoft Office Document Imaging library .

3.)Below is the sample code for Extract Text from Image using MODI

using System;
using System.Collections.Generic;
using System.Text;
using System.Drawing;
using System.Threading;
using MODI;

namespace TesseractConsole
{
class Program
{
static void Main(string[] args)
{
DocumentClass doc = new DocumentClass();
doc.Create(@"C:\Documents and Settings\lak\Desktop/quotes_7a.jpg");
doc.OCR(MiLANGUAGES.miLANG_ENGLISH, true, true);

foreach (MODI.Image image in doc.Images)
{
Console.WriteLine(image.Layout.Text);
}
}
}
}

I got 100% correct result and found it better then Google's tessnet2

1 comment:

Archive StorageJanuary 17, 2012 at 2:06 AM
Excellent tips.Really useful stuff .Never had an idea about this, will look for more of such informative posts from your side.. good job...Keep it up
ReplyDelete
Replies

Add comment

ASP.NET Issues

Pages

September 13, 2011

OCR in C# using MODI -Microsoft Office Document Imaging(Fetch Text From image in C#)

1 comment:

Translate

Search This Blog

Total Pageviews

Contact Me

Popular Posts