this blog contains information for .net and sql stuffs. You can find various tips and tricks to overcome problem you may be facing in ...

Wednesday, January 5, 2011

OCR In .net with MS-Office 2007 component (MODI)

I have been just passed through requirement to implement OCR in one of my .net application. There are very less option available in .net to implement it. I thought to share with you what I have learned at that time.

In .net OCR can be done with Ms-Office component, it’s the Microsoft Office Document Imagining Library. It is required to have ms-office on your pc before you develop or run this application. This component is available in both ms-office 2003 and 2007.

Initially when we install our office, it does not install image library in our pc, we need to install it explicitly. When you start to add/remove features in your office you need to check following things should be included.

Once you have installed it, you can start to develop your application.

You can take any project type either windows or console. Now you need to add reference of Document Imaging Library to your project, that will be available in com tab of add reference dialog box. Please note down that this will be only shown if you have installed it correctly. If you don’t able to see it though you have installed, try to restart your system and then check it again.

Once you have added that com reference you can work with images and can read text from it.