วันศุกร์ที่ 12 กันยายน พ.ศ. 2551

Optical Character Recognition Whats It All About

Writen by Brett McQueen

Optical character recognition, or 'OCR', is a technology that allows digital images of typed or handwritten text to be transferred into an editable text document. For example, if you hooked up a scanner to your computer and used the scanner to scan a handwritten essay into a digital photo, you could take that digital photo and use optical character recognition software to "see" and "grab" the text from the digital photo so you can edit that text on your computer in a program like Microsoft Word, Notepad, or TextEdit.

For the more "geeky" readers, optical character recognition takes the picture of text and translates the text into Unicode or ASCII. From wikipedia.org, "unicode is an industry standard designed to allow text and symbols from all of the writing systems of the world to be consistently represented and manipulated by computers". Also from wikipedia.org, "ASCII (American Standard Code for Information Interchange) is a character encoding based on the English alphabet." Basically, optical character recognition technologies outputs text that is recognized by computers.

Optical character recognition technologies can be found used in different software solutions, but software is considered a low budget way to use optical character recognition technology. In more complex optical character recognition systems, it is usual that a combination of both hardware and software will be used.

While optical character recognition technology has become increasingly popular and research for it has intensified, the rates of text recognition are variable. For handwritten optical character recognition, the rate of recognition is 80% to 90% with clean handwriting. For cursive text, the rate of recognition is quite lower because of the lack of information contained in the cursive characters.

Like I said, optical character recognition is becoming increasingly popular. It is becoming popular in areas of work that require massive amounts of printed documents to be sorted. One example of these areas is in the legal profession. Optical character recognition greatly reduces the time required for these printed documents to be sorted. The time reduced can amount to days! The United States Post Office has also been using optical character recognition since 1965.

For a normal computer user who wants to be able to edit scanned documents of text, there are many different types of software solutions available. Different software solutions will allow both scanned handwritten and typed text to be translated, while some may only allow scanned typed text to be translated. There are a couple optical character recognition programs that are available for free.

Some free ones are:

- SimpleOCR: http://www.simpleocr.com

- OCRAD: http://www.gnu.org/software/ocrad/ocrad.html

If you are looking to spend money on an optical character recognition program, expect to cough up some cash. These OCR programs are in the area of $100. The following is a list of some of these programs:

- http://www.abbyy.com

- http://www.nuance.com

- http://www.irislink.com

Check out the websites to see what software solution will work best for you. Many of these programs allow you to try them for a short trial period. It is definitely worth to check out as it allows you to further explore optical character recognition technology.

Brett McQueen is an avid computer user, website designer, and website developer. You can learn about how to edit a scanned document at his blog dedicated to using scanned documents. He also runs a helpful HTML tutorial website that offers in-depth tutorials and HTML video tutorials.

ไม่มีความคิดเห็น: