What Is Optical Character Recognition (OCR)?
Introduction to OmniPage Pro - 6
What Is Optical Character Recognition (OCR)?
Optical character recognition
(
OCR
) is the process of turning an
image
into
computer-editable text. An image is an electronic picture of text such as
a scanned paper document or an electronic fax file. Images do not have
editable text characters; they have many tiny dots (
pixels
) that together
form a picture of text.
During OCR, OmniPage Pro analyzes an image and defines characters
to produce editable text. This is also called
recognizing
text. After OCR,
you can export the recognized text to a variety of word-processing, page
layout, and spreadsheet applications.
About OmniPage Pro OCR
In addition to text, OmniPage Pro can retain the following elements in a
document during OCR.
Photos, logos, and drawings are examples of graphics.
Font types, font sizes, and font styles (such as bold or
italic
) are examples
of text formatting.
Column structure, paragraph spacing, and placement of graphics are
examples of page formatting.
OmniPage Pro recognizes printed text characters only. However, it can
retain handwritten text, such as a signature, as a graphic element.
The graphics, text formatting, and page formatting elements that
OmniPage Pro retains depend on the settings you select for your
document before OCR. See Chapter 4, OmniPage Pro Settings, for more
information.