What Is Optical Character Recognition (OCR)?
Introduction to OmniPage Pro - 6
What Is Optical Character Recognition (OCR)?
Optical character recognition
(
OCR
) is the process of turning an
image
into
computer-editable text. An image is an electronic picture of text such as
a scanned paper document or an electronic fax file. Images do not have
editable text characters; they have many tiny dots (
pixels
) that together
form a picture of text.
During OCR, OmniPage Pro analyzes an image and defines characters
to produce editable text. After OCR, you can export the resulting text to
a variety of word-processing, page layout, and spreadsheet
applications.
OmniPage Pro OCR
In addition to text recognition, OmniPage Pro can retain the following
elements of a document during OCR.
Photos, logos, and drawings are examples of graphics.
Font types, font sizes, and font styles (such as bold or
italic
) are examples
of text formatting.
Column structure, paragraph spacing, and placement of graphics are
examples of page formatting.
The graphics, text formatting, and page formatting elements that
OmniPage Pro retains are determined by the settings you select. See
“Settings Guidelines” on page 54 for more information.
OmniPage Pro only recognizes machine-printed characters such as
laser-printed or typewritten text. However, it can retain handwritten
text, such as a signature, as a graphic.