
(1) (2)
The
Process
Converting
printed documents into computer editable documents is fairly
straightforward. Upon starting TypeReader, I select my scanner Mustek
1200 LP (approx. cost $100). I click on the Auto Straighten and Auto
Orientation buttons. I used default settings for the rest.
I
lay the book on my flatbed scanner. I then choose From Scanner under
the Get Page button and then select Auto Start. My
scanning software starts. There I finalize my scan settings. Based
on what Ive read and some initial testing (I started at
200dpi) I chose 300 dpi and selected black & white. As long
as you have good quality documents (i.e. clean readable text)
these settings will be sufficient.
 |
|
Converted text w/errors highlighted
|
 |
|
Property window reveals possible errors
|
I then
scan the document and it is immediately sent to TypeReader. There
the image is broken up into sections and the software converts
those sections into computer editable text. You then have the option of exporting this new document in a number
of formats including .html, .rtf, .doc, and others. I tried
using the .html method, but I did not like the result. It created
lots of tables. I ended up saving my files as .rtf. I then opened
my files in Word and did my spell check. I then saved the files
as .html and imported them into our Dreamweaver template. I
scanned the images separately (72 dpi/color) and imported them
over after cropping them in Photoshop.
99 percent?
Is OCR 99 % accurate? The answer is - pretty close. One of the pluses with
TypeReader 6.0 is the 'Properties' window that pops up revealing the number
of 'suspect' and 'illegible' characters on the page. I did have to spend a
little time correcting mistakes. Most come from converting certain kinds of
type. Italicized fonts, quotation marks and some period/commas appear to be
the usual suspects that I had to fix. But I was quite impressed with the speed
and accuracy of the process.
Serious
OCR
If you are seriously considering using OCR to convert many documents get yourself
a fast flatbed scanner with an auto document feeder. Of course you can not
run a book through an auto feeder and they will add several hundred dollars
to the price of your scanner.
Final
Thoughts
OCR is a great technology that has come a long way in the last few years. It
has sure saved me a ton of work. It could do the same for you.
(1) (2)