OCR Accuracy Level
The question that comes up quite often in our engagements is – “What is your typical field acceptance rate and OCR accuracy level for Marks, Characters, and Handwriting text (Also Red Dropout vs Non-dropout)? Does your software do boilerplate drop-out?”
Accuracy is always dependent on the quality and type of the original document. My rule of thumb is that if you can’t see it clearly with your eyes, OCR will not do a good job with it either. The better the image quality the higher our accuracy ratings will be. That said, we employ a voting engine that will catch most mistakes.
This analysis is based on standard 300 dpi TIFF or electronically generated PDF, or at scan time, image processing is applied like red-drop-out. During testing on text-based documents, we typically see upwards of 90% character recognition accuracy (that is 90 out of 100 words and marks related to extracted metadata fields). On “clean” and proper registered documents, this percentage we can calculate rises to upwards of 95%.
For Handwriting recognition (ICR), we typically see upwards of 75% if it is block letters, constrained, and structured (comb fields).
Cursive Handwriting Recognition generally delivers upwards of 50% accuracy but also improves over time of use with learning and training.
Boiler template dropout – all image processing settings are global. If required, separate workflows that include unique settings can be designed for identified document sets from a specific identified destination.
Whose engines do you use? Do you have a proprietary engine?
We employ two main OCR engines: Nuance Omnipage and OpenText Recostar.
Optional are the A2iA recognition engines. We can also license other 3rd party engines as required.
Call Today to Discuss Our OCR Accuracy Rate
Here at OCR Solutions, we are as transparent as possible when it comes to our products. We can further explain how accurate our software is and any other questions you may have. Give us a call today!