Statistical Reporting is a key to success for any document capture system

by | Jan 22, 2020

Your data automation process is almost perfect- your invoices are parsed and cataloged into relational databases, and all of your vital information is now at your fingertips. There are some things you should be asking yourself though- are you ensuring that data extraction tasks are successful? Any capture system that is used in an enterprise must have good reporting- not only to ensure successful extraction but also for things like productivity, monthly usage, accuracy, and even server health.

Throughput Reporting

Your organization may handle thousands of invoices, contracts, orders, and other documents on any given day. Throughput reporting gives you an eagle eye view on how these documents are being handled and processed. You might need to know what your average batch size throughput is during peak hours and weekdays. If your throughput trends lower than what your norms are, it might explain why some payments are delayed or other business functions are being slowed.

You might even want to use statistical reporting to gauge throughput based on a single user for employee or productivity review. The bottom line is throughput reporting is customizable to meet your business needs whether it is based on document class, time (hours/weeks/days/months), module, or user.  

Document Correction

Out of the box, most OCR software works pretty well. Once your capture rules are configured and you start processing thousands of documents with no problem, you feel pretty invincible. However, one of the biggest mistakes you can make is assuming that data is accurate.

OCR is a technology that has been around for decades and is considered one of the most popular ways to convert physical documents into searchable text data. The accuracy is said to be anywhere between 98 to 99 percent accurate. That sounds great, but measured at the page level about 10 characters out of every 1,000 will be incorrect. While in most cases this might be acceptable, it might not be when it comes to your operation.

Statistical reporting through advanced reports lets you implement things such as confidence scoring,  whose main objective is to identify a threshold that separates the bad data from the good data. A high confidence score means you have higher likelihood data is correct. Reporting can even help you identify false positives, as well as review data that is tagged “bad”. Field correction is also an option.

System Health

Advanced reporting gives you the insight not only to employee productivity and how efficient they are at verifying document information. The system will report errors on a character level per document type. If a document is not well programmed into the system our advanced reporting module with give you immediate insight to your operation and allow you to fix the exact issue that was causing the problem.

