Using OCR For Efficient Vendor Invoice Management

by | Jan 8, 2018

In the late 1990’s and early 2000’s, the advent of the “electronic office” swept through the business world. As there were many different applications and approaches, the efforts to digitize all paperwork were somewhat piecemeal from business to business. A major concern was streamlining vendor invoice management.

With the addition of high-speed sheet-feed scanners, companies realized that they could convert large amounts of paper to electronic files. Accounts Payable departments, in particular, utilized scanning of vendor invoices to manage the paper flow they had to deal with on a daily basis. By scanning documents on the back end of the process, however, they were missing out on the considerable cost savings, accelerating workflow, and improved reporting (accuracy, compliance, etc.) available through the use of scanning with Optical Character Recognition (OCR) software.

In today’s evolving work environment, many payments are handled on the computer. Large corporations are recognizing the fact it costs more to process paper invoices than automated ones. Overall, many companies are integrating OCR technology into their process.

Solution providers know these figures and see the powerful potential of OCR/Data Capture solutions. The stumbling block has been finding the right synthesis of artificial intelligence (AI) algorithms that will read the many different invoice formats from various vendors with little human intervention. OCR Solutions as a company has been successfully implementing data capture systems for various industries for over 10 years. In order to tackle various invoice formats they first had to drop the idea of using a template based approach for invoices and developed what is called document definitions. Templates will extract data only if they can find it every time in the same location but definitions are more powerful. Definitions look for hints on pages such as field headers to extract information. For example, if on Invoice A the field “PO Number” was on the top of the page and in Invoice B it were located at the bottom, the system would scan the page of each invoice looking for the term “PO Number” and its derivatives ( PO, PO#, PO Num., Purchase Order etc.) and extract the proper result. AP departments can now send those piles of invoices straight through the scanners, read the information and extract it directly into their accounting software database without hours of manual processes and also greatly reduce the issue of incorrectly input data.

Front-End Data Capture

The benefits of front-end data capture have been demonstrated many times over. A lot of companies have moved to convert static (a.k.a. “dead”) paper invoices into interactive (a.k.a. “live”) data streams to increase efficiency in operations, management control and compliance such as Sarbanes-Oxley reporting requirements. Other than persuading vendors to transmit electronic invoices, the adoption of data capture using OCR to automate workflow is a top priority in AP departments today.

Using Optical Character Recognition (OCR) software through imaging tools for data capture, information from a variety of invoice formats can be scanned and data collected. Today most advanced capture systems will auto adjust to read line item detail on any invoice. A good OCR capture system will automatically capture invoice tables weather if they are on one page or expand to 100 pages. This is the power of an advanced capture system that takes out all the manual effort that was needed in the past to capture this data.  Once digitized, the information from the forms goes into accounting databases and is linked to the vendor account. These electronic documents are stored and accessed via a single computer, on the Cloud, or on an integrated enterprise accounting database in multiple offices.

You would think that with all of the technological marvels in today’s world, from spacecraft to lasers to self-driving cars, that getting a computer to read paper invoices and accurately convert them to electronic data would be easy. Some of the difficulties in adopting OCR/Data Capture in companies include concerns over the accuracy of the data extraction and a lack of a budget for the equipment and software. Taking into consideration the manual cost versus implementing a capture system, most companies now understand that over a short amount of time capture will pay for itself.  As it turns out, the AI required to locate the data on a variety of forms or invoices, interpret it, and convert the printed characters from the page into digital information to go into different accounting software databases is harder to manage than the ballistic trajectory of a rocket!

The Data Capture Process

OCR and Data Capture solutions are oriented toward eliminating the manual processing of paper. Data Capture solution processes can vary in their approach, and not all of them function equally at each step. So, this is an area to concentrate on when selecting a particular solution. In general, Data Capture with OCR involves:

  • Scanning – Paper invoices are scanned remotely or at a central office
  • Data Extraction -Invoice information is extracted and placed in proper fields
  • Document Classification – The document is identified as an invoice or other form  and processed, integrating with existing document management, ERP, and accounting systems for data storage and retrieval.
  • Automatic/Manual Indexing – Once the form is classified, data is extracted and transmitted from document form fields to database form fields. This is simplified by standard invoice templates, but advanced AI also identifies the correct form fields in a variety of formats. A small percentage of data that can’t be read is left for manual indexing.
  • Data Validation – The system checks the data to make sure it falls within the programmed parameters. There are two main parameters 1) Character Validation – the system has voting logic that tests the validity of each character. If the voting is below a specific threshold the system will display the character in question in red instead of black. 2) Rules Validation – Each field can be programmed based on customer specifications. For example, if a specific field is a date the system will highlight for correction any information that is not in proper date format. There can also be mathematical rules where the system will make sure that each line item total will equal quantity x price. Any information that fails validation is marked for review.
  • Verification – A user will log into their local or cloud verification station and follow system prompts to verify low confidence characters and verification rules. Once verification is complete the user will send the processed batch to an export station that will send the information where it needs to go.

Foolproof systems are hard to come by, though advancements are made all the time. Newer imaging and software combinations have reduced human interaction to minimal corrections of data read from a wide variety of invoice formats.

Solution providers base their marketing and sales as to which option can process the widest range of various documents and formats in the shortest amount of time with the least amount of manual preparation and corrections. Some Data Capture AI can “learn” and adapt to your processes and software. Other AI can be “taught” through human interaction, while lower-end systems simply scan and record an image of the document with some minimal data collection capabilities.

Thank you for reading our blog! How can we help you? Contact us today.