- BACK -

Unibase Logo
  

OCR Coupled with Unibase Fills Tall Order for Root3

By Paul Ford & John Elvin, Root3, Bristol, United Kingdom

Recently Root3 filled a tall order from one of its clients for an OCR/ICR process with Unibase Imaging in the Unibase by DMAC environment.

We at Root3 call the form a triple A4 with gatefold. The actual form dimensions are 30 centimeters (12.5 inches) high by 62.5 centimeters (26 inches) wide. The form is double sided. There is a mixture of tick boxes (check boxes) and text to be captured. Some of the free format text is not captured in the OCR/ICR process but is captured later in Unibase in update mode.

The Solution

The first problem we encountered was finding a scanner that would capture the form. All the sensibly priced scanners would only scan up to A3 (30 cms x42 cms) The Bell & Howell 6388 fitted the bill combined with Longscan ( software developed using the Kipp tools for the Kofax interface card).

Each side of the form was scanned in at 300 dpi producing a Group 4 tif file of some 380k bytes in size. The scanning rate is 7 double sided forms per minute.

The scanning station is running on the Novell network. As soon as an extraction station (OCR/ICR engine) detects images it begins extracting the data from previously defined masters. The extraction station needs no operator intervention and runs continuously. Extraction rates will obviously vary. Normally it is about 750 characters per minute.

With this particular form the data loaded from the OCR/ICR engine into fourteen huge Unibase formats. Th smallest format had three fields while the largest format had 390 fields.

ROOT 3's batching programs written in the powerful Unibase by DMAC AID language provide the interface between the OCR/ICR engine and Unibase by DMAC.  The programs take the data and the images and load them into Unibase batch files ready for data perfection (data entry) in the Unibase by DMAC update mode.  On the way it also takes care of rotation, resolution, number of images per batch, pre batch edits and scanned batch headers.  For this particular project we are scanning throughout the day (16,500 documents) and batching through the night.

The oeprators then open the batches in update mode and are prompted to key at error flags and predetermined update fields.

The Results

For conventional keying of the data from the form, the average was 15 forms in 150 minutes.  Using the OCR/ICR approach above in the Unibase by DMAC environnment with heads up keying produces from 15 forms in 38 minutes.  The real result was one happy customer.

Paul Ford and John Elvin can be reached at Root3 Systems Limited, Euro House, Apex Court, Woodlands, Almondsbury, Bristol, United Kingdom BS124JT.  The telephone number is 011 44 1454 898200.  The fax number is 011 44 1454 886969.