OCR and Perception

How is CloudTrade technology different to OCR?

Reading Time: 5 minutes
OCR and Perception
How data is perceived is crucial for it to be understood for automation

When looking at data capture solutions, the term OCR often pops up. A technology that has been around for years, it is often the ‘go-to’ for companies looking to automate their data capture. In this blog post, Richard Develyn, CloudTrade CTO, looks at how although OCR may capture some of the data needed, it cannot provide the understanding required to know what to do with that data or what the data means. When it comes to the future of data capture and enabling automation, we need to look at data perception and understanding…

I am often asked to explain the difference between the service that we provide here at CloudTrade and those services which are sold under the banner of “Optical Character Recognition” (OCR).

There is almost a straight answer to this, which is that OCR deals with what we might call “human perception” whereas CloudTrade is more about “human understanding”.

I say “almost” because the waters get muddied on a couple of counts. I shall come to these later; but let me first define exactly what I mean by “perception” and “understanding”.

What do we mean by data perception and data understanding?

Perception is all about recognition in its most basic form. It’s the bit in our brains which translates swirly lines and dots and circles into meaningful letters in the English language. It’s also the bit that has to struggle with differentiating between “i” and “j” or “b” and “h” so that we don’t end up wishing people “bappy hjrthdays” or catching a “fjshes” on a “fjshjng book”.

Understanding, however, is all about meaning. It’s the bit that comes in after perception has done its job (assuming that it gets it right!) and figures out, say, that the word “fishing” in “fishing for compliments” has nothing to do with the word “fishing” when you’re fishing in the sea.

Where the difference in perception and understanding starts to get muddied is that both the providers of OCR based solutions and we, ourselves, at CloudTrade, offer services which are based on a combination of both of these technologies.

You can’t have one without the other

You can’t, after all, have understanding without perception (unless you’re some sort of yogi floating over a mat in the Himalayas), or perception without understanding (imagine trying to find your way around the Tokyo underground system when you don’t speak Japanese). CloudTrade and OCR-based solutions need to use both of these elements because providing this service means not only extracting the right numbers and letters from those documents that are sent to us but also understanding them well enough to explain that, for example, “quantity 1” in an order line next to “car mats” is probably referring to a pack of 4 whereas the same phrase next to “Lamborghini Veneno Roadster” is unlikely to be referring to a pack of 4 of them at all.

Traditionally, OCR-based solutions have focussed on the perception side of the problem because that is where they have invested the bulk of their R&D, leaving the understanding part to be provided mostly by humans.

The value is in the understanding

CloudTrade, on the other hand, has invested all of its R&D efforts on understanding, succeeding in bypassing the perception part completely by focusing on “data” documents such as “data” PDFs (where, for example, the letter “s” is unambiguously stored as the letter “s” rather than as a set of drawing instructions resulting in something which could look like the letter “s” to the human eye).

Data PDFs do not need OCR and can therefore be thought of as producing a “perception” result which is 100% accurate. 100% perception is the key enabler for the process of understanding, as it allows a natural language analysis to take place with high levels of sophistication as there is no fear that all of the logical steps taking place within it will be broken by some stray spanner in the works which changes the word “battery” to a “hattery” or omits a very important decimal point in the phrase “don’t exceed the recommended dose of 1.234 ml every 24 hours”.

Providing the fuel for automation

Sophisticated systems of understanding remove the need for human operators and allow services to operate in a fully automated manner. At the time of writing, CloudTrade is processing ten million documents a year in this fashion. As soon as errors in perception are introduced, such as by using OCR, failures start to occur in the grammatical rules which underpin the process of understanding, and more and more human intervention is needed resulting in less and less automation.

Alternatively, OCR solutions operate in this field because they embrace the human element of document processing. The advantage is that they are not limited to only processing data PDFs. Their disadvantage is that they cannot fully automate.

To assume is to…

The second way in which the difference between perception and understanding has been muddled is in the technology behind OCR, which has now made inroads into the world of understanding. To quote Douglas Hofstadter from his seminal paper on OCR and AI called “on seeing A’s and seeing As”:

“A tacit assumption is thus that the components of sentences–individual words, or the concepts lying beneath them–are not deeply problematical aspects of intelligence, but rather that the mystery of thought is how these small, elemental, “trivial” items work together in large, complex (and perforce nontrivial) structures.”

Douglas Hofstadter

This assumption is certainly true with data PDFs, and that “mystery of thought” is clearly where CloudTrade has put in all of its R&D efforts. However, should the need for OCR not disappear completely, as might happen if all interactions become electronic and “data” based documents become the norm, then the most promising future for OCR is likely to come out of a hybridisation of perception and understanding.

Variety is the spice of life? Not for data.

Although as I said earlier, OCR makes mistakes such as reading “fish” for “fjsh”, what it actually does is identify lists of variations rather than hard and fast answers and then present those variations with their individual certainty values to a user for arbitration (i.e. it could be “fjsh” (60%) or perhaps it’s “fish” (50%)). OCR vendors can then use dictionaries to automatically strip out nonsense words like “fjsh” and perhaps narrow down the possibilities to arrive at the right answer. This doesn’t work, however, when the OCR mistakes still result in words present in the dictionary, or when a word being considered is not necessarily an English word at all (like a part number in a catalogue).

A far more sophisticated solution would be to bring in all these variations in perception straight into the “understanding” engine and then allow the latter to crunch through all of the grammatical options.

This is something that we have experimented with at CloudTrade, since it is possible for us to connect to OCR as the “perception” part of our solution. In doing so we have, indeed, found that with a bit of patience and tailoring we can deliver an OCR based service which is just about acceptable and automatic for header-level capture, but it’s too painful and slow to be feasible on complex or not “near-perfect” scanned images.

Dictionary lookups have been a standard feature with OCR vendors for some time. Advances in Machine Learning may well improve matters further in the future. I doubt very much that any improvements will happen with things like invoices and purchase orders, where a lot of the key information doesn’t have very much context to draw upon to allow significant automatic corrections to be made, but there could be mileage in using this technology with historical documents written in proper flowing prose.

OCR may well have an interesting future when it comes to scanning documents that were written in the past, but it’s more than likely to now be a past technology when it comes to documents that are to be written in the future.

CloudTrade vs OCR

Want to know more about how CloudTrade differs from OCR Technology?

Download our guide outlining the key features of both types of solution and the differences between them.


Why we’re building CloudTrade Logistics

Why we’re building CloudTrade Logistics

Reading Time: 4 minutes
Why we’re building CloudTrade Logistics 1

In January this year, we launched CloudTrade Logistics, our solution that helps US carriers, freight audit and freight technology companies get 100% validated data from freight invoices with no OCR or human intervention. As we approach the end of Q1, it’s an opportune time to provide an update on how we’re getting on.

The importance of payments to freight cannot be underestimated. Few things slow business quicker than the late administration and execution of payments. Given the challenges that technology has been able to solve in other sectors, it’s hard to understand why such an integral part of the supply chain still suffers from a large number of problems. These problems lead to unnecessary delays, errors, compliance risks and, in the end, costs.

We launched CloudTrade Logistics to solve these problems. By integrating invoice processing and digital data exchange for all parts of the US freight industry, we enable companies to overcome inefficient methods of freight invoice processing, giving them the opportunity to transact digitally with trading partners, regardless of size, maturity or complexity. Our users include logistics companies, business process outsourcing firms and freight payment providers who needed an alternative to EDI and OCR to process and capture data. The list of potential users is long and the potential for cost savings across the industry is great.

We believe that our solution is different from anything else on offer in the freight industry, for three reasons.

1. 100% compliant validated data

We offer 100% compliant, validated data because our product allows our users to create custom-designed business rules and validations that run on the vendor supplied invoice; we don’t mistake an ‘I’ or ‘L’ for a ‘1’ which ensures the data we deliver is 100% accurate data. We remove the issues with incorrect data appearing in data fields such as Tax or VAT, account code or depot location data fields and when this is combined with our rules engine we can put in place controls to ensure that the PO is included and referenced on the invoice, that the PO is valid and sufficient funds available to pay the invoice and in some cases that the item/ship codes are correct – all of this helps the back office matching processes in TMS, ERPs and different finance systems.

A key risk for any business is collecting and using bad data. Audit and payment processes can be put at risk and an inability to process payments quickly can lead to problems with carriers and shippers. Even if a company uses compliance programs and systems that guarantee they remain compliant for tax and customs requirements, if the problem occurs at source, where data is inaccurately collected, mistakes will be made. This can lead to a delay in payment, increased audit activities and extra expense – and it’s one reason why CloudTrade Logistics is so useful for our customers.

2. Our solution is powerful

Our unique approach to data capture means we are able to collect a large amount of data from an invoice. Most organisations make do with capturing “header-level” data only. But we go beyond this, capturing as much (or as little) information as our customers require, giving them the data they need for their back office systems and with the added benefit of being able to understand their detailed spend profiles. This means a company can process files and payments faster than ever before, whilst longer term, identifying and capturing incorrect spend from the detail on complicated invoices.

3. Our solution is fast and transparent

We designed our solution to increase ‘Inbound Invoice Velocity’, which is a positive for all parties, as it means that payments can be made earlier, early payment discounts can be applied and cash/working capital can be made available immediately. Using CloudTrade Logistics also means that an invoice is always visible through the capture process, from when it’s been received, converted and then sent to the back office system.

Our solution can convert complex multi-bill carrier/shipment invoices into unique document sets for separate processing in the back-office systems; we can work with a single multipage document containing 1,000’s of pages, or process multiple files contained on one email – we haven’t failed a challenge yet!  We know, it can take a manual operator using a key-to-screen or OCR solution up to 30 minutes to capture the data on a multi page shipment or complicated invoice. CloudTrade Logistics completes this task in a matter of seconds.

Our solution is built upon our unique Gramatica technology and it offers 100% accuracy for most major carriers. Further, it’s also the first solution of its kind to allow freight payment providers, third-party logistics and business process outsourcing partners to increase efficiencies, cut operational costs and increase throughput, by doing away with the manual or OCR processing of PDF and paper invoices which invariably produce errors.

Global ambitions

We have recently launched the product into the freight and logistics market, and it’s been received very positively. We’ve seen an increase in new enquiries, our carrier reach is expanding and we have been working with our existing clients in the industry to help them increase ‘invoice velocity’.

Our plans for the future of CloudTrade Logistics include pushing further into the US. In time, we’re also looking to build upon our existing offering, enabling clients to process other complex documents that need rules and validations running against them. But that’s for the future.

For now, we’ve been delighted with the positive response we’ve had from US users and we look forward to working with them throughout the rest of 2019. Our focus will be on working with key partners and helping as many users in the freight and logistics industry as possible, providing a simple, best-in-class technological solution to a problem that we feel should have been fixed a long time ago.