Making the right choice when looking at the technological solutions to solve your business problems has always been an issue. Too much commitment, too greater system change, perhaps too greater a cost – but what if you could find the best of both worlds? Richard Develyn, Chief Technical Officer, discusses the new and exciting developments in the CloudTrade solution suite to bring about the best of both worlds when it comes to document content recognition – Project Grandalf.
Implementing a Machine Learning solution in your organisation is a bit of a risky undertaking: you’re never entirely sure how long you’re going to have to wait before you get to see anything useful come out of it, or even whether something useful is ever going to come out of it at all.
In the meantime, however, you have to find a way of delivering your core business in an efficient and dependable way, which these days means using IT: not the strange, new, neural-network style IT being hyped about so much now, but rather the traditional IT which has been holding the world together for the last 50 years or so.
Machine Learning, exciting though it is, is still a very long way away from being able to take over from IT in total, and although there is a class of problem where traditional IT struggles, such as those requiring human assistance, which is where Machine Learning can usefully participate, most IT solutions are still delivered using the traditional IT “programming” way.
Hybrid solutions, however, can combine all of these approaches in order to get the best out of all worlds, as long as they’re created with care. Human judgement is slow to make and Machine Learning slow to learn, so processing still needs to go through the traditional IT route as much as possible if IT-speed levels of automation are going to be achieved. Humans and Neural Networks, however, can configure the system to make it run more accurately or efficiently, without trying to take over the job completely.
Project Grandalf, as it is affectionately known by the CloudTrade Development Team, is CloudTrade’s hybrid solution to the problem of document content recognition.
Extracting properly identified data from human readable documents is a complicated problem to solve. CloudTrade’s flagship product, Gramatica, does so by implementing a rules engine which allows rules to be written specifically for each document format to be processed. It is extraordinarily powerful, and Gramatica can deal with any requirement and complexity as long as it has the right rules written to do so.
Some documents, however, are not sufficiently complicated, or processed in sufficient quantities, to justify the rules-writing effort that Gramatica requires.
This is where ‘Grandalf’ comes in.
Grandalf’s Machine Learning Engine, which runs every night on a comprehensive data sample set, powers its knowledge database of data extraction algorithms. This collection of algorithms is then applied to every new document which arrives at the service, with an operator asked to clarify which algorithm has produced the right answers.
The operator’s responses are persisted in a database so that documents subsequently received from the same sender are automatically addressed by the right algorithm without further need for operator intervention. Should there be a variation in a document so that the right answers cannot be found, then the knowledge-base and operator process can be re-invoked so that the alternatives can be accurately handled.
It is this combination of Machine Learning, traditional IT and human intervention which provides the key benefit that Grandalf brings against the competition. It also illustrates the advantage that hybrid systems have against the more common “one technology only” approach.
There has always been a tendency in the marketplace to look for silver bullets. Silver bullets are easy to sell (i.e. “you have this problem, you use this silver bullet; you have that problem, you use that one”). Even if you were shooting at werewolves, however, you would be silly to make your bullets entirely out of silver – just put enough silver in them to make them toxic to the creature you’re shooting at then build the rest from good old fashioned lead-antimony and steel (that’s probably how they did it – back in the day).
We, at CloudTrade, don’t believe in silver bullets (or werewolves). We believe in solutions which are crafted from the best that the different technologies relative to the problem can offer, especially when they are made to work in harmony. As a result we are firmly convinced that Grandalf’s hybrid solution is the best way to approach the problem of document content recognition, beyond those documents which are so complicated that they require specific rules to be written to understand them (i.e. Gramatica). It is this hybrid combination of approaches that allows Grandalf to hit the sweet spot that the solution demands: Machine Learning and human assistance supporting traditional, deterministic, IT.
Grandalf is characterised by the following features:
- It learns from one example only
- Grandalf has a huge knowledge base of data capture rules which are applied to every document, with an operator then asked to help via a simple question and answer form
- It’s 100% accurate
- Once chosen, rules don’t exercise judgement or refer back to some Machine Learning database to get possible values and confidence levels; Grandalf’s rules are completely deterministic
- It’s fast
- Once an operator has helped Grandalf determine which rules should be used, documents fly through it at the speed of IT
- It handles document variations
- Grandalf returns to an operator if the rules for a given document fail to find a value, re-running its knowledge base to offer more alternatives
- It continually learns and improves
- CloudTrade’s selected data set feeds into the Machine Learning algorithm which every night updates Grandalf’s rules knowledge base by adding further data capture possibilities
- It’s expandable
- Grandalf can easily be expanded to cater for additional customer capture requirements
Project Grandalf, is set to be released and available to CloudTrade customers, under its official name, in January 2021. If you’d like to know more about how CloudTrade can help your business automate its data and documents, irrespective of volume or document type, please arrange a short meeting with us here.