Can a technical service really be 100% reliable?

/ in / by
Reading Time: 4 minutes

Can a technical service really be 100% reliable? In this blog post, Richard Develyn, CloudTrade CTO, with a nod to Heisenberg, looks at what is meant by 100% accuracy in the world of computers. He discusses the shortcomings of OCR technology, and describes how CloudTrade has devised a systematic methodology to reach the goal of offering a service that is 100% accurate.

How is it that one can claim to offer a service which is 100% accurate? One hundred per cent seems too absolute; too certain; too devoid of margin of error. Nothing is 100% accurate after all, is it?

Perhaps you think it isn’t because you’ve heard of something called Heisenberg’s Uncertainty Principle. Even if you don’t exactly know what that is, the fact that it has the word “uncertainty” in the middle of it and is clearly written by a very clever German physicist probably suggests to you that no one should ever make the claim of being 100% certain about anything.

Well, ok, but Heisenberg’s Uncertainty Principle applies at the quantum level, and if it’s going to affect anything it’ll probably kick in when processors become so small that computers start to become “quantumly” unreliable (unless, of course, quantum computers somehow or another prevent it).

But I doubt you came to this blog to read about quantum computers.

Are computers really 100% accurate?

Normal computers are, in fact, 100% reliable, in much the same way that gravity is 100% reliable. The last time that we witnessed a mistake being made with a computer processor was in 1994 with the now infamous (if you happen to move in those circles) Pentium FDIV floating point hardware bug that apparently resulted in 1 in every 9 billion floating point divisions coming out with the wrong answer. That might not seem too terrible, but it was a pretty serious problem at the time, causing Intel to recall the chip at the cost of just under half a billion dollars, which would be three quarters of a billion dollars in today’s money.

You can’t really blame the Pentium FDIV processor for the mistake, though. The poor bit of silicone was only doing what it was told. The problem happened because human engineers made a mistake when they built it. This hasn’t happened since, not because human engineers no longer make mistakes, but because human engineers have improved the way in which they test for and correct the mistakes that they make.

How can unreliable people build reliable systems?

The reason people are sceptical about a claim for 100% accuracy is that they imagine that everything must map down to some sort of human process in the end, and human beings are notoriously not 100% reliable. The people who produce computer processors, however, seemed to have bucked that trend.

About ten years earlier, in the world of software, Donald Knuth, an eminent computer scientist, offered a reward of $2.56, doubling every year, to anyone who could find a bug in his very sophisticated typesetting application that he’d written called Tex. There were bugs found at first, admittedly, and Knuth duly paid out, but even though the software continues to have widespread use to this day no one has reported a new bug in it for about the last 12 years.

So, it would seem that it’s possible to produce 100% reliable software as well as 100% reliable hardware. Plenty of software is, actually. Sure, plenty isn’t, but the majority of the stuff that controls the critical aspects of our lives is 100% reliable. We’d soon know if it wasn’t. It might not be 100% reliable when it first comes out but, rather like Donald Knuth’s Tex program, it gets to 100% accuracy with a bit of time and effort.

Here’s the key question: how is it possible for computers and computer programs to be 100% reliable when they are built by people who clearly aren’t?

It really comes down to three things:

  • Having a well-understood and limited problem to solve
  • Having the right tools to solve the problem
  • Ironing out the bugs

The problem with OCR

Optical Character Recognition (OCR), for example, as a technology, fails to achieve 100% reliability because of point (1) – the problem is not well understood or limited in its scope. There are many ways of representing letters to the human eye – some estimates give the number of fonts in existence today at about half a million. These fonts don’t adhere to some sort of hard-line rule about what differentiates a “b” from an “h” or an “I” from a “j”. They’re “artistic”, and we all know what that means: they don’t adhere to rules. Add to that the imperfections that you get from scanning paper images and you can see how OCR is really trying to solve an impossible problem.

Identifying the problem

We don’t try to solve the OCR problem; we avoid it entirely by extracting raw data from documents which already have that data unambiguously present within them. Data dumps are of no use to anyone, of course, so where we, at CloudTrade, concentrate all of our efforts is on solving the problems of data understanding, allowing us to properly identify and label data before we pass it on to our clients. These problems are well defined, allowing us to achieve 100% accuracy by following the next two steps on the list.

Choosing the right tools

First of all, we use the right tools. “Let’s get to the moon” might be a very well-defined problem but you’re not going to get there with a hammer and chisel. The better the tools you have, of course, the easier the problem is to solve. We have invested many years of development at CloudTrade in producing a rules-writing engine which allows people, without an IT background, after about a month of training, to write the necessary rules to extract and interpret data from a document in about fifteen minutes to one hour depending on its complexity. This is the key to our success as a company.

Ironing out the bugs

Of course, sometimes the rules writers get it wrong, because they’re only human, and that’s where point (3) comes in. When we make mistakes, we fix them, and fixed mistakes stay fixed.

The trick for getting to 100% accuracy is in having a repeatable process that you can correct as you go along. You can’t get to this sort of reliability if you use people in your processes because people can’t be fixed in the same way that computers can (even the most ambitious psychoanalyst in the world wouldn’t try to make that claim). Neural Networks, which sort of simulate human thinking, can’t be fixed either because we don’t actually understand how they reason any more than we understand how human beings reason. This is an area of considerable research these days because our inability to discuss reasoning with neural networks greatly limits our ability to use them. Perhaps one day we’ll have neural network psychoanalysts. I wonder whether they’ll also be neural networks. The mind boggles.

So in conclusion, the reason that we can justifiably claim to deliver a 100% accurate service is because of these three key facts:

  1. We limit the problem that we’re trying to solve to something well scoped and understood, because we avoid OCR
  2. We have a system in place purpose-built to solve our particular problem
  3. We have processes in place to correct any mistakes that we might make when we use it.

No human errors, no OCR errors, no neural network errors. Just a repeatable programmable system. That’s how we get to 100%.