Garbage In, Garbage Out

The world is not an objective place. There seems to be no single point of view, no absolute truth. There is only a little piece of information that could be regarded as reliable - an information that is well summarized by the famous cogito, ergo sum. All the rest is, more or less, speculation.

Consider some quite distant land, for example Antarctica. Have you been there? Have you observed it personally? Most probably you haven't. All you know about Antarctica is second-hand information. They say that strange birds that cannot fly live in Antarctica. Penguins, that's how they are called. Would you believe that? Yes, you probably would. Have you heard that Yeti recently moved to Antarctica? Would you believe that? You probably wouldn't. Both "here be penguins" and "here be Yeti" are information. These are not facts, but mere information. It is the belief that makes them into facts.

But even things and phenomena that you personally observe cannot be automatically regarded as true. Think of David Copperfield and Statue of Liberty. People have witnessed how the statue disappeared. Yet, if you were one of them, would you believe that huge steel-and-copper statue has really ceased to exist for those few moments? Probably not. How many times have you seen pretty ladies sawn in half, disassembled into pieces and reconnected again or levitating freely in the air? What we see may not be what it appears to be. I'm sure you will be amazed by this excellent performance by my favorite duo Penn and Teller.

Think about your date of birth. Do you think that the date of your birth is an unquestionable fact? Not really. You were there when you were born, but you probably do not remember it. And you was quite incapable of checking the date for yourself at that time. Therefore you date of birth is just an information. It comes from quite a trusted source but, strictly speaking, it is not unquestionable.

Any information must be regarded with an appropriate level of confidence. You will probably not really doubt your date of birth, therefore the level of confidence is very high. You believe in that information. But you will probably doubt that Yeti lives in Antarctica (everybody knows that Yeti lives in Himalayas). Therefore a level of confidence for that information is low. You do not believe in that. However, you may slowly increase the level of confidence as more and more expeditions will report encounters with Yeti in Antarctica. As it goes beyond a certain threshold you may start believing that. And once the popular press brings a convincing evidence that what was considered to be Yeti was in fact a mutated giant rat from Mars transported to Antarctica for the sole amusement of penguins by the four headed hyper-intelligent lizards of Sirius IV, you may quite stop believing in Yeti.

Seems pretty obvious, isn't it? Now how is it related to software?

Software is all about information. However, overwhelming majority of software systems have no ability to be "somehow inclined to believe in Yeti" or "quite doubt that Yeti has moved recently". Most software systems have only one level of confidence: fact. That was not a problem when the information systems were small and disconnected. A user working with a specific system was somehow aware what is the reliability of information coming from that system. The user either knew how the system worked or slowly learned how reliable the information is by confronting it with reality. The user as a thinking human being is correcting inability of computer system to deal with uncertainty.

But such a simple approach will fail in case of global distributed hyper-connected information super-highway such as the Web or Semantic Web. Users don't know how the displayed information was acquired and processed and usually have no time spending few years confronting the information with reality. Users of the Web have no way how to asses reliability of information they see. The simple binary model of true-false will not work in this environment. Any system using such binary model that includes computer-to-computer communication on a large scale is doomed to failure.

I quite believe that the future is not really bright for Internet-scale web services and semantic web. Unless they can learn how to doubt.