Is this the death of all world-changing AI applications?
Since the earliest attempts to use machine learning on a large scale for truly critical tasks, such applications have repeatedly hit the same wall. The very thing that makes machine learning so magical also sometimes makes it useless: the outpourings of artificial intelligence are too often inexplicable in both the narrower and wider senses.
In saying this, I don't mean that we normal earthlings are too stupid to understand the complex data structures and algorithms behind the current AI hype (ChatGPT, LamDa, BARD, and the Bing AI thing).
The problem is different, and to some extent inevitable: even the experts who wrote these applications can't understand why a particular input produces the respective output.
This is due to the nature of the algorithms involved: In classical, or symbolic, artificial intelligence, one relies on knowledge representations and formulas to draw comprehensible conclusions from that knowledge. In other words, the human programmer specifies the rules whereby a certain input leads to an output. This has the advantage of absolute traceability and consistency, but unfortunately it is not powerful enough for many types of problems - while at the same time requiring very high development effort.
For non-symbolic problems, such as image recognition and audio analysis, such methods are completely unsuitable. This is where machine learning can help: using training data, i.e., "labeled" samples (e.g., pre-categorized images), the AI derives "rules" by itself. Different methods of non-symbolic AI differ in the nature of the "derivation engine".
Machine learning with a large number of parameters (and correspondingly ginormous amounts of training data) produces models that are no longer predictable or explainable. This, of course, is what makes ChatGPT, for instance, so appealing: the inexplicable humanity of its output, the amount of information it can draw on - until it starts lying or even arguing and you can't do anything about it.
People also lie, deceive, argue. I can't even count how many times I've quoted a fellow human being, only to find out later that I had been fed utter rubbish.
This is damaging to my personal reputation, and I have increasingly taken to checking information I receive from third parties several times - e.g. through Internet research - before passing it on. But what if I now have to rely on a ChatBot with a dubious sense of truth instead of a search engine to do so?
Certain professionals are expected to be highly reliable and truthful: Lawyers, teachers, scientists face this demand more than others, even if they often fail to meet it. But what result do we expect if we now want to rely on artificial intelligence with dubious factual knowledge, especially in the legal, educational and research domains?
In a previous professional life, I was CEO of a company that was trying to gain a foothold in the "eDiscovery" or "Legal AI" space in the US. I can clearly remember the discussions around the explainability and reliability of our software.
And these kinds of discussions have been going on long before Deep Learning: the simplest application in eDiscovery is a kind of "hot-or-not classifier" whose only task is to decide whether a document could be relevant to a case or not. When potentially 20 million company emails are being considered relevant, such machine classification is critical to whether the case can even proceed to trial. But what error rate is acceptable? Is it OK if the classifier only finds 80% of the essential documents? Would a lawyer have found more in this huge haystack? What if an additional 5 million documents are considered relevant when they are not? Who is to blame for the many billed hours of the lawyers who now have to read these documents?
And this involved algorithms whose errors and omissions were at least partially explicable, e.g., by pointing to specific example documents from the training phase that had led to misclassification.
This ceases completely with Large Language Models, the current show horse of the deep learning hype. The model cannot tell where it is getting a piece of information from. And the best a human can do is guess what input data was mined for a particular output. Assuming he knows the training data very well, in which case he could also just replace the AI.
So let's keep in mind: Deep Learning and especially Large Language Models are unpredictable and not comprehensible in their outputs, they are proverbial black boxes. So these technologies cannot be used to produce software for use cases that require reliability, indication of sources, or full automation (no more humans in the loop). With this in mind, I am also convinced that the classic search engine is far from dead. The bias and falsehoods in the indexed documents are quite enough for me, even without yet another dialog engine adding its unpredictable two cents. The current attempts to marry e.g. GPT with Bing or Bard with the Google search index do weave source information and search results into the conversation, but this does not solve the original problem - as becomes painfully obvious after a few sample queries.
For detailed background on Large Language Models and their features, see our YouTube video on the topic: