(7 mins read, ~1400 words)
is AI only a Catch Phrase ?
ChatGpt hit the world like a storm in November 2022, when it became available to the public as a free tool. Scientists have been working on Artificial Intelligence (AI) for decades. As far back as in the early 90’s, Roger Penrose had published his popular science book, ‘Emperor’s New Mind’ that discussed AI. The term AI is the present day catch phrase to describe advance technological systems that can mimic human intelligence. While AI has undoubtedly made significant strides in various domains, the term itself can be misleading and susceptible to misuse, as a marketing gimmick to sell products. Use of AI is seen in products brochures from washing machines to refrigerators. Is there a huge technology leap in these AI enabled refrigerators? A couple of decades back they claimed that these machines used fuzzy logic, how different is the effective performance between fuzzy logic and AI? From our experience of using these machines, we can safely say that there is no perceptible difference; all that labelling seems to be marketing stunt.
The Intelligence in ChatGPT
So to start with, let us keep all these products out and examine the domain of ChatGPT a Natural Language Processor (NLP). This means it is a programme that can process the natural language used by humans. Prior to this, some level of computer literacy was needed for a person to make use of technology. However, with NLP, one can interact with the system without any prior knowledge of technology, in simple terms it means that humans can hold a conversation with ChatGpt using ordinary human language. Let us examine if ChatGpt is intelligent and does it understand the conversation it holds with the user.
Intelligence is a hazy term. We all know what it means but it is difficult to describe it. This is because it is a multifaceted property that includes cognitive abilities, problem-solving skills, learning abilities, adaptability, creativity etc. It is of paramount importance that we distinguish knowledge from intelligence, since, often, the former is mistaken for the latter. A caretaker looked after my ageing mother. This caretaker could not understand numbers, count money or decipher the clock. Nevertheless, to our surprise we discovered that this Marathi speaking lady picked up Malayalam to communicate with my mother over a relatively short period of time. There is no way this caretaker would have any decent score on the MENSA IQ test, but she is unquestionably intelligent. Therefore, while evaluating intelligence of a machine we must separate this aspect of retrieving stored information, i.e. knowledge, or computational abilities and then see it on other parameters, as we would test a human.
The mathematician Alan Turing, in the attempt to define intelligence for machines, devised a test called the Turing Test. This test demands that a machine and a human be asked a series of questions by a human judge. The judge has to examine the answers, without knowing which answer is from which source. If the judge cannot identify the answer from the machine, then the machine passes the test of intelligence. Thus, the test treats intelligence as a behavioural phenomenon, with no notion of comprehension, emotion or thought demanded of the machine.
We have always used calculators and computers as high fidelity devices that never gives a wrong answer. So when we come across computer system that has AI our instinct is to consider it infallible, how far are we justified in this belief? To answer this question, let us take a deep dive into how ChatGpt or any NLP system works
How NLP works
The term, ‘natural’ in NLP signifies that these systems will be able to process the language that humans use in our conversations and communications, we do not have to use a computer language with strict syntax rules, like C++ python etc. First step is to input some natural language into the system, let us consider it a paragraph, ‘Para’. The system will normalise this Para by converting all letters into lower cases, and removing grammatical extensions like “ing”, “ed” etc that signifies parts of speech. This is a detail that need not bother us at this point; all we need to know is that the computer will convert the input Para into a normalised ParaN. This ParaN is then broken into Tokens. The Tokens are strings of characters in the para and will be of random length; each token would be roughly less than a word or perhaps have parts of two words. The system assigns unique numbers to each token as identifiers.
The system generate tables that detail the probability of pattern of the tokens, ie which token will follow which one or precede which one, accordingly a probability of tokens at each location is computed by the system. Now, when a sentence (string of tokens) is presented to the computer with a word (token) missing, the computer is programmed to use its probability computation and predict the missing word. This is the kernel of the learning process of an AI system.
The process described above is done for a large amount of data that is fed into the ChatGPT system. The company operating ChatGpt, Open AI, fed in all of Wikipedia, some 10,000 books on various subjects etc. into the system. The system then tokenised all this input and prepared probability tables. Then the system was trained, by providing it with a string of tokens with some blanks where tokens were missing. The system had to predict the missing tokens. Each blank could be potentially filled by multiple tokens with different probabilities for each of the probable token. A randomiser was used to select a token for the blank from the list of probable tokens. Initially human trainers validated the predicted token in a place. These human inputs enabled the system to modify its probability tables and this went on for a while. Over time, the system was programmed to assess token suitability without human intervention. This is a simplified description of how the ChatGpt system was trained. Now imagine that a human is holding a conversation with ChatGpt. The human input is like the string of tokens with blanks and the system has to predict a suitable string of tokens which is the reply given by the system. This system is referred to as ‘Generative AI”. ChatGPT is thus pre-trained, as explained above, and hence the name ChatGpt, which stands for Chat Generative Pre Training Transformer. The term ‘Transformer’ can be understood as a programme that is designed as described above.
Falliable but useful
Once we understand this process, the magic disappears and we appreciate the system for what it is. ChatGpt is essentially a complex search engine at a very large scale with some element of randomness to mimic creativity. In any search engine, pre-existing data is searched but in ChatGpt instead of data, a pattern of token sequence is searched for, this is why a token is not a word but some sequence of characters that are broken words. Thus, the output of ChatGpt is based on statistical likelihood rather than some contemplation by the system. This should dispel any notion of infallibility that we might have attached to ChatGpt, mislead by the connotation of the term “Intelligence’ in AI.
As with internet, Whatsapp or saloon gossip, so with ChatGpt, the information fidelity is low. We cannot outsource intelligence, we need to depend on our rationality in sieving information and arrive at valid conclusions. All these are just information systems that suffers ‘Garbage In Garbage Out’ phenomenon. ChatGpt uses a huge repertoire of information, and that from varied sources; hence, it is certainly less susceptible to biases and errors. However, we need use information from primary sources or from good peer reviewed sources and from multiple sources. Then apply human rationality on it, there is no other alternative for it yet. With this caveat, I encourage you to use ChatGpt extensively in daily life. This can do a lot of heavy lifting just like calculators or computers can be used to side step the drudgery of computation, drafting of letters, seek reference for some information etc. and we humans can concentrate on interpreting the result, thus improve productivity and quality of tasks. As ChatGpt uses pattern of tokens to form an answer, it gives better answers if the questions we ask gives it sufficient context about the subject of our inquiry. This knack of asking question is what is now popularly called ‘Prompt Engineering’
NB: the above pic is AI generated with a prompt - create an cyber style image of Rodin's "Thinking Man"
Very nice elucidation. ChatGpt is after all not "intelligent", as a much as it is an intelligent collator. If you are intelligent enough to understand how ChatGpt works, and ask intelligent questions; you will get solutions in return which will be useful to you🙂.