The Universal Machine

Wednesday, February 21, 2024

Call for Papers: Workshop on CBR and LLMs

Generated by Gemini

Last year was the most remarkable year in AI that I can recall. Large Language Models (LLMs) like Chat-GPT changed the public perception of AI, and what had previously seemed like science fiction was now a reality. I was only tangentially familiar with LLM research, having been working on emotion recognition in speech with a PhD student. However, last year, I started diving into LLM research in-depth, which, as one commentator said, was like trying to drink water from a fire hydrant, such was the volume of publications through places like arXiv.

I view all problems through a lens coloured by case-based reasoning (CBR), my long-term AI research speciality. I quickly saw synergies between CBR and LLMs where both could benefit from each other's approaches, and I wrote up my initial thoughts and published them on arXiv.

CBR has an annual international conference, and I proposed the idea of a workshop at the conference on CBR-LLM synergies to some colleagues, who all thought this was a great idea and agreed to co-organise the workshop with me. The Case-Based Reasoning and Large Language Models Synergies Workshop will take place at ICCBR 2024 in Mérida, Yucatán, México on July 1st 2024. The Call for papers can be accessed here, and submissions are via EasyChair.

Thursday, February 15, 2024

A Long-term Memory for ChatGPT

Generated by Gemini

In October last year, I published a short position paper, A Case-Based Persistent Memory for a Large Language Model, arguing that ChatGPT and other LLMs need a persistent long-term memory of their interactions with a user to be truly useful. It seems OpenAI was listening because a couple of days ago, they announced that ChatGPT would retain a persistent memory of chats across multiple conversations. As reported in Wired, the memory will be used to add helpful background context to your prompts, improving their specificity to you over time. I argued in my October paper that the LLM community should look to the Case-Based Reasoning community for help with memory since we are the discipline within AI that has been explicitly concerned with memory for decades. For example, we long ago realised that while remembering is vital, a memory must also be able to forget some things to remain functional. This is a non-trivial problem discussed in Smyth and Keane's 1997 paper Remembering To Forget: A Competence-Preserving Case Deletion Policy for Case-Based Reasoning Systems. The synergies between CBR and LLMs will be the focus of a workshop at ICCBR-24 in July in Merida, Yucatán, México.

Thursday, January 4, 2024

Intelligent Agents: the transformative AI trend for 2024

Generated by DALL-E

As we move into 2024, the spotlight in AI will increasingly be on Intelligent Agents. As outlined in the influential paper by Wooldridge and Jennings (1995), agents are conceptualized as systems with autonomy, social ability, reactivity, and pro-activeness. Their evolution signifies a shift from mere tools to entities that can perceive, interact, and take initiative in their environments, aligning with the vision of AI as a field aiming to construct entities exhibiting intelligent behaviour.

The fusion of theory and practice in agent development is critical. Agent theories focus on conceptualizing and reasoning about agents' properties, architectures translate these theories into tangible systems, and languages provide the framework for programming these agents. This triad underpins the development of agents that range from simple automated processes to systems embodying human-like attributes such as knowledge, belief, and intention.

Ethan Mollick's exploration of GPTs (Generative Pre-trained Transformers) as interfaces to intelligent agents adds a contemporary dimension to this conversation. GPTs, in their current state, demonstrate the foundational capabilities for agent development - from structured prompts facilitating diverse tasks to integration with various systems. As envisioned by Wooldridge, Jennings, and Mollick, the future points towards agents integrated with a myriad of systems capable of tasks like managing expense reports or optimizing financial decisions.

Yet, this promising future has its challenges. The road to developing fully autonomous intelligent agents is fraught with technical and ethical considerations. Issues like logical omniscience in agent reasoning, the relationship between intention and action, and managing conflicting intentions remain unresolved. Mollick raises concerns about the vulnerabilities and risks in an increasingly interconnected AI landscape.

The explosion in Agents will be fuelled, like throwing gasoline on a fire, by the opening of OpenAI's GPT store sometime in early 2024. Many online pundits will think "agents" are a new thing! But as this post shows, the ideas and vast body of AI research dates back to the mid-1990s and early 2000s; as exemplified by The Trading Agents Competition.

Intelligent Agents represent a transformative trend in AI for 2024 and beyond. Their development, grounded in a combination of theoretical and practical advancements, paves the way for a future where AI is not just a tool but a proactive, interactive, and intelligent entity.

Thursday, December 28, 2023

Weizenbaum's ELIZA: A Reflection on AI and Transference

Generated by DALL-E

Sometimes, the simplest creations leave the most profound impacts. This was true for Joseph Weizenbaum's ELIZA, a chatbot I became familiar with during my MSc studies in 1985. My first assignment was to code a version of ELIZA in Prolog, and it was surprisingly easy. Yet, the implications of this simple program were anything but.

ELIZA, created in the mid-1960s, was one of the earliest examples of what we now call a chatbot. Its most famous script, DOCTOR, simulated a Rogerian psychotherapist. This simplicity was deceptive; the program merely echoed user inputs in the form of questions, yet it evoked profound emotional responses from users.

(You can try out ELIZA for yourself here.)

When I was tasked with coding ELIZA in Prolog as a new AI MSc student, I was struck by the simplicity of the task. Prolog, with its natural language processing capabilities, seemed almost tailor-made for this assignment. The ease with which I could replicate aspects of ELIZA's functionality was both exhilarating and unnerving. It was a testament to both the power of declarative AI programming languages like Prolog and the ingenious design of ELIZA.

The real intrigue of ELIZA lies not in its technical complexity but in the psychological phenomenon recognised by Freud it inadvertently uncovered: transference. Users often attributed understanding, empathy, and even human-like concern to ELIZA despite knowing it was a mere program. This phenomenon highlighted the human tendency to anthropomorphise and seek connection, even in unlikely places.

Joseph Weizenbaum himself was startled by this phenomenon. As a technologist who understood the mechanical underpinnings of ELIZA, he was disturbed by the emotional attachment users developed with the program. This led him to become a vocal critic of unrestrained AI development, warning of the ethical and psychological implications.

My journey with ELIZA and Prolog was more than an academic exercise; it was a window into the complex relationship between humans and AI. It highlighted the ease with which we can create seemingly intelligent systems and the profound, often unintended, psychological impacts they can have. As we venture further into the age of ChatGPT, Weizenbaum's cautionary tale remains as relevant as ever.

In an era where AI is more advanced and pervasive, revisiting the lessons from ELIZA and Weizenbaum's reflections, as highlighted in articles like this recent one from The Guardian, is crucial. It reminds us that in our quest to advance AI, we must remain vigilant of the human element at the core of our interactions with machines. Weizenbaum's legacy, through Eliza, is not just a technological artefact but a cautionary tale about the depth of human interaction with machines and the ethical boundaries we must navigate as we move ahead in the realm of AI.

Saturday, December 23, 2023

AI is (not) a bubble

Image generated by DALL-E

2023 has been an unprecedented year for Artificial Intelligence (AI). I know this because I have worked in the area since 1985 and have never seen AI get so much attention in the media. This is due to the release of ChatGPT and other generative AI applications based on Large Language Models capturing the public's attention like never before. Consequently, many pundits are nay-sayers, stating that AI is a bubble bound to burst, leaving fortunes in tatters and start-ups bankrupt. Undeniably, there is a small and finite market for apps that help students cheat on their essays or create the perfect dating site profile. However, AI is not a bubble.

This blog post by Cory Doctorow What Kind of Bubble is AI? is typical, making the common error of conflating AI with Large Language Models (LLMs) like ChatGPT. ChatGPT is merely one type of AI which has a 70+ year research and development history. Your smartphone map app uses the A* algorithm to find your route from A to B. It was developed at the Stanford Research Institute (SRI) in 1968 (the same place that made Apple's Siri). Fuzzy logic manages the autofocus in your phone's camera. Case-based reasoning provides knowledge to the help desk operator when you call 0800, and there are countless other examples of different AI methods embedded in all aspects of modern society. Large Language Models are called by us AI people "Foundation Models" because they provide a foundation other AIs can use to provide a two-way multimodal conversational interface. Yes, they are expensive to build and train, but as their name suggests, you only need a few "Foundation" models to underly a multitude of applications. This is a genuine breakthrough that will have a lasting impact on the uptake of AI once essay-cheating apps fall out of the public's focus.

Cory Doctorow's blog post, for example, says that "Radiologists might value the AI's guess about whether an X-ray suggests a cancerous mass. But with AIs' tendency to "hallucinate" and confabulate, there's an increasing recognition that these AI judgments require a "human in the loop" to carefully review their judgments." This mistakenly assumes that medical image analysis uses the same techniques as LLMs like ChatGPT. They do not; they're a mature application of medical image analysis using rigorously tested machine-learning algorithms that do not "guess" or "hallucinate". A recently published paper, Redefining Radiology: A Review of Artificial Intelligence Integration in Medical Imaging, by Reabal Nadjjar (Diagnostics 2023, 13, 2760. https://doi.org/10.3390/diagnostics13172760), details the development of AI-assisted medical imaging. The article clearly shows that AI is now a fixture in medical image analysis and diagnosis, although there is always room for improvement.

AI is just coming of age. ChatGPT has focused a spotlight on AI, which is now mature enough and has the processing power in the cloud to succeed. Why wasn't A* a thing in the 1960s? Back then, there simply wasn't enough portable processing power (or GPS). 2024 is going to be the year of "agents." OpenAI's release of its GPT Builder and an app store for GPTs that can interact with a myriad of online resources and tools will focus attention on the notion of intelligent agents. Many ill-informed pundits will think this is a brand new invention, whereas once again, Intelligent Agents is a mature discipline within AI dating back to the mid-1990s. This review paper by Michael.Wooldridge and Nicholas Jennings: Intelligent Agents: Theory and Practice. Knowledge Engineering Review 10(2), 1995, would be an excellent place to realise that agents won't be a flash in the pan either.

Undeniably, there is a lot of hype around AI, but within the bubble is a solid core of mature technologies ready to be exploited by people with knowledge and imagination.

Tuesday, November 28, 2023

Moore's Law visualised

Last week, a photo from my social media feeds perfectly illustrated Moore's Law. It shows a computer being manhandled into a local government building in 1957. A little Internet sleuthing revealed that it was an Elliot Series 405, revealing its full spec. These English business computers were 32-bit and had 8k of memory. That's not the entire computer; there were bulky peripherals, and a typical installation cost around £85,000. That's about $1,094,915 (USD) in today's value.

The computer below, shown against the same building, is a Raspberry Pi. Even a base model has 1GB of RAM, costing $100 or less. The photo is a beautiful illustration of Moore's Law, named after the late Gordon Moore, co-founder of Intel, who observed that the number of transistors in an integrated circuit doubles about every two years. Moore's Second Law also noted that the price fell.

Thursday, November 23, 2023

Has AI passed the Turing Test?

In 1950, British Mathematician Alan Turing published a paper called Computing Machinery and Intelligence. The paper opens with the remarkable sentence, "I propose to consider the question 'Can machines think?'" Remember that back in 1950, there were only a few computers in the world, and they were used exclusively for mathematical and engineering purposes. In this paper, Turing describes The Imitation Game, which we now call The Turing Test for machine intelligence. The test is quite simple: an interrogator using a teletype has to converse via a Q&A session with two hidden entities. One is a person, and the other is an AI chatbot. If the person guesses wrong, that is, identifies the chatbot as a human, then the computer has passed the Turing Test. Rember Turing called this the Imitation Game. Hence, the computer is successfully imitating intelligence. We can leave philosophers to decide if the computer is actually intelligent (note: any group of philosophers will never agree on this).

Now consider the maths of the Turing Test. If the interrogator simply randomly guesses between Human or Computer and wastes no time paying any attention to the merits of the Q&A session, they will be correct fifty per cent of the time since there are only two options. So, a large experiment using the Turing Test needs to identify the computer correctly significantly more than fifty per cent of the time to prove the AI has failed the Turing Test.

One such large experiment involving three large language models, including GPT-4 (the AI behind ChatGPT) has recently been published: HUMAN OR NOT? A GAMIFIED APPROACH TO THE TURING TEST. Over 1.5 million participants spent two minutes chatting with either a person or an AI. The AI was prompted to make small spelling mistakes and to quit if the tester became aggressive. With this prompting, interrogators could only correctly guess whether they were talking to an AI system 60% of the time a little better than random chance.

However, if the ChatGPT was prompted to be vulgar and use rude language, its success increased, and interrogators only identified the AI correctly 52.1% of the time, causing the authors to observe "that users associated impoliteness with human behaviour."

Turing himself set a low threshold for passing his eponymous test: "I believe that in 50 years’ time, it will be possible to make computers play the imitation game so well that an average interrogator will have no more than 70% chance of making the right identification after 5 minutes of questioning.” Well, it's been seventy years, but AI has decreased the chance of identification to 60%, and no better than guesswork if the AI curses.

This is a historical milestone. Passing the Turing Test has been held up as a significant challenge for AI since Turing's paper was first published, akin to summiting Everest or splitting the atom. The philosophers (and theologists) will continue to argue about the nature of intelligence, consciousness and free will while computer scientists continue developing machines that imitate intelligence.