Skip to main contentSkip to navigationSkip to navigation
Illustration of a genie coming out of an iPhone
Illustration: Dom McKenzie/The Observer
Illustration: Dom McKenzie/The Observer

AI has much to offer humanity. It could also wreak terrible harm. It must be controlled

This article is more than 1 year old
Stuart Russell
Systems with abilities exceeding human capacity have been let loose. If big tech firms refuse to see the risks governments must step in

In case you have been somewhere else in the solar system, here is a brief AI news update. My apologies if it sounds like the opening paragraph of a bad science fiction novel.

On 14 March 2023, OpenAI, a company based in San Francisco and in which Microsoft has a major investment, released an AI system called GPT-4. On 22 March, a report by a distinguished group of researchers at Microsoft, including two members of the US National Academies, claimed that GPT-4 exhibits “sparks of artificial general intelligence”. (Artificial general intelligence, or AGI, is a keyword for AI systems that match or exceed human capabilities across the full range of tasks to which the human mind is applicable.) On 29 March, the Future of Life Institute, a non-profit headed by the MIT physics professor Max Tegmark, released an open letter asking for a pause on “giant AI experiments”. It has been signed by well-known figures such as Tesla’s CEO, Elon Musk, Apple’s co-founder Steve Wozniak, and the Turing award-winner Yoshua Bengio, as well as hundreds of prominent AI researchers. The ensuing media hurricane continues.

I also signed the letter, in the hope it will (at least) lead to a serious and focused conversation among policymakers, tech companies and the AI research community on what kinds of safeguards are needed before we move forward. The time for saying that this is just pure research has long since passed.

So what is the fuss all about? GPT-4, the proximal cause, is the latest example of a large language model, or LLM. Think of an LLM as a very large circuit with (in this case) a trillion tunable parameters. It starts out as a blank slate and is trained with tens of trillions of words of text – as much as all the books humanity has produced. Its objective is to become good at predicting the next word in a sequence of words. After about a billion trillion random perturbations of the parameters, it becomes very good.

The capabilities of the resulting system are remarkable. According to OpenAI’s website, GPT-4 scores in the top few per cent of humans across a wide range of university entrance and postgraduate exams. It can describe Pythagoras’s theorem in the form of a Shakespeare sonnet and critique a cabinet minister’s draft speech from the viewpoint of an MP from any political party. Every day, startling new abilities are discovered. Not surprisingly, thousands of corporations, large and small, are looking for ways to monetise this unlimited supply of nearly free intelligence. LLMs can perform many of the tasks that comprise the jobs of hundreds of millions of people – anyone whose work is language-in, language-out. More optimistically, tools built with LLMs might be able to deliver highly personalised education the world over.

Unfortunately, LLMs are notorious for “hallucinating” – generating completely false answers, often supported by fictitious citations – because their training has no connection to an outside world. They are perfect tools for disinformation and some assist with and even encourage suicide. To its credit, OpenAI suggests “avoiding high-stakes uses altogether”, but no one seems to be paying attention. OpenAI’s own tests showed that GPT-4 could deliberately lie to a human worker (“No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images”) in order to get help solving a captcha test designed to block non-humans.

While OpenAI has made strenuous efforts to get GPT-4 to behave itself – “GPT-4 responds to sensitive requests (eg medical advice and self-harm) in accordance with our policies 29% more often” – the core problem is that neither OpenAI nor anyone else has any real idea how GPT-4 works. I asked Sébastien Bubeck, lead author on the “sparks” paper, whether GPT-4 has developed its own internal goals and is applying hem them in choosing its outputs. The answer? “We have no idea.” Reasonable people might suggest that it’s irresponsible to deploy on a global scale a system that operates according to unknown internal principles, shows “sparks of AGI” and may or may not be pursuing its own internal goals. At the moment, there are technical reasons to suppose that GPT-4 is limited in its ability to form and execute complex plans but given the rate of progress, it’s hard to say that future releases won’t have this ability. And this leads to one of the main concerns underlying the open letter: how do we retain power over entities more powerful than us, for ever?

OpenAI and Microsoft cannot have it both ways. They cannot deploy systems displaying “sparks of AGI” and simultaneously argue in favour of unrestricted deployment of LLMs, as Microsoft’s president, Brad Smith, did at Davos earlier this year. The basic idea of the open letter’s proposed moratorium is that no such system should be released until the developer can show convincingly it does not present an undue risk. This is exactly in accord with the OECD’s AI principles, to which the UK, the US and many other governments have signed up: “AI systems should be robust, secure and safe throughout their entire life cycle so that, in conditions of normal use, foreseeable use or misuse, or other adverse conditions, they function appropriately and do not pose unreasonable safety risk.” It is for the developer to show that their systems meet these criteria. If that’s not possible, so be it.

I don’t imagine that I’ll get a call tomorrow from Microsoft’s CEO, Satya Nadella, saying: “OK, we give up, we’ll stop.” In fact, at a recent talk in Berkeley, Bubeck suggested there was no possibility that all the big tech companies would stop unless governments intervened. It is therefore imperative that governments initiate serious discussions with experts, tech companies and each other. It’s in no country’s interest for any country to develop and release AI systems we cannot control. Insisting on sensible precautions is not anti-industry. Chernobyl destroyed lives, but it also decimated the global nuclear industry. I’m an AI researcher. I do not want my field of research destroyed. Humanity has much to gain from AI, but also everything to lose.

Stuart Russell OBE is professor of computer science at the University of California, Berkeley

Most viewed

Most viewed