1
What is AI?
A common definition of AI is that it seeks to mimic or simulate the intelligence of the human mind. As Margaret Boden puts it, āAI seeks to make computers do the sorts of things that minds can do.ā1 Meanwhile John Kelleher describes AI as āthat field of research that is focused on developing computational systems that can perform tasks and activities normally considered to require human intelligenceā.2 The implication here is that AI might soon be able to take over these tasks and activities.3
In the longer term, however, AI is likely to exceed the intelligence of the human mind.4 Human intelligence does not constitute the absolute pinnacle of intelligence. It merely constitutes āhuman-level intelligenceā. After all, there are already specific domains, such as the games of Chess and Go, where AI outperforms human intelligence. In the future, there are likely to be forms of intelligence that far exceed the intelligence of the human mind. Alternatively, then, we could define research into AI as an attempt to understand intelligence itself.
Typically the tasks performed by AI involve learning and problem-solving. But not all these tasks require intelligence.5 Some, for example, merely involve vision or speech recognition. Importantly, however, as Boden notes, they all relate to our cognitive abilities: āAll involve psychological skills ā such as perception, association, prediction, planning, motor control ā that enable humans and animals to attain their goals.ā6 Thus, although some of the operations included in research into AI are not intelligent in themselves, they must nonetheless be included in any purview of AI, as they are crucial ācharacteristics or behavioursā in the field of AI.7
When defining AI, it is necessary to draw up a series of distinctions. First and foremost, although the term āintelligenceā is used for both human beings and machines, we must be careful to distinguish AI from human intelligence. For the moment, at any rate, AI does not possess consciousness.8 This is important to recognise. For example, AI might be capable of beating humans at a game of Chess or Go, but this does not mean that AI is aware that it is playing a game of Chess or Go.
At present, then, we are still limited to a relatively modest realm of AI that is known as ānarrow AIā, which ā as its name implies ā is narrow and circumscribed in its potential.9 āNarrow AIā ā also known as āweak AIā ā needs to be distinguished from āstrong AIā or artificial general intelligence (AGI), which is AI with consciousness. At the moment, AGI remains a long way off. The only āexamplesā of AGI at the moment are pure fiction, such as characters from the world of the cinema, like Agent Smith in The Matrix or Ava in Ex Machina.10 Nonetheless some philosophers, such as David Chalmers, think that development of GPT-3 by Open AI has brought the possibility of AGI much closer.11
Classic examples of ānarrow AIā would be Siri, Alexa, Cortona or any other form of AI assistant. However sophisticated these assistants might appear, they are operating within a limited range of predetermined functions. They cannot think for themselves any more than a pocket calculator can think, and are incapable of undertaking any activity that requires consciousness.12
The different forms of AI
The term AI is often used as though it is a singular, homogeneous category. Indeed, this is how the general public understands the term. However, there are in fact many different forms of AI, and even these can be further divided into a series of sub-categories. In order to understand AI, then, we need to differentiate the various forms of AI.
Within ānarrow AIā we should make a further distinction between the broader category of AI itself, āmachine learningā and ādeep learningā. These three can be seen to be nested within each other ā somewhat like a series of Russian dolls, or layers in an onion ā in that ādeep learningā is part of āmachine learningā that is itself part of AI. Early versions of AI referred to machines that had to be programmed to process a set of data. This is also known as āClassical AIā, or even, somewhat disparagingly, āGood Old Fashioned AI (GOFAI)ā. The important point here is that with early AI, the machine could only do what it was programmed to do. The machine itself could not learn.
FIGURE 1.1 Diagram of the relationship between deep learning, machine learning and AI. This Venn diagram illustrates how deep learning is nested inside machine learning that is itself nested within the broader category of AI.
By contrast, machine learning goes one step further, and is able to train itself using vast quantities of data. Importantly, machine learning challenges the popular myth that computers cannot do anything that they have not been programmed to do. As the term implies, machine learning is able to learn, and even programme itself, although we should be careful not to conflate the way that machines ālearnā with human learning. Like other terms used for both AI and human intelligence, ālearningā does not necessarily have the same meaning in both contexts.
Stuart Russell comments that āLearning is a key ability for modern artificial intelligence. Machine learning has always been a subfield of AI, and it simply means improving your ability to do the right thing, as a result of experience.ā13 Furthermore, learning is essential if AI is ever going to match human intelligence. As Pedro Domingos observes, āThe goal of AI is to teach computers to do what humans currently do better, and learning is arguably the most important of those things: without it, no computer can keep up with a human for long; with it, the rest follows.ā14
Deep learning is a relatively recent development within machine learning and has led to many significant advances in the field of AI. It is deep learning that is, for now at least, the most promising form of AI. In fact deep learning has become the dominant paradigm to the point that ā somewhat misleadingly ā it has become almost synonymous with AI, at least within the popular media.15 Indeed, whenever AI is mentioned in this book, it is invariably deep learning that is being referred to, and certainly not GOFAI. Importantly, however, deep learning depends on a vast amount of data, so much so that data, as is often said, has now become āthe new oilā.
Kelleher defines deep learning as āthe subfield of machine learning that designs and evaluates training algorithms and architectures for modern neural network modelsā.16 Here we come to an important point regarding the term āarchitectureā. Confusingly, āarchitectureā is also used within computer science to refer to the internal organisation of a computer.17 This has nothing to do, however, with the term āarchitectureā as used in the context of buildings. Readers should therefore be careful not to confuse the āarchitectureā of computer science with the āarchitectureā of the construction industry.
Although we might trace its origins back to the neural networks developed by Pitts and McCulloch in 1943, deep learning has since developed at an astonishing rate. Several factors have contributed to this:
1Major advances in algorithms have fuelled the deep learning breakthrough.
2Cloud services have made access to significant computational power possible.
3There has been a significant influx of capital investment from both the public and private sectors.18
4There are now significantly more students in the fields of computer science and data science.
5There has been an exponential increase in the amount of data generated.19
In short, the differences between early neural networks and more recent neural networks used in deep learning should not be underestimated. There is an enormous gulf between these two in terms of their performance and capabilities. Think of the difference between early cars ā once known as āhorseless carriagesā ā and the sophisticated, potentially self-driving cars of today.
Training techniques
Further, we need to distinguish the...