1.1 Introduction
Artificial Intelligence (AI) has made very rapid progress in recent years. From smart speakers and question answering chatbots, to factory robots and self-driving cars, to AI-generated music, artwork and perfumes, to game playing and debating systemsāwe have experienced the transition of AI from a largely theoretical discipline into a practical tool empowering a plethora of new applications. Some might say that āAI is the new IT (Information Technology),ā and we are seeing the evidence across the industry: machine learning and other foundational AI subjects have record-breaking enrollment on university campuses, while AI-enabled tools are already assisting doctors to spot melanoma, recruiters to find qualified candidates, and banks to decide whom to extend a loan to. Algorithms are powering product recommendations, targeted advertising, essay grading, employee promotion and retention, risk scoring, image labelling, fraud detection, cybersecurity defenses, and a host of other applications.
The explosion and broad adoption of algorithmic decision-making has spurred a great amount of interest and triggered a variety of reactions (along with a substantial amount of āhypeā)āfrom excitement about how AI capabilities will augment human decision-making and improve business performance, to questions about fairness and ethics, fears of job eliminations and economic disparity, even speculations about a threat to humanity. Even the term āAIā itself has evolved and has come to mean different things to different people; it includes machine learning, neural networks, and deep learning, but has also become an umbrella term for many other analytics- and data-related subjects (part of the āAI is the new ITā phenomena).
The goal of this chapter is to give a brief introduction of AI and describe its evolution from the current ānarrowā state to a point where capabilities are more advanced and are ābroadā, through to a futuristic state of āgeneral AIā. We also explore considerations for organizations and management, including the role of AI in business operations tasks such as strategy planning, marketing, product design, and customer support. Finally, we detail requirements for organizations in defining a comprehensive AI strategy, supported by an understanding of the value of AI to the organization and a focus on needs, including data and skills, to appropriately execute the AI strategy.
1.1.1 Defining AI
āViewed narrowly, there seem to be almost as many definitions of intelligence as there were experts asked to define it,ā observed Richard Gregory in his book
The Oxford Companion to the Mind (Gregory
1998), while one study identifies over 70 definitions of intelligence (Legg and Marcus
2007). Broadly speaking, AI is a field of computer science that studies how machines can be made to act intelligently. AI has many functions, including, but not limited to:
Learning, which includes approaches for learning patterns from data. Two main types of learning are unsupervised and supervised. In unsupervised learning, the computer learns directly from raw data, whereas with supervised learning, human input is provided to label or identify important aspects of the data to define the training. Deep learning is a specialized class of primarily supervised learning built on artificial neural networks;
Understanding, which includes techniques for knowledge representation required for domain-specific tasks, such as medicine, accounting, and law;
Reasoning, which comes in several varieties, such as deductive, inductive, temporal, probabilistic, and quantitative; and
Interacting, with people or other machines to collaboratively perform tasks, or for interacting with the environment.
1.1.2 A Brief History of AI
AI has received significant attention recently, but it is not a new concept. The idea of creating a āthinkingā machine precedes modern computing. For example, the study of formal reasoning dates back to ancient philosophers Aristotle and Euclid. Calculating machines were built in antiquity and were improved throughout history by many mathematicians. In the seventeenth century Leibniz, Hobbes and Descartes explored the possibility that all rational thought could be made as systematic as algebra or geometry. The concept of artificial neural network is not new either. In 1943, Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, tried to understand how the brain could produce highly complex patterns by using many basic cells, neurons, that are connected together, and they outlined a highly simplified computational model of a neuron (McCulloch and Pitts 1943). This work has made an important contribution to the development of artificial neural networks, which are the underpinning of many AI systems today. Another important contribution was made by Donald Hebb who proposed that neural pathways strengthen over each successive use, especially those between neurons that tend to fire at the same time (Hebb 1949). This idea was essential to the concept of Hebbian learning, and in the context of artificial neural networks, the process of setting and learning the weights between different neurons in the neural network model.
In 1950, Alan Turing, published his seminal paper āComputing Machinery and Intelligenceā, where he laid out several criteria to assess whether a machine could be deemed intelligent. They have since become known as the āTuring testā (Turing 1950). The term āartificial intelligenceā was coined in 1955 by John McCarthy, then a young assistant professor of mathematics at Dartmouth College, when he decided to organize a working group to clarify and develop ideas about thinking machines. The workshop was held at Dartmouth in the summer of 1956, and AI as an academic discipline took off. Three years later, in 1959, IBM scientist Arthur Samuel coined the term āmachine learningā to refer to computer algorithms that learn from and make predictions on data by building a model from sample inputs, without following a set of static instructions. Machine learning techniques were core to Samuelās game-playing program for checkers. It was the first game-playing program to achieve sufficient skill to challenge a world champion. Game playing continued to be a way to challenge AI and measure its progress over the next few decades and we have seen application in checkers, chess, backgammon and Go.
The period from 1956 to 1974 was known as the āgolden years of AIā. Many prominent scientists believed that breakthroughs were imminent and government and industrial sponsors flooded the field with grants.
The field of AI has gone through phases of rapid progress and hype in the past, quickly followed by a cooling in investment and interest, often referred to as āAI winters.ā The first AI winter occurred in the 1970s as AI researchers underestimated the difficulty of problems they were trying to solve. Once the breakthroughs failed to materialize, government and other funding dried up. During an AI winter, AI research programs had to disguise their research under different names (e.g. āpattern recognitionā, āinformaticsā, āknowledge-based systemā) in order to receive funding.
Starting in the mid-seventies, by focusing on methods for knowledge representation, researchers began to build practically usable systems. AI came back in a form of expert systems āprograms that answer questions or solve problems in a specific narrow domain, using logical rules that encapsulate and implement the knowledge of a subject matter expert. As an example, in 1980, Digital Equipment Corporation (DEC) deployed R1 to assist in the ordering of DECās VAX computer systems by automatically selecting components based on the customerās requirements. By 1986, R1 had about 2500 rules, had processed 80,000 orders, and achieved 95ā98% accuracy; it was saving the company $40M per year by reducing the need to give customers free components when technicians made errors, speeding the assembly process, and increasing customer satisfaction (Crevier 1993).
The 1980s also saw the birth of Cyc, the first attempt to create a database that contains the general knowledge most individuals are expected to have, with the goal of enabling AI applications to perform human-like reasoning. The Cyc project continues to this day, under the umbrella of Cycorp. The ontology of Cyc terms grew to about 100,000 during the first decade of the project, and as of 2017 contains about 1,500,000 terms.
In 1989, chess playing programs HiTech and Deep Thought defeated chess masters. They were developed by Carnegie Mellon University, and paved the way for Deep Blue, a chess-playing computer system developed by IBM, the first computer to win both a chess game and a chess match against a reigning world champion.
1.1.3 The Rise of Machine Learning and Neural Networks
Artificial neural networks are inspired by the architecture of the human brain. They contain many interconnected processing units, artificial neurons, which are analogous to biological neurons in the brain. A neuron takes an input and processes it in a certain way. Typically, neurons are organized in layers. Different layers may perform different kinds of operations on their inputs, while the connections between the neurons contain weights, mimicking the concept of Hebbian learning.
For close to three decades, symbolic AI dominated both research and commercial applications of AI. Even though artificial neural networks and other machine learning algorithms were actively researched, their practical use was hindered by the lack of digitized data from which to learn from and insufficient computational power. It was only in the mid 1980s that a re-discovery of an already known concept pushed neural nets into the mainstream of AI. Backpropagation, a method for training neural nets devised by researchers in the 60s, was revisited by Rumelhart, Hinton, and Williams; they published a paper, which outlined a clear and concise formulation for the technique and it paved its way into the mainstream of machine learning research (Rumelhart et al. 1986). The ability to train practical neural networks, the intersection of computer science and statistics, coupled with rapidly increasing computational power, led to the shift in the dominating AI paradigm from symbolic AI and knowledge-driven approaches, to machine learning and data-driven approaches. Scientists started to build systems that were able to analyze and learn from large amounts of labeled data, and applied them to diverse application areas, such as data-mining, speech recognition, optical character recognition, image processing, and computer vision.
The first decades of the twenty-first century saw the explosion of digital data. The growth in processing speed and power, and the availability of specialized processing devices such as graphical processing units (GPUs) finally intersected with large-enough data sets that had been collected and labeled by humans. This allowed researchers to build larger neural networks, called deep learning networks , capable of performing complex, human-like tasks with great accuracy, and in many cases, achieving super-human performance. Today, deep learning is powering-up a variety of applications, including computer vision, speech recognition, machine translation, friend recommendations on social network analysis, playing board and video games, home assistants, conversational devices and chatbots, medical diagnostics, self-driving cars, and operating robots.