What Is AI?
The goal of AI is to build machines that are capable of performing tasks that we define as requiring intelligence, such as reasoning, learning, planning, problem-solving, and perception. The field was given its name by computer scientist John McCarty, who, along with Marvin Minsky, Nathan Rochester, and Claude Shannon, organized The Dartmouth Conference in 1956 (McCarthy, Minsky, Rochester, & Shannon, 1955). The goal of the conference was to bring together leading experts to set forward a new field of science involving the study of intelligent machines. A central premise discussed at the conference was that âEvery aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate itâ (McCarthy et al., 1955). During the conference, Allen Newell, J.C. Shaw, and Herbert Simon demonstrated the Logic Theorist (LT), the first computer program deliberately engineered to mimic the problem-solving skills of a human being (Newell & Simon, 1956).
Over the last 60 years AI has grown into a multidisciplinary field involving computer science, engineering, psychology, philosophy, ethics, and more. Some of the goals of AI are to design technology to accomplish very specialized functions, such as computer vision, speech processing, and analysis and prediction of patterns in data. This focus on specific intelligent tasks is referred to as Weak AI (sometimes called Applied AI or Narrow AI) (Velik, 2012). An example of a Weak AI machine is IBMâs Deep Blue chess-playing system that beat the world chess champion, Garry Kasparov, in 1997. Rather than simulating how a human would play chess, Deep Blue used the process of brute force techniques to calculate probabilities to determine its offensive and defensive moves. The term âStrong AI,â introduced by the philosopher John Searle in 1980 (Searle, 1980), refers to the goal of building machines with Artificial General Intelligence. The goal of Strong AI is thus to build machines with intellectual ability that is indistinguishable from that of human beings (Copeland, 2000). The overall aim of AI is not necessarily to build machines that mimic human intelligence; rather, intelligent machines are often designed to far exceed the capabilities of human intelligence. These capabilities are generally narrow and specific tasks, such as the performance of mathematical operations.
The term AI is also sometimes used to describe the intelligent behavior of machines such that a machine can be said to possess âAIâ when it performs tasks that we consider as intelligent. AI can be in the form of hardware or software that can be stand-alone, distributed across computer networks, or embodied into a robot. AI can also be in the form of intelligent autonomous agents (e.g., virtual or robotic) that are capable of interacting with their environment and making their own decisions. AI technology can also be coupled to biological processes (as in the case of brainâcomputer interfaces (BCIs)), made of biological materials (biological AI), or be as small as molecular structures (nanotechnology). For the purposes of this chapter, I use the term AI to refer to the field of science and AI technologies or intelligent machines to refer to technologies that perform intelligent functions.
Machine Learning and Artificial Neural Networks
Machine learning (ML) is a core branch of AI that aims to give computers the ability to learn without being explicitly programmed (Samuel, 2000). ML has many subfields and applications, including statistical learning methods, neural networks, instance-based learning, genetic algorithms, data mining, image recognition, natural language processing (NLP), computational learning theory, inductive logic programming, and reinforcement learning (for a review see Mitchell, 1997).
Essentially, ML is the capability of software or a machine to improve the performance of tasks through exposure to data and experience. A typical ML model first learns the knowledge from the data it is exposed to and then applies this knowledge to provide predictions about emerging (future) data. Supervised ML is when the program is âtrainedâ on a pre-defined set of âtraining examplesâ or âtraining sets.â Unsupervised ML is when the program is provided with data but must discover patterns and relationships in that data.
The ability to search and identify patterns in large quantities of data and in some applications without a priori knowledge is a particular benefit of ML approaches. For example, ML software can be used to detect patterns in large electronic health record datasets by identifying subsets of data records and attributes that are atypical (e.g., indicate risks) or that reveal factors associated with patient outcomes (McFowland, Speakman, & Neill, 2013; Neill, 2012). ML techniques can also be used to automatically predict future patterns in data (e.g., predictive analytics or predictive modeling) or to help perform decision-making tasks under uncertainty. ML methods are also applied to Internet websites to enable them to learn the patterns of care seekers, adapt to their preferences, and customize information and content that is presented. ML is also the underlying technique that allows robots to learn new skills and adapt to their environment.
Artificial neural networks (ANNs) are a type of ML technique that simulates the structure and function of neuronal networks in the brain. With traditional digital computing, the computational steps are sequential and follow linear modeling techniques. In contrast, modern neural networks use nonlinear statistical data modeling techniques that respond in parallel to the pattern of inputs presented to them. As with biological neurons, connections are made and strengthened with repeated use (also known as Hebbian learning; Hebb, 1949). Modern examples of ANN applications include handwriting recognition, computer vision, and speech recognition (Haykin & Network, 2004; Jain, Mao, & Mohiuddin, 1996). ANNs are also used in theoretical and computational neuroscience to create models of biological neural systems in order to study the mechanisms of neural processing and learning (Alonso & MondragĂłn, 2011). ANNs have also been tested as a statistical method for accomplishing practical tasks in mental health care, such as for predicting lengths of psychiatric hospital stay (Lowell & Davis, 1994), determining the costs of psychiatric medication (Mirabzadeh et al., 2013), and for predicting obsessive compulsive disorder (OCD) treatment response (Salomoni et al., 2009).
ML algorithms and neural networks also provide useful methods for modern expert systems (see Chapter 2). Expert systems are a form of AI program that simulates the knowledge and analytical skills of human experts. Clinical decision support systems (CDSSs) are a subtype of expert system that is specifically designed to aid in the process of clinical decision-making (Finlay, 1994). Traditional CDSSs rely on preprogrammed facts and rules to provide decision options. However, incorporating modern ML and ANN methods allows CDSSs to provide recommendations without preprogrammed knowledge. Fuzzy modeling and fuzzy-genetic algorithms are specific ancillary techniques used to assist with the optimization of rules and membership classification (see Jagielska, Matthews, & Whitfort, 1999). These techniques are based on the concept of fuzzy logic (Zadeh, 1965), a method of reasoning that involves approximate values (e.g., some degree of âtrueâ) rather than fixed and exact values (e.g., âtrueâ or âfalseâ). These methods provide a useful qualitative computational approach for working with uncertainties that can help mental healthcare professionals make more optimal decisions that improve patient outcomes.