John Dinsmore
Washington University
This is the unfolding story of symbolicism and connectionism, the argumentative paradigms who inhabit the modern study of cognition, and of the gap that stands between them. The purpose of this initial chapter is to provide a general overview of the debate between the paradigms, and to show where each of the remaining chapters slices through this debate. It also provides the introductory background, both technical and philosophical, necessary for the comprehension of the remaining chapters. I begin by looking at symbolicism and connectionism independently, what each is and what each has accomplished or failed to accomplish. Then I look at ways in which the gap between them might be handled.
1. THE SYMBOLIC PARADIGM
Symbolic theories are representation-oriented in the sense that each begins by positing basic syntactic structures assumed to possess a transparent compositional semantics. Momentarily, we see that connectionist models, in contrast, are process-oriented. The greatest problems for symbolic theories begin in looking for the kinds of processes needed to produce the properties observed in human cognition.
In this section I try to characterize symbolicism a little more accurately, and give a general assessment of the many strengths but especially of the apparent limitations found in current work within the symbolic paradigm.
1.1. Symbolic Mechanisms
The central principles that characterize traditional work in the symbolic paradigm can be summarized as follows:
• there are such things as symbols, which can be combined into larger symbolic structures (or expressions),
• these symbolic structures have a combinatorial semantics whereby what a symbolic structure represents is a function of what the parts represent, and
• at the same time all cognitive processes (reasoning) are manipulations of these symbolic structures.
This position is represented by Fodor and Pylyshyn (1988)—one of the most cited references in this book—Pylyshyn (1984), and Newell and Simon (1976; Newell, 1982). The latter incorporate this position as the central idea of what they called the physical symbol system hypothesis.
Other properties have often been attributed to symbolic systems, but either do not adhere very firmly on closer inspection, or do not turn out to be of critical importance in the debate between the paradigms. Most of these involve the thesis that the symbolic paradigm is computationalist, that is, that it takes the computer metaphor very seriously in talking about human cognition. Newell and Simon in fact borrowed specific assumptions related to modern digital computers, such as universality (the ability to stimulate any other system), discreteness, and the ability to execute programs. But as Derthick (1990) pointed out, these assumptions seem to go by the wayside in Fodor and Pylyshyn’s well-known arguments against the viability of a connectionist cognitive science. Another feature sometimes attributed to the symbolic paradigm is the assumption of seriality, but there are simply too many exceptions in modem artificial intelligence, such as marker-passing systems discussed later, to take this attribute seriously, and the advent of massive parallelism in computer architecture no longer supports seriality as characteristic of computers in any case. Nevertheless, I return later to the issue of whether the abstract concept of computation offers a principled distinction between the paradigms.
1.2. Examples of Symbolic Models
The principles listed earlier are central in the sense that symbolic theories tend to conform to these principles, but, as becomes evident, systems often deviate from them in restricted ways. I look at a couple of kinds of symbolic models that come up most often in discussing contrasts between the paradigms.
1.2.1. Logic and Rule-based Systems. Perhaps the prototype of symbolic models is the logic-based system, especially common in artificial intelligence, whereby a proposition is expressed discretely by a predicate (a symbol for a relation) and a linear sequence of arguments (each a symbol for an object). Inference procedures define manipulations of those structures, generally in a manner that preserves meanings or truth conditions. For example, such a procedure would allow the presence of the following expressions,
bird(Tweety)
bird(X) → feathered (X)
to result in the derivation of the following expression.
feathered (Tweety)
Alongside logic-based systems are rule-based or production systems, common in psychology and linguistics as well as artificial intelligence. These systems interpret rules (each of which has a condition that potentially matches some symbolic structure and an action that specifies some symbolic manipulations) by repeatedly selecting a rule (generally one of many) whose condition is satisfied and then executing the action. For instance, the following rule might cause the inference discussed earlier to be derived.
condition: bird(X)
action: add feathered (X)
1.2.2. Associative Networks. Also known as semantic networks, these were introduced in psychology (Quillian, 1968) to model associative memory. We observe associative memory when the sound of a voice causes us to think about the person belonging to the voice. An associative network represents the concepts in an encyclopedic way, as a set of nodes interconnected with links labeled for the relations holding between concepts. The original semantic network models attributed to particular symbolic nodes activations that increased when a node was processed, but automatically spread to connected nodes. For instance, the kind of semantic priming that occurs in a sentence like The astronomer married a star, whereby the interpretation of star as a celestial object, strongly suggested by astronomer, confuses the interpretation, can be modeled in an associative network by activation spreading from the node for astronomer.
In many associative network models the passing of discrete symbolic markers in a much more controlled manner replaces spreading activation (see Lange, chap. 10 in this volume). Marker passing, however, is capable of achieving many specific inferences through graph-traversal instead of by rule application. The links serve as an indexing mechanism to facilitate matching one structure against another and finding intersections of sets or concepts.
Associative networks are of interest in the symbolicism/connectionism debate because they are perhaps the most connectionist of the symbolic models. In fact, although work in associative networks is generally assumed to belong firmly in the symbolic tradition, we doubtlessly find some stretching of some of the central premises of the symbolic paradigm in this framework: In associative networks manipulations are sensitive to current activation levels (or the presence of certain markers) yet these do not contribute to semantics. Thus processing in associative networks is not defined in terms of strictly symbolic structures. Lange (chap. 10 in this volume)and some others like to classify these as connectionist in a very broad sense to underscore these similarities while at the same time recognizing that they are at the same time fundamentally symbolic.
1.3. Representations in Symbolic Systems
Philosophers, linguists, psychologists, researchers in artificial intelligence, and many others have been comfortable with the languagelike expressive abilities provided in the symbolic paradigm, the clarity with which the symbolic paradigm treats reasoning in science or other conscious domains and its natural affinity for dealing with concepts at a level subject to conscious introspection. Almost all of the practical successes in artificial intelligence in implementing higher-level abilities in such domains as expert systems, language understanding, machine translation, goal-oriented planning, and mathematical reasoning, have relied on symbolicism.
The symbolic paradigm is in effect designed around its support for representation, and it is therefore hardly surprising that this is where its strengths lie. Fodor and Pylyshyn (1988) saw this strength in systematicity and productivity of cognitive representations, which arise from the way the compositionality of symbolic expressions allows for the decomposition and recomposition of representations. Barnden (chap. 7 in this volume) contrasts these representational advantages with those of connectionist representations.
1.4. Learning
What has been simple for symbolic systems is simply acquiring information already encoded in some propositional language, that is, learning by being told. However symbolic models are very poor at adapting or organizing themselves dynamically on the basis of experience. Although a lot of attention has been given within artificial intelligence to symbolic learning algorithms (Michalski, Carbonell, & Mitchell, 1983, 1986), each of these generally works in some specialized domain and does not reorganize the system in any profound way. The lack of generality in learning means that computational symbolic models are invariably hand-coded with a particular algorithm in mind. This contrasts very markedly with connectionist systems for which very general and powerful learning techniques exist, which make it unnecessary to explicitly program the networks for specific behaviors.
1.5. Mysterious Processes
There appear to be many cognitive processes that do involve symbolic manipulations, in the sense of mapping one symbolic structure, A, into another, B, but nevertheless resist symbolic analysis, in the sense of an explicit specification of what the relation of A to B must be strictly in boolean terms of the symbolic structures involved. Dinsmore (1991) called symbolic processes with this property mysterious. Let’s look at a few processes that appear to be mysterious, given symbolic experience, at this time.
1.5.1. Holistic Processes. As an example, face recognition might be seen abstractly as a mapping that takes a set of symbolic structures representing facial features, and returns a symbol representing some individual. Now, humans can recognize the same face in a variety of contexts, from different angles, under different illuminations, with parts obscured by shadows or other objects, and so on. No known boolean combination of features determines this mapping, yet a given face is recognized fairly reliably. Dreyfus and Dreyfus (1986) described such processes as holistic, resisting the explicit decomposition of objects into component features.
1.5.2. Noise and Unexpected Input. Symbolic models are notoriously brittle when presented with unusual, unplanned, faulty, or noisy input. Even where the symbolic mapping is analyzable under ideal noise-free conditions, noise introduces a degree of complexity into the mapping that humans, but not symbolic models, are typically capable of overcoming.
Kwasny and Faisal (chap. 9 in this volume) discuss some examples in the context of parsing natural language. Symbolic parsers generally cannot successfully parse ungrammatical sentences like John did hitting Jack, even though they are close to grammatical sentences and can be processed by humans. The symbolic parsers that can handle such sentences do so by explicitly anticipating such anomalies, in effect making the grammar accepted by the parser more general to accommodate them. There is a problem because there are always new instances that will not be anticipated, so that the process involved must either become enormously complex or remain inadequate. Kwansy and Faisal in fact handle such cases by building in a connectionist component to account for the mysterious mapping.
1.5.3. Associative Memory. Associations between concepts have proven particularly resistant to symbolic analysis. Such associations should allow the retrieval of a complete memory given a partial description of the content of the memory, such as high-ranking government official who is not too bright. Examples from natural language processing are semantic priming (discussed earlier), metonymy (the reference to one object or concept by means of an expression that stands for an associated object or concept)and the resolution of anaphoric, for example, pronominal, reference.
1.5.4. The Problem of Mysterious Processes for Symbolicism. In summary, mysterious processes can be seen to map symbolic representations to others as required in the symbolic paradigm, but the symbolic vocabulary often does not seem to allow us to fully analyze the mapping, not to the degree that we can predict which representations will get mapped onto which, or such that we could develop a purely symbolic artificially intelligent program that achieves behavior that depends on mysterious processes. It follows that there can never be a complete, strictly symbolic theory of cognition. Work in connectionism, to which we now turn, suggests that dropping down to a lower, subsymbolic, level allows mysterious processes to be analyzed more successfully in terms of a nonsymbolic vocabulary.
2. THE CONNECTIONIST PARADIGM
The connectionist paradigm is, in marked contrast to the symbolic paradigm, process-oriented. That is, it starts with the assumption that the basic objects posited in a theory of cognition will be processed in a simple and usually uniform way. Unfortunately, this orientation then leaves connectionist models with the rather daunting task of discovering how representations emerge from the simple processing components.
This section provides a brief description of connectionism and lists some of its accomplishments and problems. Keep in mind that not as much work has been done to date in connectionist theories as in symbolic theories. It is hardly surprising, therefore, that (a) connectionism has logged fewer substantial achievements than symbolicism, (b) fewer clear weaknesses in the connectionist paradigm have been demonstrated, and (c) researchers in connectionism tend to have very high expectations of future success.
2.1. Connectionism in a Nutshell
A connectionist system is a network of very simple processing units that stores knowledge in the weights of the connections between units and realizes computation in the dynamic global behavior that results from the local interactions among units. The following few pages give a basic introduction to the fundamentals of connectionism equivalent to a one-semester university course, and in particular introduces some of the terminology used in the other chapters of this book. The uninitiated will have to hold onto their hats, but in principle should come out with enough background to appreciate the rest of the book.
2.1.1. Units and Connections. A connectionist model consists of a set of units linked together in a network by a set of connections. Units and connections are neurologically inspired by neurons and synapses respecti...