Chapter 1
On the On-Line Study of
Language Comprehension
MANUEL CARREIRAS AND
CHARLES CLIFTON, JR.
Language has been studied in many ways. It has been examined as an art, as a basis of philosophical investigations, and as a way of gaining insight into the human mind. During the past 50 years, practitioners of various scientific disciplines have developed objective ways of studying language. Linguists try to understand it through the construction of theories of abstract linguistic knowledge. Psycholinguists try to understand how language users engage this knowledge in comprehension and production. Neuroscientists have been digging into the biological substrate of language to study this exclusively human mental activity through identifying the time course of brain processes that are involved in language comprehension. This book provides a glimpse of how research done by linguists, psycholinguists, and neuroscientists can be brought to bear on a fundamentally psychological questionāhow language is comprehended in real time.
The topic of real-time comprehension of language has held center stage through most of the history of psycholinguistics. This is because psycholinguists have tried to develop theories of how language is comprehended that spell out the cognitive processes taking place when language is being understood. Such information-processing models are essentially claims about how mental representations are created, transformed, and stored, about what types of information are used in performing these operations, and about the architecture of the system that supports the proposed processes. A minimal (but difficult!) criterion for an adequate process model is whether or not it can create appropriate final outcomes; that is, can it create representations that adequately capture what readers and listeners understand sentences and texts to mean? Given that this has been met, even crudely, a second very challenging criterion can be addressed, which is posed by the following question: Is the model supported by āon-lineā evidence about the temporal and logical flow of information, about the moment-by-moment processes that are claimed to take place between the presentation of auditory or written material and the achievement of understanding?
As a practical matter of psychological experimentation, these two criteria have traditionally been addressed by taking two types of measures: frequency of success on a task and performance speed. Reaction time has been one of the favorite dependent variables in cognitive psychology in general and psycholinguistics in particular. The use of reaction times to evaluate theories of cognitive processes can be traced to the 19th century when Donders (1868) invented the subtraction method to estimate the speed of internal cognitive processes. In more recent times, Sternbergās (1966, 1969) use of additive factors analysis and Posnerās (1978) analysis of mental chronometry has stimulated a vast amount of theoretical and experimental work and has led to a notable increase in our understanding of the nature of cognition.
Psycholinguists commonly apply the experimental techniques of mental chronometry. For instance, when studying reading, they try to draw conclusions about what representations are formed, when, and on what basis, as a way of evaluating their claims about the architecture and operation of the reading system. Evidence for their conclusions has traditionally come from the speed and accuracy of responses in laboratory tasks such as lexical decision, word naming, self-paced reading, question answering, and sentence verification. While the evidence that these tasks provide has greatly increased our understanding of the process of language comprehension, they have not always proved adequate to discriminate between competing theoretical proposals. To take just one example, some theorists adopt modular positions following Fodor (1983) (e.g., Frazier & Rayner, 1982; Frazier, 1987); other theorists advocate more interactive positions (Marslen-Wilson & Tyler, 1987; MacDonald, Pearlmutter, & Seidenberg, 1994; Tanenhaus & Trueswell, 1995). Those who push modularity argue that certain types of information must be used and certain types of representations must be built in order for other types of information and representations to come into play. Those who argue for interactive processing envision a far less differentiated representational vocabulary and a far less constrained interplay among different types of information. This modular vs. interactive contrast would appear to have clear implications about the logical and temporal sequence of distinct processes in sentence comprehension. Although traditional measures of comprehension accuracy and speed have provided informative tests of these implications (see Mitchell, chapter 2, this volume), they have not provided evidence that settles the crucial questions to everybodyās satisfaction.
Part of the problem with traditional methods is their relatively coarse granularity. Knowing how long it takes to read a sentence, or even a word, does not tell the researcher how long any particular component process took. More diagnostic evidence comes from patterns of eye movements observed while reading text for comprehension or memory, such as what words are fixated, how long each fixation lasts, and how often the eyes regress to previous part of the sentence. (See van Gompel et al., chapter 7, this volume, for an illustration of how the finer granularity of eyetracking measures can shed light on the underlying processes of comprehending anaphors.)
Another part of the problem with traditional measures (and with many uses of eyetracking) is their lack of specificity. An increase in comprehension time or a disruption in the eye-movement record is just an increase in time. It does not by itself carry any sort of signature about what processes gave rise to the disruption. Sometimes (e.g., Meseguer, Carreiras, & Clifton, 2002) the location or pattern of eye movements can carry hints about what processes are going on at some particular point in timeāthe eyes may well go to what the reader is thinking about at any moment. Similarly, the visual-world paradigm of measuring eye movements to visual scenes during listening (Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995) relies on the apparent fact that listeners tend to look at the referent of what they are hearing. Under this analysis, it can provide quite specific information about what a listener thinks a word or sentence refers to moment-by-moment (see Boland, chapter 4, this volume). Potentially, even more diagnostic information can come from measuring brain activity during reading or listening. Measures of evoked brain-related potentials (ERPs) can arguably provide information about what processes are taking place as well as when they occur (cf., van Berkum, chapter 13, this volume; Osterhout et al., chapter 14, this volume). Such measures, as well as measures of functional brain imaging, can provide information about what is happening in the sentence comprehension process, and (at least for ERP) when it is happening.
Measures of brain activity such as ERP and functional magnetic resonance imaging (fMRI), coming from the new interdisciplinary field of cognitive neuroscience, can in principle do even more. They can ground the cognitive processes of language comprehension, previously treated largely as abstract information processing operations, in the neural underpinnings of the brain (e.g., Posner & Raichle, 1994; Gazzaniga, 2000). Knowing what brain structures are involved in computing different kinds of linguistic information, and when, can place some constraints on what categories of cognitive processes are involved in different aspects of language comprehension. If two language comprehension phenomena involve different regions of the brain, they necessarily engage at least partly distinct cognitive processes; if two phenomena involve the same regions of the brain, they may involve overlapping processes. (See Fiebach et al., chapter 17, this volume, for an example of this reasoning.) Further, knowing what linguistic and nonlinguistic tasks involve which brain regions should place some constraints on the nature of the processes that engage these brain regions.
The methodological advances provided by various uses of eyetracking and measures of brain activity will not, by themselves, solve the problems of identifying the cognitive processes underlying language comprehension. They have to be combined with careful theoretical analyses of how the measures might be related to the presumed underlying processes (see Boland, chapter 4, this volume, and Tanenhaus, chapter 18, this volume, for discussions of linking hypotheses) and used in theoretically sophisticated experimentation. With the goal of highlighting current progress in using these new on-line methodologies and stimulating future progress in their use, the organizers of the Eighth Annual Meeting of Architectures and Mechanisms of Language Processing (AMLaP), which took place September 19 to 21, 2002, in Costa Adeje, Tenerife, Canary Islands, invited several distinguished researchers to present overviews of how eyetracking and cognitive neuroscience measures are advancing our knowledge of language comprehension. Each of these researchers was asked to consider the following questions in preparing their presentations:
- What is an on-line process?
- Why is on-line processing a very important theoretical issue?
- Are eyetracking and ERP good and useful on-line measures?
- Have these technologies helped theoretical knowledge advance?
- What theoretical debates have these technologies promoted?
- What theoretical questions have they helped us answer?
- What new theoretical questions can be asked with these technologies?
The resulting presentations appear as chapters in the present volume. They are complemented by written versions of several papers and posters that were presented at AMLaP. The editors of this book made a (sometimes difficult!) selection of papers and posters that illustrated informative uses of eyetracking and brain activity measures, and invited their authors to submit them to this book. The results, we trust, show that substantial progress is being made in understanding the on-line nature of language processing, and should stimulate even further progress.
We turn to a brief overview of the on-line measures emphasized in this book, and then return to an attempt to highlight the contributions made in the individual chapters. Readers familiar with the basics of eyetracking and measures of brain activity can skip the following sections without loss.
1.1 EYE MOVEMENTS
Eye-movement recoding has become a very popular technique, or better, a family of techniques. There are two sister techniques under the label of eye movements. One has been applied to measure eye movements during reading (see Rayner, 1998). The other has been used to measure eye movements to regions of a scene while participants listen to speech related to what the scene is about (Tanenhaus et al., 1995). Even though in both cases eye movements are recorded, assumptions about what each technique is tapping are different. (See Boland, chapter 4, this volume, for a description of both techniques.)
During reading, eyes do not sweep along a line of print, but advance through little jumps called saccades. A target word is brought to the fovea by a saccade; and the eyes then fixate on the word for something like a quarter of a second to identify it. About 90% of reading time is spent in fixations, including some regressions to an earlier misperceived word. The typical reader makes about three to four saccadic movements per second. Each movement lasts between 20 to 40 ms, and the eyes typically remain fixated for about 200 to 400 ms. Nearly 15% of the eye movements made by typical college students are regressive, meaning they go back to material previously fixated. The continuous recording of eye movements enables researchers to identify locations and durations of fixations during reading, allowing them to draw inferences about cognitive operations while reading.
A readerās fixation patterns vary greatly over a text, depending on the linguistic characteristics of the words. In developing their early model of text comprehension based on readersā eye movements, Just and Carpenter (1980) made two assumptions: the immediacy and the eyeāmind assumptions. According to these assumptions, a word is the unit of processing, and processing occurs immediately and completely at the time the word is encountered. (See Pickering et al., chapter 3, this volume, for a discussion of these two assumptions.) Gaze duration, which is the summed duration of consecutive fixations on one word before the readerās eyes leave that word, is assumed to reflect processing time of that particular word.
A substantial amount of research on eye movements in reading was conducted early in the 20th century (see Huey, 1908; Tinker, 1946, 1958). By midcentury, research in this field had nearly stopped. However, prompted by the development of new methodologies and the appearance of information-processing theories of cognition, the study of eye movements in reading reappeared with vigor in the last third of the 20th century (see Rayner, 1998, for a review). Nowadays, eye-movement measures have been used successfully to understand the functioning of several components of language processing, such as phonological and orthographic processing (Lee, Binder, Kim, Pollatsek, & Rayner, 1999; Rayner, Pollatsek, Binder, 1998), the effects of neighborhood (Perea & Pollatsek, 1998), the processing of syllables (Ashby & Rayner, in press; Carreiras & Perea, 2004), lexical ambiguity (Duffy, Morris, & Rayner, 1988), morphological processing (Pollatsek, Hyona, & Bertram, 2000), syntactic processing (Carreiras & Clifton, 1999; Frazier & Rayner, 1982; Ferreira & Clifton, 1986; Trueswell, Tanenhaus, & Garnsey, 1994), plausibility (Pickering & Traxler, 1998), discourse context effects (Altmann, Garnham, & Dennis, 1992), and inference processing (OāBrien, Shank, Myers, & Rayner, 1988).
The recording of eye movements during reading to answer some theoretical questions about language processing and language architecture has helped us to better understand cognitive processes involved during on-line reading. On the one hand, results obtained with other laboratory techniques have generally been obtained with the eye-movement technique (see Mitchell, chapter 2, this volume). Converging evidence enhances our confidence in the phenomenon. On the other hand, due to its impressive temporal resolution and its ability to fractionate reading time into distinct components (long initial fixations, refixations on a word, regressions to earlier words, rereading a word after a regression, etc.), the eye-movement technique provides potentially useful detailed information about what cognitive processes might be occurring at any moment in time. (See Boland, chapter 4, this volume, and van Gompel et al., chapter 7, this volume, for illustrations.) The full value of eye-movement measures, however, will surely be realized only when we have a better understanding of how eye movements are controlled by various sorts of cognitive processes. Powerful and informative models of eye-movement control do exist (e.g., Reichle,...