The Problem of Language-Learning
Out of the Mouths of Babes
The fact that children learn language so effortlessly at a time when tying their shoes is a real hurdle makes language-learning appear miraculous. Children are faced with the seemingly difficult task of learning a complex symbolic system, one that varies from culture to culture in seemingly arbitrary ways. We have all had the experience of listening to a foreign language fluently spoken by a native speakerâit feels to us, although not to the speaker, that there are no breaks in the flow. Where are the words? Where do the sentences stop and start? This is the first task that faces young children, discovering the units of the language they are to learn.
In addition, children must learn how the units of their language are combined. When children produce utterances that they have never heard before, and that follow the patterns of the language they are witnessing, we know that they have learned something about the underlying regularities that make English English, Swahili Swahili, or American Sign Language American Sign Language. Children hear particular sentences, yet they acquire rules. Moreover, every child hears a different set of particular sentences, yet they all acquire the same rules and in approximately the same sequence. This is the wonder of language acquisition.
To begin to understand this miraculous process, we will take a brief tour through the steps children follow in acquiring language, beginning with their discovery of the sounds of language.
Discovering the Units of Sound
When we speak, we run words together without reliable pauses between them, which is what gives listeners who donât know the language the feeling that there are no breaks in the stream. How then do native language-users parse the language they hear into recognizable units? Adult listeners use their knowledge of regularities in the sound structure of the language to predict the boundaries of units like words. Since sound structure differs across language, knowledge of regularities in one language may not be useful in identifying boundaries in another language. Infants thus need to learn the particular features of the sound structure of their native language in order to be able to find words in the stream of talk that is addressed to them. When do they accomplish this feat?
Much to everyoneâs surprise, infants know something about the language they are to learn on the day they are born. Newborn babies born to French-speaking mothers listened to tapes of French and Russian speech and sucked on a wired nipple while doing so. The babies sucked moreâand, by inference, were more arousedâwhen they listened to the French tapes than when they listened to the Russian tapes. Babies who had not heard French during their prenatal months showed no such effect (Mehler, Jusczyk, Lambertz, Halsted, Bertoncini, & Amiel-Tison, 1988). What the babies appear to have learned about French during those months in the womb was its prosodic structure (its intonation contours, or âmusicâ); when the speech samples were filtered so that only the prosodic cues remained, the findings were unchanged. Thus, babies are already attuned to the music of their motherâs tongue on day 1.
However, babies do not become sensitive to the particular sounds of their native language until the second half of their first year. Babies start out ready to learn any languageâan essential characteristic since, in principle, they could be exposed to any of the worldâs languages, present or future. Babies are able to make essentially all the discriminations in sound contrasts that languages across the globe require. But sometime during the latter part of the first year, the ability to discriminate between contrasts not found in the infantâs native language fades. (Adults are only able to hear those contrasts used in their particular language and can no longer hear the rest.) For example, Hindi and Salish have consonant contrasts not found in English. Surprising those who believe that adults always know more than children, infants learning English are able to make these Hindi and Salish discriminations, adult English-speakers are not. However, the infants are only able to do so during their first yearâby 12 months the ability fades and they begin to listen like adult English-speakers. Importantly, the ability to discriminate these contrasts does not fade if the infant is exposed to input that makes use of the contrastsâinfants learning Hindi or Salish are still able to make the discriminations in their respective languages at 12 months (Werker & Tees, 1984). Babies start to fail to make discriminations among vowels (as opposed to consonants) that are not found in their language even earlier (perhaps as early as 6 months; Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992).
By 9 months, infants can recognize words in their language independent of prosodic cues. English and Dutch have very similar prosodic characteristics. They differ, however, in phonetic and phonotactic structure (that is, in which sounds are produced in the language and how those sounds combine). For example, the [r] in English words is very different from the [r] found in Dutch words (a phonetic difference). English allows [d] to occur at the end of a syllable while Dutch doesnât, and Dutch allows [kn] or [zw] to begin syllables while English doesnât (phonotactic differences). When presented with a spoken list of English and Dutch words, 9-month-old English-learners listen longer to the English than the Dutch words. In contrast, 6-month-old English-learners show no preferences (Jusczyk, 1993). By 9 months, babies have learned enough about the sound structure of their language to prefer their own language to others, even those that have the same âmusic.â
While learning to listen to sounds, infants are also learning to produce them. Infants do not produce what we might recognize as words until they are approximately 1 year old. However, long before then they use their voices in changing ways. They begin by using their voices to cry reflexively and to make vegetative sounds; they then coo, laugh, and begin to play with sound. Sometime around 6 to 9 months, infants begin to babble (Oller & Lynch, 1992)âthey produce true syllables, often in a reduplicated series of the same consonant-vowel combination, for example, [dada] or [mamama]. Later still, infants begin to produce variegated babbling in which the range of consonants and vowels expands and sounds no longer need to be reduplicated (Stark, 1986). The child is now adding prosody to strings of babbles and, as a result, begins to sound like a native speaker (as long as youâre not listening too closely).
Indeed, it is at this point (around 8 months) that native listeners can begin to identify an infant as one who is learning their language. For example, when they heard tapes of 8-month-old babbling, French speakers were able to tell the difference between French babiesâ babbling and Arabic or Chinese babiesâ babbling (deBoysson-Bardies, Sagart, & Durand, 1984). When the infantsâ babbles were closely examined by trained linguists, the French babies were found to display lengthenings and softer modulations than the Arabic and Chinese babies, who exhibited other characteristics that were found in the languages they had been hearing for 8 months (deBoysson-Bardies, 1999). Thus, by the end of the first year, children are beginning to speak, and listen, like native users of their language.
All natural languages, spoken or signed, are structured at many levels. Meaningless units (phonemes) combine to create morphemes, the smallest meaningful units of a language, which in turn combine to form words, phrases, and sentences. Having made significant progress in learning the sound system underlying their language in the first several months of life, children are then free to tackle larger units. Regardless of the language learned, children tend to enter the system of larger units at the level of the word, rather than the morpheme or sentence. Between 10 and 15 months, children produce their first words, typically using each word as an isolated unit. Children then proceed in two directions, learning (1) that the word can be composed of smaller, meaningful parts (morphology), and (2) that the word is a building block for larger, meaningful phrases and sentences (syntax).
What is a word? Consider a child who wants a jar opened and whines while attempting to do the deed herself. This child has conveyed her desires to those around her, but has she produced a word? A word does more than communicate informationâit stands for something; itâs a symbol. Moreover, the mapping between a word and what it stands for is arbitraryââdogâ is the term we use in English for furry four-legged canines, but the term is âchienâ in French and âperroâ in Spanish. There is nothing about the form of each of these three words that makes it a good label for a furry creatureâ the word works to refer to the creature only because speakers of each language act as though they agree that this is what it stands for.
At the earliest stages of development, children may use a sequence of sounds consistently for a particular meaning, but the sequence bears no resemblance to the sound of any word in their language. These âproto-wordsâ (Bates, 1976) are transitional forms that are often tied to particular contexts. For example, a child uses the sound sequence âbrmm-brmmâ every time he plays with or sees his toy truck. In fact, a childâs proto-word need not be verbal at allâgesture works quite well. For example, a child smacks her lips every time she feeds her fish, or flaps her arms when she sees a picture of a butterfly (Acredolo & Goodwyn, 1985, 1988; Iverson, Capirci, & Caselli, 1994). Indeed, some children rely heavily on gestural âwordsâ to communicate with others at the early stages.
Sometime around 18 months, childrenâs vocabularies reach 50 words (Nelson, 1973), and they continue to add an average of nine words a day throughout the preschool years (Carey, 1978). Childrenâs most common words are names for people and pets (âmama,â âMetroâ), objects (âbottleâ), and substances (âmilkâ). These nominal terms are among the earliest terms children learn, along with social words (âwant,â âno,â âbye-byeâ). Adjectives (âhotâ) and verbs (âgo,â âupâ) are part of a young childâs repertoire, but tend to be rare relative to nouns (Gentner, 1982; Goldin-Meadow, Seligman, & Gelman, 1976; although children learning languages other than English may show the noun bias less than English-learners, e.g., Korean; Gopnik & Choi, 1995; Mandarin; Tardif, 1996).
It is, of course, not trivial for the child to figure out exactly what adults mean when they use a word like âdogâ or ârunâ (let alone abstract terms like âlibertyâ and âconjectureâ). âDogâ could refer to the furry creature in its entirety, its paws, its tail, its fur, and so forth. âRunâ could refer to the trajectory of the motion, the manner in which it is carried out, its direction, and so on. An infinite number of hypotheses are logically possible for the meaning of a word in a given adult utterance (Quine, 1960). Yet children are able to zero in on adult meanings of the words they hear. How? It would certainly be easier for children to settle on the meaning of a word if they were constrained to consider only a subset of the possibilities as candidate word meaningsâif, for example, they were biased to assume that labels refer to wholes instead of parts (the creature, not the tail) and to classes instead of particular items (all dogs, not one dog). Constraints of this sort have, in fact, been proposed as part of the equipment that children bring to the language-learning situation (Markman, 1991, 1994), although others are equally convinced that inherent constraints are not needed to explain how children learn the meanings of words (Bloom, 2000).
Constraints on what a word means can also come from the discourse context in which the word is used. An adult may, for example, label an object when the child is directing full attention to that object, effectively making a set of meanings particularly salient to the child. More constraining still, the linguistic frame in which a word appears narrows down the meanings that the word can have (Gleitman, 1990). For example, English sentence structure conveys who does what to whom and, in this way, provides clues as to the meaning of the verb in the sentence. A 2-year-old child who hears âthe rabbit is gorping the duck,â will look longer at a scene in which the rabbit is acting on the duck than at a scene in which both the rabbit and the duck are circling their arms. The child correctly assumes that âgorpâ must refer to an action on an object, in this case, the duck (Naigles, 1990). This is an impressive inference, particularly since the same child when hearing âthe rabbit and the duck are gorpingâ will look longer at the scene in which the rabbit and the duck are each circling their arms. Now the child correctly assumes that âgorpâ refers to the arm-circling action. Thus, the clues that children exploit to determine the meaning of a word come not only from how that word is used in relation to the world of objects and actions, but also from how it is used in sentencesâchildren use language itself to bootstrap their way into word meanings (Fisher, Hall, Rakowitz, & Gleitman, 1994; Landau & Gleitman, 1985).
Learning That Words Are Made of Parts
Words in English, and in all languages, can be broken down into parts. For example, the word âdogsâ refers to more than one furry creature, but it does so quite systematicallyââdogâ stands for the animal, while âsâ stands for many-ness. We know this because we know that words like âcats,â âshoes,â and âbooksâ all refer to more than one cat, shoe, or book. We have extracted (albeit not consciously) what the words have in commonâthe â-sâ ending in their forms and âpluralâ in their meaningsâto form what is called a morpheme, a consistent pairing between a form and a meaning.
At the earliest stages, children learn morphologically complex words as unanalyzed wholes, âamalgamsâ (MacWhinney, 1978). For example, âshoesâ may not, in the childâs mind, be composed of the stem âshoeâ plus the plural â-s,â particularly if the child never produces the form âshoeâ and uses âshoesâ to refer to footwear in singles and in pairs. At some point, English-learning children discover that âshoesâ is composed of parts (âshoeâ, â-sâ) and that each part has a meaning (footwear, plural). It is often not easy to tell when this analysis has taken place, particularly since it is not likely to have been done consciously.
One key piece of evidence, possible only when the pattern in the language the child is learning is not completely regular, comes from childrenâs overregularizationsâerrors in which children make exceptions to the adult pattern (e.g., feet) conform to the regular pattern (e.g., foots). Children who produce the incorrect form âfootsâ must have extracted the plural morpheme â-sâ from a variety of other regular forms in their system, and added it to the noun âfoot.â Similarly, children who produce âeatedâ must have extracted the past tense morpheme â-edâ from verbs like âwalkedâ and âstoppedâ and added it to the verb âeatâ (Marcus, 1995). Creative errors of this sort also indicate that children know the differences between nouns and verbs; children add the â-edâ ending to verbs like âeatâ or âwalkâ but rarely to nouns like âfootâ or âshoe.â In addition to waiting for creative errors to occur, experimenters can give older children nonsense words and ask them to generate novel forms for different sentence frames, as Jean Berko (1958) did in her well-known âwugâ test. Berko showed children a picture of two unknown creatures and said, âThis is a wug. Now there are two of them. There are two _____â. The child who knows about plural endings should supply the word âwugs.â
The âwugâ test gives children the sentence frame (âThere are two _____â) and a referent (a picture with two items) and asks them to supply the form of the appropriate grammatical morpheme (â-sâ). In a clever study, Brown (1957) turned the question around and gave children the form of the grammatical morpheme and asked them to supply the meaning. For example, Brown showed children a picture of hands acting on a confetti-like substance in a pail and told them that this was âsibbing.â They were then asked to identify sibbing in a set of three pictures showing just the acting hands, just the substance, or just the pail. They correctly pointed to the acting hands, indicating that they knew the grammatical morpheme â-ingâ attaches to words for actions, not objects. Importantly, when they were told that the original picture contained âsome sib,â they pointed to the substance picture, and when they heard it cont...