This Is the Voice
eBook - ePub

This Is the Voice

John Colapinto

Share book
  1. 320 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

This Is the Voice

John Colapinto

Book details
Book preview
Table of contents

About This Book

A New York Times bestselling writer explores what our unique sonic signature reveals about our species, our culture, and each one of us. Finally, a vital topic that has never had its own book gets its due. There's no shortage of books about public speaking or language or song. But until now, there has been no book about the miracle that underlies them all—the human voice itself. And there are few writers who could take on this surprisingly vast topic with more artistry and expertise than John Colapinto. Beginning with the novel—and compelling—argument that our ability to speak is what made us the planet's dominant species, he guides us from the voice's beginnings in lungfish millions of years ago to its culmination in the talent of Pavoratti, Martin Luther King Jr., and Beyoncé—and each of us, every day.Along the way, he shows us why the voice is the most efficient, effective means of communication ever devised: it works in all directions, in all weathers, even in the dark, and it can be calibrated to reach one other person or thousands. He reveals why speech is the single most complex and intricate activity humans can perform. He travels up the Amazon to meet the Piraha, a reclusive tribe whose singular language, more musical than any other, can help us hear how melodic principles underpin every word we utter. He heads up to Harvard to see how professional voices are helped and healed, and he ventures out on the campaign trail to see how demagogues wield their voices as weapons.As far-reaching as this book is, much of the delight of reading it lies in how intimate it feels. Everything Colapinto tells us can be tested by our own lungs and mouths and ears and brains. He shows us that, for those who pay attention, the voice is an eloquent means of communicating not only what the speaker means, but also their mood, sexual preference, age, income, even psychological and physical illness.It overstates the case only slightly to say that anyone who talks, or sings, or listens will find a rich trove of thrills in This Is the Voice.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is This Is the Voice an online PDF/ePUB?
Yes, you can access This Is the Voice by John Colapinto in PDF and/or ePUB format, as well as other popular books in Biological Sciences & Human Anatomy & Physiology. We have over one million books available in our catalogue for you to explore.



The first experiments in fetal hearing were conducted in the early 1920s. German researchers placed a hand against a pregnant woman’s belly and blasted a car horn close by. The fetus’s startle movements established that, by around twenty-eight weeks’ gestation, the fetus can detect sounds.1 Since then, new technologies, including small waterproof microphones implanted in the womb, have dramatically increased our knowledge of the rich auditory environment2 where the fetus receives its first lessons in how the human voice transmits language, feelings, mood, and personality.
The mother’s voice is especially critical to this learning—a voice heard not only through airborne sound waves that penetrate the womb, but through bone conduction along her skeleton, so that her voice is felt as vibrations against the body. As the fetus’s primary sensory input, the mother’s voice makes a strong and indelible “first impression.” Monitors that measure fetal heart rate show that, by the third trimester, the fetus not only distinguishes its mother’s voice from all other sounds, but is emotionally affected by it: her rousing tones kick up the fetal pulse; her soothing tones slow it.3 Some researchers have proposed that the mother’s voice thus attunes the developing nervous system in ways that predispose a person, in later life, toward anxiety or anger, calm or contentment.4 Such prenatal “psychological” conditioning is unproven, but it is probably not a bad idea for expectant mothers to be conscious, in the final two months of pregnancy, that someone is eavesdropping on everything they say, and that what the listener hears might have lasting impact. The novelist Ian McEwan used this conceit in his 2016 novel, Nutshell, which retells Shakespeare’s Hamlet from the point of view of a thirty-eight-week-old narrator-fetus who overhears a plot (though “pillow talk of deadly intent”) between his adulterous mother and uncle.
As carefully researched as that novel is regarding the surprisingly acute audio-perceptual abilities of late-stage fetuses, McEwan takes considerable poetic license. For even if a fetus could understand language, the ability to hear speech in the womb is sharply limited. The uterine wall muffles voices, even the mother’s, into an indistinct rumble that permits only the rises and falls of emotional prosody to penetrate—in the same way that you can tell through the wall you share with your neighbor that the people talking on the other side are happy, sad, or angry, but you can’t hear what they’re actually saying. Nevertheless, after two months of intense focus on the mother’s vocal signal in the womb, a newborn emerges into the world clearly recognizing the mother’s voice and showing a marked preference for it.5 We know this thanks to an ingenious experiment invented in the early 1970s for exploring the newborn mind. Investigators placed a pressure-sensitive switch inside a feeding nipple hooked to a tape recorder. When the baby sucked, prerecorded sounds were broadcast from a speaker. Sounds that interested the infant prompted harder and longer sucking to keep the sound going and to raise its volume. Psychologist Anthony DeCasper used the device to show that three-day-olds will work harder, through sucking, to hear a recording of their own mother’s voice over that of any other female.6 The father’s voice sparked no special interest in the newborn7—which, on acoustical grounds, isn’t surprising. The male’s lower pitch penetrates the uterine wall less effectively and his voice is also not borne along the maternal skeleton. Newborns thus lack the two months of enwombed exposure to dad’s speech that creates such a special familiarity with, and “umbilical” connection to, mom’s voice.

The sucking test has revealed another intriguing facet of the newborn’s intense focus on adult voices. In 1971, Brown University psychologist Peter Eimas (who invented the test) showed that we are born with the ability to hear the tiny acoustic differences between highly similar speech sounds, like the p and b at the beginning of the words “pass” and “bass.” Both are made by an identical lip pop gesture. They sound different only because, with b, we make the lip pop while vibrating our vocal cords—an amazingly well-coordinated act of split-second synchronization between lips and larynx that results in a “voiced” consonant. With the p, we pop the lips while holding the vocal cords in the open position, making it “unvoiced.” We can do this with every consonant: t, voiced, becomes d; k becomes hard g; f becomes v; ch becomes j. Babies, Eimas showed, hear these distinctions at birth, sucking hard with excitement and interest when a speech sound with which they’ve become bored (ga ga ga) switches to a fascinating new one (ka ka ka).8 Prior to Eimas’s pioneering studies, it was believed that newborns only gradually learn these subtle phonemic differences.
The significance of this for the larger question of how we learn to talk emerged when Eimas tested if infants could discriminate between speech sounds from languages they had never heard—in the womb or anywhere else. For English babies this included Kikuya (an African language), Chinese, Japanese, French, and Spanish, all of which feature minuscule differences in shared speech sounds, according to the precise position of the tongue or lips, or the pitch of the voice. The experiments revealed that newborns can do something that adults cannot: detect the most subtle differences in sounds. Newborns, in short, emerge from the womb ready and willing to hear, and thus learn, any language—all seven thousand of them. This stands to reason, because a baby doesn’t know if it is going to be born into a small French town, a hamlet in Sweden, a tribe in the Amazon, or New York City, and must be ready for any eventuality.9 For this reason, neuroscientist Patricia Kuhl, a leading infant language researcher, calls babies “linguistic global citizens”10 at birth.
But after a few months, babies lose the ability to hear speech sounds not relevant to their native tongue—which has huge implications for how infants sound when they start speaking. Japanese people provide a good example: when speaking English, adults routinely swap out the r and l sounds, saying “rake” for “lake,” and vice versa. They do this because they cannot hear the difference between English r and l. But Japanese newborns can, as Eimas’s sucking test shows. Change the ra sound to la, and Japanese babies register the difference with fanatic sucking. But around seven months of age, they start having trouble telling the difference. At ten months old, they don’t react at all when ra changes to la. They can’t tell the difference anymore. English babies actually get better at it.
The reason is exposure and reinforcement. The ten-month-old English baby has spent almost a year hearing the English-speaking adults around her say words that are distinguished by clearly different r and l sounds. Not the Japanese baby, who spent the ten months after birth hearing a Japanese r that sounds almost identical to our English l, the tongue lightly pushing against the gum ridge behind the upper front teeth. Because there is no clear acoustic difference between the Japanese r and the English l, Japanese babies stop hearing a difference. They don’t need to, because their language doesn’t depend on it.
All of which is to say that the developing brain works on a “use it or lose it” basis. Circuitry not activated by environmental stimuli (mom’s and dad’s voices) is pruned away. The opposite happens for brain circuits that are repeatedly stimulated by the human voice. They grow stronger, more efficient. This is the result of an actual physical process: the stimulated circuits grow a layer of fatty cells, called myelin, along their axons, the spidery branches that extend from the cell’s nucleus to communicate with other cells. Like the insulation on a copper wire, this myelin sheath speeds the electrical impulses that flash along the nerve branches that connect the neurons which represent specific speech sounds. Neuroscientists have a saying: “Neurons that fire together, wire together”—which is why the English babies in Eimas’s experiments got better at hearing the difference between ra and la: the neuron assemblies for those sounds fired a whole lot and wired themselves together. Not so for Japanese babies.
In short, the voices we hear around us in infancy physically sculpt our brain, pruning away unneeded circuits, strengthening the necessary ones, specializing the brain for perceiving (and eventually producing) the specific sounds of our native tongue.

Some infants fail to “wire in” the circuits necessary for discriminating highly similar sounds. Take the syllables ba, da, and ga, which are distinguished by where, in the mouth, the initial sound is produced (b with a pop of the lips; d with a tongue tap at the gum ridge; g with the back of the tongue hitting the soft palate, also called the velum). These articulatory targets determine how the resulting burst of noise transitions into the orderly, musical overtones of the a-vowel that follows: a sweep of rapidly changing frequencies, over tens of milliseconds, that the normal baby brain, with repetition, wires in through myelinating the correct nerve pathways.
But some 20 percent or so of babies, for unknown reasons, fail to develop the circuits for detecting those fast frequency sweeps. Sometimes a child registers ba, sometimes ga or da. Parents are unaware of the problem because kids compensate by using contextual clues. They know that mom is saying “bat” and not “pat” because she’s holding a bat in her hand. They know dad is talking about a “car” because he’s pointing at one. The problem surfaces only when the child starts school and tries to learn to read. That is, translate written letter-symbols into the speech sounds they represent. He can’t do it, because his brain hasn’t wired-in the sounds clearly. He might read the word “dad” as “bad,” or “gab,” or “dab.” These children are diagnosed with dyslexia, a reading disorder long believed to be a vision problem (it was once called “word blindness”). Thanks to pioneering research in the early 1990s by neuroscientist Paula Tallal at Rutgers University, dyslexia is now understood to be a problem of hearing, of processing human voice sounds.11 Tallal has been helping to devise software that slows the frequency sweeps in those consonant-vowel transitions so that very young children can train their auditory circuits to detect the different speech sounds, and thus wire them in through myelination of the nerve pathways. All to improve their reading.

Of course, to learn a language, it is not enough simply to recognize the difference between pa and ba, or la and ra. To understand speech—and to produce it one day—babies must accomplish another exceedingly difficult feat of voice perception. Though it might seem, to us, as if we insert tiny gaps of silence between words when we speak (like the spaces between words on a printed page), that’s a perceptual illusion. All voiced language is actually an unbroken ribbon of sounds all slurring together. To learn our native tongue, we had to first cut that continuous ribbon into individual words—not easy when you’re a newborn and have no idea what any words mean. You can get an idea of what you were up against by listening to a YouTube clip of someone speaking a language you don’t know: Croatian, or Swahili, or Tagalog. Try listing ten words. You can’t do it because you can’t tell where one word ends and another begins. This is the problem you faced at birth—and, by around eight months, had solved.
Here’s how. Despite appearances, babies, reclining in their strollers or lying in their cribs, are anything but passive receptors of the speech that resounds all around them. Indeed, even before birth—from the seventh month of gestation onward—the fetus runs a complex statistical analysis on the voices it perceives, and registers patterns. The sucking test shows that one pattern newborns detect is word stress.12 English, on average, emphasizes the first syllable of words: contact, football, hero, sentence, mommy, purple, pigeon; words that emphasize the second syllable (like surprise) are far less common. In French, it’s the reverse—a weak-strong pattern: “bonjour,” “merci,” “vitale,” “heureux.” Babies zero in on these patterns and use them to locate word boundaries. Take a mystifying sequence of speech sounds like:
An American baby will apply English’s strong-weak probability to identify the first sound clusters (staytlee) as a possible stand-alone word (STAYT-lee—or “Stately”). The next two syllables, however (plumpbuk), don’t make an English word, no matter what stress pattern you apply (PLUMP-buk; plump-BUK). To deal with that, the baby uses another type of statistical analysis. In all languages, the likelihood that one speech sound will follow another is highest within words, less likely across words. Patricia Kuhl supplies a good example from Polish, where the zb combination is common, as in the name Zbigniew.13 But in English zb occurs only across word boundaries, as in “leaveZ Blow” or “windowZ Break”—and thus crops up less frequently. Sophisticated listening tests show that eight-month-olds use these “transition probabilities” to segment the sound stream—and babies can do this after just two minutes’ exposure to a stream of unfamiliar speech sounds.
This staggering speed of learning speaks to Darwin’s assertion, in The Descent of Man, that speech acquisition in children reveals not an instinct for language, but an instinct to learn—as in an English baby’s lightning-fast realization that the pb in plumpbuk is illegal and that it makes sense to split the speech stream there, to create the separate chunks plump and buk. Eventually, the child will use both statistical strategies to help segment the entire sequence and arrive at the first words of James Joyce’s Ulysses:
Stately, plump Buck Mulligan came from the stairhead…
She will accomplish this stunning feat before her first birthday, well before she has the least clue about what any of the words actually mean. But in snipping the sound ribbon into its separate parts, the baby stands a chance of figuring out how to assign meaning to each small cluster of sounds—clusters we call “words.”

Babies do not do all this work on their own. They receive significant help from adults, who unconsciously adopt a highly artificial vocal style when addressing them.
Remarkably, no language expert took any formal notice of the unusual way we talk to infants until 1964, when Charles A. Ferguson, a linguist at Stanford University, published the paper “Baby Talk in Six Languages.” It catalogued the identical way parents speak to babies in a slew of widely different tongues, including Syrian Arabic, Marathi (a language of western India), and Gilyak (spoken in Outer Manchuria), as well as English and Spanish. In each instance, caregivers prune consonants (as when English parents use “tummy” rather than “stomach”) and use onomatopoeia (in English, “choo choo” for “train,” and “bow wow” for “dog”).14 Ferguson was not, however, investigating how babies learn to speak—you could even say he was doing the exa...

Table of contents