Part I
Embodied language
1
From Modal Brain Structures to Word Meaning
Raphael Fargier
The first time he saw my air-plane, for instance
(I shall not draw my air-plane; that would be much too complicated for me), he asked me:
- What is that object?
- That is not an object. It flies. It is an air-plane. It is my air-plane.
And I was proud to have him learn that I could fly.
(Antoine de Saint-Exupery, The Little Prince, 1943)
In this short passage extracted from the famous novel by Saint-Exupery, the narrator describes the first encounter of the little prince with a plane; how he taught him what this object was, what it was used for, to whom it belonged and of course what was its name. Following this, one could argue that every piece of information, from the visual properties of the object to more abstract information, such as its affiliation, are related to this particular experience of the first encounter with the plane. That meaning is anchored in oneâs perceptual, affective and motor experience is one of the core statements of embodied semantics (Glenberg 1997; Barsalou 1999, 2008; PulvermĂŒller 1999, 2005; Gallese et al. 2004; Keysers and Perrett 2004; Pecher and Zwaan 2005; Gallese and Lakoff 2005; Fischer and Zwaan 2008; Meteyard et al. 2012; Gallese and Sinigaglia 2011; Kiefer and PulvermĂŒller 2012).
There has been evidence that the brain structures devoted to perception and action, in other terms the modal structures, are involved in semantic processing. For instance, it has been shown that reading or listening to words that refer to motor actions trigger activity in regions involved in the execution of the actions depicted by the words (e.g. Hauk, Johnsrude, and PulvermĂŒller 2004; Tettamanti et al. 2005; Kemmerer et al. 2008). However, several studies also pointed to the fact that the patterns of activity observed during conceptual processing and/or during processing of words and the patterns observed during perceptual/motor processing were not aligned (Willems et al. 2010). This might suggest that there is not a strict one-to-one mapping between referents (objects or events), concepts and their corresponding words.
In this chapter, we focus on the idea that breakthroughs in our understanding of how the brain processes word meanings can only be achieved by addressing the critical questions of how the networks underlying semantic representations1 develop in the first instance. The neural correlates underlying the acquisition of word-referent relationships, and the extent to which semantic networks are shaped by referential labels are discussed. However, to determine this, it is useful to begin with the mature functioning, in other terms, how semantic representations of well-known words are retrieved in adults. In the first section, we review evidence for two levels of semantic representations in modality-specific and hetero modal brain regions. In the second section, we show that these two levels of representations offer the flexibility needed by comprehenders. We then focus on how these representations developed by reviewing learning experiments. Finally, we consider learning studies that examined the specificity of language in the development of these representations. Besides informing on the degree of embodiment of word meaning, this will be critical to acknowledge the degree of interaction between language and conceptual structure.
Evidence for two levels of representations
Modal structures and semantic representations
Patient studies were the first to suggest that modal structures were recruited during semantic processing. We find case studies that show selective deficits for one or several categories while other categories are spared (Capitani et al. 2003 for a review; Caramazza and Mahon 2003). Despite variability in brain impairments, these category-related deficits can be accounted for by damage of areas responsible for processing specific modality-dependent information (Warrington and McCarthy 1983; Warrington and Shallice 1984). Behavioural and neuroimaging studies brought forth compelling evidence on that issue. For instance, behavioural studies repeatedly reported cross-talks between action and language systems (Gentilucci and Gangitano 1998; Gentilucci et al. 2000; Glenberg and Kaschak 2002; Boulenger et al. 2006, 2008; Glenberg et al. 2008; Nazir et al. 2008; Fischer and Zwaan 2008; Scorolli et al. 2009; Dalla et al. 2009; Aravena et al. 2010; Chersi et al. 2010; Fargier, MĂ©noret, et al. 2012; Shiller et al. 2013; de Vega et al. 2013) or perception and language systems (Stanfield and Zwaan 2001; Meteyard et al. 2007; Richter and Zwaan 2010). Studies that used electroencephalographic recordings (EEG) added to this by revealing distinct neural correlates for words that pertain to different semantic categories (Preissl et al. 1995; Koenig and Lehmann 1996; PulvermĂŒller et al. 1996; PulvermĂŒller, Lutzenberger et al. 1999; PulvermĂŒller, Mohr et al. 1999; Hauk and PulvermĂŒller 2004; Barber et al. 2010; Ploux et al. 2012; see Vigliocco et al. 2011 for a review).
A large amount of imaging studies reported that processing words referring to actions activated motor structures of the brain (Hauk, Johnsrude, and Pulvermuller 2004; Aziz-Zadeh et al. 2006; Aziz-Zadeh and Damasio 2008; Kemmerer et al. 2008; Boulenger et al. 2009, 2011; Raposo et al. 2009) whereas processing words that refer to gustatory (BarrĂłs-Loscertales et al. 2012), olfactory (GonzĂĄlez et al. 2006), auditory (Kiefer et al. 2008) or visual (PulvermĂŒller and Hauk 2006; Simmons et al. 2007; Desai et al. 2009) sensations triggered activity in the brain regions involved in the perception of such sensations (see also Goldberg et al. 2006). Finally, Vigliocco and colleagues (2014) recently reported that the hedonic valence of words modulated activity in the rostral anterior cingulate cortex, a region associated with emotion processing. Altogether, these data speak for the idea that semantic representations of concrete words may be related to (or grounded in) sensory and motor knowledge while those of abstract words may be grounded in affective and emotional experiences (Vigliocco et al. 2014).
The notion of convergence zones
Nonetheless, there is now a general consensus that the conceptual or semantic content is not an exact copy of the perceptual, motor or internal states that are captured during experience (Barsalou 1999). In fact, the retrieval of this knowledge might be mediated by convergence zones (Damasio 1989; Simmons and Barsalou 2003; Meyer and Damasio 2009). Damasioâs proposal lies on the idea that hetero modal areas bind information from distinct modality-specific regions into coherent events, while not containing refined representations. In their related proposal, Simmons and Barsalou (2003) suggested that the association areas that are close to visual brain areas would capture visual activation patterns whereas association areas close to motor brain regions would capture corresponding motor activation patterns. Hence, it is not primary modal cortices that participate in the retrieval of modality-specific semantic information but rather the association areas that are close to them.
In fact, most of the empirical evidence on embodied semantics shows that only adjacent motor and sensory areas are engaged during semantic processing (Willems and Hagoort 2007; Kiefer et al. 2008; Binder and Desai 2011; Mete-yard et al. 2012). Critically, when directly comparing different kinds of mental representations such as motor imagery and action language (Willems et al. 2010) or more generally action observation and action language perception/production (Tremblay and Small 2010), activity in parts of the motor regions do not overlap. Moreover, with regards to the relationship between conceptual representations and word meanings, it is assumed that concepts comprise distributed feature representations (Barsalou 2008; Kiefer and PulvermĂŒller 2012) while word meanings bind these distributed representations for the purpose of language use (Vigliocco and Vinson 2007). Therefore, there might be some kind of intermediary lexical unit that binds these features for linguistically mediated communication.
Interestingly, the presence of intermediary representations mediated by âconvergence zonesâ that link word-form representations to perceptual and motor (semantic) aspects of words has been postulated in theories of lexical access (Levelt et al. 1999). In Mesulamâs proposal (1998), the achievement of verbal naming includes a first step that comprise the activation of the âintermediaryâ labeling area âLexâ (proximal to perceptual and motor regions) to encode prelexical representations (see also Vigliocco et al. 2004; Nazir et al. 2012). This first process echoed to the lemma level that appears in all (Dell 1986; Dell and OâSeaghdha 1992; Levelt 1999; Levelt et al. 1999; Indefrey and Levelt 2004; Indefrey 2011; Roelofs 2014) but one (Caramazza 1997, âempty lemmaâ) theory of lexical access. In these psycholinguistics theories, the lemma allows the mapping between the concept and the corresponding lexical item and is thought to code abstract word properties such as semantic or syntactic features but not phonological forms. Hence, Zorzi and Vigliocco (1999) underlined that lemma was well suited to be the lexical unit that binds semantic features for the purpose of language use and that the organization of this lemma must be constrained by the conceptual level (Vigliocco et al. 2004). Despite differences in the flow of information between language production and language comprehension that may result in the recruitment of specific neural networks, both modal regions and convergence zones that mediate more abstract information seem to be part of the networks that underlie semantic representation. And as language can be seen as a modulator of a distributed and interactive system (Kemmerer and Gonzalez-Castillo 2010; Lupyan 2012) lexical-semantic representations and conceptual (non-linguistic) representations might be dissociable but interacting for the purpose of language use (Evans 2009).
It should be noted that several proposals, for instance the LASS theory (Language and Situated Simulation) (Barsalou et al. 2008), combine modal representations and linguistic representations (see also Zwaan 2014). These proposals argue that word-meanings are partly context-specific and are flexibly activated by adults. Sensory, motor and affective aspects of word meaning are thus thought to be activated according to task and context factors (the idea of âsituated simulationsâ). This idea is developed in the following.
How lexical semantic representations interact with mental models to construct meaning
In Evanâs view (2009), this context-dependency is illustrated by the non-linguistic knowledge that forms the cognitive or mental model. Access to meaning therefore requires to confront the lexical concept, the means of modeling the units of semantic structure (what we referred to as semantic representation) to this cognitive or mental model (the means of modeling the conceptual structure) (Evans 2009) (see Figure 1.1b).
The idea of having two levels of semantic representations mediated either by modal structures or lexical convergence zones in association areas or higher level structures that help for the recollection of experiential states is coherent with the degree of flexibility that is needed by comprehenders to construct mental models. On the one hand, modality-specific or most likely association areas (i.e. low-level convergence zones) may entail relevant representations for a lot of situations including for instance those in which the specific experiential traces are explicitly needed. On the other hand, more abstract representation (Barsalou 1999) or multimodal schemata (Richter et al. 2009) might be required for conceptual combination, reasoning and situational model computation.
Figure 1.1 a) Schematic presentation of the semantic/conceptual representations in the brain. Modal area corresponds to any modality-specific area such as primary sensory or motor areas. CZ: Convergence Zones. CZ corresponds to non-linguistic convergence zones in contrast to Lexical CZ which are specific to words. * are used to remind that a hierarchical system of convergence zones is assumed with low-level and higher-level CZ that mediate more complex representations. b) The interaction between cognitive model and lexical/semantic representations as expressed in the Theory of Lexical Concepts and Cognitive Models (LCCM) from Evans (2009). A parallel can be drawn between the lexical concept (b) and what is mediated by lexical convergence zones (a).
Indeed, it is assumed that meaning retrieval benefits from a representational state of the situation described by a context (Nieuwland and Van Berkum 2006; Hagoort and van Berkum 2007; Metusalem et al. 2012). This representational state, which can assimilate information about space, objects and events but also time, social relations and mental acts (Frank and Vigliocco 2011) corresponds to the âmental modelâ, âsituation modelâ or âcognitive modelâ that have been introduced by linguists and philosophers (Johnson-Laird 1983; Barsalou et al. 2008; Zwaan and Radvansky 1998; Zwaan and Madden 2004; Evans 2006; see Aravena et al. 2014; Zwaan 2014). Activity in modal brain regions during language comprehension can thus be considered as âsituated simulationsâ (Barsalou et al. 2008; see Chapter 12 by Gamez-Djokic et al. for a review on embodied simulations).
Empirical evidence for the use of such mental models comes from the disparities of language-induced perceptual and motor activity observations (Willems and Casasanto 2011). Interestingly, the contribution of modal structures in conceptual or word processing is sensitive to linguistic as well as extra-linguistic context (Hoenig et al. 2008; Sato et al. 2008; Raposo et al. 2009; van Dam et al. 2011; Rueschemeyer et al. 2010; van Dam et al. 2010; Aravena et al. 2012, 2014...