The Handbook of Spanish Second Language Acquisition
eBook - ePub

The Handbook of Spanish Second Language Acquisition

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

The Handbook of Spanish Second Language Acquisition

About this book

Bringing together a comprehensive collection of newly-commissioned articles, this Handbook covers the most recent developments across a range of sub-fields relevant to the study of second language Spanish.

  • Provides a unique and much-needed collection of new research in this subject, compiled and written by experts in the field
  • Offers a critical account of the most current, ground-breaking developments across key fields, each of which has seen innovative empirical research in the past decade
  • Covers a broad range of issues including current theoretical approaches, alongside a variety of entries within such areas as the sound system, morphosyntax, individual and social factors, and instructed language learning
  • Presents a variety of methodological approaches spanning the active areas of research in language acquisition

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Year
2013
Print ISBN
9781119457053
9780470674437
Edition
1
eBook ISBN
9781118584439

Part I

Theoretical and Methodological Approaches to the Study of Second Language Spanish

Chapter 1

Corpus-based Research in Second Language Spanish1

Amaya Mendikoetxea

1.1 Introduction

Second Language Acquisition (SLA) is a diverse field, both conceptually and empirically. Conceptually, it draws from several disciplines (linguistics, psychology, sociology, etc.) and encompasses a variety of theoretical frameworks. It relies on data types drawn from different data elicitation techniques and a variety of methodological approaches. From a cognitive perspective, the main objective of SLA research is to build models of the underlying systems of knowledge that learners have at a particular point in the SLA process (their interlanguage) and to provide a principled account of how that knowledge is acquired and how it develops. As Myles (2005, 372) points out, “the language produced by learners, whether spontaneously or through various elicitation procedures, remains a central source of evidence for these mental processes, and the success of SLA research therefore relies on having access to good-quality data.” Learner language is primary data for the study of SLA and learner corpora (a special type of corpora containing second language (L2) learners' written or oral language samples, see Section 1.2) should occupy a central role in SLA research.
Methodologically, L2 researchers have traditionally, but not exclusively, relied on (quasi)experimental and introspective data (see overviews in, e.g., Gass and Mackey 2007; Mackey and Gass 2005; Mitchell and Myles 2004; White 2003). While the use of large-scale corpora has become standard practice in first language (L1) acquisition research, large L2 corpora are still scarce and relatively little use has been made of corpora in L2 research. In this paper, I discuss the use of learner corpora in the study of L2 Spanish acquisition. It is not my intention to provide a comprehensive survey of the corpora available and related work but to describe the most relevant projects as examples of what learner corpora can contribute to the field of SLA. In Section 1.2, I define learner corpora and learner corpus research. In Section 1.3, two currently available L2 Spanish learner corpora are described: a spoken corpus, SPLLOC (Spanish Learner Language Oral Corpus), and a written corpus, CEDEL2 (Corpus Escrito del Español como L2), as well as the research carried out with them. A brief overview of corpus-based research in L2 Spanish is also provided. In Section 1.4, I point out the way forward for corpus-based SLA research.

1.2 Learner Corpora and SLA

1.2.1 What learner corpora are and why we need them

Based on Sinclair's (1996) definition of language corpora, Granger (2002) defines learner corpora as:
Electronic collections of authentic F(oreign) L(anguage)/S(econd) L(language) textual data according to explicit design criteria for a particular SLA/FLT(eaching) purpose. They are encoded in a standardised and homogeneous way and are documented as to their origin of provenance. (Granger 2002, 7)
The compilation and exploitation of a learner corpus requires a wider range of expertise than is required for native language corpora (Granger 2009, 15). On the one hand, researchers need to be familiar with the methodology of corpus linguistics2: corpus design, corpus annotation, automated data extraction and analysis, and so on. This has the additional complication that most available tools have been designed for native corpora and are therefore not fully suitable for learner corpora; for example, Part-of-Speech (POS) tagging, which, as Granger (2009, 15) points out, is affected by the high rate of errors in learner language. On the other hand, a good background of linguistic theory, as well as SLA theory, is necessary for analyzing and interpreting the data. These two types of expertise are not often found together: “many corpus-based researchers do not know enough about the theoretical background of SLA research to communicate with them [SLA researchers] effectively, while SLA researchers typically know little about what corpora can do for them” (Tono 2003, 806).
One of the main contributions of learner corpora is that they provide a much wider empirical base than has previously been available. SLA studies are often conducted on the basis of a very limited number of subjects, which raises questions about whether results can be generalized (Granger 2002, 6). These studies have served the purpose of hypothesis-building in SLA research, but there is an increasing awareness of the need to test hypotheses on larger and better constructed databases (see Myles 2005, 2007a, 2007b). Moreover, corpora are often used in an exploratory fashion: to discover sets of data not normally found in small studies, which can become crucial to inform current debates in SLA, and to discover patterns of use, as well as for quantitative studies (e.g., frequency). The latter is especially useful for usage-based approaches and input-driven models of SLA (see Gries 2008), but corpora may also be used to inform current debates on the role of input in more formal approaches to SLA.3 Finally, the use of corpora, by the very nature of the data (contextualized discourse), enables researchers to tackle some previously neglected aspects of SLA, such as lexis, phraseology, information structure, and so on, instead of just morphology and syntax, which are the traditional focus of SLA research (Granger 2009, 17).
Some caveats are necessary regarding two apparent dichotomies emerging from the definition of learner corpora and the discussion above: authentic vs. non-authentic (elicited) data and corpus vs. experiments in SLA. It is actually difficult to define what constitutes “authentic” data when dealing with learners' production. Granger (2002, 8) defines authentic learner data in instructional settings as data resulting from authentic classroom activity: texts that are produced for pedagogical reasons and for the corpus, but that use procedures exerting very little control. Compositions guided by pictures and typical experimental data resulting from elicitation techniques are not, according to this author, “authentic” language samples. For Nesselhauf (2004, 128): “Since the distinction between more or less controlled is, naturally, not clear-cut, such collections might be considered peripheral parts of learner corpora”. Sinclair (1996) claims that data collected through major intervention by the linguist form “experimental” corpora. It turns out that learner corpora are most often either semi-authentic or experimental, rather than fully natural, but still a highly valuable source of learner language. The position in the scale of naturalness depends on the degree of control researchers wish to exert over the data and this is, in turn, dependent on their research questions. Given that corpora contain data with varying degrees of naturalness, there is in fact no strict corpora-experiments dichotomy (see Gilquin and Gries 2009, 6). Additionally, a growing number of researchers are arguing for combining data extracted from different sources (see Sections 1.3.2 and 1.3.3). However, this has to be done in a systematic way in order to obtain reliable conclusions on converging evidence.4 As there is no direct access to learners' interlanguage, we need to triangulate from all available sources of data: investigating different types of behavior may help us narrow down the range of possibilities. Combining naturalistic and experimental data is crucial for this purpose.

1.2.2 Learner corpus research in SLA

Learner corpus research is L2 research which uses learner corpora as the main source of data. Interest in learner corpus research was sparked mostly by the publication of the first version of ICLE (International Corpus of Learner English, Granger, Dagneaux, and Meunier 2002), the starting point in the exploitation of large-scale learner corpora. ICLE consists of 2.5 million words of argumentative essays by L2 English university students, organized in different subcorpora according to the learners' L1: Spanish, Italian, French, Russian, etc.5 Most of the studies done with ICLE have analyzed lexical aspects of learner language, probably due to limitations in concordancers and query software. Certainly, some researchers have gone beyond the word by analyzing phrases and structures (Fitzpatrick 2007), collocations (Nesselhauf 2005), and word order alternations (Lozano and Mendikoetxea 2008, 2010).6 Research within the ICLE tradition is inherently contrastive. Contrastive Interlanguage Analysis (CIA) (see, e.g., Granger 1996; Gilquin 2001) is the term used for a research paradigm which establishes comparisons between (i) two (or more) interlanguage varieties (e.g., L1 Spanish—L2 English vs. L1 Italian—L2 English), and (ii) L1 and L2 grammars, by comparing native and non-native corpora.7
As a whole, studies using learner corpora in SLA fall within two categories: (i) hypothesis-driven/corpus-based studies and (ii) hypothesis-finding/corpus-driven studies (see Barlow 2005; Granger 1998; Tognini-Bonelli 2001). This reflects the tension between deductive vs. inductive approaches in language acquisition research (see Myles 2007b for an overview and discussion), with most studies falling within category (ii) (e.g., Aston, Bernardini, and Stewart 2004; Granger, Gilquin, and Meunier, forthcoming; Granger, Hung, and Petch-Tyson 2002).8 However, learner corpus research is a relatively young but very active field. As pointed out by Díaz Negrillo and Thompson (forthcoming), in the last decade we have seen an increasing number of resources, a broadening of the uses that learner corpora are put to, and a wider diversity of users. While research within this field is also increasingly important, the contribution of learner corpus research has been much more substantial in description than interpretation of SLA data (Granger 2004, 134–135), with very little reference to current debates, hypotheses, and theories of SLA (Myles 2005).9, 10

1.2.3 Corpus design: deciding on criteria for learner corpora

Most available learner corpora are of L2 English. Most are written and only a few of them are spoken. Where different proficiency levels are represented in a corpus, most are cross-sectional, offering a cross-section of the learner population (containing texts from groups of learners at different proficiency levels collected at the same point in time), and very few are longitudinal, following learners' development over a period of time (containing texts from the same learners at different stages of acquisition).11
For a corpus to be useful for linguistic analysis, it is crucial to have strict design criteria that follow standard practices (see, for instance, Wynne 2005). In fact, most learner corpora are opportunistic; researchers collect data which are readily available and do not require vast investments of resources and time. Some are designed following ad hoc methodology, that is, to elicit particular types of structures or lexical items (against what is considered good practice in corpus design, according to Sinclair 2005). The collection of texts gathered in a corpus has to be (i) representative, with a high degree of inclusiveness and a low degree of language bias, so that the corpus could potentially contain all likely morphosyntactic forms and a variety of language structures and vocabulary items (see Sinclair 2005; Gries 2008), and (ii) balanced, containing a fair and equally proportioned sample of each of the language varieties it is supposed to be representative of (e.g., a roughly equivalent number of words for each of the proficiency levels in cross-sectional and longitudinal corpora).
In addition, to be useful for SLA research, learner corpora should incorporate a reliable measure of learners' proficiency (see Tono 2003) to allow for contrastive analyses of learners' interlanguage at different proficiency levels, as well as for developmental research. This, together with other types of background information (e.g., L1, length of exposure, learning environment, etc.), is essential to conduct L2 research concerning interlanguage grammars, as well as, for instance, critical period effects, language use patterns, likely cross-linguistic effects, stay abroad effects, and so on. Finally, decisions have to be made about annotation, which is often manual and semiautomatic. Though standard annotation would be desirable, researchers tend to adopt their own annotation schemes to suit their research purposes.12 Rutherford and Thomas (2001) argue in favor of reexamining the procedures and tools of the CHILDES project, originally conceived for L1 acquisition, to explore their potential for learner corpus analysis: the set of transcription conventions in CHAT (Codes for the Human Analysis of Transcripts) and the CLAN (Computerized Language ANalysis) suite for POS tagging.13 Using the same annotation scheme facilitates sharing and comparing results.

1.3 Spanish Learner Corpora

This section offers a brief overview of L2 Spanish learner corpora, focusing on two c...

Table of contents

  1. Cover
  2. Blackwell Handbooks in Linguistics
  3. Title Page
  4. Copyright
  5. List of Figures
  6. List of Tables
  7. Notes on Contributors
  8. Acknowledgments
  9. Introduction
  10. Part I: Theoretical and Methodological Approaches to the Study of Second Language Spanish
  11. Part II: Phonology in Second Language Spanish
  12. Part III: Developing Grammars in Second Language Spanish
  13. Part IV: Individual and Social Factors in Second Language Spanish
  14. Part V: Acquisition in the Second Language Spanish Classroom
  15. Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access The Handbook of Spanish Second Language Acquisition by Kimberly L. Geeslin in PDF and/or ePUB format, as well as other popular books in Languages & Linguistics & Linguistics. We have over 1.5 million books available in our catalogue for you to explore.