
- 214 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
About this book
Corpus Linguistics for ELT provides a practical guide to undertaking ELT-related corpus research. Aimed at researchers, advanced undergraduate and postgraduate students of ELT and TESOL, and English language teachers, this volume:
- covers corpus research in the main areas of language study relevant to ELT: grammar, lexis, ESP, spoken grammar and discourse;
- presents a review of relevant corpus research in these areas, and discusses the implications of this research for ELT;
- suggests potential ELT-focused corpus research projects, and equips the reader with all the required tools and techniques to carry them out;
- deals with the growing area of learner corpora and direct classroom application of corpus material.
Corpus Linguistics for ELT empowers and inspires readers to carry out their own ELT corpus research, and will allow them in turn to make a significant contribution to corpus-informed ELT pedagogy.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weâve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere â even offline. Perfect for commutes or when youâre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Corpus Linguistics for ELT by Ivor Timmis in PDF and/or ePUB format, as well as other popular books in Languages & Linguistics & Teaching Arts & Humanities. We have over one million books available in our catalogue for you to explore.
Information
Chapter 1 Introduction
DOI: 10.4324/9781315715537-1
Aims
The challenge of fostering a fruitful relationship between corpus linguistics and ELT was clearly set out by Conrad (2000: 556):
Corpus grammarians must strive to reach more audiences that include teachers and must emphasize concrete pedagogical applications ⊠In fact, the strongest force for change could be a new generation of ESL teachers who were introduced to corpus-based research in their training programs [and] have practiced conducting their own corpus investigations and designing materials based on corpus research.
Indeed, this comment by Conrad encapsulates the main aim of this book: to help move corpus linguistics from what Römer (2012) terms its âminority sportâ status in language teaching to a point where the ability to carry out and interpret corpus research is seen as a normal part of an English language teacherâs repertoire. Familiarity with corpus research and practice should be a standard part of an English language teacherâs toolkit, I would argue, because most people in ELT will at some time have had thoughts like these:
- How many words do my learners need to learn?
- Why is everyone talking about lexical chunks and collocations?
- Do my students really need this grammar point?
- Which words should I use to exemplify this structure?
- Am I teaching my learners language they will need to use when they speak the language?
- Does the grammar explanation in the coursebook really reflect how we use this structure?
- What vocabulary do my English for dentistry students need to get their teeth into?
If you have had questions like these, this book is designed to help you to answer them by consulting corpora and corpus-informed literature. It is also designed to help you to generate and investigate similar questions. It is, however, important to keep corpora in perspective throughout this book. The argument presented here is that corpora are a resource and a reference source and, as is the case with all resources, pedagogic judgement is vitally important in determining how and when they are deployed to best effect.
The book does not assume prior knowledge or experience of corpus research; nor does it assume any technical expertise. Technophobes can relax: contemporary corpus interfaces and corpus software are user-friendly and often include tutorial packages. The tasks in this book will help to familiarise readers with publicly available user-friendly corpora such as the British National Corpus hosted at http://corpus.byu.edu/bnc/
And if you know how to save a document, you are, as we shall see in the next chapter, well on the way to being able to compile your own corpus for teaching purposes; and then things get really interesting.
What is a corpus?
Defining a corpus
If you are reading this book, you probably know what a corpus is, but it is useful to draw out some key points from definitions in the literature to be sure that we have a shared understanding. Brazil (1995: 24) defines a corpus as âa collection of used languageâ, explaining that âused languageâ is âlanguage which has occurred under circumstances in which the speaker was known to be doing something more than demonstrate the way the system worksâ. This definition is useful in that it focuses on the fact that language in a corpus is naturally occurring. We need to note, however, that a corpus is not just a collection of naturally occurring language in the form of isolated words or sentences randomly collected; it consists of spoken and/or written texts (the word âtextâ in corpus linguistics is used to refer to both spoken and written language). And the collection of texts also has to be purposeful: âA corpus is not simply a collection of texts. Rather a corpus seeks to represent a language or some part of a languageâ (Biber, Conrad and Reppen 1998: 246). In practice, as McEnery and Wilson (1996) note, in contemporary usage a corpus almost always refers to texts collected in machine-readable form, i.e. electronic texts which can be automatically analysed with software packages. For our purposes, it is important to note that while âbig-nameâ corpora such as the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA) consist of hundreds of millions of words, size is not an absolute criterion for corpus design: size is a question of fitness for purpose. OâKeeffe, McCarthy and Carter (2007: 4) stress that the design of the corpus is more important than the size:
For corpora of spoken language, anything over a million words is considered to be large; for written corpora, anything below five million is considered quite small. In terms of suitability, however, it is often the design of a corpus as opposed to its size which is the determining factor.
It is the design of a corpus which will ensure that it represents what it seeks to represent. Design issues include demographic factors such as gender, age and social class, as well as questions of the genres and contexts of the language included in the corpus. Even a very large corpus such as the BNC self-evidently does not tell us how English is used in the USA, in India, or as a lingua franca between non-native speakers.
Types of corpus
It is important to be aware of the range of corpora available (see Appendix 2 for a fuller list). While large general corpora such as BNC and COCA have both written and spoken components, many corpora are either written or spoken. The five million word CANCODE (Cambridge and Nottingham Corpus of Discourse English) is a well-known spoken corpus often cited in ELT studies. There are also English for Specific Purposes corpora, e.g. MICASE (Michigan Corpus of Academic Spoken English); CANBEC (Cambridge and Nottingham Business English Corpus), and the Hong Kong Engineering corpus. For ELT purposes, corpora of non-native English are important, e.g. VOICE 1 (ViennaâOxford International Corpus of English), a spoken corpus of English used as a Lingua Franca (ELF). Learner corpora are a specific type of non-native corpus, self-evidently containing data produced by learners of English, e.g. ICLE (International Corpus of Learner English) which âcontains argumentative essays written by higher intermediate to advanced learners of English from several mother tongue backgroundsâ (http://www.uclouvain.be/en-cecl-icle.html). We need to consider one further type of corpus: a pedagogic corpus or, to use Leechâs (1997) term, a teaching-oriented corpus. A pedagogic corpus is one that has been compiled specifically for language teaching purposes. An interesting suggestion for âpedagogic corporaâ has been made by Willis (2003), who proposes a pedagogic corpus is made up of the texts already used by the learners in class, which is then exploited for the study of particular language features. The advantage of such corpora, Willis (2003) argues, is that learners will already be familiar with the co-text, i.e. the text immediately surrounding the target feature, as they will previously have studied the whole text in class. Similarly, Römer (2006) has suggested that coursebooks themselves can be made into corpora so that âcoursebook Englishâ can be compared with âreal Englishâ. The SACODEYL (System Aided Compilation and Distribution of European Youth Language) corpus could also be seen as a pedagogic corpus, though it was not compiled from learning materials; it was deliberately constructed for language learning purposes, as described below on the SACODEYL website: âThe [SACODEYL] corpora are based on structured video interviews with pupils between 13 and 18 years of age. The interviews have been annotated and enriched for language learning purposes.â http://sacodeyl.inf.um.es/sacodeyl-search2/
While SACODEYL might not be the most transparent project title, it has the significant benefit of being free to access and providing online guidance on how to use it.
Corpus Search
Visit the four websites below and consider which you might find most useful for your teaching, research or studies:
http://corpus.byu.edu/bnc/
http://sacodeyl.inf.um.es/sacodeyl-search2/
http://www.uclouvain.be/en-cecl-icle.html
http://www.univie.ac.at/voice/
What can we do with a corpus?
Questions corpora can answer â quantitative analysis
Though corpus linguistics has come to be seen as a domain of applied linguistics in its own right, it will be useful for our purposes to view it also as a methodology through which various domains of applied linguistics can be investigated, e.g. grammar, lexis, discourse, pragmatics, SLA (second language acquisition). Corpora are most often associated with quantitative research as frequency information can be generated with striking ease. The most basic kinds of frequency question we can ask are:
- What are the most frequent words in our corpus, i.e. rank order?
- How many instances of a given word are there in the corpus, i.e. raw frequency?
- What percentage of the total number of tokens in the corpus does the raw frequency represent, i.e. relative frequency?
- What are the most frequent collocations of a given word in our corpus?
- What are the most frequent phrases of a given length (e.g. 2-word phrases, 3-word phrases, 4-word phrases and so on)?
- What are the most frequent grammatical structures in our corpus?
Each of these questions may be applied with a more specific focus, but we will take word frequency as an example:
- What are the most frequent words used in a given component of the corpus, e.g. academic or business or technical English?
- What are the most frequent words used by a particular demographic group of people, e.g. women, people under 30, people of a given social class or from a given region?
- What are the most frequent words used in a particular kind of text, e.g. scientific articles?
- What are the most frequent words in a given genre, e.g. self-descriptions on internet dating sites?
These questions do not exhaust the possibilities, but give some idea of the range of questions which can be asked of a corpus. It is crucial to note, however, that the kind of question which can be investigated depends on the composition of the corpus and the information wh...
Table of contents
- Cover Page
- Half Title Page
- Title Page
- Copyright Page
- Dedication
- Table of Contents
- List of figures
- List of tables
- Acknowledgements
- 1 Introduction
- 2 Building a corpus
- 3 Corpora and lexis
- Conclusion
- 4 Corpus research and grammar
- 5 Spoken corpus research
- 6 Corpora and the classroom
- 7 Corpora and ESP
- 8 Corpora in perspective
- 9 Conclusion
- Appendices
- Index