Corpus Linguistics for Education provides a practical and comprehensive introduction to the use of corpus research-methods in the field of education. Taking a hands-on approach to showcase the applications of corpora in the exploration of educationally relevant topics, this book:
• covers 18 key skills including corpus building, the role of frequency, different corpus methods, transcription and annotation;
• demonstrates the use of available corpora and desktop and online corpus analysis tools to conduct original analyses;
• features case studies and step-by-step guides within each chapter;
• emphasises the use of interview data in research projects.
Corpus Linguistics for Education is an essential guide for students and researchers studying or conducting their own corpus-based research in education.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go. Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Corpus Linguistics for Education by Pascual Pérez-Paredes in PDF and/or ePUB format, as well as other popular books in Education & Education General. We have over one million books available in our catalogue for you to explore.
If you expect to find a definition of corpus linguistics in this opening paragraph, you will not be disappointed. Actually, you will find two. One is short; the second is a bit longer. Corpus linguistics (CL) studies language usage empirically. That was the first definition. It is inspired by McEnery and Wilson (1996: 1): ‘CL is the study of language based on examples of real-life language use’. And this is the slightly longer definition: CL studies the usage of language by examining how representative texts of a given genre reflect the discursive practices of actual language users. Do not worry if the second definition is a bit difficult to process now, or if you just do not seem to find how this may be relevant in education research. The aim of this book is precisely to show you how you can use CL research methods in your area of investigation. We will come back to these definitions throughout this chapter.
Corpus linguists have always been concerned with actual usage. This interest has run parallel in the past decades with an interest in describing language performance, that is, what people actually say or write. Linguist Geoffrey Leech noted that it was Randolph Quirk’s Survey of English Usage in 1959 and Nelson Francis’s collection of the Brown Corpus in 1962 that contributed to the development of CL applications before the massive, widespread use of computers. Leech observed (Viana, Zyngier & Barnbrook, 2011: 155) that ‘both [linguists] hit on the idea of collecting a large body of texts (and transcriptions) wide-ranging enough to represent, to a reasonable extent, the contemporary English language’. One of the early applications of CL was lexicography. Not so long ago, most dictionary entries contained examples of use made up by lexicographers and a selection of entries based on their expert insight. Before the use of CL methods, lexicographers had consistently tried to portray the meanings of words in the most accurate way, but it has not been until more recent times that they have begun to rely on descriptions of language use based on attested uses of the language contributed by a community of speakers and users of the language.
In CL a large body of texts is known as a corpus, hence the name corpus linguistics. A corpus is used to model usage and we can think of a corpus as a proxy for usage. In this view, a corpus is an instrument, a method, that researchers use to answer research questions. Linguists regularly use corpora (plural form of corpus) to investigate questions concerning the characterisation of usage (Figure 1.1).
Figure 1.1Corpora as a research method
CL research methods offer us a means to understand how language is used by a group of individuals while engaged in communication. For example, you may want to know how speakers of English in both hemispheres use language and you need massive amounts of data that can illustrate your query. The iWeb corpus1 contains 14 billion words compiled from websites in English-speaking countries. It is an excellent resource to look at how language is currently used across national varieties (US, UK, NZ, etc.) and different types of text.
Let’s now turn our attention to three concrete research questions and how corpora can be used to help researchers answer them:
Question A: What language is used in TV shows?
Question B: What characterises Higher Education (HE) student writing in the UK?
Question C: What characterises dentists’ communication in professional contexts?
Before we look at how these questions are examined by means of a corpus, can you think of other research methods that can be used to answer these questions? How will your data be collected? How will it be analysed?
Table 1.1 Connecting research questions, methods and data collection and analysis
Discussion
Research methods that can be used to answer these questions
A. What language is used in TV shows?
……………………………….
……………………………….
• Data collection and analysis:
• Data collection and analysis:
B. What does Higher Education (HE) student writing look like in the UK?
……………………………….
……………………………….
• Data collection and analysis:
• Data collection and analysis:
C. What characterises dentists’ communication in professional contexts?
……………………………….
……………………………….
• Data collection and analysis:
• Data collection and analysis:
Arguably, these questions can be answered by drawing on different research methodologies and methods, as suggested in Table 1.1. However, corpus linguistics will put more emphasis on the notion of usage and the need to use a representative body of textual evidence. The three questions above can be answered by putting together and querying different corpora that can be used as proxies of the phenomena under investigation. In the case of the first research question (A), we may want to use the English-Corpora.org TV Corpus. This corpus contains 325 million words from 75,000 episodes of TV shows dating from the 1950s to 2018 produced and recorded in, among other countries, the US, the UK, Australia and New Zealand. Depending on your area of interest, you may want to narrow down your focus and examine only talk shows or, for example, all soaps or just one specific genre (i.e. comedy). Bednarek (2018) used the Sydney Corpus of Television Dialogue (SydTV)2 to investigate dialogues in American TV series and developed a categorisation of their functions, namely, narrative-related functions (progressing the plot or filling out character) and medium-related functions (endorsing products or engaging audience emotions).
Question B calls for the compilation of a specific collection of texts representative of student writing in HE. The British Academic Written English Corpus (BAWE)3 seems like a good fit when approaching this research question. The BAWE contains university-level student writing: around 3,000 good-standard student assignments totalling 6,506,995 words. The corpus showcases different types of text (essays, critiques, explanation, literature reviews, etc.) across different disciplines (agriculture, economics, biological sciences, business, classics, engineering, etc.). Nesi & Gardner (2018) have discussed how the genres in this corpus can be linked to different social purposes:
•demonstrating knowledge and understanding
•developing powers of independent reasoning
•building research skills
•preparing for professional practice
•writing for oneself and others.
For each of these purposes, Nesi & Gardner have analysed an inventory of subgenres and have developed specific materials that can be used to teach HE writing across different levels of expertise and disciplines. These findings are solely based on the evidence provided by the texts included in the British Academic Written English Corpus.
Question C represents one of the areas where corpus linguistics has been most productive in the last decades: the analysis of specialised languages (Bhatia, Sánchez Hernández & Pérez-Paredes, 2011). The use of professional registers has attracted the attention of applied linguists who have found in corpora an opportunity to examine evidence of how specialised discourse is used and applications of evidence-based knowledge in education. Thus, Biber & Conrad (2009) have stressed the educational potential of the analysis of corpora:
Text varieties and the differences among them constantly affect people’s daily lives. Proficiency with these varieties affects not only success as a student, but also as a practitioner of any profession, from engineering to creative writing to teaching. Receptive mastery of different text varieties increases access to information, while productive mastery increases the ability to participate in varying communities. And if you cannot analyze a variety that is new to you, you cannot help yourself or others learn to master it.
(Biber & Conrad, 2009: 4)
The use of language is, therefore, constrained by the communities of users where those uses are meaningful, either because they see them as part of a discursive practice or a non-discursive one. In the case of the language used by dentistry professionals, Crosthwaite & Cheung (2019) have identified register features from three subgenres: published experimental research articles, case reports, and novice and professional research reports within the Dental Public Health domain. Crosthwaite & Cheung (2019) decided to use subgenres that undergraduates will encounter shortly after graduation. Awareness of the differences among these subgenres at all levels (lexical, syntactic, phraseological, etc.) is key to understanding communication within this area of practice. Biber & Conrad (2009: 3), among others, have stressed the educational needs of students ‘to control and interpret the language of different varieties’ as a vital factor to succeed at school and in their careers.
We have just seen how researchers use corpora as a method to answer questions such as A, B and C above. Corpora, therefore, become the central research instrument for corpus linguists when trying to answer a wide range of different questions. McEnery & Hardie (2012: 15) have stressed the idea that most corpus designs follow the principle of total accountability in that the researcher tries to avoi...
Table of contents
Cover
Half Title
Series Information
Title Page
Copyright Page
Dedication
Table of Contents
List of Figures
List of Tables
Preface
Acknowledgements
Chapter 1 Introduction: Corpus linguistics and education research
Chapter 2 Analysing text
Chapter 3 Corpus linguistics approaches to understanding language use
Chapter 4 Researching education policies: Using your own corpus
Chapter 5 Interview data: Transcription and annotation
Chapter 6 Examining lexis: Analysing peace treaties and children’s literature