1 Introduction
Jesse Egbert and Paul Baker
How Tall Are the Pyramids?
In the sixth century B.C., the Greek philosopher Thales of Miletus wanted to know the height of the pyramids in Egypt. Even if he had possessed a measuring tape of sufficient length and had been able to climb to the top, the shape of the pyramids would have rendered it impossible to measure their exact height at the center. So Thales discovered a clever solution to his problem in which âhe measured the height of the pyramids by the shadow they cast, taking the observation at the hour when our shadow is of the same length as ourselvesâ (Diogenes Laertius, 2018, 1.1, p. 27). Thales used his knowledge of geometry to deduce that his shadow and the shadow of the pyramids projected similar right-angled triangles. Thus, at the moment when the length of his shadow matched his height, the length of the pyramidsâ shadows would also match their height. This was the first recorded use of triangulation, or the process of using âdistances from and direction to two landmarks in order to elicit bearings on the location of a third point (hence completing one triangle)â (Baker & Egbert, 2016, p. 3).
The efficacy of triangulation has ensured that it is still used today. For example, land surveyors use distance from and direction to two landmarks in order to elicit bearings on the location of a third point (hence completing a triangle). However, the concept has been extended to involve other forms of analysis. So a related term, methodological triangulation has been used for decades by social scientists as a means of explaining behavior by studying it from two or more perspectives (Webb, Campbell, Schwartz, & Sechrest, 1966; Glaser & Strauss, 1967; Newby, 1977; Layder, 1993; Cohen & Manion, 2000).
Methodological triangulation in linguistic research that includes large bodies of naturally occurring text, encoded in electronic form, referred to as corpus data, is clearly on the rise. Recent decades have seen a surge in the use of corpora and their accompanying corpus linguistic methods which involve specialist computer software in empirical linguistic research (e.g. Sampson, 2013). While many studies in linguistics use corpora as their sole source of data it is becoming increasingly common in empirical linguistics for researchers to triangulate corpus linguistic methods with methods and data from other areas in linguistics. This type of methodological triangulation has proven to be a highly effective means of explaining linguistic phenomena. McEnery and Hardie (2012) note the trend, explicitly encouraging researchers to engage with this type of research, stating that:
the triangulation of corpus methods with other research methodologies will be an important further step in enhancing both the rigour of corpus linguistics and its incorporation into all kinds of research, both linguistic and non-linguistic. To put it another way, the way ahead is methodological pluralism. This kind of methodological triangulation is already happening, to some extent⌠. But we would argue that it needs to be taken further.
(p. 227)
Despite this apparent trend, there has been very little work that has described and evaluated state-of-the-art methods for triangulating corpus linguistics with other research methods in linguistics. One notable exception is a special issue of Corpus Linguistics and Linguistic Theory on âCorpora and Experimental Methodsâ (see Gilquin & Gries, 2009). However, the studies included in the special issue focussed solely on triangulating corpora with experimental psycholinguistic methods, which is only one of many potentially fruitful areas of research in linguistics that can benefit from triangulation with corpus data.
In a 2016 volume published in Routledgeâs Advanced in Corpus Linguistics series as the same book series, Triangulating Methodological Approaches in Corpus Linguistics, we (the two editors) carried out a large-scale experiment on methodological triangulation within corpus linguistics (Baker & Egbert, 2016). Our book presented a cohesive, demonstrative overview of ten methods within corpus linguistics using a single corpus, exploring the extent to which those methods complemented each other. We gave analysts the same corpus and research questions, asking them to work independently and produce a report of their approach and findings, which we then subjected to a comparative analysis in the final chapter of the book. We found a largely complementary picture, with most authors making unique discoveries with a smaller number of shared findings and a tiny number of contradictory ones. At the end of the book we argued that
While there are certainly challenges inherent to triangulation research, we believe the benefits of triangulation make efforts to overcome these challenges worthwhile. Moreover, the challenges associated with triangulation research may help motivate corpus linguists to develop synergy in their research through effective collaborative relationships with scholars from different research orientations.
(ibid, 207â8)
The success of that book project inspired us to think about triangulation between corpus linguistic methods and other linguistic methodologies, which we see as a natural extension and an exciting next step.
Thus, this edited volume focuses on triangulating corpus linguistic methods with a wide range of other research methods in linguistics (e.g. psycholinguistic experiments, critical discourse analysis, language assessments). Specifically, we have three primary goals for this volume:
- Showcase a variety of state-of-the-art research methods in linguistics outside of the realm of corpus linguistics (CL).
- Include a series of empirical studies on a range of topics in psycholinguistics, applied linguistics, and discourse analysis which triangulate non-CL methods with CL methods.
- Investigate the extent to which non-CL research methods can complement CL methods to enhance our understanding of linguistic processes and variation.
This book is structured around nine empirical studies, each of which triangulates CL data with non-CL data. We have selected expert researchers who have a strong track record of triangulating corpus methods with other linguistic methods. These nine methods fall into three major areas of linguistics: discourse analysis, applied linguistics, and psycholinguistics. The introduction and conclusion chapters are written by the two editors. In this introduction chapter we consider the relationship between corpus linguistics and triangulation with reference to relevant earlier research. We then describe the perspective that this collection takes on triangulation and outline the aims of the book and its more specific research questions. This follows with a summary of how we selected the topics and the instructions that were given to authors, and we conclude the chapter by introducing the nine approaches taken by the authors who contributed chapters, as well as giving a brief summary of the bookâs concluding chapter. Let us begin then, with a brief introduction to corpus linguistics, which, along with triangulation, is one of the two thematic threads which run through this book.
Corpus Linguistics
Linguistics is the scientific study of language. In essence, linguists share the goal of discovering and describing patterns in human language. Linguistics, broadly defined, encompasses many research areas related to the underlying structure of language, the use of languages by their speakers, how languages vary across social groups and change over time, how language is processed in the brain and acquired by speakers, and how languages are typologically related, to name just a few.
As with any scientific enterprise, linguistics relies heavily on empirical data to test hypotheses and explore patterns. There are many different types of linguistic data. For example, psycholinguists use response times, eye tracking data, and neuroimaging; sociolinguists use interviews, questionnaires and surveys; and critical discourse analysts consider both texts and the myriad set of contexts which account for the ways that such texts were produced and received. Although the sources of linguistic data differ widely across the different subfields of linguistics, one source of data is used in nearly all of them: the corpus.
A corpus is a large collection of naturally occurring texts (e.g. those which were not originally created for the purposes of language analysis). Ideally, corpora are sampled from and thus meant to be representative of a larger linguistic population, allowing researchers to generalize findings from the corpus sample back to the population. Corpora are typically stored digitally and processed with the aid of computers, enabling researchers to process large amounts of corpus data automatically, providing increased speed and reliability. Even corpus methods that are qualitative in nature benefit from the use of computer software that allows researchers to search patterns, sort results, and annotate texts more easily and efficiently.
Corpus-based research has proliferated during the past two or three decades with the existence of several dedicated corpus linguistics journals including International Journal of Corpus Linguistics, Corpora, Corpus Linguistics and Linguistic Theory, and International Journal of Learner Corpus Research. It is also increasingly common to find corpus-based studies published in applied linguistics journals (e.g. Applied Linguistics, TESOL Quarterly), sociolinguistics journals (e.g. Language Variation and Change, Journal of Sociolinguistics), discourse analysis journals (e.g. Discourse Studies, Discourse Processes), and general linguistics journals (e.g. Language, Journal of English Linguistics).
The reason for the widespread appeal of corpus methods is quite simple: corpora reveal how real people use real language. Whether youâre a psycholinguist interested in the effect of input frequency, a historical linguist interested in diachronic language change, a sociolinguist interested in dialect variation, a language acquisition expert interested in developmental sequences or a critical discourse analyst interested in social relations and the power of language to influence people, corpora can often provide a good amount of naturally occurring language data to answer at least some of the research questions in a study. Paradoxically, despite being arguably the most ubiquitous type of data in linguistics, corpora are seldom used as the sole data source in most subfields of linguistics. In other words, it is becoming increasingly common for researchers to triangulate corpora and corpus linguistic methods with other data sources and methods in linguistic research. So, what is triangulation and how has it been used in corpus-based research?
Triangulation
As mentioned earlier, triangulation is a research approach that takes two or more perspectives to investigate a research question. In the words of Cohen, Manion, and Morrison (2000), triangulation is an âattempt to map out, or explain more fully, the richness and complexity of human behavior by studying it from more than one standpointâ (p. 112). Denzin (1970) identified six major types of triangulation: time triangulation, space triangulation, combined levels of triangulation, theoretical triangulation, investigator triangulation, and methodological triangulation. The first three of these types can be grouped together within the category of data triangulation, leaving us with four main types of triangulation. In data triangulation, the researcher applies the same method to data sets from different times, locations or groups of participants. There are many examples of corpus studies that investigate language from different times (i.e. diachronic studies) and places (i.e. dialect studies), but these studies are typically focussed on describing variation rather than triangulating data. Data triangulation can also be performed with two or more groups of participants. In contrast to other studies with multiple participant groups, the primary goal in data triangulation studies is to validate data through cross-verification.
In theoretical triangulation, a researcher draws on two or more theories by actively testing each of them or appealing to each of them to interpret research findings. In investigator triangulation, two or more researchers apply the same methods to the same data set. Whereas data and theoretical triangulation are not particularly common in corpus-based research, there have been some studies based on investigator triangulation. For example, Marchi and Taylor (2009) used investigator triangulation in a corpus-based study of journalist language, and Baker (2015) triangulated findings about representations of foreign doctors from five investigators.
The final type, methodological triangulationâor âbetween methods triangulationâ as Denzin (1970) calls itârelies on more than one method to answer the same research question. Although the triangulation of data, investigators, and theories are valuable approaches to data analysis, in this volume we focus on methodological triangulation, the most widely used and, arguably, the most useful and applicable type of triangulation for corpus research. Despite criticisms from some scholars (see, e.g. Silverman, 1985; Fielding & Fielding, 1986), researchers have noted a large number of benefits associated with methodological triangulation, arguing that it:
- Presents the situation in a more detailed and balanced way (Altrichter, Posch, & Somekh, 1996).
- Leads to a deeper and more comprehensive understanding of the issue under investigation (Denzin & Lincoln, 1994, p. 5; Bekhet & Zauszniewski, 2012; Baker & Egbert, 2016).
- Provides confirmation of findings (Bekhet & Zauszniewski, 2012).
- Facilitates cross-checking of data to search for (ir)regulatities (OâDonoghue & Punch, 2003; Baker & Egbert, 2016).
- Provides evidence for the validity of research findings (Thurmond, 2001; Cohen et al., 2000).
- Demonstrates the reliability of the methods and findings (Marchi & Taylor, 2009).
- Provides opportunities for more robust interpretations (Layder, 1993, p. 128).
- Leads to increased collaboration among scholars with different theoretical orientations and/or methodological expertise (Baker & Egbert, 2016).
Defining Triangulation
Some scholars in previous research apply the term âtriangulationâ only to cases where the sole objective of using more than one method is to determine whether the validity of one method is confirmed by the other(s). As noted earlier, this is an oft-cited benefit of methodological triangulation (see Bekhet & Zauszniewski, 2012), but we argue that triangulation has benefits that reach beyond confirmation of findings. Thus, in this volume we adopt a much broader definition of methodological triangulation that extends to:
- âapplying two methodologies separately to the same questionâ (Marchi & Taylor, 2009, p. 5)
- âthe combination of two or more ⌠methodologic approaches ⌠within the same studyâ (Thurmond, 2001, p. 253)
- âthe use of two or more different kinds of methods in a single line of inquiryâ (Risjord, M...