eBook - ePub

Corpus Stylistics

Name: Corpus Stylistics
ISBN: 9781134447190

Speech, Writing and Thought Presentation in a Corpus of English Writing

Elena Semino,

Mick Short,

272 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Corpus Stylistics

Speech, Writing and Thought Presentation in a Corpus of English Writing

Elena Semino,

Mick Short,

About this book

This book combines stylistic analysis with corpus linguistics to present an innovative account of the phenomenon of speech, writing and thought presentation - commonly referred to as 'speech reporting' or 'discourse presentation'.This new account is based on an extensive analysis of a quarter-of-a-million word electronic collection of written narrative texts, including both fiction and non-fiction. The book includes detailed discussions of:

The construction of this corpus of late twentieth-century written British narratives taken from fiction, newspaper news reports and (auto)biographies
The development of a manual annotation system for speech, writing and thought presentation and its application to the corpus.
The findings of a quantitive and qualitative analysis of the forms and functions of speech, writing and thought presentation in the three genres represented in the corpus.
The findings of the analysis of a range of specific phenomena, including hypothetical speech, writing and thought presentation, embedded speech, writing and thought presentation and ambiguities in speech, writing and thought presentation.
Two case studies concentrating on specific texts from the corpus.

Corpus Stylistics shows how stylistics, and text/discourse analysis more generally, can benefit from the use of a corpus methodology and the authors' innovative approach results in a more reliable and comprehensive categorisation of the forms of speech, writing and thought presentation than have been suggested so far. This book is essential reading for linguists interested in the areas of stylistics and corpus linguistics.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Corpus Stylistics by Elena Semino,Mick Short in PDF and/or ePUB format, as well as other popular books in Languages & Linguistics & Linguistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Routledge

Year

2004

eBook ISBN

9781134447190

Edition

Topic

Languages & Linguistics

Subtopic

Linguistics

Index

Languages & Linguistics

1 Introduction: A corpus-based approach to the study of discourse presentation in written narratives

1.1 Introduction

We hope that this book will be of interest to at least two different kinds of linguists: (i) textlinguists (e.g. stylisticians and critical discourse analysts) who are involved in the analysis of discourse presentation in written and spoken language, and (ii) corpus linguists or other linguists who are interested in developing dedicated electronic corpora to elucidate textual phenomena. As we try to take both of these main readerships into account, we may, to some degree, tell one readership what it already knows. We apologize in advance if we sometimes do this, and we will try to keep such descriptions to a minimum. Nonetheless, we think it helpful to try to draw the textlinguistic and corpus traditions closer together through this specific study.

Our book describes the research on discourse presentation in written narratives we have been involved in since 1994, and which is still ongoing.¹ This work has involved the systematic and detailed annotation of a corpus of written fictional and non-fictional narratives for speech, writing and thought presentation categories, in order to throw light on discourse presentation theory and on how patterns of discourse presentation vary in three different written narrative genres (fiction, news reports and (auto)biographies).

Since 1996 we have published seven articles and book chapters on our work.² However, because these articles are spread through different books and journals, it is difficult for scholars to access the reports of the work we have undertaken. This volume, which draws from parts of these articles but also contains new material, is a summation of our work to date – work which aims to offer insights in relation to the study of discourse presentation in texts and to what is a relative innovative methodology for textlinguists. We will also use this book to consolidate what has been for us a constantly developing method of textual annotation and theory building. Because our research project has evolved over time, our articles to date have some descriptive and annotational inconsistencies among them. We have gradually changed some of the terms and annota-tions we have used as we have come to grips with new discourse presentation phenomena in our data. These inconsistencies may well have been confusing for those who have read more than one of our articles, and this volume provides an opportunity to explain the changes we have made and our reasons for making them, and to arrive at a reasonably stable set of descriptive terms and annotations for further research. We do not, of course, assume that our work to date is the end of the story in descriptive, annotational, analytical or theoretical terms.³ We hope that others might be interested in applying the analytical methods we have developed to yet other spoken and written genres/text types,⁴ to see how well our approach works for these other genres and how the patterns of discourse presentation in these genres compare with those we have analysed.

Before we proceed further, it will be helpful if we make some points about our use of terminology in this book. We have used the term ‘discourse’ in the discussion above for two reasons. First, we sometimes need a general, and briefer, term to refer to what we otherwise call ‘speech, writing and thought presentation’ (SW&TP).⁵ We will strive to use the term ‘discourse presentation’ only in this general, overarching sense. Our second reason for using the term was that we wanted to connect our work to that of other scholars who have written about the way in which the discourse of others is presented, and who often use the term ‘discourse presentation’ for this enterprise. However, we are conscious of the fact that the term ‘discourse’ is often used vaguely and/or with somewhat different meanings by different scholars. We have pointed out before (Short et al. 2002) that one of the dangers of the term ‘discourse presentation’ is that, if it is used as an elegant variant of the more specific terms ‘speech presentation’, ‘writing presentation’ and ‘thought presentation’, it is possible to move seamlessly from the discussion of one mode of presentation to another without making the change clear to oneself, or to others. This in turn can lead to mis-analyses and a less accurate understanding of the phenomena under investigation. We believe that, although there are commonalities among speech presentation, writing presentation and/or thought presentation, there are also important differences which are unhelpfully hidden if the general term ‘discourse presentation’ is used as an alternative for these more specific, mode-related terms and concepts. Hence, when discussing specific discourse presentation phenomena, we will strive to use the more specific terms and not to use the general term as a substitute for them.

The other term which we have already made considerable use of is ‘presentation’. We use this term as a default, rather than ‘report’ or ‘representation’ (which are often used as default terms by other linguists), because we are specifically interested in how the discourse of others (or the speaker/writer on some previous occasion) is presented. This is what textual annotation and analysis can most sensibly be used for (and explains why stylisticians tend to use this term). We prefer not to use the term ‘report’, which is often used as a default by grammarians (e.g. Hud-dlestonand Pullum 2002: 1023–30; Quirk et al. 1985: 1020–33) and other linguists who are part of a tradition where examples are invented when discussing discourse presentation. This is because the term ‘report’ suggests an unproblematic relationship between the discourse presentation and the anterior discourse which is being presented. Tannen (1989), among others, has shown that an assumption of faithful report for direct speech presentation in casual conversation is unrealistic (yet interestingly she uses the term ‘report’ even when undermining this assumption). However, we do not want to use the term ‘representation’ as a default either, as this tends to be used by linguists (e.g. critical discourse analysts like Caldas-Coulthard 1994 and Fairclough 1988) who want to concentrate mainly on distortions and misrepresentations in the reporting of anterior discourses. ‘Presentation’ is thus helpfully neutral for the discussion of speech, writing and thought presentation in a corpus of written texts where, for the most part, we do not, in any case, have easy access to the anterior speech, writing or thought being presented. We discuss this issue of terminology in more detail in Short et al. (2002).⁶

Many studies have proposed models of the forms and functions of discourse presentation in a range of text-types (e.g. Bally 1912a, 1912b; Ban-field1982; Collins 2001; Fairclough 1988; Fludernik 1993; Fowler 1986; McHale 1978; Pascal 1977; Tannen 1989; Thompson 1994, 1996; Volosi-nov1973; Waugh 1995; see also papers in Coulmas 1986 and Lucy 1993). The original motivation for our corpus-based study of discourse presentation, however, was to test how well the particular model of speech and thought presentation outlined in Leech and Short (1981: Ch. 10) worked on written text types other than the novel. The Leech and Short model was developed specifically to account for the range of speech and thought presentation forms and their effects in novels written in English. We wanted to test this model, not only because one of us has a rather obvious personal interest in it, but also because (i) it is still the most analytically specific account of speech and thought presentation to date, and (ii) it has been influential and widely used by other textlinguists.

Many analysts of prose fiction, including Fludernik (1993: 283–316, passim) and Simpson (1993: 21–30), have discussed the Leech and Short approach. Person (1999: 28–37) and Toolan (2001: 136–40) also include discussions of some of our more recent work referred to above. A number of studies have also applied the Leech and Short approach to non-literary texts. McKenzie (1987) uses Leech and Short to analyse how free indirect speech was used to circumvent a ban on direct quotation of the ANC in a booklet by South African students, and Roeh and Nir (1990) use it in the analysis of Israeli radio broadcasts. Thompson’s (1996) account of the dimensions of choice available to speakers or writers when reporting the language of others also draws on the Leech and Short model, which he describes as ‘comprehensive in its coverage’ and ‘[t]he most fully developed’ of the various approaches to speech and thought presentation (Thompson 1996: 504).

1.2 Why a corpus-based approach?

The Leech and Short model, like all theoretical models in stylistics up to that point, was developed through the use of scholarly intuition, based on extensive personal reading experience, which was in turn exemplified and tested through the analysis of examples chosen from previous reading. The model was also designed to account specifically for speech and thought presentation in fictional texts (indeed, most of the discourse presentation work by stylisticians and narratologists has concentrated on fiction). Hence it was difficult to know how generalizable the model was to other text-types, or how descriptively adequate it was when ‘tested to destruction’ on texts (including fictional texts) in a way that could not avoid inconvenient or borderline cases. It was for this reason that we decided to develop and annotate a dedicated corpus to test out the model.

We should also point out that some of the non-corpus work on discourse presentation which has already been completed has been based on the accumulation of very large numbers of examples accrued from previous reading. Specific mention should be made here of the monumental work of the narratologists Cohn (1978) and Fludernik (1993). We have benefited considerably from these two very insightful works. Cohn grounded her analysis of what we would call thought presentation through the accumulation of a manually collected corpus of examples:

Equipped with these basic abstractions [of narrative theory] I could then travel around in narrative literature, selecting works and passages in works that would best display the entire spectrum of possibilities, while in turn allowing these works themselves to reveal unforeseen hues.

(Cohn 1978: v)

Cohn’s motivation is not unlike ours, except that we want to compare discourse presentation across text types, including narrative fiction, and want to be much more explicit about our criteria for text selection, as well as being more explicit and systematic in our analysis of the texts in our corpus. Cohn was writing before computers could be used to store and interrogate large corpora of texts, of course, and we could well imagine that if she were beginning her work now, she might also want to make use of an electronic corpus, as we have.

Fludernik’s (1993) study of what she calls free indirect discourse is even more impressive in terms of the wide range of textual examples she uses to illustrate the points she wants to make. We have learned much from her work but, as with Cohn’s study, we were concerned that her relatively informal analytical approach might mean that important factors in the study of discourse presentation would be missed. In her research, Flud-ernikspecifically considered the possibility of a corpus-based approach, and the quantification that comes with it, but rejected this option (i) because she did not want to restrict herself to the literature of just one language, nation, period, etc., which she thought a corpus-based approach would prevent, and (ii) because she believed that a corpus and its associated annotation would have created serious methodological problems, in the sense that she thinks it would have been necessary to ‘institute arbitrary definitions of the relevant categories’ (Fludernik 1993: 9):

Such arbitrariness would necessarily have resulted in an erosion of the actual usefulness of the statistical data, since one would have had either to decide on larger categories that include marginal and ambiguous phenomena, or to indulge in a proliferation of subcategories and intermediary categories which would have rendered the statistics next to useless for interpretation. From previous experience with statistical research (Fludernik 1982) I have also acquired a profound distrust of the methodological relevance of statistical data. Statistics typically take individual occurrences of certain phenomena out of context. Since the present study attempts to document the crucial importance of context for the purpose of the even preliminary establishment of basic categories, a statistical approach would from the outset have vitiated one of the major aims of the project. These remarks are, however, not meant to discredit statistical research in itself. On the contrary, I would welcome a series of statistical analyses that might help to corroborate, modify or refute some of the theses I am here proposing.

(Fludernik 1993: 9)

We have quoted from Fludernik at length because we have effectively tried to do what she decided to avoid, namely to use a set of categories and subcategories to analyse the textual extracts in our corpus comprehensively and systematically. Consequently, we certainly recognize some of the problems she points to, though we think that the annotation difficulties have not been as damaging as she thought they would be. Indeed, we would claim that forcing ourselves to be as clear and precise as possible about our annotations has helped us to isolate, and come to terms with, phenomena we may not otherwise even have noticed. Similarly, we believe that forcing ourselves to account for ambiguity and marginal phenomena in our annotations has helped us to understand more exactly how the speech, writing and thought presentation scales operate, and what factors are at work in producing ambiguity on those scales. Because we take this explicit analytical approach, we are able to provide some of the statistical information which, at the end of the above quotation, Fludernik says that she would welcome.

We very much agree with Fludernik that statistical analysis has limitations as well as advantages, and this is why we present both quantitative and qualitative analysis in this book. We do not think that the one precludes the other (though doing both does increase the workload still further, as, from experience, we are very well aware). Indeed, we would want to argue that both forms of analysis are needed, and work best when used interdependently. Although Fludernik decided not to adopt a corpus-based and quantitative approach (the experience of the dissertation she refers to as Fludernik 1982 was clearly salutory!), she makes a point of saying that she is not antipathetic to such work. She is very open to the fact that all approaches have advantages and disadvantages, and that we can all learn from different approaches to the same phenomenon. This tolerant and inclusive attitude is in contrast to the attacks on corpus linguistics by some other linguists, which we allude to briefly below.

It was natural for us to move to a corpus-based approach as we work in a department which has members who have been involved in corpus construction and annotation for some years, and who could easily be called upon for advice and help. The Lancaster–Oslo/Bergen (LOB) corpus was one of the early modern linguistic corpora to be developed; Lancaster is the ‘home’ of the British National Corpus (BNC), for which Lancaster did much of the work, and our colleagues are involved in the building and exploitation of other corpora too. However, not all linguists are sympathetic to a corpus-based approach, and so we will take a little space here to explore some of the pros and cons in the use of electronic corpora, to help explain our decision to develop our corpus and to use ‘corpus stylistics’ as the main title of this book.

The first point that we would like to make is that although this book, and much of our current work, involves the use of a corpus-based approach in stylistics, we do not think that this approach should supplant other work within our field. Rather, our decision to use a corpus-based approach was because it was the best tool we could find to carry o...

Cover Page
Title Page
Copyright Page
Routledge advances in corpus linguistics
Figures
Tables
Acknowledgements
1: Introduction: A corpus-based approach to the study of discourse presentation in written narratives
2: Methodology: The construction and annotation of the corpus
3: A revised model of speech, writing and thought presentation
4: Speech presentation in the corpus: A quantitative and qualitative analysis
5: Writing presentation in the corpus: A quantitative and qualitative analysis
6: Thought presentation in the corpus: A quantitative and qualitative analysis
7: Specific phenomena in speech, writing and thought presentation
8: Case studies of specific texts from the corpus
9: Conclusion
Appendix 1: List of texts sampled
Appendix 2: The SW&TP tagset
Appendix 3: Alphabetical list of reporting verbs for Indirect Speech presentation
Appendix 4: Alphabetical list of reporting verbs for Direct Speech presentation
Appendix 5: Alphabetical list of reporting verbs for Indirect Writing presentation
Appendix 6: Alphabetical list of reporting verbs for Direct Writing presentation
Appendix 7: Alphabetical list of reporting verbs for Direct Thought presentation
Appendix 8: Alphabetical list of reporting verbs for Indirect Thought presentation
Bibliography