PART ONE
APPROACHES TO DISCOURSE ANALYSIS
CHAPTER ONE
Data collection and transcription in discourse analysis: A technological history
RODNEY H. JONES
DATA COLLECTION AS MEDIATED ACTION
The focus of this chapter will be on data collection and analysis as cultural and material practices of discourse analysts (Jaffe 2007). In particular I will focus on how, over the past half century, these practices have been affected by different technologies such as tape recorders, video cameras and computers, each of which made new kinds of knowledge and new kinds of disciplinary identities possible, and each of which fundamentally changed our understanding of discourse itself. I will limit myself to discussing the collection and transcription of data from real-time social interactions (especially spoken discourse). I will not be considering issues around the collection of written texts, which has its own set of complications.
Since the publication of Elinor Ochsâs groundbreaking 1979 article âTranscription as Theoryâ, it has become axiomatic that data collection and transcription are affected by the theoretical interests of the analyst, which inevitably determine which aspects of an interaction will be attended to and how they will be represented (see also Edwards 1993, Mishler 1991). Since then, much of the debate around transcription has focused on choosing the âbest systemâ for transcribing spoken discourse (Du Bois et al. 1993, Psathas and Anderson 1990) or âmultimodal interactionâ (Baldry and Thibault 2006, Norris 2004) or arguing about the need for standardization in transcription conventions (Bucholtz 2007, Lapadat and Lindsay 1999). In order to productively engage in such debates, however, it is necessary to consider more practical questions about data collection and transcription having to do with the materiality of what we call data and the effects of the technologies we use to collect and transcribe it on the ways we are able to formulate theories about discourse in the first place.
The theoretical framework I will use to approach these issues is mediated discourse analysis (Norris and Jones 2005). Central to this perspective is the concept of mediation, the idea that all (inter)actions are mediated through cultural tools (which include technological tools like tape recorders and semiotic tools like transcription systems) and that the affordances and constraints of these tools help to determine what kinds of actions are possible in different circumstances. This focus on mediation invites us to look at data collection and transcription as physical actions which take place within a material world governed by a host of technological, semiotic and sociological affordances and constraints on what can be captured from the complex stream of phenomena we call âsocial interactionâ, what can be known about it, and how we as analysts exist in relation to it, affordances and constraints that change as new cultural tools are introduced.
Mediated discourse analysis allows us to consider data collection and transcription as both situated practices, tied to particular times, places and material configurations of cultural tools, and community practices, tied to particular disciplinary identities.
FIVE PROCESSES OF ENTEXTUALIZATION
Nearly all of the practices discourse analysts engage in involve âentextualizationâ â transforming actions into texts and texts into actions. We turn ideas into research proposals, proposals into practices of interviewing, observation and recording, recordings into transcripts, transcripts into analyses, analyses into academic papers and academic papers into job promotions and academic accolades. Ashmore and Reed (2000) argue that the business of an analyst consists chiefly of creating artefacts â such as transcripts and articles â that are endowed with both âanalytic utilityâ and professional value.
Bauman and Briggs (1990) define âentextualizationâ as the process whereby language becomes detachable from its original context of production and reified as âtextsâ or portable linguistic objects. In the case of discourse analysts, this usually involves two discrete activities â one in which discourse is âcollectedâ with the aid of some kind of recording device and the other in which the recording is transformed into some kind of artefact suitable for analysis.
Practices of entextualization have historically defined elite communities in society â scribes, police officers, researchers â who, through the âauthorityâ of their entextualizations, are able to exercise power over others. To create texts is to define reality.
Whether we are talking about discourse analysts making transcripts or police officers issuing reports, entextualization normally involves at least five processes:
1. framing, in which borders are drawn around the phenomenon in question;
2. selecting, in which particular features of the phenomenon are selected to represent the phenomenon;
3. summarizing, in which we determine the level of detail with which to represent these features;
4. resemiotizing, in which we translate the phenomena from one set of semiotic materialities into another; and
5. positioning, in which we claim and impute social identities based on how we have performed the first four processes.
These processes are themselves mediated through various âtechnologies of entextualizationâ (Jones 2009), tools like tape recorders, video cameras, transcription systems and computer programs, each with its own set of affordances and constraints as to what aspects of a phenomenon can be entextualized and what kinds of identities are implicated in this act. Changes in these technologies result in changes in the practice of entextualization itself, what can be done with it, what kinds of authority adheres to it and what kinds of identities are made possible by it.
DATA IN THE AUDIO AGE
The act of writing down what people say was pioneered as a research practice at the turn of the twentieth century by anthropologists and linguists working to document the phonological and grammatical patterns of ânativeâ languages. Up until fifty years ago, however, what people actually said was treated quite casually by the majority of social scientists, mostly because they lacked the technology to conveniently and accurately record it. On-the-spot transcriptions and field notes composed after the fact failed to offer the degree of detail necessary to analyse the moment-by-moment unfolding of interaction. The âtechnologies of entextualizationâ necessary to make what we now know as âdiscourse analysisâ possible were not yet available.
This all changed in the 1960s when tape recorders became portable enough to enable the recording of interactions in the field. According to Erickson (2004), the first known instance of recording spoken interaction for research purposes was by Soskin and John in 1963 and involved a tape recorder with a battery the size of an automobile battery placed into a rowboat occupied by two arguing newlyweds. By the end of the decade, the problem of battery size had been solved and small portable audio recorders became ubiquitous, as did studies of what came to be known as ânaturally occurring talkâ, a class of data which, ironically, did not exist before tape recorders were invented to capture it (Speer 2002).
The development of portable audio-recording technology, along with the IBM Selectric typewriter, made the inception of fields like conversation analysis, interactional sociolinguistics and discursive psychology possible by making accessible to scrutiny the very features of interaction that would become the analytical objects of these fields. The transcription conventions analysts developed for these disciplines arose from what audio tapes allowed them to hear, and these affordances eventually became standardized as practices of âprofessional hearingâ (Ashmore et al. 2004) among these analysts.
The introduction of these new technologies of entextualization brought a host of new affordances and constraints to how phenomena could be framed, what features could be selected for analysis, how these features could be represented, the ways meanings could be translated across modes and the kinds of positions analysts could take up vis-Ă -vis others.
Framing refers to the process through which a segment of interaction is selected for collection. Scollon and Scollon (2004) use the term âcircumferencingâ. All data collection, they argue, involves the analyst drawing a âcircumferenceâ around phenomena, which, in effect, requires making a decision about the widest and narrowest âtimescalesâ upon which the interaction depends. All interactions are parts of longer timescale activities (e.g. relationships, life histories) and are made up of shorter scale activities (e.g. turns, thought units). The act of âcircumferencingâ is one of determining which processes on which timescales are relevant.
Among the most important ways audio recording transformed the process of framing for discourse analysts was that it enabled, and in some respects compelled them to focus on processes occurring on shorter timescales at the expense of those occurring on longer ones. One reason for this was that tapes themselves had a finite duration, and another was that audio recordings permitted the analyst to attend to smaller and smaller units of talk.
This narrowing of the circumference of analysis had a similar effect on the processes of selecting and summarizing that went in to creating textual artefacts from recordings. Selecting and summarizing have to do with how we choose to represent the portion of a phenomenon around which we have drawn our boundaries. Selecting is the process of choosing what to include in our representation, and summarizing is the process of representing what we have selected in greater or lesser detail.
The most obvious effect of audio recording technology on the processes of selecting and summarizing was that, since audiotape only captured the auditory channel of the interaction, that was the only one available to select. While many researchers accompanied their recordings with notes about non-verbal behaviour, these notes could hardly compete with the richness, accuracy and âauthorityâ of the recorded voice. As a result, speech came to be regarded as the âtextâ â and all the other aspects of the interaction became the âcontextâ.
It is important to remember that this privileging of speech in the study of social interaction was largely a matter of contingency. Analysts privileged what they had access to. Sacks himself (1984: 26) admitted that the âsingle virtueâ of tape recordings is that they gave him something he could analyse. âThe tape-recorded materials constituted a âgood enoughâ record of what had happened,â he wrote. âOther things, to be sure, happened, but at least what was on the tape had happened.â
While limiting what could be selected, the technology of audio recording hardly simplified the selection process. Because tapes could be played over and over again and divided into smaller and smaller segments, the amount of detail about audible material that could be included in transcripts increased dramatically. Whereas most analysts based their decisions about what features of talk to include on specific theoretical projects â conversation analysts, for example, focusing on features which they believed contributed to the construction of âorderâ in talk â some analyst...