Corpus-Based Sociolinguistics
eBook - ePub

Corpus-Based Sociolinguistics

A Guide for Students

  1. 312 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Corpus-Based Sociolinguistics

A Guide for Students

About this book

In the last decade, the availability of corpora and the technological advancements of corpus tools have increased dramatically. Applied linguists have greater access to data from around the world and in a variety of languages through websites, blogs, and social networking sites, and there is a high level of interest among these scholars in applying corpora and corpus-based methods to other research areas, particularly sociolinguistics.

This innovative guidebook presents a systematic, in-depth account of using corpora in sociolinguistics. It introduces and expands the application of corpora and corpus approaches and tools in sociolinguistic research, surveys the growing number of studies in corpus-based sociolinguistics, and provides instructions and options for designing and developing corpus-based studies. Readers will find practical information on such contemporary topics as workplace registers, megacorpora, and using the web as a corpus. Vignettes, case studies, discussion questions, and activities throughout further enhance students' involvement with the material and provide opportunities for hands-on practice of the methods discussed. Corpus-Based Sociolinguistics is a comprehensive and accessible guide, a must-read for any student or scholar interested in exploring this popular and promising approach to sociolinguistic research.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Corpus-Based Sociolinguistics by Eric Friginal,Jack Hardy in PDF and/or ePUB format, as well as other popular books in Languages & Linguistics & Linguistics. We have over one million books available in our catalogue for you to explore.
Section B
Survey of Corpus-Based Sociolinguistic Studies

B1
Corpora and the Study of Languages and Dialects

B1.1 Languages, Dialects, and Varieties

There are various definitions of language and dialect in the interrelated fields of linguistics, sociology, and anthropology. Most of these definitions overlap and scholars may sometimes use these two terms interchangeably. In general, there are no commonly accepted criteria distinguishing a language from a dialect, and subjective definitions are largely based on scholars’ particular field or research focus. For example, it is common for Chinese speakers to refer to Mandarin and Cantonese as two distinct dialects although Mandarin and Cantonese are not mutually intelligible and are spoken by people representing diverse regions and cultural traditions in China. In other contexts, Mandarin and Cantonese are identified as two independent languages. The complication here is intensified by the fact that Mandarin and Cantonese can be written using the same classical or standard “Chinese” scripts. Dialects have also been categorized according to social judgments by speakers (e.g., what is proper or standard form against informal, nonstandard) and level of prestige or usage (e.g., not written or not codified, speakers are few).
In sociolinguistics, it is important to operationalize these two terms, as a clear distinction helps identify the meanings and contexts of variation in speakers’/writers’ use of words, sentences, and discourse. In this book, we define language and dialect as follows:
  • Language: a collection of words, meaningful sounds, and gestures that form a system for their common use by groups of individuals belonging to the same speech community. A speech community may cover a geographical region, a nation, or people with same cultural tradition, norms, and identities (French language, English language in the United States, Kinyarwanda in Rwanda).
  • Dialect: a variety of a language that can clearly be distinguished from other varieties of the same language. These dialects are typically mutually intelligible but with clear differences in features such as accent and pronunciation (phonology), sentence structures (syntax), and use of vocabulary (lexis). Dialect speakers may be separated from other dialect speakers geographically or socially.
    The English language in the United States has been classified into regional dialects in many ways by linguists. For example, the most detailed dialect distinction in American English based on pronunciation lists the following groups: (1) Northern New England, (2) the North, (3) Greater New York City, (4) the Midland, (5) the South, (6) North Central, and (7) the West. We discuss some of these dialect groups in this section.

B1.2 The Study of Regional Dialectology

In this section, we focus first on regional dialectology research and how this tradition has influenced the use of corpora in sociolinguistic studies of languages and dialects. Taken together, linguistic variation in this context is attributed primarily to geography and regional differences. The study of regional variation in language has been one of the most important foci of sociolinguistic research, with many pioneering projects originating in Europe in the early nineteenth century. The primary goal of these dialectology studies was to interview older speakers in order to list common vocabulary and local terms and phrases unique to speakers of a particular village or region. In France, for example, the Atlas Linguistique de la France was a product of extensive fieldwork by Jules Gillieron and Edmond Edmont using standard questionnaires, note-taking techniques, and a systematic transcription of how locals pronounced different words or phrases. Over the years, the works of Gillieron and Edmont have inspired various atlas projects in European countries including Italy, Spain, Germany, Switzerland, and the United Kingdom. These studies have produced dialect maps and translation dictionaries that illustrate regional variation in speakers’ vocabulary use and pronunciation of speech sounds.
In the United States, a project to develop the Linguistic Atlas of the United States and Canada was begun in 1931 by Hans Kurath. Kurath (1891–1992) studied German linguistics at the University of Chicago and his original research allowed him to travel to various remote areas of the eastern United States tracking migration movements of German speakers and collecting a variety of spoken and written data. This experience in data gathering and geographically tracking dialect distinctions helped Kurath to identify and map distinctive dialects and speech and pronunciation patterns, as well as the structure and evolution of American English. Kurath’s interests in historical linguistics and the study of English brought about by European settlers inspired a groundbreaking set of projects that eventually produced the Linguistic Atlas of the United States. He introduced a systematic data-gathering approach while interviewing speakers of dialect groups particularly of the New England region and plotting similarities and differences on maps. In the 1930s, supported by the Modern Language Association, Kurath directed and completed the Linguistic Atlas of New England. The first volume of this atlas was published in 1939. The primary result of Kurath’s work suggested that there are three major dialect areas in the eastern United States: the North, the Midland, and the South (see Figure B1.1). This atlas became the model for succeeding regional atlases that have been initiated by teams of American linguists.
FIGURE B1.1. Three major dialect areas of the Eastern United States
FIGURE B1.1. Three major dialect areas of the Eastern United States
Source: adapted from Kurath, 1949, p. 91
In the United Kingdom, the Linguistic Atlas of England, edited by Orton, Sanderson, and Widdowson (1978), mapped regional linguistic variation in British English—including lexical, phonological, morphological, and syntactic features— based on the data gathered for the Survey of English Dialects. Isoglosses were also plotted midway between adjacent locations with contrastive forms. Nine primary regions—Southwest, Southeast, London, East, West Midlands, East Midlands, Yorkshire and Humber, Northwest, and Northeast—now often referred to as the dialect regions of present-day England, have been identified based on this atlas.
Over 1,300 questions designed to elicit 730 lexical items, 387 phonological items, 128 morphological items, and 77 syntactic items were used, primarily focusing on the oldest and most conservative forms of vernacular English. Data collection occurred primarily in rural and farming areas, and most questions considered topics about the countryside and farming as well as universal subjects such as numbers, time, and the human body. The field investigators were instructed to interview older residents who had been born and raised in the area they represented and whose parents were also likely to be native to the area, as well. Moreover, because it was assumed that males would be more likely to speak traditional dialects than females, the investigators were told to favor male over female informants. In addition to the focus on rural England, the settings for interviews were based on location and population, with the overall objective of sampling one community in England every 15 miles. The survey was conducted from 1950 to 1961, facilitated by recording technology that was developed after 1953. A total of 11 field investigators collected data in 313 locations across England during this period. Overall, several findings of these interviews were not as consistent or as clear-cut as the American surveys directed by Kurath; however, this may be more attributable to the potentially conservative approach adopted by the British dialectologists compared to their American counterparts than with the nature of British dialect regions (Grieve, 2009).

B1.3 Linguistic Atlases in the United States

Who uses skeeter hawk, snake doctor, and dragonfly to refer to the same insect?
Who says gum band instead of rubber band? (Kretzschmar, McDavin, Lerud, & Johnson, 1993)
Subsequent to Kurath, linguistic atlases in the United States have been produced by scholars mostly affiliated with the Linguistic Atlas of the United States and Canada. In addition, other groups of linguists collected data that focused on specific sub-areas in the eastern United States. The first three primary categorizations of dialect groups in the Eastern Seaboard (the North, the Midland, and the South) were further divided into subcategories (e.g., the North was further divided into northeastern, southeastern, and southwestern New England). New York, with its ever-growing number of immigrant communities and traditional European settlers, has also been further divided into groups such as Upstate, Hudson Valley, and Metropolitan Area.
Regional Linguistic Atlas projects have produced dictionaries, various publications, and digital archives, and they continue to be maintained in U.S. universities. These projects include the Linguistic Atlas of the Middle and South Atlantic States (LAMSAS), Linguistic Atlas of the Western States (LAWS), Digital Archive of the Southern Speech (DASS), Linguistic Atlas of the Gulf States (LAGS), and Linguistic Atlas of the North-Central States (LANCS).
  • Linguistic Atlas of the Middle and South Atlantic States (LAMSAS)
    LAMSAS came directly from Kurath’s framework of extensive interviews with 1,200 informants from the state of New York to Florida (northern Florida only) and from the Atlantic coast to the borders of Kentucky and Ohio. Interviews were collected from the 1930s to the 1940s. Regional variations in word use, grammar, and pronunciation were mapped in LAMSAS during the time when migration movements were more limited than they are today. Hence, data from these interviews allow for correlations between language patterns and settlement or migration movements in the United States. LAMSAS is considered as the largest single survey of regional and social differences in spoken American English (Preston, 1993).
  • Linguistic Atlas of the Western States (LAWS)
    It took a while before U.S. linguists started covering the Western States, with more concentrated efforts dedicated to the Eastern Seaboard and its population’s slow migration movements to the Midwest and the West. LAWS is still a work in progress, but fieldwork has been completed in Colorado, Utah, and Wyoming. Additional interviews in Texas and California began in the 1990s. The primary goal of LAWS is to provide recorded data on the speech of the American West by creating an inventory of regional and social markers characterizing Western culture and traditions. Also highlighted are influences from Mexico and the Spanish language (e.g., prevalence of Spanish words, proper nouns, and code-switched terms). Preston (1993) noted that work with LAWS extends beyond traditional atlas dialectology, as all interviews are recorded, which allows linguists the ability to explore all features of discourse.
  • Digital Archive of the Southern Speech (DASS) and the Linguistic Atlas of the Gulf States (LAGS)
    LAGS benefits from available digitized versions of interviews that feature Southern English that are now accessible through computer-based files (e.g., .wav or .mp3 files). LAGS covers Florida, Georgia, Tennessee, Alabama, Mississippi, Louisiana, Arkansas, and Texas. Interviews were sampled across the LAGS region to cover older residents and a range of topics. In addition, one African American speaker for each of the 16 LAGS areas is included in the database. The presence of subcategories of speakers, especially African American speakers of southern English, makes LAGS an important atlas that directly addresses sociolinguistic regional and racial variation.
  • Linguistic Atlas of the North-Central States (LANCS)
    The states of Wisconsin, Michigan, Illinois, Indiana, Ohio, and Kentucky are the primary focus of LANCS, but this group also includes speech and interview samples from Ontario, Canada. Most of the questions used to interview participants followed the traditional Linguistic Atlas model, but data collection has taken an extended period, starting from as early as 1933 through 1978. A total of 564 interviews with 154 audio tapes has been collected for LANCS (Preston, 1993).

B1.4 The Dictionary of American Regional English (DARE)

The Dictionary of American Regional English (DARE) resulted from dialect surveys in the United States created by Fred Cassidy from the 1960s to the early 1970s. Although it is not directly a product of the Linguistic Atlas projects, DARE follows the same research tradition. Data from DARE came from responses to questionnaires from local residents of over 1,000 locations systematically sampled from states that have clearly defined migration patterns and historical developments in the United States. More than 1,600 questions focusing on weather patterns, agricultural practices...

Table of contents

  1. Cover
  2. Title
  3. Copyright
  4. TABLE OF CONTENTS
  5. List of Figures
  6. List of Tables
  7. Acknowledgements
  8. Preface
  9. SECTION A Introduction to Corpus-Based Sociolinguistics
  10. SECTION B Survey of Corpus-Based Sociolinguistic Studies
  11. SECTION C Conducting Corpus-Based Sociolinguistic Studies
  12. Bibliography
  13. Index