
eBook - ePub
Web Social Science
Concepts, Data and Tools for Social Scientists in the Digital Age
- 224 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Web Social Science
Concepts, Data and Tools for Social Scientists in the Digital Age
About this book
Although written simply enough to be accessible to undergraduates, accomplished scholars are likely to appreciate it too. Reading it taught me quite a lot about a subject I thought I knew rather well.
- Paul Vogt, Illinois State University
"This book brings the art and science of building and applying innovative online research tools to students and faculty across the social sciences."
- William H. Dutton, University of Oxford
A comprehensive guide to the theory and practice of web Social Science. This book demonstrates how the web is being used to collect social research data, such as online surveys and interviews, as well as digital trace data from social media environments, such as Facebook and Twitter. It also illuminates how the advent of the web has led to traditional social science concepts and approaches being combined with those from other scientific disciplines, leading to new insights into social, political and economic behaviour.
- Paul Vogt, Illinois State University
"This book brings the art and science of building and applying innovative online research tools to students and faculty across the social sciences."
- William H. Dutton, University of Oxford
A comprehensive guide to the theory and practice of web Social Science. This book demonstrates how the web is being used to collect social research data, such as online surveys and interviews, as well as digital trace data from social media environments, such as Facebook and Twitter. It also illuminates how the advent of the web has led to traditional social science concepts and approaches being combined with those from other scientific disciplines, leading to new insights into social, political and economic behaviour.
Situating social sciences in the digital age, this book aids:
- understanding of the fundamental changes to society, politics and the economy that have resulted from the advent of the web
- choice of appropriate data, tools and research methods for conducting research using web data
- learning how web data are providing new insights into long-standing social science research questions
- appreciation of how social science can facilitate an understanding of life in the digital age
It is ideal for students and researchers across the social sciences, as well as those from information science, computer science and engineering who want to learn about how social scientists are thinking about and researching the web.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Web Social Science by Robert Ackland,SAGE Publications Ltd in PDF and/or ePUB format, as well as other popular books in Social Sciences & Social Science Research & Methodology. We have over one million books available in our catalogue for you to explore.
Information
1
Introduction
This chapter aims to provide a context for web social science, introducing some of the major themes that are addressed elsewhere in the book.
Section 1.1 provides an introduction to the key technologies and governance structures that underlie the Internet and the web, and presents a timeline of key events (from the perspective of web social science) in the history of the web. Section 1.2 introduces examples of online computer-mediated interaction which feature throughout the book. Section 1.3 introduces three important phases in the conceptualisation of the web: cyberspace, virtual communities and online social networks. Section 1.4 outlines four disciplinary approaches for conducting empirical research using data from the web. Section 1.5 introduces the concept of construct validity in the context of web data. Finally, Section 1.6 looks at whether the web can should be viewed as a tool that people use for achieving social, political and economic outcomes, rather than a force that shapes behaviour.
1.1 THE WEB: TECHNOLOGY, HISTORY AND GOVERNANCE
The starting point for a book on web social science is necessarily a brief introduction to the technology that underlies the web. While the average social scientist will not need to know much about the technology of the web, it is important to know, for example, that the web and the Internet are not synonymous. The Internet came before the web, and the web is in fact built on top of the Internet.
The Internet is a massive, distributed network of computers, originally developed in the US in the 1960s with funding from the Defense Advanced Research Projects Agency (DARPA). Data that is transferred between computers on the Internet is split into relatively small blocks (‘packets’) which are then reconstituted at the final destination. Packets follow the most efficient pathway to the final destination; if a particular computer is not available they are automatically rerouted. This enables efficient transfer of data and also means that packets can be delivered even if parts of the network are not functioning (the original interest of DARPA was in ensuring communications in the event of war).
For the packets to be successfully sent and received there need to be rules or protocols – two critical protocols are the Transmission Control Protocol (TCP) and the Internet Protocol (IP), jointly referred to as TCP/IP.1 But TCP/IP are not the only important protocols. The delivery of email involves an additional protocol called the Simple Mail Transfer Protocol (SMTP). The World Wide Web (or web) is a massive distributed network of resources – documents, sounds, images (Box 1.1). The protocol that underlies the web is the HyperText Transfer Protocol (HTTP), which allows the development of web pages written in the HyperText Markup Language (HTML) coding language; these are used to access information on the web. The web is therefore built on top of the Internet. While the Internet is a network of computers connected by cables, the web is a network of documents connected by hypertext links.
The word network is very important – a major aim of this book is to show that web social science is network-based social science. However, the networks that are discussed in the book are not networks of computers or documents, but networks of individuals, groups and organisations. That is, the web allows individuals, groups and organisations to form and maintain networks and, in doing so, create digital trace data that can be studied by social scientists. While it is relatively easy to conceive of Facebook as a network of individuals, this book shows that other web applications also facilitate networked behaviour.
The web, which is regarded by some as being the ‘largest human information construct in history’,2 was invented by Tim Berners-Lee while based at CERN and was publicly released in 1991. Box 1.2 presents a list of important milestones in the development of the web. The focus is on events that are important for web social science, and references to relevant chapters and sections in the book are provided.
The web is commonly understood to have had three overlapping phases of development or eras: Web 1.0, Web 2.0 and Web 3.0 (Box 1.3). Under Web 1.0, webmasters create content that is then read or consumed by users. Web 1.0 websites are sometimes referred to as comprising the Static Web since they typically do not allow a lot of interactivity and the information presented (often reflecting organisational goals, products, services) does not change regularly (relative to the constant flow of change on sites such as Facebook and Twitter).
Web 2.0 blurs the distinction between webmasters and users, with blogging tools, social network sites (e.g. Facebook) and microblog services (e.g. Twitter) enabling non-technical people to both produce and consume content. The act of a person both consuming and producing web content has been referred to as ‘prosumption’ (e.g. Ritzer and Jurgenson, 2010) and ‘produsage’ (e.g. Bruns, 2008).
BOX 1.1 RESOURCES ON THE WEB
So how are resources on the web found? Resources such as websites are identified via unique numeric IP addresses that consist of four numbers (between 0 and 255) separated by dots. The Domain Name System (DNS) translates an easier-to-remember, character-based, fully qualified domain name (also known as the hostname, sitename or subdomain), which is the unique name by which a computer is known on a network, into an IP address.
The hostname comprises two parts (joined by a ‘.’): the name of the host (this is the computer that is connected to the network) and the domain name. A domain name usually consists of two parts. A top-level domain (TLD) identifies the type of organisation. There are two types of TLD: generic TLDs (e.g. ‘.com’, ‘.edu’) and country-code TLDs (e.g. ‘.au’, ‘.uk’). A second-level domain such as ‘google’ or ‘yahoo’ identifies the organisation.
For example, the hostname voson.anu.edu.au consists of the host ‘voson’ and the domain name ‘anu.edu.au’, and currently translates (via DNS) into the IP address 150.203.224.58. The generic TLD is ‘.edu’, the country-code TLD is ‘.au’, and the second-level domain is ‘anu’.
A uniform resource locator (URL) is an address that defines a route to a file on an Internet server (e.g. web server, FTP server). The first part of the address is the protocol identifier, while the second part is the resource name, with the first and second parts being separated by ‘://’. Thus, the URL http://voson.anu.edu.au/index.html consists of the protocol identifier ‘http’ indicating that this is a resource that is hosted on a web server, and thus requires HTTP to access it, and the resource name is ‘voson.anu.edu.au/index.html’. The resource name is composed of the hostname (‘voson.anu.edu.au’), the directory path to the file (‘/’), and the file (‘index.html’).
A subsite is a collection of pages within a particular website. For example, the subsite http://voson.anu.edu.au/news is a part of the VOSON project website and contains pages with details on project activities, e.g. http://voson.anu.edu.au/news/2012, http://voson.anu.edu.au/news/2011.
BOX 1.2 WEB TIMELINE
1983 – TCP/IP implemented
1984 – William Gibson publishes Neuromancer (Section 1.3.1)
1985 – Domain Name System (DNS) introduced
1990 – The Internet comprises over 100,000 hosts
1991 – Linus Torvalds begins work on the open source Linux operating system (based on the MINIX variant of the Unix operating system) (Section 9.1.1)
1990–1994 – New content-publishing services released, e.g. news/bulletin boards, FTP, gopher (menu-driven system for accessing files), first content search engines (e.g. Brewster Kahle’s Wide Area Information Service, WAIS)
1991 – Tim Berners-Lee’s World Wide Web is publicly released. The web eventually swamped all other content publishing services
1994 – Netscape web browser released
1997 – Internet Archive starts archiving the web, currently available via the Wayback Machine (Section 4.3.2)
1998 – Sergey Brin and Larry Page publish (and patent) their ‘PageRank’ search algorithm, paving the way for Google (Section 7.1.1)
Mid-2003 – There are an estimated 180 million registered hosts on the Internet, 40 million websites and between 600 and 700 million users
2003 – Linden Labs launch Second Life virtual world (Section 9.3.2)
2004 – Mark Zuckerberg founds Facebook.com, heralding the rise of social network sites (Sections 3.3.3, 5.1.2)
2004 – Political bloggers play prominent role in US Presidential election (Section 7.3)
2005 – YouTube video-sharing website launched
2006 – Twitter microblogging service launched (Section 5.2.2)
2007 – iPhone launched by Apple, igniting the market for smartphones (Section 1.3.1)
2011 – Social media play prominent role in the Arab Spring and the Occupy Movement (Section 8.2)
BOX 1.3 PHASES IN THE EVOLUTION OF THE WEB
| Web 1.0: | Static Web. Key languages/protocols: HTML, HTTP. Key applications: websites (hosted by web server software such as Apache), web browsers (e.g. Firefox). |
| Web 2.0: | Collaborative Web. Key languages/protocols: AJAX, RSS, SOAP, XML. Key applications: web blogs, social network services, microblogs, smartphone operating systems (e.g. Android), software as a service (e.g. Google Docs). |
| Web 3.0: | Semantic Web. Key languages/protocols: RDF, SWRL, SPARQL. Key applications: semantic databases, intelligent personal agents. |
Web 3.0, or the Semantic Web, involves technologies that make the web more machine-readable, leading to a ‘web of data’, which is an evolution of the Web 1.0 ‘web of documents’ (Shadbolt et al., 2006). While the technologies underlying the Semantic Web are proven, there is yet to be a general take-up of Web 3.0. While it is possible to retrofit existing websites to make them Web 3.0 compatible, this would entail a massive amount of work, so webmasters are unlikely to do this until there are clear benefits or reasons to do so. The exception is the government sector, where Open Data initiatives are drawing on Web 3.0 technologies. But for the vast majority of the web, while Web 1.0 and Web 2.0 are ubiquitous, Web 3.0 is still in its infancy.
A common feature of all three phases of the web is the use of technologies to help people find the content they want. With Web 1.0, and to a lesser extent Web 2.0, the core enabling technology are the hyperlink, which enables users to efficiently move around the web (‘web surfing’), and search engines that index web content and present search results to users. In contrast, Web 3.0 envisages intelligent personal agents finding content on behalf of users by drawing on users’ preferences and browsing habits.
Governance of the Internet occurs at two levels: architecture and operation.3 In relation to architecture, design and refinement of protocol specifications is undertaken by various working groups coordinated by the Internet Engineering Task Force (IETF). Other organisations take specific roles in particular areas. For example, issues relating to transmission media are handled by the Institute of Electrical and Electronic Engineers (IEEE) and the International Telecommunications Union (ITU), and protocols to do with the web are the province of the World Wide Web Consortium (W3C) industry association. The main organisation involved with Internet operation governance is the Internet Corporation for Assigned Names and Numbers (ICANN), which coordinates the DNS, IP addresses and the generic and country code TLD system.
1.2 EXAMPLES OF ONLINE COMPUTER-MEDIATED INTERACTION
This section aims to familiarise readers with several forms of online computer-mediated interaction. The list is not complete, with a focus on the types of online interaction that are discussed elsewhere in this book.
Threaded conversations: newsgroups, discussion groups and chat rooms
Newsgroups are repositories of emails set up for different topics, often hosted on the Usenet system (an example is rec.pets.cats – a Usenet newsgroup dedicated to discussing pet cats). Threaded conversations occur within newsgroups when individuals make posts to newsgroups (thus starting a ‘thread’), and respond to the posts of other people. Discussion groups (or chat rooms) are hosted on the web and are often functionally similar to newsgroups (which do not necessarily involve web technologies). They can be moderated or unmoderated. An example is the chat rooms that are hosted on America Online (AOL). Another example is Slashdot – a popular web-based technology-related forum, with articles and comments from readers. Slashdot has developed its own subculture involving the accumulation of ‘karma’ scores, with volunteer moderators being selected from those with high scores. Threaded conversations are looked at in Sections 3.3.2 and 9.1.2.
Web 1.0 websites
A static website is the ‘face’ of Web 1.0. These generally represent organisational web presence, rather than the web...
Table of contents
- Cover Page
- Title Page
- Copyright Page
- Contents
- List of Figures
- List of Tables
- List of Boxes
- About the Author
- Preface
- 1 Introduction
- I Web Social Science Methods
- II Web Social Science Examples
- References
- Index