1 Value-Laden Biases in Data Analytics
Who is responsible for the outcomes of an analytics program that tracks the facial expressions of therapy patients? Is the program itself responsible? Does Lemonade Insurance’s AI Jim “act” when it makes decisions about fraudulent claims? Or, as some may argue, are these programs neutral and any bad decisions are more the product of society and human decisions?
The goal of this chapter is to examine how technologies—including computer programs and data analytics—have biases or preferences. The discussion about whether technology does things or has preferences emanates from a concern as to who is responsible for outcomes. In other words, when an organization or individual uses data analytics, who is responsible for the outcome? The arguments traditionally fall into two camps: those that focus on the technology as the actor that “does” things and is at fault (technological determinists) and those that focus on the users of that technology as determining the outcome (social determinists). The readings chosen take a different approach by acknowledging the value-laden biases of technology—including data analytics—while preserving the ability of humans to control the design, development, and deployment of technology.
For technological determinists
, technology is the primary actor of the story and some even argue that technology has an internal guiding force that propels the development and use of technology and shapes society. As such, technology is to “blame” for the outcome. Strident technological determinists frequently see technology as having an internal dynamic that leads the best technology to survive in the market. This faction within computer science argues that the ethical evaluation of technology is not appropriate since it may curtail development. The technological imperative frames technologies as almost inevitable and outside all societal control; a technological determinist also believes that technology always is correct. 1
Accordingly, technology should be adopted for the good of society. 2
For example, in an argument against scholars who have highlighted the dangers of using artificial intelligence and predictive analytics without regard to their biases or moral implications, Alex Miller, in “Want Less-Biased Decisions? Use Algorithms,” lists the ways AI could
be an improvement because humans are bad at decisions (true—we aren’t great 3
). His argument joins a common refrain that technology, because it can be an improvement if designed properly, is then always improvement. 4
For data analytics, we hear technological determinist arguments when the algorithm or program is the main actor in the paragraph or the sentence. For example, “The algorithm decided …” or “the program categorized …” For AI Jim, who is already given a name (!), Lemonade Insurance reports the good that AI Jim has done for the company.
For social determinists, society is the main actor of the story, constructing technology and determining the outcome. If a technology is not performing correctly, then a social determinist would point to the many ways that people created that technology and decided how it would be used. For social determinists, what matters is not technology itself but the social or economic system in which it is embedded. We hear social determinist arguments in data analytics in two ways. First, we may blame the use of the program rather than the design of the program. Second, others may acknowledge that the data may be flawed (“it’s just the data”) and that society needs to get better data for data analysts.
This tension—between social determinists and technological determinists—is important to the ethics of data analytics because who is “acting” or doing things is normally who we look to hold responsible for those acts. For social determinist approaches (blaming the data or the users or society), a data analytics program is neutral. Society is then responsible for the moral implications of the technology in use; we can’t blame developers. For technological determinists, data analytics programs have biases and do things; but these inherent biases are then outside the influence of society, designers, and developers. Interestingly, both mistakenly absolve developers—computer scientists, data analysts, and corporations—of their responsibility. Whether you hold the users of the algorithm responsible (social determinism) or the algorithm itself (technological determinism), you are not holding responsible the systems of power—the government or company—that designed, developed, and implemented the program.
However, scholars (not surprisingly) have tackled this issue with a variety of approaches.
Wiebe Bijker is a classic social constructionist (not a determinist!). In Of Bicycles, Bakelites, and Bulbs, Bijker explores “both the social shaping of technology and the technical shaping of society.” Rather than claiming all technologies are socially determined or all technologies determine society, Bijker notes that “some artifacts [technologies for Bijker] are more obdurate, harder to get around and to change, than others.” This allows for some data analytics programs to be more obscure, “harder to get around” than others.
Deborah Johnson, 5
directly addresses the question underlying many of these determinist debates—who can be responsible for the moral implications of technology. Johnson’s “claim is that those who argue for the moral agency (or potential moral agency) of computers are right in recognizing the moral importance of computers, but they go wrong in viewing computer systems as independent, autonomous moral agents.” 6
This difference is important for some in that the term moral agent carries with it the idea of responsibility for their actions
. In this case, Johnson’s account allows us to identify the important value-laden biases and moral implications of data analytics programs but not attribute some sort of intentional agency that would lie outside human control. For Johnson, society still is responsible for the technology they design, develop, and bring to market.
For the readings included here, the authors are attempting to acknowledge both the ability of humans to create and mold technology for their purposes as well as the value-laden biases or politics technology has once it is in use. In terms of data analytics, this would mean that developers and designers make value-laden decisions in the development and coding of AI, predictive analytics, and machine learning (any type of analytics), and those decisions have moral implications for the use of that program.
Summary of Readings
In the classic article “Do Artifacts Have Politics?” 7
Professor Langdon Winner explicitly addresses the ideas of social and technological determinism. Winner argues against the idea that “what
matters is not technology itself but the social or economic system in which it is embedded,” which he sees as an overreaction to the claim that technology has an internal dynamic which, “unmediated by any other influence, molds society to fit its patterns.” In other words, Winner sees social determinism as an overcorrection to claims of technological determinism. He argues that technology, designed and used by society, has politics or “arrangements of power and authority in human associations.” Winner uses examples such as bridges, molding machines, and tomato harvesters to explore the many ways technology can have politics both in the decision to have the technology and in the specific features in their design. For example, he notes the size and cost required of the tomato harvester as requiring an amount of capital to enter the market that drove out smaller farmers. This was not a “plot” according to Winner, but the “social process in which scientific knowledge, technological invention, and corporate profit reinforce each other in deeply entrenched patterns that bear the unmistakable stamp of political and economic power.” Winner’s concepts are just as applicable today: e.g., the critiques of large language models we examine in the next chapter as environmentally damaging and concentrated in labs that are funded by large corporations.
When applying Winner’s approach to a data analytics case, we would (1) identify the politics or arrangements of power and authority in a program, and (2) examine whether the technology is “inherently political” or due to specific design choices that “can affect the relative distribution of power, authority, privilege in a community.” Winner may see the tracking and recording of patients as shifting power to the company and away from the patient as they do not have visibility or control over the data collected. Some may go further to question if this type of surveillance has inherent politics (according to Winner), as it requires a particular distribution of authority to collect, protect, and analyze data as opposed to the alternative of a therapist taking notes.
In “Bias in Computer Systems,” 8
Professors Batya Friedman and Helen Nissenbaum explore the idea of “bias” in computer systems. Friedman and Nissenbaum define bias as the tendency of a computer system to “systematically and unfairly discriminate” against certain individuals. In other words, computer systems have preferences as to who “gets” certain things and who does not. The authors focus specifically on systematic
discrimination and do not include random mistakes or glitches. In addition, and unlike Winner, Friedman and Nissenbaum define bias as something that is unethical or unfair—and therefore undesirable. Where Winner sees politics as either good or bad (we would need to analyze the degree to which they are good/bad), Friedman and Nissenbaum, in this reading, define bias as a bad thing. 9
Friedman and Nissenbaum identify three types of biases based on how the bias emerges: preexisting biases, technical biases, and emergent biases. These categories are helpful in thinking through how a data analytics program, such as Lemonade Insurance’s AI Jim, could have biases (a) preexisting in the data, then (b) embedded in the chosen technology, and (c) emergent in how the program is then deployed on live data. While Winner appears to argue that all technologies have good and bad politics, Friedman and Nissenbaum see a possibility of a technology with no bias. This is an important distinction and one that many may not agree with now: that a data analytics program could ever be free of biases. In analyzing a data analytics program according to Friedman and Nissenbaum, one would examine if the program has the types of biases outlined in the article: preexisting, technical, and emergent.
In an excerpt from Gabbrielle Johnson’s “Are Algorithms Value-Free?” 10
Johnson pushes us to think more deeply as to the many ways algorithms are not value-free. Some in computer science and data analytics acknowledge that the data
we use is problematic, thus shifting the “blame” for the moral implications of data analytics model to either (a) those who created the data some time ago or (b) those who used the algorithm on live data in use. The refrain “it’s just bad data,” however, masks that developing data analytics models, from AI or programming, is a value-laden
enterprise or, as Johnson says, “values are constitutive of the very operation of algorithmic decision-making.” It is not possible to be “value-free.” In doing so, Johnson relies on a body of work in philosophy of science, including Rudner who is included later in this volume, that examines the value-laden-ness of science and technology: those who argue “values can shape not only the research programs scientists choose to pursue, but also practices internal to scientific inquiry itself, such as evidence gathering, theory confirmation, and scientific inference.” 11
Finally, in “Algorithmic Bias and Corporate Responsibility: How Companies Hide behind the False Veil of the Technological Imperative,” I tie determinist arguments explicitly to corporate responsibility of value-laden design. I argue that judging AI on effi...