1 Value-Laden Biases in Data Analytics
Who is responsible for the outcomes of an analytics program that tracks the facial expressions of therapy patients? Is the program itself responsible? Does Lemonade Insuranceâs AI Jim âactâ when it makes decisions about fraudulent claims? Or, as some may argue, are these programs neutral and any bad decisions are more the product of society and human decisions?
The goal of this chapter is to examine how technologiesâincluding computer programs and data analyticsâhave biases or preferences. The discussion about whether technology does things or has preferences emanates from a concern as to who is responsible for outcomes. In other words, when an organization or individual uses data analytics, who is responsible for the outcome? The arguments traditionally fall into two camps: those that focus on the technology as the actor that âdoesâ things and is at fault (technological determinists) and those that focus on the users of that technology as determining the outcome (social determinists). The readings chosen take a different approach by acknowledging the value-laden biases of technologyâincluding data analyticsâwhile preserving the ability of humans to control the design, development, and deployment of technology.
For
technological determinists, technology is the primary actor of the story and some even argue that technology has an internal guiding force that propels the development and use of technology and shapes society. As such, technology is to âblameâ for the outcome. Strident technological determinists frequently see technology as having an internal dynamic that leads the best technology to survive in the market. This faction within computer science argues that the ethical evaluation of technology is not appropriate since it may curtail development. The technological imperative frames technologies as almost inevitable and outside all societal control; a technological determinist also believes that technology always is correct.
1 Accordingly, technology should be adopted for the good of society.
2 For example, in an argument against scholars who have highlighted the dangers of using artificial intelligence and predictive analytics without regard to their biases or moral implications, Alex Miller, in âWant Less-Biased Decisions? Use Algorithms,â lists the ways AI
could be an improvement because humans are bad at decisions (trueâwe arenât great
3 ). His argument joins a common refrain that technology, because it can be an improvement if designed properly, is then always improvement.
4 For data analytics, we hear technological determinist arguments when the algorithm or program is the main actor in the paragraph or the sentence. For example, âThe algorithm decided âŠâ or âthe program categorized âŠâ For AI Jim, who is already given a name (!), Lemonade Insurance reports the good that AI Jim has done for the company.
For social determinists, society is the main actor of the story, constructing technology and determining the outcome. If a technology is not performing correctly, then a social determinist would point to the many ways that people created that technology and decided how it would be used. For social determinists, what matters is not technology itself but the social or economic system in which it is embedded. We hear social determinist arguments in data analytics in two ways. First, we may blame the use of the program rather than the design of the program. Second, others may acknowledge that the data may be flawed (âitâs just the dataâ) and that society needs to get better data for data analysts.
This tensionâbetween social determinists and technological deterministsâis important to the ethics of data analytics because who is âactingâ or doing things is normally who we look to hold responsible for those acts. For social determinist approaches (blaming the data or the users or society), a data analytics program is neutral. Society is then responsible for the moral implications of the technology in use; we canât blame developers. For technological determinists, data analytics programs have biases and do things; but these inherent biases are then outside the influence of society, designers, and developers. Interestingly, both mistakenly absolve developersâcomputer scientists, data analysts, and corporationsâof their responsibility. Whether you hold the users of the algorithm responsible (social determinism) or the algorithm itself (technological determinism), you are not holding responsible the systems of powerâthe government or companyâthat designed, developed, and implemented the program.
However, scholars (not surprisingly) have tackled this issue with a variety of approaches.
Wiebe Bijker is a classic social constructionist (not a determinist!). In Of Bicycles, Bakelites, and Bulbs, Bijker explores âboth the social shaping of technology and the technical shaping of society.â Rather than claiming all technologies are socially determined or all technologies determine society, Bijker notes that âsome artifacts [technologies for Bijker] are more obdurate, harder to get around and to change, than others.â This allows for some data analytics programs to be more obscure, âharder to get aroundâ than others.
Deborah Johnson,
5 directly addresses the question underlying many of these determinist debatesâwho can be responsible for the moral implications of technology. Johnsonâs âclaim is that those who argue for the moral agency (or potential moral agency) of computers are right in recognizing the moral importance of computers, but they go wrong in viewing computer systems as independent, autonomous moral agents.â
6 This difference is important for some in that the term moral agent carries with it the idea of
responsibility for their actions. In this case, Johnsonâs account allows us to identify the important value-laden biases and moral implications of data analytics programs but not attribute some sort of intentional agency that would lie outside human control. For Johnson, society still is responsible for the technology they design, develop, and bring to market.
For the readings included here, the authors are attempting to acknowledge both the ability of humans to create and mold technology for their purposes as well as the value-laden biases or politics technology has once it is in use. In terms of data analytics, this would mean that developers and designers make value-laden decisions in the development and coding of AI, predictive analytics, and machine learning (any type of analytics), and those decisions have moral implications for the use of that program.
Summary of Readings
In the classic article âDo Artifacts Have Politics?â
7 Professor Langdon Winner explicitly addresses the ideas of social and technological determinism. Winner argues against the idea that âwhat
matters is not technology itself but the social or economic system in which it is embedded,â which he sees as an overreaction to the claim that technology has an internal dynamic which, âunmediated by any other influence, molds society to fit its patterns.â In other words, Winner sees social determinism as an overcorrection to claims of technological determinism. He argues that technology, designed and used by society, has politics or âarrangements of power and authority in human associations.â Winner uses examples such as bridges, molding machines, and tomato harvesters to explore the many ways technology can have politics both in the decision to have the technology and in the specific features in their design. For example, he notes the size and cost required of the tomato harvester as requiring an amount of capital to enter the market that drove out smaller farmers. This was not a âplotâ according to Winner, but the âsocial process in which scientific knowledge, technological invention, and corporate profit reinforce each other in deeply entrenched patterns that bear the unmistakable stamp of political and economic power.â Winnerâs concepts are just as applicable today: e.g., the critiques of large language models we examine in the next chapter as environmentally damaging and concentrated in labs that are funded by large corporations.
When applying Winnerâs approach to a data analytics case, we would (1) identify the politics or arrangements of power and authority in a program, and (2) examine whether the technology is âinherently politicalâ or due to specific design choices that âcan affect the relative distribution of power, authority, privilege in a community.â Winner may see the tracking and recording of patients as shifting power to the company and away from the patient as they do not have visibility or control over the data collected. Some may go further to question if this type of surveillance has inherent politics (according to Winner), as it requires a particular distribution of authority to collect, protect, and analyze data as opposed to the alternative of a therapist taking notes.
In âBias in Computer Systems,â
8 Professors Batya Friedman and Helen Nissenbaum explore the idea of âbiasâ in computer systems. Friedman and Nissenbaum define bias as the tendency of a computer system to âsystematically and unfairly discriminateâ against certain individuals. In other words, computer systems have preferences as to who âgetsâ certain things and who does not. The authors focus specifically on
systematic discrimination and do not include random mistakes or glitches. In addition, and unlike Winner, Friedman and Nissenbaum define bias as something that is unethical or unfairâand therefore undesirable. Where Winner sees politics as either good or bad (we would need to analyze the degree to which they are good/bad), Friedman and Nissenbaum, in this reading, define bias as a bad thing.
9 Friedman and Nissenbaum identify three types of biases based on how the bias emerges: preexisting biases, technical biases, and emergent biases. These categories are helpful in thinking through how a data analytics program, such as Lemonade Insuranceâs AI Jim, could have biases (a) preexisting in the data, then (b) embedded in the chosen technology, and (c) emergent in how the program is then deployed on live data. While Winner appears to argue that all technologies have good and bad politics, Friedman and Nissenbaum see a possibility of a technology with no bias. This is an important distinction and one that many may not agree with now: that a data analytics program could ever be free of biases. In analyzing a data analytics program according to Friedman and Nissenbaum, one would examine if the program has the types of biases outlined in the article: preexisting, technical, and emergent.
In an excerpt from Gabbrielle Johnsonâs âAre Algorithms Value-Free?â
10 Johnson pushes us to think more deeply as to the many ways algorithms are not value-free. Some in computer science and data analytics acknowledge that the
data we use is problematic, thus shifting the âblameâ for the moral implications of data analytics model to either (a) those who created the data some time ago or (b) those who used the algorithm on live data in use. The refrain âitâs just bad data,â however, masks that developing data analytics models, from AI or programming, is a value-laden
enterprise or, as Johnson says, âvalues are constitutive of the very operation of algorithmic decision-making.â It is not possible to be âvalue-free.â In doing so, Johnson relies on a body of work in philosophy of science, including Rudner who is included later in this volume, that examines the value-laden-ness of science and technology: those who argue âvalues can shape not only the research programs scientists choose to pursue, but also practices internal to scientific inquiry itself, such as evidence gathering, theory confirmation, and scientific inference.â
11 Finally, in âAlgorithmic Bias and Corporate Responsibility: How Companies Hide behind the False Veil of the Technological Imperative,â I tie determinist arguments explicitly to corporate responsibility of value-laden design. I argue that judging AI on effi...