1
Big data and/versus people knowledge
On the ambiguities of humanistic research
Ingrid M. Hoofd
Introduction: big data, the neo-liberal evil?
Many universities across the globe have undergone radical changes with the advent of the so-called big data techniques. From new ways of generating global university rankings, to new computer-driven research methods in Social Sciences and Humanities, these techniques have led to novel, and, at times, more efficient ways of conducting teaching and research and managing large academic institutions. Big data techniques are a central aspect of what some scholars have termed the drive towards “informatization” under “cognitive capitalism” – terms that mark the increasing importance of cognitive work and the manipulation of symbols in order to spur economic growth (Hardt and Negri 2000, p. 280; Moulier-Boutang 2011, p. 47). Yet many scholars lament the advent of big data in universities, a phenomenon they see as constitutive of the near-pervasive “audit culture” (Shore and Wright 2000, p. 57; Strathern 2000, p. 2; Shore 2008, p. 278; Morrish 2017, p. 142). In these well-argued narratives, oppressive evaluation and audit practices are generally attributed to post-1980s neo-liberalisation that saw the demise of the welfare state and the devaluation of educational institutions in favour of a more market-driven approach (Shore 2008, p. 280; Watts 2017, p. 112).
While I certainly sympathise with these critiques of neo-liberalisation and its imbrication in the demise of healthier university practices, I would nevertheless like to invite you to consider the possibility that the problem of big data audit culture in academia – the continued assessment and surveillance of staff and students through digital-platform- and social-media-generated analytics – is actually more complex and cannot be situated in an ‘evil’ capitalist force residing outside the university walls alone. Resisting both the problematic covert nostalgia of the ‘demise’ of a ‘nobler’ past university, and the overly optimistic interpretations of the supposedly emancipatory potential of big data techniques, I suggest that the encroachment of data surveillance on the university, is, in fact, a continuation of a longer history of the university’s entanglements with ‘modern’ technologies that date to the European Enlightenment. I develop this argument by zooming in on debates around the advent of big data in Humanities that have given rise to a new field: Digital Humanities. Proponents of big data in Humanities have so far argued that the automated gathering and visualisation of data affords new insights into social and human relations, as well as into the larger Republic of Letters (Burdick et al. 2012, p. 4; McGann 2014, p. 4). However, opponents bemoan the increasing encroachment of neo-liberal techniques of automation on the Humanities, arguing that they signal the demise of rich research and teaching practices such as “close reading” (Grusin 2014, p. 85); some scholars have complicated the opposition between “close” and “distant” reading by claiming that big data techniques generate sophisticated novel interpretations of classical texts (Hammond, Brooke, and Hirst 2016, p. 50).
In order to move beyond the too-convenient (even if partially justified) ‘neo-liberalisation’ narrative, I further complicate this adversarial ‘for-or-against’ stance by teasing out the ambiguities of big data’s implementation in staff and student surveillance, curriculum changes, new research and teaching methods, and student surveys in the Humanities Faculty at Utrecht University. Founded in 1636, Utrecht University is one of the larger Dutch research universities, with approximately 6,000 staff and 30,000 students. Currently, the Faculty uses a plethora of big-data-analytics-generating platforms: Elsevier’s research performance platform Pure, the online teaching and learning environment BlackBoard, course evaluation tool Caracal, Peergrade peer evaluation tool, the network analysis and visualisation software Gephi, and CLARIAH, a distributed infrastructure for archival work in Humanities. This chapter will illustrate that the turn to big data in Humanities signals a more profound conundrum in the concept of the university dating to its idealistic beginnings than a mere spat-around Digital Humanities. This deeper conundrum pivots on the paradoxical claim that big data renders its object of analysis simultaneously more unknowable (or superficial) and more knowable (or deep), a paradox inseparable from the aporia inherent in the humanist endeavour to understand yet liberate alterity, to totalise yet render un-finishable the project of totalising knowledge. Following this line of argument, I suggest that the very quest for knowledge – especially knowledge about people/s, which ties the university project to the history of colonialism – is becoming a near-pervasive ‘exposing-itself’ of academia. The Humanities’ debates around big data illustrate that the problem of the university today consists of a cybernetic acceleration of the university’s idealistic yet oppressive mission. Moreover, as much as academia never has been an isolated ‘ivory tower’, with no connection to ‘the market’, this ‘exposing-itself’ is similar to the effect the widespread use of social media has in all spheres of life. However, academia’s self-exposure also utilises new, supposedly ‘postcolonial’ ethnographic methods that romanticise ‘direct contact’ with peoples and ‘nature’ beyond neo-liberal perversities that obviously damage ‘natural’ and human habitats and communities. My radical conclusion is that the most cherished ideals of Utrecht University – emancipation, innovation, and knowledge accumulation and exchange – are therefore precisely what produces unjust and unsustainable practices, both ‘within’ and ‘without’ Higher Education institutions in Dutch (and global) society.
Historical traces of big data: Utrecht University’s missions and visions
As a large teaching and research university, Utrecht University has, in the past decade, embraced big data technologies in many realms. Every year, its students participate in the National Student Survey (Nationale Studenten Enquête or NSE), which ranks degree programmes ranging from 1 (very poor) to 10 (excellent) across the country by digitally analysing tens of thousands of online questionnaires. The ‘grades’ that degree programmes ‘earn’ are independent of the global university or department rankings. Students evaluate teaching through anonymous online forms eight times per year. The Department of Media and Culture’s curriculum has recently wholeheartedly embraced ‘data’ as part of its teaching content, too; feedback to students is increasingly located in ‘rubrics’ where skillsets are evaluated by way of ticks or numbers, rather than narrative comments. Such ‘datafied’ assessment and feedback tools usually consist of 5 to 20 rubrics – for example, ‘method’, ‘readability’, and ‘analysis’. They are marked with a plus or a minus, or numbers from 1 to 5 so that a pass or fail can be transparently calculated. Meanwhile, staff in Medical and Social Sciences are taught the intricacies of Python, a computer programming language developed for large database management. Humanities have jumped on the Digital Humanities bandwagon by instituting their own Centre for Digital Humanities, which “aspires to accelerate [sic] and support the development of these digital methods, in order to gain new insights in all of the humanities” (Centre for Digital Humanities 2019). In its 2016–2020 Strategic Plan, Utrecht University stated that it will “future-proof its educational model by implementing innovative pedagogical models that revolve around technical and educational support in the use of IT such as blended (online and classroom) learning” (Utrecht University Strategic Plan 2016, p. 6). It will also prioritise multidisciplinary research (especially between Science, Technology, Engineering, and Mathematics (STEM), Social Sciences and Humanities) by investing in research infrastructures (Utrecht University Strategic Plan 2016, p. 7), a large part of which will go to the Research IT programme that comprises “a high-quality network, capacity for the storage of research data and facilities for high-performance computing” (23).
In its Strategic Plan, Utrecht University justifies all these developments and investments by appealing to its primary responsibility to society:
These are not mere neo-liberal catchphrases; the demand for transparency and accountability resonates with the University’s mission, which revolves around “innovation, new insights and societal impact” (6), and with its motto, “Sol Iustitiae Illustra Nos” (May the Sun of Righteousness Enlighten Us), engraved in the University’s emblem at the time of its inception in 1636. Plain here are Utrecht University’s connections with the Dutch Reformed Church; the motto is taken from Malachi 4:2 of the Latin Bible. The Dutch Reformed Church (Nederlandse Hervormde Kerk) constituted the largest protestant denomination since the 1571 Reformation in the Netherlands until its disbandment in 2004 and has historically close ties with the Dutch government and its economic policies. Interestingly, a variation on the Utrecht University motto returns in the US Rutgers University’s emblem which reads “Sol Iustitiae Et Occidentem Illustra” (“May the Sun of Righteousness Also Shine Upon the West”). Founded by Dutch Reformed colonists in 1766, Rutgers, like Utrecht, initially admitted only male students, and subsequently only the cultured elites.
Innovation, transparency, and social responsibility imperatives seem to be at the idealistic heart of the university. Far from being propaganda or facile PR strategy, they continue the University’s moral trajectory towards techniques like big data and fields like Digital Humanities. This is not to say that neo-liberalisation plays no role, but that the neo-liberal management, teaching, and research techniques are appealing because they are steeped in certain deep-seated Christian, ‘enlightened’, and humanist moral imperatives. As with the university’s initially positive, optimistic mission, and its simultaneous embroilment in deeply problematic practices (colonialism, patriarchy, and class elitism), it is these very imperatives and their aporetic tensions – for example, the transparentisation, through ethnographic research, of the precise number of ‘natives’ in a newly colonised territory (Jenkins 2003; Manickam 2015) – that lead to an aggravation of these long-lasting tensions through the accelerated communication and computation tools. The aporetic logic of these ‘moral imperatives’ and their accelerated aggravation – staff and student disorientation, stress, and distress – is operative in a host of problematic oppositions (such as “close” and “distant” reading) that typify the turn to big data in the university. In the face of these imperatives’ accelerated implementation, their grounding aporetic logic fractures in a potentially indefinite number of signifiers, which, in our current cybernetic condition, are imbricated in the fundamentally binary logic of computing, leading to a variety of exhausted polarisations. While computer programming seems to offer a method for controlling vast data sets, the exponentially growing size of big data undermines procedural knowledge, which leads to an increase of uncertainty. Paradoxically then, it is these polarisations that keep the debate alive; they are not only unresolvable but fuel the infrastructural machines that have, from the historical beginnings of the university, through the ‘Republic of Letters’, been intimately intertwined with scientific research, collaboration, and publishing. An obvious example, of course, is this chapter itself, which productively challenges the opposition between the university’s ‘moral core’ and its ‘evil’ neo-liberal ‘environment’, which the following section details in order to draw out the ambiguities and alternative futures of humanistic research in the era of big data.
Complicating oppositions in the Digital Humanities
In “Building Theories or Theories of Building?” Warwick recounts the “methodology storms” that have marred the birth and establishment of Digital Humanities, pointing out the difference between positions that emphasise “making” or “doing” and those that favour “theorising” or “critique” (Warwick 2016, p. 540). By helpfully tracing these “storms” to their historical precedents in English Studies and History, she argues that any young field tends to be marred by such oppositional in-fights, which are, or at least can be, “healthy” for the field’s development (541). She shows that there are elements of “thinking” in “making” and vice versa – for instance, English composition is envisioned as contributing to Critical Media literacy, while Critical Making teaches coding to make transparent to students the power of computing. However, she also observes that much of these historical and contemporary “storms” take place within their pertaining communication technologies; fervent arguments in old printed pamphlets in English versus impassioned Twitter debates in Digital Humanities. Moreover, Warwick relates these “storms” to uncertainty about how to measure what counts as proper scientific knowledge in those fields, given the lack of appropriate parameters and objective values, which, in turn, leads to uncertainties about whose research is evaluated and who gets hired (546).
Given the above-mentioned aporetic entanglement of academic aspirations and new technologies, the uncertainty inherent in traditional Humanities, and now, again, in Digital Humanities, goes much deeper than teething problems. It concerns no less than the fundamental epistemological questions about the nature of knowledge and its (technical) procurement and ratification. In proposing this, I heed Frabetti’s suggestion in “Have the Humanities Always Been Digital?” that such foundational epistemological concerns in Sciences and Humanities emphasise the need to “understand how exactly new technologies change the very nature and content of academic knowledge” (Frabetti 2012, p. 168). She does this by looking at how software engineering, in its attempt to control the “constitutive fallibility of software-based technology”, undoes itself through the “unexpected consequences” of the very technologies it relies on (167). This, Frabetti claims, is not only true for optimising and adaptive algorithms, but for computer software in general. Software is always open to the world; it comes into existence through use. In other words, Frabetti suggests that cybernetic machines and their software programming always already deconstruct themselves, generating elements that are in excess of their instrumentalist attempt at control (the word ‘cybernetics’, after all, comes from the Greek κυβερνήτης: to govern or steer). This, in turn, means that “software is always both conceptualised according to … a metaphysical framework and capable of escaping it” so that if we “uncover the conceptual presuppositions”, we can better understand what counts as academic knowledge, especially in Humanities, so that we may, in turn, politicise concerns around cybernetic machines better (167; emphasis original). I would add that, in the spirit of deconstruction, uncovering such presuppositions necessarily also entails picking apart the problematic oppositions surrounding these machines. While Warwick’s argument does a good job of uncovering the historical backdrop of one of these oppositions, we need a more fine-grained understanding of how and why this exhausted opposition between theory and practice appears in fields that are pressed to contribute to the general social good. One may think here, for instance, of socialist scholarship that devalues theory in favour of a ...