An Online Field Study on Scholarly Journal Annotations
Empirical Evidence and Implications for Software Design in the Digital Humanities
Max-Planck-Institut fĂźr Wissenschaftsgeschichte
Abstract
Even though there is an abundance of web-based annotation tools that allow users to share their data across the internet, little is known about how these tools are actually used in the daily work routines of scholars in the Humanities. This chapter presents an empirical study on public inline annotations by publishers, article authors and readers in a scholarly open-access journal. The findings of this study are combined with a meta-analysis of the existing empirical literature on marginal annotations in the Humanities and scholarsâ willingness to share them. The most important conclusion that can be drawn from the empirical data is that the publication of annotations is not a feature that needs to be offered by all types of scholarly annotation software packages.
keywords: Collaboration, Evaluation, Classifying, Commenting, Form, Tool, Digital Humanities,
1 Introduction
In his contribution to this volume, Willard McCarty describes his personal way of writing, storing and processing notes on scholarly texts. An early step in this workflow is to ârecord ideas, keywords and references to other sources I want to come back to later for more detailed note-makingâ on separate paper slips (McCarty 2020, 276 ff.). Two aspects of this description, the temporary, transient nature of preliminary comments and the working context in which they are made, can also be taken as distinctive features of individual scholarly annotations written in the margins of texts, henceforth marginalia (cf. Bold and Wagstaff 2017). It has been shown that these textual notes, consisting of a couple of sentences or even only some symbols, have been an important element of academic reading throughout the ages (Agosti et al. 2007; Blair 2004). Less clear is the relationship between this text genre and the scholarly practices of communication and data sharing. In the Digital Humanities (DH), which set out to foster collaboration and information sharing, there have been numerous initiatives to provide the means to share or publish this type of data. Connecting primary sources, scholarly literature and annotations on these texts could result in a âScholarly Webâ (Perkel 2015) that crosses the boundaries of disciplines and links once isolated digital collections (Lordick 2015, 2). The individual work of text analysis might be opened up to larger audiences even in the early stages of research (Becker et al. 2016, 10). These ideas are taken to the extreme by Hemminger and TerMaat (2014, 2278): âOne can now foresee a time when every scholarâs thoughts about a particular article are electronically captured and displayed to other scholars.â This vision has already received considerable technological support: The âWeb Annotation Data Modelâ (henceforth WADM) issued by the W3C1 provides interoperability across software and collections (Hunter et al. 2010), and a number of DH tools ofer annotation-sharing capabilities (MĂźller-Birn et al. 2015, Grassi et al. 20132). At present, though, it is unclear if a Scholarly Web of annotations will come into existence. For other types of annotations such as linguistic tags, established infrastructures for the publication of annotated data exist and are in constant use,3 but so far no comparable platform has attracted a large number of textual free-form annotations. It seems safe to say that the web-scale publication4 of marginalia has not become a regular feature of scholarly work in the Humanities so far. The question arises as to whether it should be a feature of annotation environments at all. After all, there have been a number of reports of a general mismatch between user needs and software designs in the Digital Humanities (Juola 2008, 75; Pape et al. 2012, 3 f.). This potential mismatch could reflect a general problem with annotation tools. An annotation feature was devised for one of the first graphical web browsers in the early 1990s (Carpenter 2013), and later Adriano and Ricarte (2012) were able to list eighty diferent systems in a comparative study of general-purpose annotation software tools. In the (now defunct) DH tool directory DiRT, âAnnotationâ was among the three functions that were referred to most often (Borek et al. 2016, Par. 9). But web annotation was not included as a feature of later browser generations, whose present-day descendants still do not conform to the WADM (Shaikh-Lesko 2019), and so far no killer application (cf. Juola 2008) has been presented for web-based digital annotations. It seems clear that more research is needed on the real potential for annotation-sharing within and outside academia. In light of these problems, this chapter follows AntonijeviÄ Uboisâ (2016) maxim of âDeveloping Research Tools via Voices from the Fieldâ, gathering empirical evidence on scholarly annotation practices in order to derive ideas for software design. It presents the findings of a study on public inline annotations authored by publishers, article authors and readers in the open-access journal eLife.5 The findings of this study are combined with a meta-analysis of the existing empirical literature on marginalia in the Humanities and scholarsâ willingness to share them.
The remainder of this chapter is structured as follows. Section 2 offers an overview and a categorization of annotation systems in the Digital Humanities. Section 3 reviews the existing literature on annotations, and Section 4 presents new data on public scholarly annotations. Sections 5 and 6 set out the findings and derive recommendations for software design.
2 Annotations in the Digital Humanities: Concepts and Systems
At least since the 1990s, shared digital annotation environments have been an active field of study, both within the Digital Humanities and in Computer and Information Sciences (e.g. Ovsiannikov et al. 1999) in general. However, the types of annotations discussed range from marginalia written for private use to digital editions and linguistic markup in text corpora (Hunter 2009, 1). Annotations of the latter two types constitute research findings that are published together with their respective annotation targets (i.e. the objects that annotations are attached to). It is clear that publishing these annotations is normally useful or even necessary. Therefore, annotations need to be categorized to distinguish between different degrees of a priori suitability for publication. However, there is no consensus in the literature on a useful typology of digital annotations. In his influential work on âScholarly Primitivesâ, Unsworth (2000, 1) counted the practice of annotating among the âbasic functions common to scholarly activity across disciplinesâ. In a similar manner, the âTaxonomy of Digital Research Activities in the Humanitiesâ project (TaDiRAH, Borek et al. 2016), which draws on Unsworthâs work, does not subcategorize âAnnotatingâ any further, but subsumes the practices of âadding, e.g., comments, metadata or keywordsâ6 under the entry. In this taxonomy, annotating is a subtype of âEnrichmentâ, as it makes information inherent to the annotation target explicit. Annotating is explicitly contrasted with âCommentingâ (a subtype of âDisseminationâ), an activity that âserves to express some opinion, to add contextual information, or to engage in communication or collaboration.7âIf these definitions are used to inform software design one-to-one, annotating has to be modeled as one function, and commenting as another. But the distinction between âcontextual informationâ and information which is âinherentâ to the annotation target is too subtle for that purpose. Furthermore, there are conceptual doubts about whether a clearcut distinction between the two activities is empirically adequate: Walkowski (2016b, 9 f.) notes that in practice, annotating is most often part of other research activities. And with respect to annotating as a âPrimitiveâ, Unsworth later considered the possibility that some of the initial categories might have to be further subcategorized (Unsworth and Tupman 2016, 232). Indeed, it can be shown that a more fine-grained subcategorization of annotating practices is helpful in constructing suitable use cases and, accordingly, functional requirements for software design. The factors presented in Table 1, which have in part been derived from Hunterâs comprehensive typology (Hunter 2009, 4â14), form the basis for a tentative subcategorization of annotations and the software systems with which they can be produced.
Tab. 1:Annotation dimensions
| Category | Annotation Author/Reader Scope | Target Type | Annotation Target Granularity | Metadata Depth |
| Values | 1: Individual | 1: Research | 1: Publication/File | 1: Technical/Application- |
| 2: Collaboratory | Literature | 2: Part of | Specific |
| 3: (Scholary) | 2: Primary | Publication/File | 2: Ad-hoc Semantics |
| Public | Source | | 3: Std.-Conformant Metadata (WADM) |
| | | | 4: LOD (Target, Body) |
This choice of features is motivated as follows. Author scope and reader scope8 indicate whether annotation authoring must be a function presented to all annotation readers, or if writing and reading functions can be facilitated by different software modules and interfaces. âCollaboratoriesâ are defined in Cerf et al. (1993, 7 f., cf. Agosti et al. 2004) as networked infrastructures enabling scientific collaboration. They differ from solitary working contexts in that they require networked software for shared annotations. In contrast to web-scale annotations, however, sharing...