PART 1
The Practical, Technical and Theoretical Context
Chapter 1
Analysis of an Audiovisual Resource
1.1. Introduction
This book’s goal is to present a functional approach based on the semiotics* of the audiovisual text* [STO 03] for the analysis, i.e. the description, interpretation and indexing of digital audiovisual corpora.
The central notion used for this approach is of the model of description* of an audiovisual object, such as a video, based on a set of criteria which serve the semiotics to process the text object* and will be presented in greater detail in Chapter 3 of this book. Primarily, it is a question of the following criteria:
– the criterion of the text as a compositional entity (a text can, in principle, be broken down into “smaller” textual units, and in turn forms part of a textual environment, of what is, metaphorically speaking, a textscape or mediascape);
– the criterion of the text as a structural entity possessing a set of characteristic constituents (such as the thematic constituent, the narrative constituent, the rhetorical and discursive constituent, the multimodal expression of content, or the formal and physical organization of the content in the text); and finally,
– the criterion of the text as a historical entity (the text as a genre) and an evolutive entity (the text as the product of savoir-faire, in principle always modifiable).
The hypothesis behind this book is that any project of analysis of a textual corpus in general and an audiovisual corpus in particular – whatever its level of specialization – relies on representations, “visions”, or theories: 1) about the object text and 2) the activity of the analysis* of the text.
Thus, all told, a model of description is nothing more or less than the explicitized, formalized (in the broader sense of the word) part of a theory or vision which guides the task of analyzing a textual corpus (in our case, audiovisual).
The gap of “satisfaction” which may exist between the model and the theory or vision underlying the work of analysis can be explained either as a more or less significant implicit factor which guides the analyst in his work and which the model is not capable of taking into consideration, or by imperative simplifications which must be carried out in relation to a theoretical referential to develop an explicit and functional approach to the analysis of textual, or audiovisual, corpora.
The work of definition, development, validation and tracking of models of description of textual and, particularly, audiovisual corpora, still represents an entire occupation, i.e. a set of specialist skills and knowhow calling on a varied body of culture and knowledge which cover not only the practical and technological domains such as information and knowledge technology, applied sciences of documentation, archiving, library sciences or the management of cultural heritage lato sensu, but also – and, in our opinion, crucially – a set of disciplines in human sciences such as text sciences (and particularly semiotics*), linguistic sciences or even that heterogeneous emerging set of approaches and problems classified under the general umbrella label of “cultural sciences” (Kulturwissenschaften, in German).
The occupation in question is that of the concept designer*, sometimes also called concept-designer*, or information technician or engineer; indeed, the terminology is still very fuzzy and unstable. However, it is a central role of the workflow* [STO 11e] defining the constitution, analysis and publication/diffusion of bodies of knowledge heritage which are channeled by audiovisual corpora. The modelizer prepares, develops and manages all the metalinguistic resources necessary for the other actors involved to carry out their work.
1.2. Functionally different corpora
As part of the process of digitizing knowledge heritage, we can distinguish a series of categories of models (i.e. metalinguistic resources) needed to accomplish the various activities making up that process.
As set out in [STO 11e], the process of constituting a body of knowledge heritage in the form, e.g. of a digital archive, takes place in various canonic stages – notably:
1) the stage of preparation of a field for collection of data documenting a body of cultural heritage;
2) the stage of the realization of the field work1;
3) the stage of technical and auctorial treatment of the data collected (including, amongst other things, the derushing of audiovisual data, the montage and postproduction of the audiovisual data collected);
4) the stage of analysis (description, indexing but also pragmatic adaptation) of the data collected and documenting a terrain;
5) the stage of the publication and diffusion of the data collected and/or analyzed and, finally,
6) the stage of conservation of the data collected/analyzed/published.
However, each stage in this process of digitization of a body of knowledge heritage necessarily has to do with a certain functionally specialized type of corpus* (in our case, an audiovisual corpus):
1) The stage of preparation of a field for collecting audiovisual data can only be conceived of in reference to a pre-existing corpus, or by compiling the knowledge and sources of information necessary to the proper functioning of the field work (knowledge and sources which could cover bibliographical references, online resources, personal information, ‘good practices’, examples of similar projects underway or already carried out, directories, etc.).2
2) The stage of data collection leads to the creation or updating/enriching of a pre-existing field corpus*. The field corpus is made up not only of data produced within the boundaries of the field. Take the example of the recording of a field as circumscribed as a research seminar whose sessions to be filmed are spread out over a whole academic year. The corpus of data documenting the field research seminar is not (necessarily) restricted to the audiovisual recordings of the various sessions. It covers all the data deemed pertinent either to give an account of that field (i.e. to make it an archive of knowledge in the true sense of the term), to facilitate a highquality analysis of such-and-such an aspect of the filmed session, to have a documentary base in view of one or more publications (online) of the seminar, or to transform it (as it is, or after a process of selection of documents which must be preserved “absolutely”) into a heritage corpus (see below) documenting, e.g. the history of a discipline or of a research institution.3
3) The stage of technical and auctorial processing relies on a selection of collected data forming part of a field corpus, or else of several field corpora, or on data stemming from different periods in the life of a field corpus (a field corpus can be updated, enriched, etc.). In any case, a processing corpus* is composed of data selected, e.g., with a view to being cut together to constitute a new audiovisual creation corresponding to an authorial intention to publish (i.e. to a scenario defining such a creation). Thus, an intention to publish the recordings of a research seminar may be aimed at diffusing a certain problem dealt with during the said seminar. In this case, not the “entire” seminar is the object of an intention to publish, but rather just those parts of it in which the problem chosen is dealt with. Yet even when a decision is taken to publish “the entirety” of the seminar, the recordings made during the field phase have to undergo technical processing (encoding, checking of the image and sound quality, deletion of unusable passages, etc.) before being made available for publication of the seminar and its various sessions in the form, e.g., of a website. Hence, no matter whether the processing stage is reduced to a “simple” activity of processing or whether it also covers a genuine authorial activity per se, the question of the definition and constitution of the processing corpus arises every time. Note that in addition, in the context of digital archives of knowledge, a processing corpus can be fed not only by data from one or more field corpora, but also by data already published and “re-injected”, reused in the context of a new technical, and above all authorial, treatment. In concrete terms, a corpus documenting a scientific problem which is dealt with in a seminar and which is the object of a montage with a view to publication online, alongside original data (e.g. from a new field corpus), may perfectly well include parts from pre-published contributions.
4) The stage of analyzing the data collected is the one which interests us most, and to which this book is dedicated. For the moment, let us highlight that the analysis of a piece of textual information (or, in our case, audiovisual information) cannot be reduced to a “simple” free indexation, nor to indexation controlled according to this-or-that standard, this-or-that documentary language. The analysis includes all intellectual activities – from documentary indexation to the most personal interpretation, through the various forms of professional assessment of the information – which “use” and “exploit” the object text* to satisfy a need (a desire, or a simple curiosity) for knowledge. However, such a need or desire may stem from very variable motivations, and arise in extremely different social and cultural contexts. It is still true that analysis as an activity to satisfy a need or desire for knowledge can only be successfully carried out if the right object is available to it, as its primary material which is the text* or rather, the corpus of texts. In the context of the constitution and diffusion of a body of cultural/knowledge heritage, the analysis corpus*, i.e. in our case the corpus of audiovisual data being analyzed, is not necessarily coextensive with a field corpus – far from it, in fact. Indeed, everything depends on the goal of the analysis* and, more generally, on the analytical policy* (e.g. in the context of exploitation of the contents of an archive* of knowledge). If the analysis is conceived as an activity of description and classification of data collected beforehand and documenting a particular field with a view, e.g. to their publication online, the field corpus and analysis corpus become similar – although they do not merge. If the analysis is conceived independently of the activity of collection, the corpus of audiovisual data needed for the analysis to fulfill its goals, obviously, no longer has anything to do with this-or-that field corpus. The analysis corpus is constructed and enriched solely according to the objectives of the analysis itself. In [STO 11b], two examples are provided of the constitution of an analysis corpus fed by data from different field corpora: the first example is of the analysis of traditional bread-making in France and Portugal [DEP 11d]; the second of the comparativ...