INFORMATION ORGANIZATION IN KNOWLEDGE RESOURCES
Knowledge Organization from Libraries to the Web: Strong Demands on the Weakest Side of International Librarianship
Maria InĂȘs Cordeiro
SUMMARY. This paper reflects on some major aspects related to library subject access systems in the era of networked information. The main argument builds on the fact that we nowadays witness the strongest demand and expectation on subject access tools, coming from far beyond the traditional library world, but the field remains the weakest side of international
librarianship. While the emergence to cope with the practical challenges of a wider environment is emphasized, the need to reinforce the internationalization of knowledge organization as a professional library matter is stressed, not only at the pragmatic level but also, more importantly, in theoretical terms.
[Article copies available for a fee from The Haworth Document Delivery Service: 1-800-HAWORTH. E-mail address: <[email protected]> Website: <http://www.HaworthPress.com> © 2003 by The Haworth Press, Inc. All rights reserved.] KEYWORDS. Knowledge organization, networked information retrieval, subject access, subject indexing, subject heading languages
INTRODUCTION
Knowledge organization has always been recognized amongst the libraryâs primary functions, and libraries have been the major provider of such a service. The profound changes introduced by the Internet have increased the concerns around knowledge and knowledge organization, diversifying the perspectives about information management. The field has become more complex and less of a unique and clearly identified professional field. Quite suddenly, knowledge organization finds itself in the multiple crossroads of the overcrowded and distributed information landscape brought about by the WWW. In the new WWW context, neither the good models for best guiding oneâs way in the information routes, nor the efficient means to embark for safe, purposeful and well-succeeded information retrieval journeys have yet appeared.
The complexity in devising retrieval solutions for a growing diversity of scattered and independently managed information resources can explain why subject access has become central in the network environment. Being more resource-independent and at a higher level when compared to other aspects of resource description and identification, subject access appears as a more powerful common denominator, potentially capable of alleviating or bypassing the countless other heterogeneities that can be found about many other aspects of information resources management. Yet, such a common denominator is amongst the most complex to define, express, manage and use, as it is the less bound to objective characteristics and the more exposed to all sorts of contextual factors.
Subject access increased in importance along with an expanding range of information discovery and retrieval services, among which professional library services are only a small part, but apparently having an augmented and special role. This situation brings as many opportunities as challenges, and the current context and circumstances of library knowledge organization (KO) tools should be regarded not only in the light of novel contexts but also in the perspective of their intrinsic weaknesses and strengths, either concerning their original targets, or the expanded environment they currently face. One aspect that deserves particular attention is how consolidated the subject matter is in the professional scene, especially at the international level.
THE EXPANDED ENVIRONMENT OF KNOWLEDGE ORGANIZATON
The global network is the transforming reality of the information environment. First of all, it is about distributed information resources and distributed computing. Distributed computing has elicited the trend towards higher levels of abstraction and more formalized ontological support for information systems. The concept of ontology, so close to the library concept of knowledge organizationâthe space where classification systems, thesauri and other controlled vocabularies for information retrieval have been producedâarose in the information systems field with little relation to the experience held in library and information science.1 Ontologies âemerged from academic obscurity into mainstream business and practice on the Web,â2 expanding from the research level of knowledge representation, to applied fields such as knowledge bases, new methods of software engineering,3 or information brokering based on metadata for knowledge domains. Simultaneously, interoperability became a more and more crucial aspect of information systems, and evolved by concentrating the core issues in the semantic axis of information structures and services. Such evolution from simple syntax and structure to semantics is well illustrated by the Semantic Web movement4 and all the developments around it.
All of these trends are leading to changes with impacts wider than the field of information retrieval (IR), normally the one where knowledge organization issues have been discussed. They convey changes in systems paradigms, raising new base concepts, such as the concept of âcomposability,â in which components tend to be system independent, adaptable, extendable and reusable. KO tools are only one among the various kinds of information components envisaged by distributed, interoperable systems. But they are of a fundamental kind, because the use of common, shared or otherwise imparted formal languages (i.e., controlled vocabularies) is of utmost importance to convey explicit and shareable knowledge representations in the distributed environment. Yet, most of them have not been developed, or modified, having practical solutions for wide shareability, adaptability, extensibility and reusability in mind. This is generally the case for the data structure and content of library authority files and for the solutions to manage and convey KO tools.5
Besides structure and content, the conceptual basis of KO tools is often not explicit or well documented, because they have been mostly developed in a pragmatic way and for the needs of a confined realityâa library information system or a community of libraries. In this respect, what can be learned from the above trends in technology is the need for higher levels of abstraction in such tools, providing key common conceptualisations, in order to support semantic interoperability solutions that require cross-domain and interdisciplinary expertise. This is as important nowadays as, in the words of Doerr, âthe problems computer scientists and system implementers have to comprehend the logic of cultural concepts seems to be equally notorious as the inability of the cultural professionals to communicate those to computer scientists.â6 All these are novel or currently highlighted aspects that make the future of KO tools more complex and demanding.
SUBJECT ACCESS IN A TURMOIL: WIDER TARGETS, MORE COMPLEXITY, UNCLEAR DIRECTIONS
Now in company with more diversified information service providers, libraries populate the new and turbulent network landscape with a difficult threefold purpose: to maintain and improve the quality of the subject access service concerning the kind of resources they have long been focused on; to extend and develop the same level of service to every new kind of information resource; and to provide support for subject access interoperability across different domains and services, encompassing not only the distributed library world, but also non-library information stakeholders, service providers and users. Libraries cannotâand do notâignore these drivers of change, but future directions are far from clear. The field of library KO is going through a transitional state that is composed simultaneously of uncertainty and of expectationâuncertainty about the future of library subject indexing practices, both in terms of sustainability and theoretical approaches (a situation that the profession has felt since the â80s without major evolution), and expectation about the role and contribution of traditional library KO tools as ingredients for IR solutions for newly emerged network services.
On the one hand, the central challenge in extending bibliographic control to the WWW content is mostly a matter of scale, requiring feasible and sustainable solutions. Therefore, for the sake of efficacy, a useful balance between scale and practicability should be reflected in the adoption of technical solutions. This balance is managerial by nature and it is not new in terms of subject access management, as we can read in Fugmann.7 But the fact is that significant and visible experience in true managerial approaches cannot be drawn from the practical field of subject access in libraries, while scale and practicability have been better addressed by other IR service providers, relying more on technological solutions than on intellectually based mechanisms, as is notable, the case of Web search engines.
On the other hand, we cannot say that the field has renewed or endured its theoretical basis. In between the pragmatic approaches, ongoing long existing practices, and the recurrent technical discussions around the same kind of questions and issues for almost twenty years, the profession has been breathing a sense of failure,8 of much of the same,9 of obsolete philosophy,10 or even of being a neglected and poorly understood field.11
While this is not easily perceived from outside the profession, feasible and sustainable solutions for WWW coverage have been mostly based on pure technological devices with a very low level of semantic processing. Misunderstandings about their virtues and limitations, and also about what knowledge organization is really aboutânamely the difference between indexing and structuring12âare major aspects in need of clarification in the still naĂŻve stage of the WWW. NaĂŻvetĂ© about subject retrieval in the networked environment can appear from different sides: from many of the producers of Web search services (notably subject directories) arguably more intuitive than traditional KO tools which ignore basic qualities of knowledge organization schemes,13 to traditionalistic approaches where the solution appears to be in simply transferring existing practices and tools to the networked environment. In between these, the need for further research comes into view, either because of indications about different user behaviour in the hypertext environment14 or because traditional tools may not be scalable and widely applicable enough.15 Besides, as pointed out by Jansen and Pooch,16 simple comparisons between traditional library services and the performance of Web search services can be misleading and of little use because the Web is a new search environment requiring its own metrics and methodologies, independent from traditional information retrieval and OPACs.
From this turmoil of different targets and understandings, the clearest idea shared by all parties interested in these matters is that library subject access tools are needed and useful beyond the traditional scope of library services. While this idea is positive and stimulating, it is less clear how such professional tools can be directly re-used by other professional communities with different professional backgrounds and concerns, or even by non-professionals, as is the case of anyone publishing resources with subject access metadata taken from a given classification, subject heading list or thesaurus using, for example, the Dublin Core element set. Without further analysis, it may seem that creating subject access is only about KO tools, that both the tools and the results of their application are intended for the same audiences and, therefore, that they are good enough if they can be understood and used by anyone. The matter is not so simple, for several reasons: first, because library KO tools are not only vocabularies, and they have been prepared to be used mostly by library and information professionals, but the results of their application have had other users in mind; second, because subject IR products and services that appear simple to use can often hide a degree of complexity which is proportionate to the sophistication needed to support the functionalities they provide, and this is not evident from the user side.
Reflecting on these aspectsâand this reflection can hardly come from outside the library professionâis important in order to overcome misconceptions and to help devise solutions to extend the library KO toolsâ applicability, without having necessarily to downgrade them in their application in library services. Besides, these aspects are only little branches stemming from another main trunk of the matter, right in the heart of the profession: despite its core importance, subject access remains the weakest side of international librarianship. Recognizing this is important in order to clearly devise the place and state of library subject access systems in face of the current challenging environment, and such a reality should not to be clouded by the many changes, new pressures and different understandings that come with it.
LIBRARY SUBJECT ACCESS: A BRIEF INTROSPECTION
The knowledge organization function in libraries is built upon formalized information structuresâsubject indexing languages as surrogates of natural languageâaimed at some kind of unification regarding both the conceptual reference framework in which concepts/subjects can be understood and the diversity of their possible forms of representation and communication. In this perspective the library function is primarily one of communication, but implying at its core a large spectrum of issues related to conceptual structures and language. Subject indexing languages encompass both these fields and are assumed as abstractions and reductions aiming at producing knowledge repres...