Introduction
Metadata best practices encompassing metadata guidelines and application profiles function as an essential mechanism for metadata planning, application, and management. Thus, examination of the documentation processes of metadata best practices is essential. There has been a rapidly growing body of digital repositories and collections; accordingly, a wide range of digital projects and initiatives have adopted various metadata standards. Because of differences in the formats and knowledge domains of the resources, it is inevitable that these digital projects and initiatives may have different needs regarding metadata. Therefore, when a metadata standard is adopted in various institutions and organizations, it may be modified to reflect the community needs and characteristics of given resources.
For instance, in the case of the Dublin Core (DC) metadata standard, flexibility is one of the salient characteristics that accounts for Dublin Core’s popularity. Although the Dublin Core Metadata Initiative offers a set of standard elements and elementary guides for using DC elements, various versions of metadata best practices and application profiles have been created to serve the special needs of individual digital projects and initiatives. Even based on the same metadata standard, different metadata best practices and guidelines may select different sets of metadata elements. They may also have different requirements regarding other aspects such as controlled or uncontrolled vocabulary and cardinality/metadata element status (e.g., mandatory, optional, repeatable).
In an ongoing exploratory study, we analyze the variations and commonalities among twenty local metadata best practices and guidelines, based on the DC metadata standard, in terms of metadata element status, metadata semantics in relation to label names and definitions/descriptions of metadata elements, content encoding rules, usage of controlled and uncontrolled vocabularies, and locally added homegrown metadata elements (Park, Tosaka, & Lu). Results of the study evince great divergence in the application of the DC metadata standard. Each set of guidelines utilizes different labels, different interpretations of metadata elements, and locally defined homegrown additions and variants to the DC metadata standard. Regarding locally defined homegrown metadata elements, these have the potential to facilitate the information needs of users in local environments. This same flexibility, however, may also engender hindrances in achieving interoperability and resource discovery across digital repositories (Park et al.; see also study by Han, Cho, Cole, & Jackson, in this issue).
Owing to the flexibility and complex structure of natural language, which allows for the representation of a concept in various ways, there is a critical need for common understanding and definitions of terms in a given metadata standard (Park, 2006). As briefly discussed, when we compare two or more of the locally created metadata best practices, we face issues in relation to interoperability. However, despite problems in achieving interoperability, metadata best practices seem to be fundamental for metadata planning and quality metadata generation.
Heery (2004) points out that local metadata application profiles must be documented and made accessible at least in human-readable form in order to encourage the reuse of existing metadata beyond the immediate local environment. The current level of difficulty in finding information about documentation practices is particularly alarming because the sharing of such local best practices and application profiles in turn may bring forth increased metadata interoperability and facilitate efficient metadata planning, application, and management especially for new digital projects and initiatives.
There is a pressing need for systematic examination of documentation practices. Notably, however, there is a lack of studies that address such needs based on empirical examination of existing metadata guidelines, best practices, and application profiles, particularly through comparative analysis.
AIM AND SCOPE OF THE SPECIAL ISSUE
The general aim of this special issue is to address such needs and to present current practices and trends in the creation and implementation of metadata best practices, guidelines, and application profiles among various institutions and organizations. It seeks to outline the major issues, challenges, applications and tools, and future perspectives vis-à-vis documentation practices for resource description and access as well as for metadata interoperability and resource sharing across distributed digital repositories.
The special issue includes discussion on metadata decisions drawn from international surveys, locally defined unique fields, educational metadata standards, semiautomatic metadata generation and the Semantic Web. Several case studies present documentation practices and implementation of local best practices for metadata creation and quality control and management. Metadata best practices in relation to collaboration with cross-institutional repositories are also discussed. Best practices in relation to metadata standards (e.g., Dublin Core, Text Encoding Initiative (TEI), Metadata Encoding & Transmission Standard (METS), Metadata Object Description Schema (MODS)) and resource types including electronic theses and dissertations, electronic texts, and maps are also discussed. Below is a brief introduction to the contributed studies.
METADATA BEST PRACTICES: ISSUES AND CHALLENGES
In “Metadata Decisions for Digital Libraries: A Survey Report,” Marcia Lei Zeng, Jaesun Lee, and Allene F. Hayes report on IFLA-initiated survey results drawn from over 400 answers covering 49 countries. The aim of the survey is to identify major issues and concerns in terms of the design and planning of digital projects, element set standards, data contents, authority files, and controlled vocabularies and metadata encoding. The survey results will be addressed in IFLA Guidelines for Digital Libraries, set to be released in 2010. A workflow chart drawn from the survey results is presented in this paper. The authors emphasize the importance of recognition of the ultimate aim of the creation of metadata element sets, content standards, and value-encoding schemes; namely, the generation of high-quality metadata.
Jane Greenberg, Hollie C. White, Sarah Carrier, and Ryan Scherle, the authors of “A Metadata Best Practice for a Scientific Data Repository” report on the metadata best practices of the Dryad Repository, designed for the preservation, access, and reuse of scientific research publication. Dryad’s efforts directly address the two prongs of the metadata approach: (1) immediate needs to content access in DSpace via an extensible markup language (XML) schema and (2) long-term goal to align with the Semantic Web through the Dublin Core–based metadata application profile. Dryad’s metadata functional requirements are based on simplicity, interoperability and Semantic Web alignment. Machine processable metadata are provided in the Dryad Repository. The authors address the challenges of the project vis-à-vis communication issues among diverse project members, lack of metadata registries fully supporting Semantic Web agents, and limited resource description framework (RDF) data.
The study entitled “Metadata for Special Collections in CONTENTdm: How to Improve Interoperability of Unique Fields through OAI-PMH” by Myung-Ja Han, Christine Cho, Timothy W. Cole, and Amy S. Jackson presents locally-defined unique fields created by a total of 21 CONTENTdm-based collections utilizing the Dublin Core metadata standard. The researchers found that out of a total of 491 fields, 171 are locally defined unique fields that are not in simple or qualified Dublin Core. The unique fields are categorized into descriptive, administrative, and technical metadata. The study shows that 107 (84.3%) out of 127 unique descriptive metadata fields can be mapped onto pertinent DC metadata elements. This indicates that DC metadata semantics affect the correct application of DC standard in local contexts. It also makes clear that collection curators often misunderstand the definition of certain Dublin Core elements, notably <type>, <format>, <source>, <relation>, and <identifier> (see also Park and Childress, 2009). Insightful strategies are discussed for increasing interoperability of metadata beyond local collections.
In “Implications and Challenges of Educational Standards Metadata,” Anne R. Diekema discusses the practice of standards-based education in public elementary and secondary school. The author discusses the difficulties faced by educators in addressing standards in teaching, while relating such issues to educational standards information in the resource metadata. Issues and challenges drawn from educational standards metadata are discussed in depth. The study also reports on the semiautomatic metadata generation tool CAT (content assignment tool) developed at Syracuse University. Through natural language processing and machine-learning techniques, CAT assists catalogers and teachers in providing relevant standards based on the educational resource to which the cataloger and/or teacher assigns standards. The author addresses unresolved issues in semiautomatic metadata assignment while identifying the principal cause of such difficulties: gaps derived from conceptual and vocabulary mismatches between standards and educational resources.
DOCUMENTATION AND IMPLEMENTATION OF METADATA BEST PRACTICES
The four case studies based on institutional digital repositories discuss experiences and lessons drawn from the implementation processes of locally developed metadata best practices and guidelines. These experiences are invaluable in that they may offer insights and efficient mechanisms for metadata planning and reuse of best practices directed to digital initiatives.
Rebecca L. Lubas, the author of “Defining Best Practices in Electronic Thesis and Dissertation Metadata,” reports on a case study drawn from the electronic theses and dissertation (ETD) deposit utilizing open source DSpace at the University of New Mexico. After reviewing current practices for thesis and dissertation metadata creation, the study presents recommendations for enhancing author-submitted metadata, metadata quality control, and enhancement together with cross-walking of the metadata to the library’s catalog and training for metadata practitioners.
In “Implementing TEI Projects and Accompanying Metadata for Small Libraries: Rationale and Best Practices.” Richard Wisneski and Virginia Dressler present the electronic text encoding project at Case Western Reserve University. The study presents documentation practices including aspects on workflow, TEI headers metadata, structural markup, creation of metadata records, such as MODS, METS and Dublin Core, and, techniques to analyze policies and procedures. It argues for enhanced access to the collection and a long-term solution to preservation through electronic text encoding projects. It demonstrates the feasibility of implementing such projects, even in small institutions, through proper planning, training and collaborative work.
The study, “Providing Metadata for Compound Digital Objects: Strategic Planning for an Institution’s First Use of METS, MODS, and MIX,” by Michael Dulock and Christopher Cronin presents the implementation processes of METS, MODS, and the NISO Metadata for Images in XML Schema (MIX) for a collection of digitized Sanborn fire insurance maps at the University of Colorado at Boulder. The study illustrates lessons learned through the implementation of these metadata standards. It offers an in-depth discussion on the decision-making process and insightful strategies regarding implementation of the new metadata structures.
In “Documenting Local Procedures: The Development of Standard Digitization Processes through the Dear Comrade Project,” Emily Symonds and Cinda May present a digital project involving the Eugene V. Debs correspondence collection using the digital collection-management software CONTENTdm at Cunningham Memorial Library at Indiana State University. The project is part of the Wabash Valley Visions & Voices Digital Memory Project built upon cross-institutional collaboration and partnerships. The authors present local practices and workflow for the project, including the usage of a metadata template for consistent metadata creation across multiple organizations and collections. Lessons drawn from the project, applicable to other similar projects, are presented.
FUTURE DIRECTIONS
In summary, the rapidly growing body of digital repositories calls for further investigation of documentation practices. Growing numbers of metadata best practices, guidelines, and application profiles demand novel approaches and techniques for extracting, analyzing, and comparing those locally developed documentations. Such future endeavors may bring forth a better understanding of core and emergent semantics of metadata best practices. This may further contribute to the development of mechanisms for sharable and interoperable metadata. Further studies also lie in the development of integrating best practices with semiautomatic metadata generation applications and tools. Approaches and techniques for converting best practices into machine-processable formats and formalization of metadata best practices are impending areas for future studies.
Jung-ran Park
Editor
REFERENCES
Heery, R. (2004). Metadata futures: Steps toward semantic interoperability. In D. I. Hillman & E. L. Westbrooks (Eds.), Metadata in practice (pp. 257–271). Chicago: ALA Editions.
Park, J. (2006). Semantic interoperability and metadata quality: An analysis of metadata item records of digital image collections. Knowledge Organization, 33(1): 20–34.
Park, J., & Childress, E. (2009). Dublin Core metadata semantics: An analysis of the perspectives of information professionals. Journal of Information Science, 35(6): 727–739.
Park, J., Tosaka, Y., & Lu, C. (in press). Locally added home-grown metadata semantics: Issues and implications. Proceedings of the 11th International Society for Knowledge Organization (ISKO) Conference, February 23–26, 2010, Rome, Italy.
Metadata Decisions for Digital Libraries: A Survey Report
MARCIA LEI ZENG
School of Library and Information Science, Kent State University, Kent, Ohio, USA
JAESUN LEE
Korea Research Institute for Library and Information, The National Library of Korea, Seoul, Republic of Korea
ALLENE F. HAYES
Acquisitions and Bibliographic Access Directorate, Library of Congress, Washington, DC, USA
A survey on metadata conducted at the end of 2007 received over 400 answers from 49 countries all over the world. It helped the authors to identify major issues and concerns regarding metadata that should be addressed in the IFLA Guidelines for Digital Libraries. The questionnaire included a question of the roles respondents may have, and five questions of the major concerns in any project that relates to metadata, regarding design and planning of digital projects, element set standards, data contents in a record, authority files and controlled vocabularies, and metadata encoding. Findings from the survey are reported and a workflow chart is included in this paper.
BACKGROUND
In June 2005, the Librarian of Congress James H. Billington presented a proposal to UNESCO (United Nations Educational, Scientific and Cultural Organization) to establish a World Digital Library (WDL). The objectives of the World Digital Library are to: promote international and intercultural understanding and awareness, provide resources to educators, expand non-English and non-Western content on the Internet, and contribute to scholarly research. UNESCO and the Library of Congress co-sponsored an experts meeting in December 2006 with key stakeholders from all regions of the world. That meeting result...