Semantic Multimedia Analysis and Processing
  1. 555 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

About this book

Broad in scope, Semantic Multimedia Analysis and Processing provides a complete reference of techniques, algorithms, and solutions for the design and the implementation of contemporary multimedia systems. Offering a balanced, global look at the latest advances in semantic indexing, retrieval, analysis, and processing of multimedia, the book features the contributions of renowned researchers from around the world. Its contents are based on four fundamental thematic pillars: 1) information and content retrieval, 2) semantic knowledge exploitation paradigms, 3) multimedia personalization, and 4) human-computer affective multimedia interaction. Its 15 chapters cover key topics such as content creation, annotation and modeling for the semantic web, multimedia content understanding, and efficiency and scalability.

Fostering a deeper understanding of a popular area of research, the text:

  • Describes state-of-the-art schemes and applications
  • Supplies authoritative guidance on research and deployment issues
  • Presents novel methods and applications in an informative and reproducible way
  • Contains numerous examples, illustrations, and tables summarizing results from quantitative studies
  • Considers ongoing trends and designates future challenges and research perspectives
  • Includes bibliographic links for further exploration
  • Uses both SI and US units

Ideal for engineers and scientists specializing in the design of multimedia systems, software applications, and image/video analysis and processing technologies, Semantic Multimedia Analysis and Processing aids researchers, practitioners, and developers in finding innovative solutions to existing problems, opening up new avenues of research in uncharted waters.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Information

Part I

Information and Content
Retrieval

1

Image Retrieval Using Keywords: The
Machine Learning Perspective

Zenonas Theodosiou
Cyprus University of Technology, zenonas. [email protected]
Nicolas Tsapatsoulis
Cyprus University of Technology, nicolas. [email protected]

Contents

  1. 1.1 Introduction
  2. 1.2 Background
    1. 1.2.1 Key Issues in Automatic Image Annotation
  3. 1.3 Low-Level Feature Extraction
    1. 1.3.1 Local Features
    2. 1.3.2 Global or Holistic Features
    3. 1.3.3 Feature Fusion
  4. 1.4 Visual Models Creation
    1. 1.4.1 Dataset Creation
    2. 1.4.2 Learning Approaches
  5. 1.5 Performance Evaluation
  6. 1.6 A Study on Creating Visual Models
    1. 1.6.1 Feature Extraction
    2. 1.6.2 Keywords Modeling
    3. 1.6.3 Experimental Results
  7. 1.7 Conclusion
In recent years, much effort has been expended on automatic image annotation in order to exploit the advantages of both the text-based and content-based image retrieval methods and compromise their drawbacks, having the ultimate goal to allow content-based keyword searching. This chapter focuses on image retrieval using keywords under the perspective of machine learning, by covering different aspects of the current research in this area. These include low-level feature extraction, creation of training sets and development of machine learning methodologies. Moreover, it presents the evaluation framework of automatic image annotation and discusses various methods and metrics utilized within it. Finally, it proposes the idea of addressing automatic image annotation by creating visual models, one for each available keyword, and presents an example of the proposed idea by comparing different features and machine learning algorithms in creating visual models for keywords referring to the athletics domain.

1.1 Introduction

Given the rapid growth of available digital images, image retrieval has attracted a lot of research interest during the last decades. Image retrieval research efforts fall into content-based and text-based methods. Content-based methods retrieve images by analyzing and comparing the content of a given image example as a starting point. Text-based methods are similar to document retrieval, and retrieve images using keywords. The latter is the approach of preference both for ordinary users and search engine engineers. Besides the fact that the majority of users are familiar with text-based queries, content-based image retrieval lacks semantic meaning. Furthermore, image examples that have to be given as a query are rarely available. From the search engine perspective, text-based image retrieval methods have the advantage of well established techniques for document indexing, and they are integrated into a unified document retrieval framework. However, for text-based image retrieval to be feasible, images must be somehow related with specific keywords or textual description. In contemporary search engines this kind of textual description is usually obtained from the web page, or the document, containing the corresponding images and includes HTML alternative text, the file names of the images, captions, surrounding text, metadata tags or the keywords of the whole web page [588]. Despite the fact that this type of information is not directly related to the content of the images it can be utilized only in web-page image retrieval. As a result, image retrieval from dedicated image collections can be done either by content-based methods or by explicitly annotating images by assigning tags to them to allow text-based search. The latter process is collectively known as ā€œimage annotationā€ or ā€œimage tagging.ā€
Image annotation can be achieved using various approaches like free text descriptions, keywords chosen from controlled vocabularies, etc. Nevertheless, the annotation process remains a significant difficulty in image retrieval, since the manual annotation seems to be the only way guaranteeing success. This is partially a reason explaining why the content-based image retrieval is still considered an option for accessing the enormous amount of digital images. Despite the plethora of available tools, manual annotation is an extremely difficult and elaborate task, since the keyword assignment is performed on an image basis. Furthermore, manual annotations cannot always be considered as correct, due to the visual information that always allows the possibility for contradicting interpretation and ambiguity [395].
In recent years, much effort has been expended on automatic image annotation in order to exploit the advantages of both the text-based and content-based image retrieval methods and compromise their drawbacks mentioned above. In any case, the ultimate goal is to allow keyword searching based on the image content [1035]. Thus, automatic image annotation efforts try to mimic humans aiming to associate the visual features that describe the image content with semantic labels.
This chapter focuses on image retrieval using keywords under the perspective of machine learning. It covers different aspects of the current research in this area, including low-level feature extraction, creation of training sets and development of machine learning methodologies. It also presents the evaluation framework of automatic image annotation and discusses various methods and metrics utilized within it. Furthermore, it proposes the idea of addressing automatic image annotation by creating visual models, one for each available keyword, and presents an example of the proposed idea by comparing different features and machine learning algorithms in creating visual models for keywords referring to the athletics domain.

1.2 Background

Automatic image annotation has been a topic of ongoing research for more than a decade. Several interesting techniques have been proposed during this period [641]. Although it appears to be a particularly complex problem for researchers and despite the fact that annotation obtained automatically is not expected to reach the same level of detail as the one obtained by humans, it remains a research hot topic. The reason is obvious: Manual annotation of the enormous number of images created and uploaded to the web every day is not only impractical; it is simply impossible. Therefore, automatic assignment of keywords to images for retrieval purposes is highly desirable. The proposed methods attempted to address first, the difficulty of relating high-level human interpretations with low-level visual features and second, the lack of correspondence between the keywords and image regions in the (training) data.
Traditionally in content-based image retrieval, images are represented and retrieved using low-level features such as color, texture and shape regions. Similarly, in automatic image annotation, a manually annotated set of data is used to train a system for the identification of joint or conditional probability of an annotation together with a certain distribution of feature vectors corresponding to image content [83]. Different models and machine learning techniques were developed to learn the correlation between image features and textual words, based on the examples of annotated images. Learned models of this correlation are then applied to predict keywords for not yet seen images [1040]. Although the low-level features extracted from an image cannot be automatically translated reliably into high-level semantics [263], the selection of visual features that better describe the content of an image is an essential step for the automatic image annotation. The interpretation inconsistency between image descriptors and high-level semantics is known as a ā€œsemantic gapā€ [856] or ā€œperceptual gapā€ [457]. Recent research focuses on new low-level feature extraction algorithms to bridge the gap between the simplicity of available visual features and...

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Table of Contents
  5. Digital Imaging and Computer Vision Series
  6. List of Figures
  7. List of Tables
  8. Preface
  9. The Editors
  10. Contributors
  11. PART I Information and Content Retrieval
  12. PART II Semantic Knowledge Exploitation and Applications
  13. PART III Multimedia Personalization
  14. PART IV Human–Computer Affective Multimedia Interactions
  15. Bibliography
  16. Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Semantic Multimedia Analysis and Processing by Evaggelos Spyrou, Dimitris Iakovidis, Phivos Mylonas, Evaggelos Spyrou,Dimitris Iakovidis,Phivos Mylonas in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Graphics. We have over one million books available in our catalogue for you to explore.