Object Detection and Recognition in Digital Images
eBook - ePub

Object Detection and Recognition in Digital Images

Theory and Practice

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Object Detection and Recognition in Digital Images

Theory and Practice

About this book

Object detection, tracking and recognition in images are key problems in computer vision. This book provides the reader with a balanced treatment between the theory and practice of selected methods in these areas to make the book accessible to a range of researchers, engineers, developers and postgraduate students working in computer vision and related fields.

Key features:

  • Explains the main theoretical ideas behind each method (which are augmented with a rigorous mathematical derivation of the formulas), their implementation (in C++) and demonstrated working in real applications.
  • Places an emphasis on tensor and statistical based approaches within object detection and recognition.
  • Provides an overview of image clustering and classification methods which includes subspace and kernel based processing, mean shift and Kalman filter, neural networks, and k-means methods.
  • Contains numerous case study examples of mainly automotive applications.
  • Includes a companion website hosting full C++ implementation, of topics presented in the book as a software library, and an accompanying manual to the software platform.

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Publisher
Wiley
Year
2013
Print ISBN
9780470976371
Edition
1
eBook ISBN
9781118618363
1
Introduction
Look in, let not either the proper quality, or the true worth of anything pass thee, before thou hast fully apprehended it.
—MARCUS AURELIUS Meditations, 170–180 AD (Translated by Meric Casaubon, 1634)
This book presents selected object detection and recognition methods in computer vision, joining theory, implementation as well as applications. The majority of the selected methods were used in real automotive vision systems. However, two groups of methods were distinguished. The first group contains methods which are based on tensors, which in the last decade have opened new frontiers in image processing and pattern analysis. The second group of methods builds on mathematical statistics. In many cases, object detection and recognition methods draw from these two groups. As indicated in the title, equally important is the explanation of the main concepts of the methods and presentation of their mathematical derivations, as their implementations and usage in real applications. Although object detection and recognition are strictly connected, to some extent both domains can be seen as pattern classification and frequently detection precedes recognition, we make a distinction between the two. Object detection in our definition mostly concerns answering a question about whether a given type of object is present in images. Sometimes, their current appearance and position are also important. On the other hand, the goal of object recognition is to tell its particular type. For instance, we can detect a face, or after that identify a concrete person. Similarly, in the road sign recognition system for some signs, their detection unanimously reveals their category, such as “Yield.” However, for the majority of them, we first detect their characteristic shapes, then we identify their particular type, such as “40km/h speed limit, ” and so forth.
Detection and recognition of objects in the observed scenes is a natural biological ability. People and animals perform this effortlessly in daily life to move without collisions, to find food, avoid threats, and so on. However, similar computer methods and algorithms for scene analysis are not so straightforward, despite their unprecedented development. Nevertheless, biological systems after close observations and analysis provide some hints for their machine realizations. A good example here are artificial neural networks which in their diversity resemble biological systems of neurons and which – in their software realization – are frequently used by computers to recognize objects. This is how the branch of computer science, called computer vision (CV), developed. Its main objective is to make computers see as humans, or even better. Sometimes it becomes possible.
Due to technological breakthroughs, domains of object detection and recognition have changed so dynamically that preparation of even a multivolume publication on the majority of important subjects in this area seems impossible. Each month hundreds of new papers are published with new ideas, theorems, algorithms, etc. On the other hand, the fastest and most ample source of information is Internet. One can easily look up almost all subjects on a myriad of webpages, such as Wikipedia. So, nowadays the purpose of writing a book on computer vision has to be stated somewhat differently than even few years ago. The difference between an ample set of information versus knowledge and experience starts to become especially important when we face a new technological problem and our task is to solve it or design a system which will do this for us. In this case we need a way of thinking, which helps us to understand the state of nature, as well as a methodology which takes us closer to a potential solution. This book grew up in just this way, alongside my work on different projects related to object recognition in images. To be able to apply a given method we need first to understand it. At this stage not just a final formula summarizing a method, but also its detailed mathematical background, are of great use. On the other hand, bare formulas don't yet solve the problem. We need their implementations. This is the second stage, sometimes requiring more time and work than the former. One of the main goals of this book is to join the two domains on a selected set of useful methods of object detection and recognition. In this respect I hope this book will be of practical use, both for self study and also as a reference when working on a concrete problem. Nevertheless, we are not able to go through all stages of all the methods, but I hope the book will provide at least a solid start for further study and development in this fascinating and dynamically changing area.
As indicated in the title, one of my goals was to join theory and practice. My experience is that such composition leads to an in-depth understanding of the subject. This is further underpinned by case studies of mostly automotive applications of object detection and recognition. Thus, sections of this book can be grouped as follows:
  • Presentations of methods, their main concepts, and mathematical background.
  • Method implementations which contain C++ code listings (sections of this type are indicated with word IMPLEMENTATION).
  • Analysis of special applications (their names start with CASE STUDY).
Apart from this we have some special entries which contain brief explanations of some mathematical concepts with examples which aim is to help in understanding the mathematical derivation in the surrounding sections.
A comment on code examples. I have always been convinced that in a book like this we should not spoil pages with an introduction to C, C++ or other basic principles of computer science, as sometimes is the case. The reasons are at least twofold: the first is that for computer science there are a lot of good books available, for which I provide the references. The second reason, is so to not divert a Reader from the main purpose of this book, which is an in-depth presentation of the modern computer vision methods and algorithms. On the other hand, Readers who are not familiar with C++ can skip detailed code explanations and focus on implementation in other platforms. However, there is no better way of learning the method than through practical testing and usage in applications.
This book is based on my experience gathered while working on many scientific projects. Results of these were published in a number of conference and journal articles. In this respect, two previous books are special. The first, An Introduction to 3D Computer Vision Techniques and Applications, written together with J. Paul Siebert, was published by Wiley in 2009 [1]. The second is my habilitation thesis [2], also issued in 2009 by the AGH University of Science and Technology Press in Kraków, Poland. Extended parts of the latter are contained in different sections of this book, permission for which was granted by the AGH University Press.
Most of all, I have always found being involved in scientific and industry projects real fun and an adventure leading to self-development. I wish the same to you.
1.1 A Sample of Computer Vision
In this section let us briefly take a look at some applications of computer vision in the systems of driver monitoring, as well as scene analysis. Both belong to the on-car Driver Assisting System aimed at facilitating driving, for example by notifying drivers of incoming road signs, and most of all by preventing car accidents, for example due to the driver falling asleep.
Figure 1.1 depicts a system of cameras mounted in a test car. The cameras can observe the driver and allow the system to monitor his or her state. Cameras can also observe the front of the car for pedestrian detection or road sign recognition, in which case they can send an image like the one presented in Figure 1.2.
Figure 1.1 System of cameras mounted in a car. The cameras can observe a driver to monitor his/her state. Cameras can also observe the front of a car for pedestrian detection or road sign recognition. Such vision modules will probably soon become standard equipment, being a part of the on-board Driver Assisting System of a car.
c01f001
Figure 1.2 A traffic scene. A car-mounted computer with cameras can provide information on the road scene to help safe driving. But computer vision can also help you identify where the picture was taken.
c01f002
What type of information can we draw from such an image? This depends on our goal, certainly. In the real traffic situation depicted we are mainly interested in driving the car safely, avoiding pedestrians and other vehicles in motion or parked, as well as spotting and reacting to traffic signals and signs. However, in a situation where someone sent us this image we might be interested in finding out the name of that street, for instance. What can computer vision do for us? To some extent all of the above, and soon driving a car, at least in special conditions. Let us look at some stages of processing by computer vision methods, details of which are discussed in the next chapters.
Let us first observe that even a single color image has three dimensions, as shown in Figure 1.3(a). In the case of multiple images or a video stream, dimensions grow. Thus, we need tools to analyze such structures. As we will see, tensors offer new possibilities in this respect. Also, their recently developed decompositions allow insight into information contained in such multidimensional structures, as well as their compression or extraction of features for further classification. Much research into computer vision and pattern recognition is on feature detection and their properties. In this respect such transformations are investigated which change the original intensity or color pixels into some new representation which provides some knowledge about image contents or is more appropriate for finding specific objects. An example of an application of the structural tensor to image in Figure 1.2 for detection of areas with strong local structures is shown in Figure 1.3(b). Found structures are encoded with color – their orientation is represented by different colors, whereas strength is by color saturation. Let us observe that areas with no prominent structures show no response of this filter – in Figure 1.3(b) they are simply black. As will be shown, such representation proves very useful in finding specific figures in images, such as pedestrians, cars, or road signs, and so forth.
Figure 1.3 A color image can be seen as a 3D structure (a). Internal properties of such multidimensional signals can be analyzed with tensors. Local structures can be detected with the structural tensor (b). Here different colors encode orientations of areas with strong signal variations, such as edges. Areas with weak texture are in black. These features can be used to detect pedestrians, cars, road signs and other objects.
c01f003
Let us now briefly show the possible steps that lead to detection of road signs in the image in Figure 1.2. In this method signs are first detected with fast segmentation by specific colors characteristic to different groups of expected signs. For instance, red color segmentation is used to spot all-red objects, among which could also be the red rims of the prohibitive signs, and so on for all colors of interest.
Figure 1.4 shows binary maps obtained of the image in Figure 1.2 after red and blue segmentations, respectively. There are many segmentation methods which are discussed in this book. In this case we used manually gathered color samples which were used to train the support vector classifiers.
Figure 1.4 Segmentation of image in Figure 1.2. Red (a), blue color segmentation (b).
c01f004
From the maps in Figure 1.4 we need to find a way of selecting objects whose shape and size potentially correspond to the road signs we are looking for. This is done by specific methods which rely on detection of salient points, as well as on fuzzy logic rules which define the potential shape and size of the candidate objects.
Figure 1.5 shows the detected areas of the signs. These now need to be fed to the next classifier which will provide a final response, first if we are really observing...

Table of contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Dedication
  5. Preface
  6. Acknowledgements
  7. Notations and Abbreviations
  8. Chapter 1: Introduction
  9. Chapter 2: Tensor Methods in Computer Vision
  10. Chapter 3: Classification Methods and Algorithms
  11. Chapter 4: Object Detection and Tracking
  12. Chapter 5: Object Recognition
  13. A Appendix
  14. Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Object Detection and Recognition in Digital Images by Boguslaw Cyganek in PDF and/or ePUB format, as well as other popular books in Physical Sciences & Physics. We have over 1.5 million books available in our catalogue for you to explore.