eBook - ePub

Intelligent Video Surveillance Systems

Name: Intelligent Video Surveillance Systems
Author: Jean-Yves Dufour, Jean-Yves Dufour

Jean-Yves Dufour, Jean-Yves Dufour

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Intelligent Video Surveillance Systems

Jean-Yves Dufour, Jean-Yves Dufour

Book details

Book preview

Table of contents

Citations

About This Book

Belonging to the wider academic field of computer vision, video analytics has aroused a phenomenal surge of interest since the current millennium. Video analytics is intended to solve the problem of the incapability of exploiting video streams in real time for the purpose of detection or anticipation. It involves analyzing the videos using algorithms that detect and track objects of interest over time and that indicate the presence of events or suspect behavior involving these objects.
The aims of this book are to highlight the operational attempts of video analytics, to identify possible driving forces behind potential evolutions in years to come, and above all to present the state of the art and the technological hurdles which have yet to be overcome. The need for video surveillance is introduced through two major applications (the security of rail transportation systems and a posteriori investigation). The characteristics of the videos considered are presented through the cameras which enable capture and the compression methods which allow us to transport and store them. Technical topics are then discussed – the analysis of objects of interest (detection, tracking and recognition), "high-level" video analysis, which aims to give a semantic interpretation of the observed scene (events, behaviors, types of content). The book concludes with the problem of performance evaluation.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Intelligent Video Surveillance Systems an online PDF/ePUB?

Yes, you can access Intelligent Video Surveillance Systems by Jean-Yves Dufour, Jean-Yves Dufour in PDF and/or ePUB format, as well as other popular books in Physical Sciences & Waves & Wave Mechanics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Wiley-ISTE

Year

2012

ISBN

9781118577936

Edition

Topic

Physical Sciences

Subtopic

Waves & Wave Mechanics

Chapter 1 Image Processing: Overview and Perspectives

“Puissance de l’image, dit-on? il s’agit bel et bien, plutôt, de l’extrême richesse du plus évolué de nos sens : la vue – ou, pour mieux dire, de la plus remarquable de nos fonctions de contact avec l’environnement: la vision, œil et cerveau. De fait, en termes de quantité d’information véhiculée et de complexité de son traitement, il n’y a guère, pour l’être humain, que la fonction de reproduction qui puisse soutenir la comparaison avec la fonction de vision.”

D. Estournet¹

1.1. Half a century ago

In an exercise in prospective, it is always helpful to look back toward the foundation of the domain in question, examine the context of its apparition and then that of its evolutions, to identify the reasons for its hurdles or – conversely – the avenues of its progressions. Above all, the greatest advantage can be found in revisiting the promises made by the discipline, comparing them with what has actually been achieved and measuring the differences.

Today, the field of image processing is a little over 50 years old. Indeed, it was in the 1960s that elementary techniques began to emerge – in parallel but often independently of one another – which gradually came together to form image processing as we now know it, which is partly the subject of this book.

Of these techniques, we will begin by discussing the extension to two or three dimensions (2D or 3D) of signal processing methods. In this exercise, among other great names, the following have distinguished themselves: R.M. Mersereau, L.R. Rabiner, J.H. McClellan, T.S. Huang, J.L. Shanks, B.R. Hunt, H.C. Andrews, A. Bijaoui, etc., recognized for their contribution both to 1D and 2D. The aim of their work was to enable images to benefit from all the modeling, prediction, filtering and restoration tools that were becoming established at the time in acoustics, radar and speech. Based on the discovery of rapid transformations and their extension to 2D, these works naturally gave rise to spectral analysis of images – a technique that is still very much in use today. However, this route is pockmarked by insightful but unfulfilled, abandoned projects that have hitherto not been widely exploited – relating, for example, to the stability of multidimensional filters or 2D recursive processes – because the principle of causality that governs temporal signals had long thwarted image processors, which expected to find it in the television signal, for instance. From then on, this field of signal processing became particularly fertile. It is directly at the root of the extremely fruitful approaches of tomographic reconstruction, which nowadays is unavoidable in medical diagnostics or physical experimentation, and wavelet theory, which is useful in image analysis or compression. More recently, it is to be found at the heart of the sparse approaches, which harbor many hopes of producing the next “great leap forward” in image processing.

A second domain also developed in the 1960s, was based on discrete – and often binary – representation of images. Using completely different tools, the pioneers of this domain turned their attention to other properties of images: the connexity, the morphology, the topology of forms and spatial meshes that are a major component of an image. Turning away from continuous faithful representation of the signal, they set about identifying abstract properties: the relative position, the inside and outside, contact and inclusion, thereby opening the way to shape semantics on the one hand, and a verbal description of the space, which naturally gave way to scene analysis on the other hand. In this discipline as well, a number of great names can be held up: A. Rosenfeld, T. Pavlidis, M. Eden, M.J.E. Golay, A. Guzman, H. Freeman, G. Matheron and J. Serra.

The third field of activities, which was crucial in the foundation of image processing as we know it, is that of pattern recognition. This accompanied the emergence of artificial intelligence (AI) and automatic learning. Both statistical and structural classification methods emerged during these years, following the works of F. Rosenblatt, S. Watanabe, T. Pavlidis, E. Diday, R.O. Duda, M. Levine, P.E. Hart, M. Pavel, K.S. Fu, J.C. Simon, etc. In image processing, they found a field with exceptional development and progression, because it offers an infinite base for experimentation, where each programmer is also the expert who verifies the quality and solidity of the results.

1.2. The use of images

In the 1960s and in the particular context of the Western world, in a society deeply scarred by the Cold War, highly open to mass consumption and marked by social welfare, we wonder what applications these various techniques were developed for. Three fields of application largely dominate the academic scene: biological and medical imaging, document processing and television (today, we would speak of “multimedia”). Other domains also emerged, but in a less structured way, e.g. around sensors in physics or the nascent spatial applications.

In medical imaging, to begin with, efforts were concentrated around radiology, with the aim of dealing with a very high demand for mass sanitary prevention. Around radiography, algorithms were constructed for filtering, detection, recognition, contour tracking, density evaluation, etc. The requirements in terms of memory, display, networking and archiving also became apparent, as did the concepts of interaction and annotation. The notions of calibration, readjustment and change detection also emerged. For a long time, radiologists guarded the piloting of the technical platforms and their costly imaging systems. However, at the other side of the hospital, far from the huge instruments of in vivo inspection, another research activity was rapidly emerging in the specialist services: in cytology, hematology, histology, etc., with a view to acquiring and quickly and safely processing biological samples. This led to the development of imaging to determine form and carry out cell counting, classification and quantification. The notion of texture came into existence. Mathematical morphology found very fertile soil in this domain.

In the domain of television, all work was – unsurprisingly – aimed at compression of the images with a view to reducing the bandwidth of the transmission channels. Very fortuitously, these works were accompanied by research that went far beyond this exact objective, which produced a great many results that are still being drawn upon even today, about the quality of the image, its statistical properties, whether it is static or animated, and on the psycho-physiological properties of the human observer, or the social expectations of the viewers. These results have greatly fertilized the other domains of application, lending them an exceptional basis of founding principles that have been used in the processing algorithms and the hardware developed to date.

Today, it could be said that document processing has lost its place as the driving force behind image processing; however, it was the object of the most noteworthy efforts in the early 1960s, to help postal sorting, archive plans and books, and accompanied the explosion of telecommunications, laying the groundwork for the emergence of “paper-free” office automation. It contributed greatly to the development of cheap analysis materials: scanners, printers and graphics tables, and therefore caused the demise of photographic film and photographic paper. To a large extent, it was because of the requirements of document processing that theories and low-level processing techniques, discrete representation, detection, recognition, filtering and tracking were developed. It stimulated the emergence of original methods for pattern recognition, drove forward the development of syntactic and structural descriptions, grammars, pattern description languages, etc.

To conclude this brief review of the past, let us cite a few phrases taken from old texts that illuminate this particular context, and reread them in the light of our contemporary society. It is striking to note their ongoing pertinence, even if certain words seem very quaint:

“The demand for picture transmission (picturephone images, space pictures, weather maps, newspapers, etc.) has been ever increasing recently, which makes it desirable if not necessary for us to consider the possibility of picture bandwidth compression”. [HUA 72]

Or indeed:

“The rapid proliferation of computers during the past two decades has barely kept pace with the explosive increase in the amount of information that needs to be processed”. [ROS 76]

Compression and processing, the problems facing society half a century ago, are obviously still facing us today, expressed in more or less the same words. We might, therefore, be forgiven for wondering: so what has image processing been doing all this time?

1.3. Strengths and weaknesses of image processing

Let us answer this provocative question with a two-pronged witticism:

– Image processing has solved none of the theoretical problems it set out to solve, but it has solved many of its practical problems.

– Image processing, by solving a handful of problems, has created an armful for itself.

1.3.1. What are these theoretical problems that image processing has been unable to overcome?

To begin with, it is the problem of segmentation that constitutes an unsolved problem after half a century of effort and thousands of articles and communications. We still do not know how to properly deal with this issue without an explicit reference to a human observer who serves simultaneously as worker, reference point and referee. Certainly, the methods have been greatly improved; they are easier to reproduce, more reliable, more easily controllable (see, for example, [KUM 10, GRO 09, LAR 10]), but they are also still just as blind to the object that they are processing, and ignorant of the intentions of their user.

Then, we turn to contour detection – an ambiguous abstraction in itself but commonly shared, necessary at numerous stages but too often unpredictable and with disappointing results (in spite of highly interesting works such as [ARB 11]). Along with segmentation, contours have the great privilege of having mobilized legions of image processors and witnessed the advent of cohorts of “optimal” detectors that sit in tool boxes, awaiting a user who will likely never come.

Finally, texture detection and recognition still pose a problem: the practical importance of textures is proven in all fields of application, but they do not as yet have a commonly held definition, and far less a robust, reliable and transferable methodology (the recent works [GAL 11, XIA 10] would be of great interest to an inquisitive reader).

1.3.2. What are the problems that image processing has overcome?

To begin with, we might cite the problem of compression that, by successive stages, has enabled the establishment of standards of which the user may not even know the name (or the principles, for that matter), but which enable him to carry, on a USB stick, enough movies for a flight from Paris to New York – or which, at the other end of the scale, compress an artistic photo to a tenth of its original size, without adversely affecting the quality, even for an exacting photographer. Yet it is through these generations – who have worked on the Hadamard transforms, then on discrete cosine transforms (DCTs) and then on wavelets; who have optimized coefficients, truncations and scans; who have developed motion prediction, interframe coding, visual masking and chromatic quantification – that we have witnessed the emergence of the successive representations, ever more powerful and yet ever more supple in order to be able to adapt to the image, making use of efficient and clever algorithms, capable of responding to the real time of increasingly demanding applications [CHE 09, COS 10]. A presentation of the evolutions of compression in the domain of video is presented in Chapter 5 of this book. In connection with this topic, Chapter 7 presents an approach for detecting moving objects in a compressed video, which exploits the mode of video compression in the MPEGx standards.

Enormous leaps forward have been made in the field of pattern recognition: the exceptional capacity of face detection and recognition systems, no matter what the size or the type of face within complex scenes, in crowds, and on varying media supports [PAR 10]. This function is now routinely available not only in databases of images diffused as free products, but also on all compact photo cameras, mobile telephones and video cameras, where it governs the focusing function, and perhaps in the future will govern the framing and the next stages as well. In this book, applications of these techniques in video analytics are presented for detection (Chapter 8), tracking (Chapters 9 and 10) and recognition of people by facial or iris scans (Chapter 11), as well as for vehicle recognition (Chapter 12).

Next, we can cite the capacity to restore degraded documents [DEL 11], by way of linear or nonlinear filters, identifying the defects either blindly or under supervision, dealing with the non-homogeneities [DEL 06, RAB 11], and supplementing the missing parts with inpainting techniques [AUJ 10]. Here, the available quick and precise focusing systems, astutely combining optical principles and image processing [YAS 10, ZHO 11], are in competition with techniques that – on the contrary – ignore the focusing issues and reconstruct a profound scene based on a plethora of views, all out of focus.

Finally, a major step forward lies in the management of large image bases, the search for specific objects, the detection of identical elements [MOR 09, SIV 05], determination of overlap and possibly automatic mosaicing of complex scenes.

1.4. What is left for the future?

The progress made by the efforts of researchers creates new requirements and new aspirations in turn. The availability of on-line digital resources has given rise to a universal demand, which at present the communication channels and the archiving supports have a limited capacity to satisfy. Attempts are being made to deliver even greater compression than that achieved by wavelets. Technical progress made by developing the classic methods will certainly yield further gains but, among the decisive steps, sparse representation approaches hold out the hope of greater progress [BAR 07, HOR 10]. This progress will probably come at the expense of a great deal of computation, both at the source and at the receiving end, but it seems that, today, the resources to perform these computations are available – particularly in users’ homes, where the workload demanded of the resources is often well below their capacity, but also (why not?) on the cloud. The domain of image or video compression has always progressed through stages. Over the past few years, a number of techniques have become well established: differential pulse code modulations (DPCMs), DCTs and wavelets. Their competitors, even those better equipped, appear unable to rival their performances, which have gradually been achieved by way of painstaking – and collaborative – optimization of all the parameters. Hence, the superiority of the most powerful approaches can be seen in their performances, but firstly at the expense of software or hardware that is so complex that it will still need years of appropriation before it can be affordably implemented in silicon or in algorithms. To date, we have not yet reached a point where the performances of techniques based on redundant dictionaries, model selection and statistical learning can surpass the Daubechy 9/7 wavelets or the LeGall 5/3 wavelets, but numerous examples are appearing today which suggest that we could soon reach that point.

In the domain of restoration and filtering, groups of techniques are gradually emerging that could offer rapid progress [ABE 97, GAS 07, YAN 10]. They relate to restoration by the use of a large number of potentially heterogeneous images. Successfully employed in satellite imaging to reconstitute multispectral images in high resolution using low-resolution multispectral images and high-resolution panchromatic images, they have also been used to reconstitute images with improved resolution from a series of images with lower resolution, but always in somewhat canonic configurations, which are difficult to generalize. In the next few years we should see the emergence of techniques that exploit the diversity of resolution of the sensors, different angles of observation, varied lighting conditions, differing sensitivities or field depths in 3D scenes to reconstitute references of the scenes observed, based on the classical work in matching, stereovision and signal processing.

Yet above all, it is in the extracting of information (data mining) and specifically in the mining of semantic data, that this progress is expected (Figure 1.1). The Internet has, in recent years, become very specialized in the use of keywords to access in...

Cover
Contents
Title page
Copyright page
Introduction
Chapter 1: Image Processing: Overview and Perspectives
Chapter 2: Focus on Railway Transport
Chapter 3: A Posteriori Analysis for Investigative Purposes
Chapter 4: Video Surveillance Cameras
Chapter 5: Video Compression Formats
Chapter 6: Compressed Domain Analysis for Fast Activity Detection
Chapter 7: Detection of Objects of Interest
Chapter 8: Tracking of Objects of Interest in a Sequence of Images
Chapter 9: Tracking Objects of Interest Through a Camera Network
Chapter 10: Biometric Techniques Applied to Video Surveillance
Chapter 11: Vehicle Recognition in Video Surveillance
Chapter 12: Activity Recognition
Chapter 13: Unsupervised Methods for Activity Analysis and Detection of Abnormal Events
Chapter 14: Data Mining in a Video Database
Chapter 15: Analysis of Crowded Scenes in Video
Chapter 16: Detection of Visual Context
Chapter 17: Example of an Operational Evaluation Platform: PPSL
Chapter 18: Qualification and Evaluation of Performances
List of Authors
Index