eBook - ePub

Active Perception

Name: Active Perception
ISBN: 9781134776092

Yiannis Aloimonos,

300 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Active Perception

Yiannis Aloimonos,

About this book

This book defines the emerging field of Active Perception which calls for studying perception coupled with action. It is devoted to technical problems related to the design and analysis of intelligent systems possessing perception such as the existing biological organisms and the "seeing" machines of the future. Since the appearance of the first technical results on active vision, researchers began to realize that perception -- and intelligence in general -- is not transcendental and disembodied. It is becoming clear that in the effort to build intelligent visual systems, consideration must be given to the fact that perception is intimately related to the physiology of the perceiver and the tasks that it performs. This viewpoint -- known as Purposive, Qualitative, or Animate Vision -- is the natural evolution of the principles of Active Vision. The seven chapters in this volume present various aspects of active perception, ranging from general principles and methodological matters to technical issues related to navigation, manipulation, recognition, learning, planning, reasoning, and topics related to the neurophysiology of intelligent systems.

Trusted by 375,005 students

Access to over 1 million titles for a fair monthly price.

Study more efficiently using our study tools.

Publisher

Psychology Press

Year

2013

Print ISBN

9780805812909

eBook ISBN

9781134776092

Topic

Psychology

Subtopic

Cognitive Psychology & Cognition

Index

Psychology

	ACTIVE VISION AS A METHODOLOGY
	Kourosh Pahlavan, Tomas Uhlin and Jan-Olof Eklundh
	The Royal Institute of Technology

ABSTRACT

If vision is a way of interaction with the environment, then it must be active. That is, either the observer or the environment have to undergo certain changes in order for vision to be a meaningful process. One could say that vision is trivially active. Still, this does not legitimate active vision as a methodology.

Traditionally, active vision is considered from two points of view. One approach regards active vision as a set of techniques that can be used to simplify the computational aspects of vision in certain situations; the other points to the specific computational advantages of having anthropomorphic features. This chapter attempts to address active vision as a methodology, and elucidate its methodological superiority to passive vision. The idea is to formalize active vision by defining it, classifying its different components and expanding its generality rather than by isolating it as an endless list of merits. We also address implementational issues like system design, purpose dependency and control problems; these issues are discussed in accordance with the formal definition. At the end, some experiments with an active vision system, the KTH-head, are briefly discussed.

1. INTRODUCTION

Active vision is attracting an increasing interest among researchers in computational vision. A look at the outcome of more than three decades of research in machine vision points to the sources of this trend. Although the achievements in the field are by no means negligible, fundamental difficulties in developing computational theories have made progress relatively slow. Since the 70s, these complications have resulted in the use of more sophisticated physical models and in engaging high-level mathematics. Such efforts, in spite of complexity and abstraction, have not solved many of the primary problems which seem so easy for a biological observer to deal with, like motion tracking and figure-ground segmentation.

As the field of computer vision matured in the 1970s and 1980s there was an emphasis on its informational processing nature and therefore the problem was formalized as such. Computational theories were formed and substantial work was and is being done on the development of algorithms embodying these theories and taking advantage of the rich information contained in an image. Since this information in turn is contained in an enormous set of potentially available data about the scene, it was very clear at an early stage that a main issue is to have the right kind of data for the right kind of processing; this could be seen as the major explanation for why there has been such a focus on representational issues.

The attempts to find appropriate representations of the relevant information point to the need for attentional mechanisms. This insight, in conjunction with the discovery of the difficulties in finding the proper constraints for processing images in the traditional paradigm, has largely motivated the interest in active vision. Other motivations come from interdisciplinary influences from the fields of psychophysics and neurophysiology on one hand, and progress in robotics on the other.

In particular, an active vision platform could not be built without the presence of the compact CCD arrays, motors and control systems of today, as well as current microprocessor technology. Hence, the emergence of active vision is tied to general developments in hardware. Also, it could certainly not have happened without an increasing familiarity with the great achievements in physiology. In summary, with the interest in active vision, computer vision has, more than ever, developed a strong relationship with other disciplines such as robotics, psychology and physiology.

All this background cannot, however, motivate the use of active vision as a methodology until the methodological advantage of this paradigm vs. passive vision¹ is shown. Many researchers, among them the very first pioneers of active vision, like Bajcsy [6], Ballard [8] and Aloimonos [3], have in different portions and from different viewpoints elaborated the advantages of active vision in both the computational and the qualitative sense in specific applications. Hence, such arguments are available elsewhere [35]. Despite this, we will discuss what lies in the notion of active vision and why we feel it adds something to existing approaches for two reasons. First, we feel it is important to explain the methodological supremacy of active vision before working out its computational advantages in each instance, and secondly, we think that an analysis of active vision as a methodology—which is the objective of this chapter—provides such an explanation.

It should be noted that vision is trivially active, both because it must necessarily be tied to some action (the information is used for something) and in the sense that the world surrounding the visual agent in general is a dynamic, steadily changing environment. However, it is often not the observer that causes these changes. The observer cannot help being affected by the environmental changes and should therefore react accordingly. Nevertheless one could argue that this does not necessarily justify the active vision approach.

In active vision the observer selectively acquires visual data in space and time, while in passive vision the observer relies on given (usually prerecorded) data. The question is what this actually entails.

The emphasis in much work has largely been on active vision as body movements and, more recently, as eye movements. This is clear from an early published definition like:

An observer is called active when engaged in some kind of activity whose purpose is to control the geometric parameters of the sensory apparatus. [3] p. 35

indicating that it is particularly the manipulation of the “geometric” parameters which dominantly form the notion of active vision.²

In this chapter we will try to show that not only the observer’s geometric parameters are relevant, but a whole set of other visual parameters as well. The ability to manipulate them in a controlled manner, both as an action and a reaction, builds the concept of active vision. However, the way the manipulation is done, to what extent and what kind of manipulation is done, is task dependent.

We will also discuss the components of an active vision system, i.e., the building blocks that form the system. These building blocks should enable us to construct a system that can simplify visual sensing according to the paradigm and allow not only cue integration, but also process integration. Our own work and the results from other research groups are examples that we will use to illustrate our arguments. There are indeed considerable advances in the field which substantiate our claim that methodological advantages of the paradigm exist and can be exploited in practice [1, 13, 16, 17, 24, 29].

Next we consider the implications of this discussion of active vision as a methodology on our strategy for actually studying it. We argue that a close look at biological vision is necessary and also stress the differences between reactive and active behaviors. Thereafter the control issue is discussed, which in our view is an inherent problem that cannot be separated from the other aspects of active vision. Our presentation ends with a final brief account of some of our experiments which illustrate our general discussion and findings.

2. WHAT IS ACTIVE VISION?

Active vision should not be seen as a total contradiction to passive vision. The two approaches agree, at least, upon addressing the problem of developing seeing systems. The difference is methodological, and as such it does not deal with optimality and efficiency; rather it deals with the question of whether the major tasks can be carried out by one methodology or not. Not surprisingly, there are situations where the passive approach is as good as the active approach. One example can be found in recognition, where Biederman [10] has shown that humans can perform object classification without eye movements and so rapidly that active feedback seems excluded. Note, however, that in order to recognize an object in a real situation, it should be found in the image, at the right scale and with proper visual parameters.

In this context, let us point out that in our ensuing discussion of active vision we refer to seeing systems that are highly flexible and that can perform a large number of tasks. Our arguments are less relevant in more limited situations.

2.1. ACTIVE AND PASSIVE VISION

The typical property of passive vision is that the observer is not capable of choosing how to view the scene, but is instead limited to what is offered, determined by the preset visual parameters and environmental conditions, including the time sampling. The active observer, on the other hand, utilizes its capability to change its visual parameters to acquire favorable data from the scene in solving the specific task it has at the time. A passive system has to extract all information needed from the given images, possibly engaging in complicated reasoning and computations, but cannot acquire more data which could facilitate the computations.

We will now give our definition of active vision:

An active visual system is a system which is able to manipulate its visual parameters in a controlled manner in order to extract useful data about the scene in time and space.

The proponents of active vision have traditionally stressed the issue of optimally and the benefits of anthropomorphic vision techniques in specific applications, instead of pointing out the methodological superiority of active vision applied to visual tasks in general. We want to emphasize the latter aspect.

Once accepted that active vision is the methodologically superior approach to computer vision, one can begin to discuss how to do it optimally and in practice.

2.2. THE GIBSONIAN OPTIC ARRAY AND ACTIVE VISION

We defined an active system functioning in time and space. Gibson [20] introduced the notion of the spatial optic array as being the set of visual data that, in principle, can be acquired from the surrounding world from a given viewing point (see Figure 1). Seeing in particular requires sampling the optic array, and it is worthwhile to discuss concretely how both the active and passive approaches to vision could perform this function.

Figure 1. An illustration of the optic array. All visual data is supposed to be gathered in the optic array from the viewing point. The question is, what methodology makes it feasible to sample the array?

An observer utilizing active vision, by definition, is able to sample all visual data, limited only by its degrees of freedom. The process of data acquisition occurs in time, but the observer is of course only capable of capturing the data at the rate it can sample them. Hence, accepting the discrete steps in time and space, the active methodology, based on the given definition, is capable of acquiring all the information given by the abstract Gibsonian optic array.

As far as static vision is concerned, there are two major cases to study here. The first case is when the observer can access prerecorded data about the world and process those data. Unless the system is capable of prerecording all possible samplings in time and space and storing them, the system cannot process in real time, that is at the rate of the relevant events in the world. It is obvious that such an approach either would cause delayed responses, making it insensitive to what is happening in the world, or require a...

Cover
Half Title
Title Page
Copyright
Contents
Contributors
Introduction: Active Vision Revisited
1. Active Vision as a Methodology
2. Designing Visual Systems: Purposive Navigation
3. Navigational Preliminaries
4. Vision During Action
5. Visual Servoing from 2-D Image Cues
6. Computational Modelling of Hand–Eye Coordination
7. Principles of Animate Vision
Subject Index
Author Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Active Perception by Yiannis Aloimonos in PDF and/or ePUB format, as well as other popular books in Psychology & Cognitive Psychology & Cognition. We have over one million books available in our catalogue for you to explore.

Active Perception

Active Perception

About this book

Trusted by 375,005 students

Information

Table of contents

Frequently asked questions