eBook - ePub

Computational Models for Cognitive Vision

Name: Computational Models for Cognitive Vision
Author: Hiranmay Ghosh

Hiranmay Ghosh,

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Computational Models for Cognitive Vision

Hiranmay Ghosh,

About this book

Learn how to apply cognitive principles to the problems of computer vision

Computational Models for Cognitive Vision formulates the computational models forthecognitive principlesfound inbiologicalvision, and applies those modelsto computer visiontasks.Such principles include perceptual grouping, attention, visual quality and aesthetics, knowledge-based interpretation and learning, to name a few.The author'sultimate goalis toprovide a framework forcreation of amachine vision systemwith thecapability and versatility ofthe human vision.

Written by Dr.HiranmayGhosh, the book takes readers through the basicprinciplesand the computational modelsforcognitive vision, Bayesian reasoning for perception and cognition, and otherrelatedtopics, beforeestablishing therelationship ofcognitive visionwiththemulti-disciplinaryfield broadly referred to as "artificial intelligence".The principles are illustrated with diverse application examples in computer vision, such as computational photography, digital heritage and social robots.The author concludes with suggestions for future research and salient observations about the state of the field of cognitive vision.

Other topics covered in the book include:

· knowledge representation techniques

· evolution of cognitive architectures

· deeplearning approachesfor visual cognition

Undergraduate students, graduate students, engineers, and researchers interested in cognitive vision will consider this an indispensable and practical resource in the development and study of computer vision.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Computational Models for Cognitive Vision by Hiranmay Ghosh in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Science General. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Wiley-IEEE Computer Society Pr

Year

2020

Print ISBN

9781119527862

eBook ISBN

9781119527893

Edition

Topic

Computer Science

Subtopic

Computer Science General

Index

Computer Science

1
Introduction

Human vision system (HVS) has a remarkable capability of building three-dimensional models of the environment from the visual signals received through the eyes. The goal of computer vision research is to emulate this capability on man-made apparatus, such as computers. Twentieth century saw a tremendous growth in the field of computer vision. Starting with signal processing techniques for demarcating objects in space-time continuum of visual signals, the field has embraced several other disciplines like artificial intelligence and machine learning for interpreting the visual contents. As the research in computer vision matured, it has been pushed to address several real-life problems toward the turn of the century. Examples of such challenging applications include visual surveillance, medical image analysis, computational photography, digital heritage, robotic navigation, and so on.

Though computer vision has shown extremely promising results in many of applications in restricted domains, its performance lags that of HVS by a large margin. While HVS can effortlessly interpret complex scenes, e.g. those shown in Figure 1.1, artificial vision fails to do so. It is “intuitive” for humans to comprehend the semantics of the scenes at multiple levels of abstraction, and to predict the next movements with some degree of certainty. Derivation of such semantics remains a formidable challenge for artificial vision systems. Further, many real-life applications demand analysis of imperfect imagery, for example with poor lighting, blur, occlusions, noise, background clutter, and so forth. While human vision is robust to such imperfections, computer vision systems often fail to perform in such cases. These revelations motivated deeper study of HVS and to apply the principles involved into computer vision applications.

Photographs depicting hard challenges for computer vision. (Left) “The offensive player … is about to shoot the ball at the goal …”. (Right) A facial expression in Bharatnatyam dance. — **Figure 1.1** Hard challenges for computer vision. (a) “The offensive player

is about to shoot the ball at the goal

” (b) A facial expression in Bharatnatyam dance.

*Source*: File shared by Rick Dikeman through Wikimedia Commons, file name: Football_iu_1996.webp.

*Source*: File shared by Suyash Dwivedi through Wikimedia Commons, file name: Bharatnatyam_different_facial_expressions_(9).webp.

images — **Figure 1.1** Hard challenges for computer vision. (a) “The offensive player

is about to shoot the ball at the goal

” (b) A facial expression in Bharatnatyam dance.

*Source*: File shared by Rick Dikeman through Wikimedia Commons, file name: Football_iu_1996.webp.

*Source*: File shared by Suyash Dwivedi through Wikimedia Commons, file name: Bharatnatyam_different_facial_expressions_(9).webp.

1.1 What Is Cognitive Vision

Though there is a broad agreement in the scientific community that cognitive vision pertains to application of principles of biological (especially, human) vision systems to computer vision applications, the space of cognitive vision studies are not well defined (Vernon 2006). The boundary between vision and cognition is thin, and cognitive vision operates in that gray area. Broadly speaking, cognitive vision involves the ability to survey a visual scene, recognizing and locating objects of interest, acting based on visual stimuli, learning and generation of new knowledge, dynamically updating a visual map that represents the reality, and so on. Perception and reasoning are two important pillars on which cognitive vision stands. A crucial point is that the entire gamut of activities must be in real-time to enable an agent to engage with the real world. It is an emerging area of research integrating methodologies from various disciplines like artificial intelligence, computer vision, machine learning, cognitive science, and psychology. There is no single approach to cognitive vision, and the proposed solutions to the different problems appears like islands in an ocean. In this book, we have attempted to put together computational theories for a set of cognitive vision problems and organized it in an attempt to develop a coherent narrative for the subject. We shall get more insight on what cognitive vision is as we proceed through the book, and shall characterize it in clearer terms in Chapter 10.

1.2 Computational Approaches for Cognitive Vision

Two branches of science have significantly contributed to the understanding of the processes for cognition from visual as well as other sensory signals. One of them is psychophysics, which is defined as the “study of quantitative relations between psychological events and physical events or, more specifically, between sensations and the stimuli that produce them” (Encyclopedia Britannica). The subject was established by Gustav Fechner and is a marriage between study of sensory processes and physical stimuli. The other branch of science that has facilitated our understanding of perception and cognition is neurophysiology, which combines physiology and neural sciences for an understanding of the functions of the nervous system. The two approaches are complementary to each other. While psychophysics answers what happens during cognition, neurophysiology explains how it is realized in the biological nervous system.

Researchers on cognitive vision have for long recognized it as an information processing activity by the biological neural system. However, a formal computational approach to understand cognition has been a fundamental contribution by David Marr (1976). Marr abstracted vision into three separable layers, namely (i) hardware, (ii) representation and algorithms, and (iii) computational theory. This abstraction enables computational theories of cognitive vision to be formulated independent of implementations in biological vision system. It also provides a theory for realizing cognitive functions in artificial systems made up of altogether different hardware, and possibly using different representations and algorithms. Further, Marr's model of vision assumes modularity and pipelined architecture, two important properties of information processing systems that allow independent formulation of the different cognitive processes with defined interfaces. Marr identifies three stages of processing for vision. The first involves finding the basic contours that mark the object boundaries. The second stage results in discovery of the surfaces and their orientations, that results in an observer-centric

-dimensional model. The third involves knowledge-based interpretation of the model to an observer-neutral set of objects that constitute the 3D environment. These three stages roughly correspond to the early vision, perception, and cognition stages of vision, as recognized in the modern literature, and which we shall describe shortly.

As suggested by David Marr, it is possible to study computational theories of cognitive vision in isolation from the biological systems, and we propose to do exactly the same in this book. However, such computational models need to explain the what part of cognition. For that purpose, we shall refer to the results of the psychophysical experiments, wherever relevant, without going into details of the experimental setups. Further, though the goal of computational modeling is to support alternate (artificial) implementations of cognition that need not be based on biological implementation models, analysis of the latter often provides clue to plausible implementation schemes. We shall discuss the results of some relevant neurophysiological studies in the book. We shall consciously keep such discussions to a superficial level, so that the text can be followed without a deep knowledge of either psychology or neurosciences.

1.3 A Brief Review of Human Vision System

We briefly look into how human vision works in this section, in order to put rest of the text in this book in context. A broad overview of HVS is presented in Figure 1.2. It comprises a pair of eyes connected to the brain via the optic nerves. When one looks at a scene, the light rays enter the eyes to form a pair of inverted images on screens at the back of the eyes, which are known as the retina. This corresponds to mapping of the external 3D world to a pair of 2D images, with slightly different perspectives. Internal representations of the images are transmitted to the visual cortex in the rear end of the brain by a bunch of optic nerves, where the images are correlated and interpreted to reconstruct a symbolic description of the 3D world.

In this simple model of biological vision, the eyes primarily act as image capture device in the system, and the brain as the interpreter. In reality, things are much more complex. The output from the eyes is not a faithfu...

Cover
Table of Contents
About the Author
Acknowledgments
Preface
Acronyms
1 Introduction
2 Early Vision
3 Bayesian Reasoning for Perception and Cognition
4 Late Vision
5 Visual Attention
6 Cognitive Architectures
7 Knowledge Representation for Cognitive Vision
8 Deep Learning for Visual Cognition
9 Applications of Visual Cognition
10 Conclusion
References
Index
End User License Agreement