
eBook - ePub
Convolutional Neural Networks in Visual Computing
A Concise Guide
- 168 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
About this book
This book covers the fundamentals in designing and deploying techniques using deep architectures. It is intended to serve as a beginner's guide to engineers or students who want to have a quick start on learning and/or building deep learning systems. This book provides a good theoretical and practical understanding and a complete toolkit of basic information and knowledge required to understand and build convolutional neural networks (CNN) from scratch. The book focuses explicitly on convolutional neural networks, filtering out other material that co-occur in many deep learning books on CNN topics.
Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead
Information
1
INTRODUCTION TO VISUAL COMPUTING
The goal of human scientific exploration is to advance human capabilities. We invented fire to cook food, therefore outgrowing our dependence on the basic food processing capability of our own stomach. This led to increased caloric consumption and perhaps sped up the growth of civilizationāsomething that no other known species has accomplished. We invented the wheel and vehicles therefore our speed of travel does not have to be limited to the ambulatory speed of our legs. Indeed, we built airplanes, if for no other reason than to realize our dream of being able to take to the skies. The story of human invention and technological growth is a narrative of the human species endlessly outgrowing its own capabilities and therefore endlessly expanding its horizons and marching further into the future.
Much of these advances are credited to the wiring in the human brain. The human neural system and its capabilities are far-reaching and complicated. Humans enjoy a very intricate neural system capable of thought, emotion, reasoning, imagination, and philosophy. As scientists working on computer vision, perhaps we are a little tendentious when it comes to the significance of human vision, but for us, the most fascinating part of human capabilities, intelligence included, is the cognitive-visual system. Although human visual system and its associated cognitive decision-making processes are one of the fastest we know of, humans may not have the most powerful visual system among all the species, if, for example, acuity or night vision capabilities are concerned (Thorpe et al., 1996; Watamaniuk and Duchon, 1992). Also, humans peer through a very narrow range of the electromagnetic spectrum. There are many other species that have a wider visual sensory range than we do. Humans have also become prone to many corneal visual deficiencies such as near-sightedness. Given all this, it is only natural that we as humans want to work on improving our visual capabilities, like we did with other deficiencies in human capabilities.
We have been developing tools for many centuries trying to see further and beyond the eye that nature has bestowed upon us. Telescopes, binoculars, microscopes, and magnifiers were invented to see much farther and much smaller objects. Radio, infrared, and x-ray devices make us see in parts of the electromagnetic spectrum, beyond the visible band that we can naturally perceive. Recently, interferometers were perfected and built, extending human vision to include gravity waves, making way for yet another way to look at the world through gravitational astronomy. While all these devices extend the human visual capability, scholars and philosophers have long since realized that we do not see just with our eyes. Eyes are but mere imaging instruments; it is the brain that truly sees.
While many scholars from Plato, Aristotle, Charaka, and Euclid to Leonardo da Vinci studied how the eye sees the world, it was Hermann von Helmholtz in 1867 in his Treatise on the Physiological Optics who first postulated in scientific terms that the eye only captures images and it is the brain that truly sees and recognizes the objects in the image (Von Helmholtz, 1867). In his book, he presented novel theories on depth and color perception, motion perception, and also built upon da Vinciās earlier work. While it had been studied in some form or the other since ancient times in many civilizations, Helmholtz first described the idea of unconscious inference where he postulated that not all ideas, thoughts, and decisions that the brain makes are done so consciously. Helmholtz noted how susceptible humans are to optical illusions, famously quoting the misunderstanding of the sun revolving around the earth, while in reality it is the horizon that is moving, and that humans are drawn to emotions of a staged actor even though they are only staged. Using such analogies, Helmholtz proposed that the brain understands the images that the eye sees and it is the brain that makes inferences and understanding on what objects are being seen, without the person consciously noticing them. This was probably the first insight into neurological vision. Some early-modern scientists such as Campbell and Blakemore started arguing what is now an established fact: that there are neurons in the brain responsible for estimating object sizes and sensitivity to orientation (Blakemore and Campbell, 1969). Later studies during the same era discovered more complex intricacies of the human visual system and how we perceive and detect color, shapes, orientation, depth, and even objects (Field et al., 1993; McCollough, 1965; Campbell and Kulikowski, 1966; Burton, 1973).
The above brief historical accounts serve only to illustrate that the field of computer vision has its own place in the rich collection of stories of human technological development. This book focuses on a concise presentation of modern computer vision techniques, which might be stamped as neural computer vision since many of them stem from artificial neural networks. To ensure the book is self-contained, we start with a few foundational chapters that introduce a reader to the general field of visual computing by defining basic concepts, formulations, and methodologies, starting with a brief presentation of image representation in the subsequent section.
Image Representation Basics
Any computer vision pipeline begins with an imaging system that captures light rays reflected from the scene and converts the optical light signals into an image in a format that a computer can read and process. During the early years of computational imaging, an image was obtained by digitizing a film or a printed picture; contemporarily, images are typically acquired directly by digital cameras that capture and store an image of a scene in terms of a set of ordered numbers called pixels. There are many textbooks covering image acquisition and a cameraās inner workings (like its optics, mechanical controls and color filtering, etc.) (Jain, 1989; Gonzalez and Woods, 2002), and thus we will present only a brief account here. We use the simple illustration of Figure 1.1 to highlight the key process of sampling (i.e., discretization via the image grid) and quantization (i.e., representing each pixelās color values with only a finite set of integers) of...
Table of contents
- Cover
- Half Title Page
- Title Page
- Copyright Page
- Dedication
- Contents
- Preface
- Acknowledgments
- Authors
- Chapter 1 Introduction to Visual Computing
- Chapter 2 Learning as a Regression Problem
- Chapter 3 Artificial Neural Networks
- Chapter 4 Convolutional Neural Networks
- Chapter 5 Modern and Novel Usages of CNNs
- Appendix A Yaan
- Postscript
- Index
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, weāve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere ā even offline. Perfect for commutes or when youāre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Convolutional Neural Networks in Visual Computing by Ragav Venkatesan,Baoxin Li in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Science General. We have over one million books available in our catalogue for you to explore.