Hands-On Computer Vision with TensorFlow 2
eBook - ePub

Hands-On Computer Vision with TensorFlow 2

Leverage deep learning to create powerful image processing apps with TensorFlow 2.0 and Keras

Benjamin Planche, Eliot Andres

Share book
  1. 372 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Hands-On Computer Vision with TensorFlow 2

Leverage deep learning to create powerful image processing apps with TensorFlow 2.0 and Keras

Benjamin Planche, Eliot Andres

Book details
Book preview
Table of contents
Citations

About This Book

A practical guide to building high performance systems for object detection, segmentation, video processing, smartphone applications, and more

Key Features

  • Discover how to build, train, and serve your own deep neural networks with TensorFlow 2 and Keras
  • Apply modern solutions to a wide range of applications such as object detection and video analysis
  • Learn how to run your models on mobile devices and web pages and improve their performance

Book Description

Computer vision solutions are becoming increasingly common, making their way into fields such as health, automobile, social media, and robotics. This book will help you explore TensorFlow 2, the brand new version of Google's open source framework for machine learning. You will understand how to benefit from using convolutional neural networks (CNNs) for visual tasks.

Hands-On Computer Vision with TensorFlow 2 starts with the fundamentals of computer vision and deep learning, teaching you how to build a neural network from scratch. You will discover the features that have made TensorFlow the most widely used AI library, along with its intuitive Keras interface. You'll then move on to building, training, and deploying CNNs efficiently. Complete with concrete code examples, the book demonstrates how to classify images with modern solutions, such as Inception and ResNet, and extract specific content using You Only Look Once (YOLO), Mask R-CNN, and U-Net. You will also build generative adversarial networks (GANs) and variational autoencoders (VAEs) to create and edit images, and long short-term memory networks (LSTMs) to analyze videos. In the process, you will acquire advanced insights into transfer learning, data augmentation, domain adaptation, and mobile and web deployment, among other key concepts.

By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0.

What you will learn

  • Create your own neural networks from scratch
  • Classify images with modern architectures including Inception and ResNet
  • Detect and segment objects in images with YOLO, Mask R-CNN, and U-Net
  • Tackle problems faced when developing self-driving cars and facial emotion recognition systems
  • Boost your application's performance with transfer learning, GANs, and domain adaptation
  • Use recurrent neural networks (RNNs) for video analysis
  • Optimize and deploy your networks on mobile devices and in the browser

Who this book is for

If you're new to deep learning and have some background in Python programming and image processing, like reading/writing image files and editing pixels, this book is for you. Even if you're an expert curious about the new TensorFlow 2 features, you'll find this book useful.

While some theoretical concepts require knowledge of algebra and calculus, the book covers concrete examples focused on practical applications such as visual recognition for self-driving cars and smartphone apps.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Hands-On Computer Vision with TensorFlow 2 an online PDF/ePUB?
Yes, you can access Hands-On Computer Vision with TensorFlow 2 by Benjamin Planche, Eliot Andres in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Vision & Pattern Recognition. We have over one million books available in our catalogue for you to explore.

Information

Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision
This section covers the fundamentals of computer vision and deep learning, with the help of concrete TensorFlow examples. Starting with a presentation of these technical domains, the first chapter will then walk you through the inner workings of neural networks. This section continues with an introduction to the instrumental features of TensorFlow 2 and Keras, and their key concepts and ecosystems. It ends with a description of machine learning techniques adopted by computer vision experts.
The following chapters will be covered in this section:
  • Chapter 1, Computer Vision and Neural Networks
  • Chapter 2, TensorFlow Basics and Training a Model
  • Chapter 3, Modern Neural Networks
Computer Vision and Neural Networks
In recent years, computer vision has grown into a key domain for innovation, with more and more applications reshaping businesses and lifestyles. We will start this book with a brief presentation of this field and its history so that we can get some background information. We will then introduce artificial neural networks and explain how they have revolutionized computer vision. Since we believe in learning through practice, by the end of this first chapter, we will even have implemented our own network from scratch!
The following topics will be covered in this chapter:
  • Computer vision and why it is a fascinating contemporary domain
  • How we got there—from local hand-crafted descriptors to deep neural networks
  • Neural networks, what they actually are, and how to implement our own for a basic recognition task

Technical requirements

Throughout this book, we will be using Python 3.5 (or higher). As a general-purpose programming language, Python has become the main tool for data scientists thanks to its useful built-in features and renowned libraries.
For this introductory chapter, we will only use two cornerstone libraries—NumPy and Matplotlib. They can be found at and installed from www.numpy.org and matplotlib.org. However, we recommend using Anaconda (www.anaconda.com), a free Python distribution that makes package management and deployment easy.
Complete installation instructions—as well as all the code presented alongside this chapter—can be found in the GitHub repository at github.com/PacktPublishing/Hands-On-Computer-Vision-with-TensorFlow2/tree/master/Chapter01.
We assume that our readers already have some knowledge of Python and a basic understanding of image representation (pixels, channels, and so on) and matrix manipulation (shapes, products, and so on).

Computer vision in the wild

Computer vision is everywhere nowadays, to the point that its definition can drastically vary from one expert to another. In this introductory section, we will paint a global picture of computer vision, highlighting its domains of application and the challenges it faces.

Introducing computer vision

Computer vision can be hard to define because it sits at the junction of several research and development fields, such as computer science (algorithms, data processing, and graphics), physics (optics and sensors), mathematics (calculus and information theory), and biology (visual stimuli and neural processing). At its core, computer vision can be summarized as the automated extraction of information from digital images.
Our brain works wonders when it comes to vision. Our ability to decipher the visual stimuli our eyes constantly capture, to instantly tell one object from another, and to recognize the face of someone we have met only once, is just incredible. For computers, images are just blobs of pixels, matrices of red-green-blue values with no further meaning.
The goal of computer vision is to teach computers how to make sense of these pixels the way humans (and other creatures) do, or even better. Indeed, computer vision has come a long way and, since the rise of deep learning, it has started achieving super human performance in some tasks, such as face verification and handwritten text recognition.
With a hyper active research community fueled by the biggest IT companies, and the ever-increasing availability of data and visual sensors, more and more ambitious problems are being tackled: vision-based navigation for autonomous driving, content-based image and video retrieval, and automated annotation and enhancement, among others. It is truly an exciting time for experts and newcomers alike.

Main tasks and their applications

New computer vision-based products are appearing every day (for instance, control systems for industries, interactive smartphone apps, and surveillance systems) that cover a wide range of tasks. In this section, we will go through the main ones, detailing their applications in relation to real-life problems.

Content recognition

A central goal in computer vision is to make sense of images, that is, to extract meaningful, semantic information from pixels (such as the objects present in images, their location, and their number). This generic problem can be divided into several sub-domains. Here is a non-exhaustive list.

Object classification

Object classification (or image classification) is the task of assigning proper labels (or classes) to images among a predefined set and is illustrated in the following diagram:
Figure 1.1: Example of a classifier for the labels of people and cars applied to an image set
Object classification became famous for being the first success story of deep convolutional neural networks being applied to computer vision back in 2012 (this will be presented later in this chapter). Progress in this domain has been so fast since then that super human performance is now achieved in various use cases (a well-known example is the classification of dog breeds; deep learning methods have become extremely efficient at spotting the discriminative features of man's best friend).
Common applications are text digitization (using character recognition) and the automatic annotation of image databases.
In Chapter 4, Influential Classification Tools, we will present advanced classification methods and their impact on computer vision in general.

Object identification

While ...

Table of contents