Hands-On Convolutional Neural Networks with TensorFlow
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Share book
  1. 272 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Book details
Book preview
Table of contents
Citations

About This Book

Learn how to apply TensorFlow to a wide range of deep learning and Machine Learning problems with this practical guide on training CNNs for image classification, image recognition, object detection and many computer vision challenges.

Key Features

  • Learn the fundamentals of Convolutional Neural Networks
  • Harness Python and Tensorflow to train CNNs
  • Build scalable deep learning models that can process millions of items

Book Description

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time!

We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation.

After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks.

Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.

What you will learn

  • Train machine learning models with TensorFlow
  • Create systems that can evolve and scale during their life cycle
  • Use CNNs in image recognition and classification
  • Use TensorFlow for building deep learning models
  • Train popular deep learning models
  • Fine-tune a neural network to improve the quality of results with transfer learning
  • Build TensorFlow models that can scale to large datasets and systems

Who this book is for

This book is for Software Engineers, Data Scientists, or Machine Learning practitioners who want to use CNNs for solving real-world problems. Knowledge of basic machine learning concepts, linear algebra and Python will help.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Hands-On Convolutional Neural Networks with TensorFlow an online PDF/ePUB?
Yes, you can access Hands-On Convolutional Neural Networks with TensorFlow by Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

Information

Image Classification in TensorFlow

Image classification refers to the problem of classifying images into categories according to their contents. Let's start with an example task of classifying, where a picture may be an image of a dog, or not. A naive approach that someone might take to accomplish this task is to take an input image, reshape it into a vector, and then train a linear classifier (or some other kind of classifier), like we did in Chapter 1, Setup and Introduction to TensorFlow. However, you would very quickly discover that this idea is bad for several reasons. Besides not scaling well to the size of your input image, your linear classifier will simply have a hard time being able to separate one image from another.
In contrast to humans, who can see meaningful patterns and content in an image, the computer only sees an array of numbers from 0 to 255. The wide fluctuation of these numbers at the same locations for different images of the same class prohibits using them directly as an input to the classifier. These 10 example dog images taken from Canadian Institute For Advanced Research (CIFAR) dataset illustrate this problem perfectly. Not only does the appearance of dogs differ, but their pose and position in front of the camera also does. For a machine, each image at a glance is completely different with no commonalities, whereas we as humans can clearly see that these are all dogs:
A better solution to our problem might be to tell the computer to extract some meaningful features from an input image, for example, common shapes, textures, or colors. We could then use these features, rather than the raw input image, as input to our classifier. Now, we are looking for the presence of these features in an image to tell us if it contains the object we want to identify or not.
These extracted features will look to us as simply a high-dimensional vector (but usually a much lower dimension than the original image space) that can be used as input for our classifier. Some well-known feature extraction methods that have been developed over the years are scale invariant features (SIFT), maximally stable extremal regions (MSER), local binary patterns (LBP), and histogram of oriented gradients (HOG).
The year 2012 saw one of the biggest turning points for Computer Vision (and subsequently, other machine learning areas) when the use of convolutional neural networks for image classification started a paradigm shift in how to solve this task (and many others). Rather than focusing on handcrafting better features to extract from our images, we use a data-driven approach that finds the optimal set of features to represent our problem dataset. A CNN will use large number of training images and learn for itself the best features to represent our data in order to solve the classification task.
In this chapter, we will cover the following topics:
  • A look at the loss functions used for classification
  • The Imagenet and CIFAR datasets
  • Training a CNN to classify the CIFAR dataset
  • Introduction to the data API
  • How to initialize your weights
  • How to regularize your models to get better results

CNN model architecture

The crucial part of an image classification model is its CNN layers. These layers will be responsible for extracting features from image data. The output of these CNN layers will be a feature vector, which like before, we can use as input for the classifier of our choice. For many CNN models, the classifier will be just a fully connected layer attached to the output of our CNN. As shown in Chapter 1, Setup and Introduction to TensorFlow, our linear classifier is just a fully connected layer; this is exactly the case here, except that the size and input to the layer will be different.
It is important to note that at its core, the CNN architecture used in classification or a regression problem such as localization (or any other problems that use images for that matter) would be the same. The only real difference will be what happens after the CNN layers have done their feature extraction. For example, one difference could be the loss function used for different tasks, as it is shown in the following diagram:
You will see a recurring pattern in this book when we look at the different problems that CNNs can be used to solve. It will become apparent that lots of tasks involving images can be solved using a CNN to extract some meaningful feature vector from the input data, which will then be manipulated in some way and fed to different loss functions, depending on the task. For now, letā€™s crack on and focus firstly on the task of image classification by looking at the loss functions commonly used for it.

Cross-entropy loss (log loss)

The simplest form of image classification is binary classification. This is where we have a classifier that has just one object to classify, for example, dog/no dog. In this case, a loss function we are likely to use is the binary cross-entropy loss.
The cross entropy function between true labels p and model predictions q is defined as:
With i being the index for each possible element of our labels and predictions.
However, as we are dealing with the binary case when we have only two possible outcomes, y=1 and y=0, then p
{
} and q
{
} can be simplified down and we get:
This is equivalent
Iterating over
training examples, the cost function L to be minimized then becomes this:
This is intuitively correct, as when
, we want to minimize
, which requires large
and when
, we want to minimize
, which requires a small
.
In TensorFlow, the binary cross entropy loss can be found in the tf.losses module. It is useful to know that the name for the raw output
of our model is logits. Before we can pass this to the cross entropy loss we need to apply the sigmoid function to it so our output is scaled between 0 and 1. TensorFlow actually combines all these steps together into one operation, as shown in the code below. Also TensorFlow will take care of averaging the loss across the batch for us.
loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=labels_in, logits=model_prediction)

Mul...

Table of contents