Hands-On Convolutional Neural Networks with TensorFlow
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Buch teilen
  1. 272 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Learn how to apply TensorFlow to a wide range of deep learning and Machine Learning problems with this practical guide on training CNNs for image classification, image recognition, object detection and many computer vision challenges.

Key Features

  • Learn the fundamentals of Convolutional Neural Networks
  • Harness Python and Tensorflow to train CNNs
  • Build scalable deep learning models that can process millions of items

Book Description

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time!

We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation.

After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks.

Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.

What you will learn

  • Train machine learning models with TensorFlow
  • Create systems that can evolve and scale during their life cycle
  • Use CNNs in image recognition and classification
  • Use TensorFlow for building deep learning models
  • Train popular deep learning models
  • Fine-tune a neural network to improve the quality of results with transfer learning
  • Build TensorFlow models that can scale to large datasets and systems

Who this book is for

This book is for Software Engineers, Data Scientists, or Machine Learning practitioners who want to use CNNs for solving real-world problems. Knowledge of basic machine learning concepts, linear algebra and Python will help.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Hands-On Convolutional Neural Networks with TensorFlow als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Hands-On Convolutional Neural Networks with TensorFlow von Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Informatique & Intelligence artificielle (IA) et sémantique. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Image Classification in TensorFlow

Image classification refers to the problem of classifying images into categories according to their contents. Let's start with an example task of classifying, where a picture may be an image of a dog, or not. A naive approach that someone might take to accomplish this task is to take an input image, reshape it into a vector, and then train a linear classifier (or some other kind of classifier), like we did in Chapter 1, Setup and Introduction to TensorFlow. However, you would very quickly discover that this idea is bad for several reasons. Besides not scaling well to the size of your input image, your linear classifier will simply have a hard time being able to separate one image from another.
In contrast to humans, who can see meaningful patterns and content in an image, the computer only sees an array of numbers from 0 to 255. The wide fluctuation of these numbers at the same locations for different images of the same class prohibits using them directly as an input to the classifier. These 10 example dog images taken from Canadian Institute For Advanced Research (CIFAR) dataset illustrate this problem perfectly. Not only does the appearance of dogs differ, but their pose and position in front of the camera also does. For a machine, each image at a glance is completely different with no commonalities, whereas we as humans can clearly see that these are all dogs:
A better solution to our problem might be to tell the computer to extract some meaningful features from an input image, for example, common shapes, textures, or colors. We could then use these features, rather than the raw input image, as input to our classifier. Now, we are looking for the presence of these features in an image to tell us if it contains the object we want to identify or not.
These extracted features will look to us as simply a high-dimensional vector (but usually a much lower dimension than the original image space) that can be used as input for our classifier. Some well-known feature extraction methods that have been developed over the years are scale invariant features (SIFT), maximally stable extremal regions (MSER), local binary patterns (LBP), and histogram of oriented gradients (HOG).
The year 2012 saw one of the biggest turning points for Computer Vision (and subsequently, other machine learning areas) when the use of convolutional neural networks for image classification started a paradigm shift in how to solve this task (and many others). Rather than focusing on handcrafting better features to extract from our images, we use a data-driven approach that finds the optimal set of features to represent our problem dataset. A CNN will use large number of training images and learn for itself the best features to represent our data in order to solve the classification task.
In this chapter, we will cover the following topics:
  • A look at the loss functions used for classification
  • The Imagenet and CIFAR datasets
  • Training a CNN to classify the CIFAR dataset
  • Introduction to the data API
  • How to initialize your weights
  • How to regularize your models to get better results

CNN model architecture

The crucial part of an image classification model is its CNN layers. These layers will be responsible for extracting features from image data. The output of these CNN layers will be a feature vector, which like before, we can use as input for the classifier of our choice. For many CNN models, the classifier will be just a fully connected layer attached to the output of our CNN. As shown in Chapter 1, Setup and Introduction to TensorFlow, our linear classifier is just a fully connected layer; this is exactly the case here, except that the size and input to the layer will be different.
It is important to note that at its core, the CNN architecture used in classification or a regression problem such as localization (or any other problems that use images for that matter) would be the same. The only real difference will be what happens after the CNN layers have done their feature extraction. For example, one difference could be the loss function used for different tasks, as it is shown in the following diagram:
You will see a recurring pattern in this book when we look at the different problems that CNNs can be used to solve. It will become apparent that lots of tasks involving images can be solved using a CNN to extract some meaningful feature vector from the input data, which will then be manipulated in some way and fed to different loss functions, depending on the task. For now, let’s crack on and focus firstly on the task of image classification by looking at the loss functions commonly used for it.

Cross-entropy loss (log loss)

The simplest form of image classification is binary classification. This is where we have a classifier that has just one object to classify, for example, dog/no dog. In this case, a loss function we are likely to use is the binary cross-entropy loss.
The cross entropy function between true labels p and model predictions q is defined as:
With i being the index for each possible element of our labels and predictions.
However, as we are dealing with the binary case when we have only two possible outcomes, y=1 and y=0, then p
{
} and q
{
} can be simplified down and we get:
This is equivalent
Iterating over
training examples, the cost function L to be minimized then becomes this:
This is intuitively correct, as when
, we want to minimize
, which requires large
and when
, we want to minimize
, which requires a small
.
In TensorFlow, the binary cross entropy loss can be found in the tf.losses module. It is useful to know that the name for the raw output
of our model is logits. Before we can pass this to the cross entropy loss we need to apply the sigmoid function to it so our output is scaled between 0 and 1. TensorFlow actually combines all these steps together into one operation, as shown in the code below. Also TensorFlow will take care of averaging the loss across the batch for us.
loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=labels_in, logits=model_prediction)

Mul...

Inhaltsverzeichnis