eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Name: Hands-On Convolutional Neural Networks with TensorFlow
Author: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Compartir libro

272 páginas
English
ePUB (apto para móviles)
Disponible en iOS y Android

eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Detalles del libro

Vista previa del libro

Índice

Citas

Información del libro

Learn how to apply TensorFlow to a wide range of deep learning and Machine Learning problems with this practical guide on training CNNs for image classification, image recognition, object detection and many computer vision challenges.

Key Features

Learn the fundamentals of Convolutional Neural Networks
Harness Python and Tensorflow to train CNNs
Build scalable deep learning models that can process millions of items

Book Description

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time!

We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation.

After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks.

Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.

What you will learn

Train machine learning models with TensorFlow
Create systems that can evolve and scale during their life cycle
Use CNNs in image recognition and classification
Use TensorFlow for building deep learning models
Train popular deep learning models
Fine-tune a neural network to improve the quality of results with transfer learning
Build TensorFlow models that can scale to large datasets and systems

Who this book is for

This book is for Software Engineers, Data Scientists, or Machine Learning practitioners who want to use CNNs for solving real-world problems. Knowledge of basic machine learning concepts, linear algebra and Python will help.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?

Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.

¿Cómo descargo los libros?

Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.

¿En qué se diferencian los planes de precios?

Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.

¿Qué es Perlego?

Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.

¿Perlego ofrece la función de texto a voz?

Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.

¿Es Hands-On Convolutional Neural Networks with TensorFlow un PDF/ePUB en línea?

Sí, puedes acceder a Hands-On Convolutional Neural Networks with TensorFlow de Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo en formato PDF o ePUB, así como a otros libros populares de Informatique y Intelligence artificielle (IA) et sémantique. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.

Información

Editorial

Packt Publishing

Año

2018

ISBN

9781789132823

Edición

Categoría

Informatique

Categoría

Intelligence artificielle (IA) et sémantique

Image Classification in TensorFlow

Image classification refers to the problem of classifying images into categories according to their contents. Let's start with an example task of classifying, where a picture may be an image of a dog, or not. A naive approach that someone might take to accomplish this task is to take an input image, reshape it into a vector, and then train a linear classifier (or some other kind of classifier), like we did in Chapter 1, Setup and Introduction to TensorFlow. However, you would very quickly discover that this idea is bad for several reasons. Besides not scaling well to the size of your input image, your linear classifier will simply have a hard time being able to separate one image from another.

In contrast to humans, who can see meaningful patterns and content in an image, the computer only sees an array of numbers from 0 to 255. The wide fluctuation of these numbers at the same locations for different images of the same class prohibits using them directly as an input to the classifier. These 10 example dog images taken from Canadian Institute For Advanced Research (CIFAR) dataset illustrate this problem perfectly. Not only does the appearance of dogs differ, but their pose and position in front of the camera also does. For a machine, each image at a glance is completely different with no commonalities, whereas we as humans can clearly see that these are all dogs:

A better solution to our problem might be to tell the computer to extract some meaningful features from an input image, for example, common shapes, textures, or colors. We could then use these features, rather than the raw input image, as input to our classifier. Now, we are looking for the presence of these features in an image to tell us if it contains the object we want to identify or not.

These extracted features will look to us as simply a high-dimensional vector (but usually a much lower dimension than the original image space) that can be used as input for our classifier. Some well-known feature extraction methods that have been developed over the years are scale invariant features (SIFT), maximally stable extremal regions (MSER), local binary patterns (LBP), and histogram of oriented gradients (HOG).

The year 2012 saw one of the biggest turning points for Computer Vision (and subsequently, other machine learning areas) when the use of convolutional neural networks for image classification started a paradigm shift in how to solve this task (and many others). Rather than focusing on handcrafting better features to extract from our images, we use a data-driven approach that finds the optimal set of features to represent our problem dataset. A CNN will use large number of training images and learn for itself the best features to represent our data in order to solve the classification task.

In this chapter, we will cover the following topics:

A look at the loss functions used for classification
The Imagenet and CIFAR datasets
Training a CNN to classify the CIFAR dataset
Introduction to the data API
How to initialize your weights
How to regularize your models to get better results

CNN model architecture

The crucial part of an image classification model is its CNN layers. These layers will be responsible for extracting features from image data. The output of these CNN layers will be a feature vector, which like before, we can use as input for the classifier of our choice. For many CNN models, the classifier will be just a fully connected layer attached to the output of our CNN. As shown in Chapter 1, Setup and Introduction to TensorFlow, our linear classifier is just a fully connected layer; this is exactly the case here, except that the size and input to the layer will be different.

It is important to note that at its core, the CNN architecture used in classification or a regression problem such as localization (or any other problems that use images for that matter) would be the same. The only real difference will be what happens after the CNN layers have done their feature extraction. For example, one difference could be the loss function used for different tasks, as it is shown in the following diagram:

You will see a recurring pattern in this book when we look at the different problems that CNNs can be used to solve. It will become apparent that lots of tasks involving images can be solved using a CNN to extract some meaningful feature vector from the input data, which will then be manipulated in some way and fed to different loss functions, depending on the task. For now, let’s crack on and focus firstly on the task of image classification by looking at the loss functions commonly used for it.

Cross-entropy loss (log loss)

The simplest form of image classification is binary classification. This is where we have a classifier that has just one object to classify, for example, dog/no dog. In this case, a loss function we are likely to use is the binary cross-entropy loss.

The cross entropy function between true labels p and model predictions q is defined as:

With i being the index for each possible element of our labels and predictions.

However, as we are dealing with the binary case when we have only two possible outcomes, y=1 and y=0, then p

{

} and q

{

} can be simplified down and we get:

This is equivalent

Iterating over

training examples, the cost function L to be minimized then becomes this:

This is intuitively correct, as when

, we want to minimize

, which requires large

and when

, we want to minimize

, which requires a small

In TensorFlow, the binary cross entropy loss can be found in the tf.losses module. It is useful to know that the name for the raw output

of our model is logits. Before we can pass this to the cross entropy loss we need to apply the sigmoid function to it so our output is scaled between 0 and 1. TensorFlow actually combines all these steps together into one operation, as shown in the code below. Also TensorFlow will take care of averaging the loss across the batch for us.

loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=labels_in, logits=model_prediction)