Hands-On Convolutional Neural Networks with TensorFlow
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Compartir libro
  1. 272 páginas
  2. English
  3. ePUB (apto para móviles)
  4. Disponible en iOS y Android
eBook - ePub

Hands-On Convolutional Neural Networks with TensorFlow

Solve computer vision problems with modeling in TensorFlow and Python.

Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Detalles del libro
Vista previa del libro
Índice
Citas

Información del libro

Learn how to apply TensorFlow to a wide range of deep learning and Machine Learning problems with this practical guide on training CNNs for image classification, image recognition, object detection and many computer vision challenges.

Key Features

  • Learn the fundamentals of Convolutional Neural Networks
  • Harness Python and Tensorflow to train CNNs
  • Build scalable deep learning models that can process millions of items

Book Description

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time!

We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation.

After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks.

Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.

What you will learn

  • Train machine learning models with TensorFlow
  • Create systems that can evolve and scale during their life cycle
  • Use CNNs in image recognition and classification
  • Use TensorFlow for building deep learning models
  • Train popular deep learning models
  • Fine-tune a neural network to improve the quality of results with transfer learning
  • Build TensorFlow models that can scale to large datasets and systems

Who this book is for

This book is for Software Engineers, Data Scientists, or Machine Learning practitioners who want to use CNNs for solving real-world problems. Knowledge of basic machine learning concepts, linear algebra and Python will help.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Hands-On Convolutional Neural Networks with TensorFlow un PDF/ePUB en línea?
Sí, puedes acceder a Hands-On Convolutional Neural Networks with TensorFlow de Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo en formato PDF o ePUB, así como a otros libros populares de Informatique y Intelligence artificielle (IA) et sémantique. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.

Información

Año
2018
ISBN
9781789132823

Image Classification in TensorFlow

Image classification refers to the problem of classifying images into categories according to their contents. Let's start with an example task of classifying, where a picture may be an image of a dog, or not. A naive approach that someone might take to accomplish this task is to take an input image, reshape it into a vector, and then train a linear classifier (or some other kind of classifier), like we did in Chapter 1, Setup and Introduction to TensorFlow. However, you would very quickly discover that this idea is bad for several reasons. Besides not scaling well to the size of your input image, your linear classifier will simply have a hard time being able to separate one image from another.
In contrast to humans, who can see meaningful patterns and content in an image, the computer only sees an array of numbers from 0 to 255. The wide fluctuation of these numbers at the same locations for different images of the same class prohibits using them directly as an input to the classifier. These 10 example dog images taken from Canadian Institute For Advanced Research (CIFAR) dataset illustrate this problem perfectly. Not only does the appearance of dogs differ, but their pose and position in front of the camera also does. For a machine, each image at a glance is completely different with no commonalities, whereas we as humans can clearly see that these are all dogs:
A better solution to our problem might be to tell the computer to extract some meaningful features from an input image, for example, common shapes, textures, or colors. We could then use these features, rather than the raw input image, as input to our classifier. Now, we are looking for the presence of these features in an image to tell us if it contains the object we want to identify or not.
These extracted features will look to us as simply a high-dimensional vector (but usually a much lower dimension than the original image space) that can be used as input for our classifier. Some well-known feature extraction methods that have been developed over the years are scale invariant features (SIFT), maximally stable extremal regions (MSER), local binary patterns (LBP), and histogram of oriented gradients (HOG).
The year 2012 saw one of the biggest turning points for Computer Vision (and subsequently, other machine learning areas) when the use of convolutional neural networks for image classification started a paradigm shift in how to solve this task (and many others). Rather than focusing on handcrafting better features to extract from our images, we use a data-driven approach that finds the optimal set of features to represent our problem dataset. A CNN will use large number of training images and learn for itself the best features to represent our data in order to solve the classification task.
In this chapter, we will cover the following topics:
  • A look at the loss functions used for classification
  • The Imagenet and CIFAR datasets
  • Training a CNN to classify the CIFAR dataset
  • Introduction to the data API
  • How to initialize your weights
  • How to regularize your models to get better results

CNN model architecture

The crucial part of an image classification model is its CNN layers. These layers will be responsible for extracting features from image data. The output of these CNN layers will be a feature vector, which like before, we can use as input for the classifier of our choice. For many CNN models, the classifier will be just a fully connected layer attached to the output of our CNN. As shown in Chapter 1, Setup and Introduction to TensorFlow, our linear classifier is just a fully connected layer; this is exactly the case here, except that the size and input to the layer will be different.
It is important to note that at its core, the CNN architecture used in classification or a regression problem such as localization (or any other problems that use images for that matter) would be the same. The only real difference will be what happens after the CNN layers have done their feature extraction. For example, one difference could be the loss function used for different tasks, as it is shown in the following diagram:
You will see a recurring pattern in this book when we look at the different problems that CNNs can be used to solve. It will become apparent that lots of tasks involving images can be solved using a CNN to extract some meaningful feature vector from the input data, which will then be manipulated in some way and fed to different loss functions, depending on the task. For now, let’s crack on and focus firstly on the task of image classification by looking at the loss functions commonly used for it.

Cross-entropy loss (log loss)

The simplest form of image classification is binary classification. This is where we have a classifier that has just one object to classify, for example, dog/no dog. In this case, a loss function we are likely to use is the binary cross-entropy loss.
The cross entropy function between true labels p and model predictions q is defined as:
With i being the index for each possible element of our labels and predictions.
However, as we are dealing with the binary case when we have only two possible outcomes, y=1 and y=0, then p
{
} and q
{
} can be simplified down and we get:
This is equivalent
Iterating over
training examples, the cost function L to be minimized then becomes this:
This is intuitively correct, as when
, we want to minimize
, which requires large
and when
, we want to minimize
, which requires a small
.
In TensorFlow, the binary cross entropy loss can be found in the tf.losses module. It is useful to know that the name for the raw output
of our model is logits. Before we can pass this to the cross entropy loss we need to apply the sigmoid function to it so our output is scaled between 0 and 1. TensorFlow actually combines all these steps together into one operation, as shown in the code below. Also TensorFlow will take care of averaging the loss across the batch for us.
loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=labels_in, logits=model_prediction)

Mul...

Índice