Deep Learning for Computer Vision
eBook - ePub

Deep Learning for Computer Vision

Rajalingappaa Shanmugamani, Abdul Ghani Abdul Rahman, Stephen Maurice Moore, Nishanth Koganti

Buch teilen
  1. 310 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Deep Learning for Computer Vision

Rajalingappaa Shanmugamani, Abdul Ghani Abdul Rahman, Stephen Maurice Moore, Nishanth Koganti

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Learn how to model and train advanced neural networks to implement a variety of Computer Vision tasks

Key Features

  • Train different kinds of deep learning model from scratch to solve specific problems in Computer Vision
  • Combine the power of Python, Keras, and TensorFlow to build deep learning models for object detection, image classification, similarity learning, image captioning, and more
  • Includes tips on optimizing and improving the performance of your models under various constraints

Book Description

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning.

In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation.

What you will learn

  • Set up an environment for deep learning with Python, TensorFlow, and Keras
  • Define and train a model for image and video classification
  • Use features from a pre-trained Convolutional Neural Network model for image retrieval
  • Understand and implement object detection using the real-world Pedestrian Detection scenario
  • Learn about various problems in image captioning and how to overcome them by training images and text together
  • Implement similarity matching and train a model for face recognition
  • Understand the concept of generative models and use them for image generation
  • Deploy your deep learning models and optimize them for high performance

Who this book is for

This book is targeted at data scientists and Computer Vision practitioners who wish to apply the concepts of Deep Learning to overcome any problem related to Computer Vision. A basic knowledge of programming in Python—and some understanding of machine learning concepts—is required to get the best out of this book.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Deep Learning for Computer Vision als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Deep Learning for Computer Vision von Rajalingappaa Shanmugamani, Abdul Ghani Abdul Rahman, Stephen Maurice Moore, Nishanth Koganti im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Computer Science & Artificial Intelligence (AI) & Semantics. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Generative Models

Generative models have become an important application in computer vision. Unlike the applications discussed in previous chapters that made predictions from images, generative models can create an image for specific objectives. In this chapter, we will understand:
  • The applications of generative models
  • Algorithms for style transfer
  • Training a model for super-resolution of images
  • Implementation and training of generative models
  • Drawbacks of current models
By the end of the chapter, you will be able to implement some great applications for transferring style and understand the possibilities, as well as difficulties, associated with generative models.

Applications of generative models

Let's start this chapter with the possible applications of generative models. The applications are enormous. We will see a few of these applications to understand the motivation and possibilities.

Artistic style transfer

Artistic style transfer is the process of transferring the style of art to any image. For example, an image can be created with the artistic style of an image and content of another image. An example of one image combined with several different styles is shown here illustrated by Gatys et al. (https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf). The image A is the photo on which the style is applied, and the results are shown in other images:
Reproduced from Gatys et al.
This application has caught the public's attention, and there are several mobile apps in the market providing this facility.

Predicting the next frame in a video

Predicting future frames from synthetic video sets is possible using generative models. In the following image proposed by Lotter et al. (https://arxiv.org/pdf/1511.06380.pdf) the images on the left side are the models from the previous frame, and on the right side, there are two algorithms compared with respect to the ground truth:
Reproduced from Lotter et al.
The frames generated by the generative models will be realistic.

Super-resolution of images

The super-resolution is the process of creating higher resolution images from a smaller image. Traditionally, interpolations were used to create such bigger images. But interpolation misses the high-frequency details by giving a smoothened effect. Generative models that are trained for this specific purpose of super-resolution create images with excellent details. The following is an example of such models as proposed by Ledig et al. (https://arxiv.org/pdf/1609.04802.pdf). The left side is generated with 4x scaling and looks indistinguishable from the original on the right:
Reproduced from Ledig et al.
Super-resolution is useful for rendering a low-resolution image on a high-quality display or print. Another application could be a reconstruction of compressed images with good quality.

Interactive image generation

Generative models can be used to create images by interaction. A user can add edits and the images can be generated, reflecting the edits as shown here as proposed by Zhu et al. (https://arxiv.org/pdf/1609.03552v2.pdf):
Reproduced from Zhu et al.
As shown, the images are generated based on the shape and color of the edits. A green color stroke at the bottom creates a grassland, a rectangle creates a skyscraper and so on. The images will be generated and fine-tuned with further inputs from the user. The generated image can also be used to retrieve the most similar real image that can be utilized. Interactiv...

Inhaltsverzeichnis