eBook - ePub

Fundamentals of Deep Learning and Computer Vision

Name: Fundamentals of Deep Learning and Computer Vision
Author: Nikhil Singh

A Complete Guide to become an Expert in Deep Learning and Computer Vision

Nikhil Singh

Share book

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Fundamentals of Deep Learning and Computer Vision

A Complete Guide to become an Expert in Deep Learning and Computer Vision

Nikhil Singh

Book details

Book preview

Table of contents

Citations

About This Book

Master Computer Vision concepts using Deep Learning with easy-to-follow steps Key Features

Setting up the Python and TensorFlow environment
Learn core Tensorflow concepts with the latest TF version 2.0
Learn Deep Learning for computer vision applications
Understand different computer vision concepts and use-cases
Understand different state-of-the-art CNN architectures
Build deep neural networks with transfer Learning using features from pre-trained CNN models
Apply computer vision concepts with easy-to-follow code in Jupyter Notebook
Description
This book starts with setting up a Python virtual environment with the deep learning framework TensorFlow and then introduces the fundamental concepts of TensorFlow. Before moving on to Computer Vision, you will learn about neural networks and related aspects such as loss functions, gradient descent optimization, activation functions and how backpropagation works for training multi-layer perceptrons.
To understand how the Convolutional Neural Network (CNN) is used for computer vision problems, you need to learn about the basic convolution operation. You will learn how CNN is different from a multi-layer perceptron along with a thorough discussion on the different building blocks of the CNN architecture such as kernel size, stride, padding, and pooling and finally learn how to build a small CNN model.
The book concludes with a chapter on sequential models where you will learn about RNN, GRU, and LSTMs and their architectures and understand their applications in machine translation, image/video captioning and video classification.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Fundamentals of Deep Learning and Computer Vision an online PDF/ePUB?

Yes, you can access Fundamentals of Deep Learning and Computer Vision by Nikhil Singh in PDF and/or ePUB format, as well as other popular books in Informatik & Computer Vision & Mustererkennung. We have over one million books available in our catalogue for you to explore.

Information

Publisher

BPB Publications

Year

2020

ISBN

9789388511858

Topic

Informatik

Subtopic

Computer Vision & Mustererkennung

CHAPTER 1 Introduction to TensorFlow

We live in the information age more precisely the digital age. Technology has been advancing by leaps and bounds over the past few years, and this has led to the creation of various smart devices. In this pervasive world, smart devices like smartphones, vehicles, smartwatches, household appliances or any Internet of Things (IoT) devices are becoming ubiquitous and involve communication with databases maintained in the cloud. These communications create lots of data that gets stored in huge databases. The Internet is exploding with a huge amount of data as every second elapses in the time continuum. Around 2.5 quintillion bytes of data gets created each day at current pace. Images and videos are the major contributors to this huge data source. With the development of cloud and flexible storage capacity, developers are opting for more the merrier approach, and actively working to gather more data. This helps them to enhance their technology.

With proliferation of IoT devices and advent of social media, a huge amount of multimedia data is being generated and most of it are unstructured and multimodal.

Hence, it requires computation of multimedia data which has created huge opportunities in storage, processing, and analysis.

Structure

In this chapter we will be covering:

Defining tensors
Basic operations using TensorFlow
Session logging and variables
TensorBoard

Objective

Learn basic manipulations like assigning variables, matrix multiplication, transpose of matrix, resizing vectors and matrices using TensorFlow.

Machine learning and deep learning

It is apt to tell that computer vision is at the frontier of an intersection of computation, storage and the future of deep learning research. Some important applications in computer vision include the following:

Self-driving transportation
Fraud detection
Security system
Public administration
Content analysis, management, and retrieval

Alongside the proliferation of data, it requires various computationally efficient techniques to use these data in a meaningful manner. But the growth in CPU speed has not been at par with data creation speed, leading up to the development of many parallel processing architectures. Lately, we have seen a rise in usage of GPUs, to overcome this issue of computation, which have primarily been used for computer games, now it is being used for the computational purpose, and it has helped immensely in the rise of machine learning field.

Machine learning is a technique that uses statistical and mathematical models to extract some desired insights/information by utilizing data. This technique has been used to forecast the value by using previous years data and various other indicators, to classify emails into different categories, as a recommendation system to feed users with choices aligned with their past behavior, among many other things. Recently, a new branch of machine learning called deep learning has become popular among developers.

Deep learning is a powerful technique that provides flexible models to use, by combining the multilayer perceptron algorithm with various mathematical concepts. In deep learning, the model automatically finds the optimum combination of input features when properly tuned, and hence it enhances the accuracy of a decision-making process. We will get to know more about deep learning in further chapters but before that, let’s discuss frameworks and libraries which we will use to learn and implement various deep learning concepts and techniques.

In deep learning literature, several mathematical concepts and techniques have been proposed. It requires dedicated programming tools and frameworks for the implementation of those concepts to train deep models by utilizing a huge amount of data. In recent years, several programming libraries have been developed. But most of the proposed libraries came with trade-offs in terms of flexibility and scalability. In the research field, libraries having flexible structure are widely in demand, but these are often not good for scalability. To overcome this problem, different libraries have been developed, which are fast and scalable but built for specific models and networks, and hence not suitable for research purposes to experiment faster and develop better models.

In November 2015, Google developed TensorFlow, a novel open source library to overcome the above-mentioned problems.

What is TensorFlow?

When we visit TensorFlow website (https://www.tensorflow.org), at the very beginning it is defined as an opensource software library for machine intelligence. But when we start to read its first paragraph, TensorFlow is defined as an open-source software library for numerical computation using data flow graphs. The latter definition seems more cogent and comprehensive explanation of TensorFlow which includes its core structure. Don’t worry, if you are not aware of the term like data flow graph, we will look into all those terms, and will get familiarized with them.

From the website, we can see that TensorFlow is not merely defined as machine learning library rather it uses a more comprehensive term, numerical computation for its definition. TensorFlow contains a high-level wrapper package called Scikit Flow, which performs equally or better than the functionality of Scikit Learn but it is not primarily designed to provide novel Machine Learning solutions. Instead, TensorFlow helps users to design models from basics by providing various functions and classes. Hence, it helps the users to build customized and flexible models. TensorFlow does offer machine learning functionality, so it is equally good to perform complex mathematical computations also.

TensorFlow installation

Before diving deeper into TensorFlow concepts, let’s install TensorFlow library first as it will be better for us to validate our discussions and arguments simultaneously by writing the code. TensorFlow website provides step by step procedure to install TensorFlow for macOS, Ubuntu, and Windows. We have illustrated the complete process below for an Ubuntu environment.

In this section, we will get to know about the importance of other software like pip, virtual environment, notebook, and so on which will be helpful in installing and using TensorFlow.

If you know how to use all this software, then you can install TensorFlow directly from the official guide at TensorFlow website.

Okay, let’s discuss the importance of third-party software.

Jupyter Notebook and Matplotlib are two famous open source software that are widely practiced in data science. Jupyter Notebook helps to check the output of the script in the desired chunk or in step by step also. We can use Jupyter Notebook as a debugger for our script. In case of TensorFlow, which uses a graphical approach to solve a problem, it will be handy to use Jupyter Notebook to check the output at each node while debugging.

Matplotlib is another useful library which is being used for visualization.

Virtual environment

We can directly install software at machine level, but it has some disadvantages.

Let’s understand how. ...