eBook - ePub

Elements of Deep Learning for Computer Vision

Name: Elements of Deep Learning for Computer Vision
Author: Bharat Sikka

Explore Deep Neural Network Architectures, PyTorch, Object Detection Algorithms, and Computer Vision Applications for Python Coders (English Edition)

Bharat Sikka

Share book

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Elements of Deep Learning for Computer Vision

Explore Deep Neural Network Architectures, PyTorch, Object Detection Algorithms, and Computer Vision Applications for Python Coders (English Edition)

Bharat Sikka

Book details

Book preview

Table of contents

Citations

About This Book

Conceptualizing deep learning in computer vision applications using PyTorch and Python libraries.

Key Features
? Covers a variety of computer vision projects, including face recognition and object recognition such as Yolo, Faster R-CNN.
? Includes graphical representations and illustrations of neural networks and teaches how to program them.
? Includes deep learning techniques and architectures introduced by Microsoft, Google, and the University of Oxford.

Description
Elements of Deep Learning for Computer Vision gives a thorough understanding of deep learning and provides highly accurate computer vision solutions while using libraries like PyTorch.This book introduces you to Deep Learning and explains all the concepts required to understand the basic working, development, and tuning of a neural network using Pytorch. The book then addresses the field of computer vision using two libraries, including the Python wrapper/version of OpenCV and PIL. After establishing and understanding both the primary concepts, the book addresses them together by explaining Convolutional Neural Networks(CNNs). CNNs are further elaborated using top industry standards and research to explain how they provide complicated Object Detection in images and videos, while also explaining their evaluation. Towards the end, the book explains how to develop a fully functional object detection model, including its deployment over APIs.By the end of this book, you are well-equipped with the role of deep learning in the field of computer vision along with a guided process to design deep learning solutions.

What you will learn
? Get to know the mechanism of deep learning and how neural networks operate.
? Learn to develop a highly accurate neural network model.
? Access to rich Python libraries to address computer vision challenges.
? Build deep learning models using PyTorch and learn how to deploy using the API.
? Learn to develop Object Detection and Face Recognition models along with their deployment.

Who this book is for
This book is for the readers who aspire to gain a strong fundamental understanding of how to infuse deep learning into computer vision and image processing applications. Readers are expected to have intermediate Python skills. No previous knowledge of PyTorch and Computer Vision is required.

Table of Contents
1. An Introduction to Deep Learning
2. Supervised Learning
3. Gradient Descent
4. OpenCV with Python
5. Python Imaging Library and Pillow
6. Introduction to Convolutional Neural Networks
7. GoogLeNet, VGGNet, and ResNet
8. Understanding Object Detection
9. Popular Algorithms for Object Detection
10. Faster RCNN with PyTorch and YoloV4 with Darknet
11. Comparing Algorithms and API Deployment with Flask
12. Applications in Real World

About the Authors
Bharat Sikka is a data scientist based in Mumbai, India. Over the years, he has worked on implementing algorithms like YOLOv3/v4, Faster-RCNN, Mask-RCNN, among others. He is currently working as a data scientist at the State Bank of India.He also has a thorough knowledge and understanding of various programming languages such as Python, R, MATLAB, and Octave for Machine Learning, Deep Learning, Data Visualization and Analysis in Python, R, and Power BI, Tableau.He holds an MS degree in Data Science and Analytics from Royal Holloway, University of London, and a BTech degree in Information Technology from Symbiosis International University and has earned multiple certifications, including MOOCs in varied fields, including machine learning.He is a science fiction fanatic, loves to travel, and is a great cook. Blog links: https://github.com/bharatsikka
LinkedIn Profile: www.linkedin.com/in/bharat-sikka

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Elements of Deep Learning for Computer Vision an online PDF/ePUB?

Yes, you can access Elements of Deep Learning for Computer Vision by Bharat Sikka in PDF and/or ePUB format, as well as other popular books in Informatica & Visione artificiale e riconoscimento di schemi. We have over one million books available in our catalogue for you to explore.

Information

Publisher

BPB Publications

Year

2021

ISBN

9789390684687

Topic

Informatica

Subtopic

Visione artificiale e riconoscimento di schemi

SECTION 1 Introductory Concepts

"Any sufficiently advanced technology is indistinguishable from magic."

-Arthur C. Clarke, Profiles of the Future (revised edition, 1973)

The upcoming three chapters mark the introductory section of this book. These chapters explain details about AI and deep learning, that are essential, and recommended for readers to understand the upcoming sections; you can move to Section 2: Computer Vision if you already have a good understanding of the basic concepts.

CHAPTER 1 An Introduction to Deep Learning

In this chapter we will understand the basics of deep learning (DL) and how it is a subset of the popular terms i.e. Machine Learning (ML) and Artificial Intelligence (AI). We need to know how AI works and how it serves our purpose, there are many definitions developed by different people in society, which unless understood properly can cause confusion. We will dive deep into the background of these terms to understand their origins and how AI has evolved through the years.

The following topics will be covered in this chapter:

Deep learning and its basic concepts
Artificial intelligence, deep learning, and machine learning
History of AI and relationship with data science
Focus of this book i.e. computer vision
A brief understanding of a popular neural network developed by the University of Oxford
Future of deep learning

Objectives

By the end of this chapter, you should be able to:

Understand how AI has evolved through the years from 1950s to 2020s.
Understand the meaning of AI, ML, neural networks and deep learning.
Develop an intuition of how neural networks look like.

Figure 1.1: Artificial intelligence hierarchy

1.1 Artificial intelligence

There has been a huge hype about AI and data science in some recent years, from engineers and researchers to data analysts and business decision makers, everyone has been able to relate their work and have found a keen curiosity in understanding these terms. Many companies have promised a future of driverless cars, intelligent robots to reduce the human manual effort, or to be handled only by robots/AI altogether. The economist also mentions that data has become more valuable than oil [12] and data, information and knowledge further attributes to the development of AI but how do we define intelligence? Ability to understand patterns and predict? Many abilities represent intelligence but once we are able to explain it, it doesn't seem so intelligent.

Some see it as automation or to reduce manual effort and human error, while many have seen it as a decision making support by first analyzing data and then forecasting. AI has been known to exist from a long time and is probably older than many people reading this book. AI has existed since the 1950s when only a handful of computer science experts started asking if computers could be made to think. We are going to study the vision challenge of AI and by the end of this book, you will be able to develop various different computer vision-based applications which can be used in an AI agent to provide aid visually.

AI is not a single algorithm which performs a single intelligent task but a complete infrastructure that successfully automates the tasks humans generally perform and require intelligence.

AI currently is termed along with Machine learning and deep learning, while earlier the approach to make an AI was to find and automate all the sub tasks of a bigger task or automating a set of rules rather than the algorithm learning to make decisions. For example, for automating a chess game, earlier the tasks were hard coded with a huge set of rules and there was no particular learning involved. This type of approach to AI was known as symbolic AI and until the 1980s it was the primary practice. But if symbolic AI was able to play chess and can be referred to as an intelligence, why do we need machine learning or deep learning? Even though symbolic AI can perform really well in chess and can be a good approach, it is really complex and is not able to perform tasks like image classification, language translation, etc. Hence, we reach at another explanation of AI, that AI must be developed from experience rather than programming tasks.

An AI is a program whose behavior is substantially determined by its experience rather than its original programming. Experience may consist of observation, data mining, receiving instruction, or problem analysis.

—Conrad McDonnell

One of the most popular personalities in AI, Andrew Ng, also made a bold claim about AI in regards to automation, that any decision made by a person within a second can and should be automated, while this statement can be totally correct but some 1 second decisions might also require us to make decisions which might take more than a second to perform.

In order to understand deep learning conceptually, we need to first dive into ML and neural networks which would form the basics of deep learning and the upcoming chapters in this book.

Figure 1.2: Image from Microsoft's COCO dataset with detections such as human, motorcycle etc. publicly available [1]

1.2 Machine learning

ML is both a concept and a field of study where machines or systems understand and learn from experience to make themselves better in decision making, and perform tasks using algorithms. These pattern recognition algorithms then help us predict further occurrences of data from the already provided data. Learning in ML is used to describe the process of reducing errors while predicting patterns to reach a minimum error rate; after which predictions can be made with accuracy or minimum errors to predict future occurrences in data. ML can be categorized into various categories by the type of learning:

Statistical learning: Using conventional statistical learning techniques that include parameterized approaches like linear regression, logistic regression, and non-parameterized approaches like k-nearest neighbors (KNN).
Neural networks: Using neurons and neural networks, this is the main area of learning in this book with a focus on computer vision.
Reinforcement learning: Using conventional animal learning techniques, utilizing factors like state, action, reward, and policy. We use algorithms and techniques such as the Markov decision process, Bellman equation, and Q-learning for achieving learning.

Figure 1.3: Machine Learning categorized by learning techniques

We will only discuss neural networks in the subsequent chapters and other topics such as statistical learning and reinforcement learning are beyond the scope of this book. A prior understanding of statistical learning would be really beneficial in the upcoming chapters and is advised, before moving further. ML can also categorized according to the available dataset:

Supervised learning: When data is available with examples or labels in the dataset, and patterns are required to be understood with the help of these examples.
Unsupervised learning: When there are no examples or labels of data, and patterns are required to be understood from the data with some known information.
Semi-supervised learning: A type of learning where there is some information about the data and only definite amount examples are provided in the dataset which may not be enough for supervised learning.

We have mentioned earlier that ML algorithms are pattern recognition algorithms, but what are these patterns in data and how do we perceive and understa...