OpenCV 3 Computer Vision with Python Cookbook
eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

  1. 306 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

About this book

Recipe-based approach to tackle the most common problems in Computer Vision by leveraging the functionality of OpenCV using Python APIsAbout This Book• Build computer vision applications with OpenCV functionality via Python API• Get to grips with image processing, multiple view geometry, and machine learning• Learn to use deep learning models for image classification, object detection, and face recognitionWho This Book Is ForThis book is for developers who have a basic knowledge of Python. If you are aware of the basics of OpenCV and are ready to build computer vision systems that are smarter, faster, more complex, and more practical than the competition, then this book is for you.What You Will Learn• Get familiar with low-level image processing methods• See the common linear algebra tools needed in computer vision• Work with different camera models and epipolar geometry• Find out how to detect interesting points in images and compare them• Binarize images and mask out regions of interest• Detect objects and track them in videosIn DetailOpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.In this book, you will learn how to process an image by manipulating pixels and analyze an image using histograms. Then, we'll show you how to apply image filters to enhance image content and exploit the image geometry in order to relay different views of a pictured scene. We'll explore techniques to achieve camera calibration and perform a multiple-view analysis.Later, you'll work on reconstructing a 3D scene from images, converting low-level pixel information to high-level concepts for applications such as object detection and recognition. You'll also discover how to process video from files or cameras and how to detect and track moving objects. Finally, you'll get acquainted with recent approaches in deep learning and neural networks.By the end of the book, you'll be able to apply your skills in OpenCV to create computer vision applications in various domains.Style and approachThis book helps you learn the core concepts of OpenCV faster by taking a recipe-based approach where you can try out different code snippets to understand a concept.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Object Detection and Machine Learning

In this chapter, we will cover the following recipes:
  • Obtaining an object mask using the GrabCut algorithm
  • Finding edges using the Canny algorithm
  • Detecting lines and circles using the Hough transform
  • Finding objects via template matching
  • The real-time median-flow object tracker
  • Tracking objects using different algorithms via the tracking API
  • Computing the dense optical flow between two frames
  • Detecting chessboard and circle grid patterns
  • A simple pedestrian detector using the SVM model
  • Optical character recognition using different machine learning models
  • Detecting faces using Haar/LBP cascades
  • Detecting AruCo patterns for AR applications
  • Detecting text in natural scenes
  • The QR code detector and recognizer

Introduction

Our world contains a lot of objects. Each type of object has its own features that distinguish it from some types and, at the same time, make it similar to others. Understanding the scene through the objects in it is a key task in computer vision. Being able to find and track various objects, detect basic patterns and complex structures, and recognize text are challenging and useful skills, and this chapter addresses questions on how to implement and use them with OpenCV functionality.
We will review the detection of geometric primitives, such as lines, circles, and chessboards, and more complex objects, such as pedestrians, faces, AruCo, and QR code patterns. We will also perform object tracking tasks.

Obtaining an object mask using the GrabCut algorithm

There are cases where we want to separate an object from other parts of a scene; in other words, where we want to create masks for the foreground and background. This job is tackled by the GrabCut algorithm. It can build object masks in semi-automatic mode. All that it needs are initial assumptions about object location. Based on these assumptions, the algorithm performs a multi-step iterative procedure to model statistical distributions of foreground and background pixels and find the best division according to the distributions. This sounds complicated, but the usage is very simple. Let's find out how easily we can apply this sophisticated algorithm in OpenCV.

Getting ready

Before you proceed with this recipe, you need to install the OpenCV 3.x Python API package.

How to do it...

  1. Import the modules:
import cv2
import numpy as np
  1. Open an image and define the mouse callback function to draw a rectangle on the image:
img = cv2.imread('../data/Lena.png', cv2.IMREAD_COLOR)
show_img = np.copy(img)

mouse_pressed = False
y = x = w = h = 0

def mouse_callback(event, _x, _y, flags, param):
global show_img, x, y, w, h, mouse_pressed

if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
x, y = _x, _y
show_img = np.copy(img)

elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
show_img = np.copy(img)
cv2.rectangle(show_img, (x, y),
(_x, _y), (0, 255, 0), 3)

elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
w, h = _x - x, _y - y
  1. Display the image, and, after the rectangle has been completed and the A button on the keyboard has been pressed, close the window with the following code:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)

if k == ord('a') and not mouse_pressed:
if w*h > 0:
break

cv2.destroyAllWindows()
  1. Call cv2.grabCut to create an object mask based on the rectangle that was drawn. Then, create the object mask and define it as:
labels = np.zeros(img.shape[:2],np.uint8)

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, (x, y, w, h), None, None, 5, cv2.GC_INIT_WITH_RECT)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()
  1. Define the mouse callback to draw the mask on the image. It's necessary to repair mistakes in the previous cv2.grabCut call:
label = cv2.GC_BGD
lbl_clrs = {cv2.GC_BGD: (0,0,0), cv2.GC_FGD: (255,255,255)}

def mouse_callback(event, x, y, flags, param):
global mouse_pressed

if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
  1. Show the image with the mask; use white to draw where the object pixels have been labeled as a background, and use black to draw where the background areas have been marked as belonging to the object. Then, call cv2.grabCut again to get the fixed mask. Finally, update the mask on the image, and show it:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)

if k == ord('a') and not mouse_pressed:
break
elif k == ord('l'):
label = cv2.GC_FGD - label

cv2.destroyAllWindows()

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, None, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_MASK)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()

How it works...

OpenCV's cv2.grabCut implements the GrabCut algorithm. This function is able to work in several modes, and takes the following arguments: input 3-channel image, a matrix with initial labels for pixels, a rectangle in (x, y, w, h) format to define label initialization, two matrices to store the process state, a number of iterations, and the mode in which we want the function to launch.
The function returns labels matrix and two matrices with the state of the process. The labels matrix is single-channel, and it stores one of these values in each pixel: cv2.GC_BGD (this means that the pixel definitely belongs to the background), cv2.GC_PR_BGD (this means that the pixel is probably in the background), cv2.GC_PR_FGD (for pixels which are possibly foreground), cv2.GC_FGD (for pixels which are definitely foreground). The two state matrices are necessary if we want to continue the process for a few iterations.
There are three possible modes for the function: cv2.GC...

Table of contents

  1. Title Page
  2. Copyright and Credits
  3. Packt Upsell
  4. Contributors
  5. Preface
  6. I/O and GUI
  7. Matrices, Colors, and Filters
  8. Contours and Segmentation
  9. Object Detection and Machine Learning
  10. Deep Learning
  11. Linear Algebra
  12. Detectors and Descriptors
  13. Image and Video Processing
  14. Multiple View Geometry
  15. Other Books You May Enjoy

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access OpenCV 3 Computer Vision with Python Cookbook by Alexey Spizhevoy, Aleksandr Rybnikov in PDF and/or ePUB format, as well as other popular books in Computer Science & Application Development. We have over one million books available in our catalogue for you to explore.