eBook - ePub

Image Processing Masterclass with Python

Name: Image Processing Masterclass with Python
ISBN: 9789389898644

50+ Solutions and Techniques Solving Complex Digital Image Processing Challenges Using Numpy, Scipy, Pytorch and Keras

Sandipan Dey,

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Image Processing Masterclass with Python

50+ Solutions and Techniques Solving Complex Digital Image Processing Challenges Using Numpy, Scipy, Pytorch and Keras

Sandipan Dey,

About this book

Over 50 problems solved with classical algorithms + ML / DL models

Key Features

Problem-driven approach to practice image processing.
Practical usage of popular Python libraries: Numpy, Scipy, scikit-image, PIL and SimpleITK.
End-to-end demonstration of popular facial image processing challenges using MTCNN and Microsoft's Cognitive Vision APIs.

Description
This book starts with basic Image Processing and manipulation problems and demonstrates how to solve them with popular Python libraries and modules. It then concentrates on problems based on Geometric image transformations and problems to be solved with Image hashing. Next, the book focuses on solving problems based on Sampling, Convolution, Discrete Fourier transform, Frequency domain filtering and image restoration with deconvolution. It also aims at solving Image enhancement problems using different algorithms such as spatial filters and create a super resolution image using SRGAN.Finally, it explores popular facial image processing problems and solves them with Machine learning and Deep learning models using popular python ML / DL libraries.

What you will learn

Develop strong grip on the fundamentals of Image Processing and Image Manipulation.
Solve popular Image Processing problems using Machine Learning and Deep Learning models.
Working knowledge on Python libraries including numpy, scipy and scikit-image.
Use popular Python Machine Learning packages such as scikit-learn, Keras and pytorch.
Live implementation of Facial Image Processing techniques such as Face Detection / Recognition / Parsing dlib and MTCNN.

Who this book is for
This book is designed specially for computer vision users, machine learning engineers, image processing experts who are looking for solving modern image processing/computer vision challenges.

Table of Contents
1. Chapter 1: Basic Image & Video Processing
2. Chapter 2: More Image Transformation and Manipulation
3. Chapter 3: Sampling, Convolution and Discrete Fourier Transform
4. Chapter 4: Discrete Cosine / Wavelet Transform and Deconvolution
5. Chapter 5: Image Enhancement
6. Chapter 6: More Image Enhancement
7. Chapter 7: Face Image Processing

About the Author
Sandipan Dey is a Data Scientist with a wide range of interests, covering topics such as Machine Learning, Deep Learning, Image Processing and Computer Vision. He has worked in numerous data science fields, such as recommender systems, predictive models for the events industry, sensor localization models, sentiment analysis, and device prognostics. He earned his master's degree in Computer Science from the University of Maryland, Baltimore County, and has published in a few IEEE data mining conferences and journals. He has also authored a couple of Image Processing books, published from an international publication house. He has earned certifications from 100+ MOOCs on data science and related courses. He is a regular blogger (at sandipanweb @wordpress, medium and data science central) and is a Machine Learning education enthusiast. LinkedIn Profile: https://www.linkedin.com/in/sandipan-dey-0370276

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Year

eBook ISBN

Topic

Subtopic

Index

CHAPTER 1 Basic Image and Video Processing

Introduction

Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on a computer. Video processing refers to a special case of image processing that often employs video filters and where the input and output signals are video files or video streams. Image and video processing have applications in many disciplines and fields in science and technology such as television, photography, robotics, remote sensing, medical diagnosis (CT scan/X-Ray/MRI), and industrial inspection. Social networking sites such as Facebook and Instagram, which we have got used to in our daily lives and where we upload tons of images/videos every day, are typical examples of the industries that need to use/innovate many image/video processing algorithms to process the images/videos we upload.

In this chapter, we shall solve a few initial image and video processing problems that will help us understand the basic concepts of image and video processing. Before we start processing/analysing an image/video, we need to be able to load the image into memory using a suitable data structure and also be able to save the processed image/video back to the disk. It is also important to be able to visualize (plot) the image on the computer screen (to see the impact of an image processing algorithm on an image immediately). Often an image/a video needs to be pre-processed before it can be used in some complex image/video processing algorithms (such as classification or segmentation that you will get to know more in the later chapters); some transformation/manipulation techniques (such as resizing/cropping/changing brightness and contrast) are very useful. Similarly, as a post-processing step, we may need to apply some image/video manipulation/transformation techniques to get back the desired output. With image transformation and manipulation, we can also enhance the appearance of an image (for example, by applying a filter).

In this chapter, you are going to learn how to use different Python libraries (numpy, scipy, scikit-image, opencv-python, and matplotlib) for basic image/video processing, manipulation, and transformation. We shall start by displaying the three channels of an RGB image with 3D visualizations. Next, we shall demonstrate how to capture a video from a camera and extract frames. Then, we shall show how to implement Instagram-like Gotham filter. Finally, we shall explore the following few problems on image manipulations and see how to solve them using python libraries:

Plot image montage, crop/resize images, and draw contours
Convert PNG image with a palette to grayscale
Rotate an image and convert RGB to YUV color space (using scikit-image, PIL, python-opencv, and scipy.ndimage/misc)

Structure

This chapter is organized as follows:

Objectives
Problems
Display RGB image color channels in 3D

Video I/O

Read/write video files

Capture video from camera and extract frames with OpenCV-Python

Implement Instagram-like Gotham filter

Explore image manipulations (using scikit-image, PIL, python-opencv, and scipy ndimage/misc)

Plot image montage with scikit-image

Crop/resize images with SciPy ndimage module

Draw contours with OpenCV-Python

Counting objects in an image

Convert a PNG image with a palette to grayscale with PIL

Different ways to convert an RGB image to grayscale

Rotating an image with scipy.ndimage

Image differences with PIL

Converting RGB to HSV and YUV color spaces with scikit-image

Resizing an image with OpenCV-Python

Add a logo to an image with scikit-image

Change brightness/contrast of an image with linear transformation and gamma correction with OpenCV-Python

Detecting colors and changing colors of objects with OpenCV-Python

Object removal with seam carving

Creating fake miniature effect
Summary
Questions
Key terms
References

Objectives

After studying this Chapter, you should be able to:

Understand the image/video storage and data structures in python
Do image/video file I/O in python using different libraries
Write python code to do basic image/video manipulations

Problems

Display RGB image color channels in 3D

It is very useful to be able to conceptualize an image as a function and visualize it to understand what it is and then do further analysis/processing. A grayscale image can be thought of a 2-D function f(x, y) of the pixel locations (x, y), a function that maps each pixel into its corresponding grey level (for example, an integer in [0,255] or equivalently a floating-point number in [0,1]), that is:

f : (x, y) → R

For an RGB image, there are three such functions that can be denoted as:

f_R (x, y), f_G (x. y) and f_B(x. y)

which is corresponding to each of the channels R, G, and B, respectively. The library matplotlib’s 3-D plot functions can be used to plot each of these functions. The following Python code shows how to plot the RGB channels separately in 3D.

The following are the steps you need to follow:

First, start by importing all the required packages by using the following code. For reading an image, we need the imread() function from the scikit-image library’s io module. For array operations, ...

Cover Page
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgements
Preface
Errata
Table of Contents
1. Basic Image and Video Processing
2. More Image Transformation and Manipulation
3. Sampling, Convolution, Discrete Fourier, Cosine and Wavelet Transform
4. Discrete Cosine/Wavelet Transform and Deconvolution
5. Image Enhancement
6. More Image Enhancement
7. Face Image Processing
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Image Processing Masterclass with Python by Sandipan Dey in PDF and/or ePUB format, as well as other popular books in Computer Science & Software Development. We have over one million books available in our catalogue for you to explore.