Image Processing Masterclass with Python
eBook - ePub

Image Processing Masterclass with Python

50+ Solutions and Techniques Solving Complex Digital Image Processing Challenges Using Numpy, Scipy, Pytorch and Keras

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Image Processing Masterclass with Python

50+ Solutions and Techniques Solving Complex Digital Image Processing Challenges Using Numpy, Scipy, Pytorch and Keras

About this book

Over 50 problems solved with classical algorithms + ML / DL models

Key Features

  • Problem-driven approach to practice image processing.
  • Practical usage of popular Python libraries: Numpy, Scipy, scikit-image, PIL and SimpleITK.
  • End-to-end demonstration of popular facial image processing challenges using MTCNN and Microsoft's Cognitive Vision APIs.

Description
This book starts with basic Image Processing and manipulation problems and demonstrates how to solve them with popular Python libraries and modules. It then concentrates on problems based on Geometric image transformations and problems to be solved with Image hashing. Next, the book focuses on solving problems based on Sampling, Convolution, Discrete Fourier transform, Frequency domain filtering and image restoration with deconvolution. It also aims at solving Image enhancement problems using different algorithms such as spatial filters and create a super resolution image using SRGAN.Finally, it explores popular facial image processing problems and solves them with Machine learning and Deep learning models using popular python ML / DL libraries.

What you will learn

  • Develop strong grip on the fundamentals of Image Processing and Image Manipulation.
  • Solve popular Image Processing problems using Machine Learning and Deep Learning models.
  • Working knowledge on Python libraries including numpy, scipy and scikit-image.
  • Use popular Python Machine Learning packages such as scikit-learn, Keras and pytorch.
  • Live implementation of Facial Image Processing techniques such as Face Detection / Recognition / Parsing dlib and MTCNN.

Who this book is for
This book is designed specially for computer vision users, machine learning engineers, image processing experts who are looking for solving modern image processing/computer vision challenges.

Table of Contents
1. Chapter 1: Basic Image & Video Processing
2. Chapter 2: More Image Transformation and Manipulation
3. Chapter 3: Sampling, Convolution and Discrete Fourier Transform
4. Chapter 4: Discrete Cosine / Wavelet Transform and Deconvolution
5. Chapter 5: Image Enhancement
6. Chapter 6: More Image Enhancement
7. Chapter 7: Face Image Processing

About the Author
Sandipan Dey is a Data Scientist with a wide range of interests, covering topics such as Machine Learning, Deep Learning, Image Processing and Computer Vision. He has worked in numerous data science fields, such as recommender systems, predictive models for the events industry, sensor localization models, sentiment analysis, and device prognostics. He earned his master's degree in Computer Science from the University of Maryland, Baltimore County, and has published in a few IEEE data mining conferences and journals. He has also authored a couple of Image Processing books, published from an international publication house. He has earned certifications from 100+ MOOCs on data science and related courses. He is a regular blogger (at sandipanweb @wordpress, medium and data science central) and is a Machine Learning education enthusiast. LinkedIn Profile: https://www.linkedin.com/in/sandipan-dey-0370276

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Image Processing Masterclass with Python by Sandipan Dey in PDF and/or ePUB format, as well as other popular books in Computer Science & Software Development. We have over one million books available in our catalogue for you to explore.

CHAPTER 1

Basic Image and Video Processing

Introduction

Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on a computer. Video processing refers to a special case of image processing that often employs video filters and where the input and output signals are video files or video streams. Image and video processing have applications in many disciplines and fields in science and technology such as television, photography, robotics, remote sensing, medical diagnosis (CT scan/X-Ray/MRI), and industrial inspection. Social networking sites such as Facebook and Instagram, which we have got used to in our daily lives and where we upload tons of images/videos every day, are typical examples of the industries that need to use/innovate many image/video processing algorithms to process the images/videos we upload.
In this chapter, we shall solve a few initial image and video processing problems that will help us understand the basic concepts of image and video processing. Before we start processing/analysing an image/video, we need to be able to load the image into memory using a suitable data structure and also be able to save the processed image/video back to the disk. It is also important to be able to visualize (plot) the image on the computer screen (to see the impact of an image processing algorithm on an image immediately). Often an image/a video needs to be pre-processed before it can be used in some complex image/video processing algorithms (such as classification or segmentation that you will get to know more in the later chapters); some transformation/manipulation techniques (such as resizing/cropping/changing brightness and contrast) are very useful. Similarly, as a post-processing step, we may need to apply some image/video manipulation/transformation techniques to get back the desired output. With image transformation and manipulation, we can also enhance the appearance of an image (for example, by applying a filter).
In this chapter, you are going to learn how to use different Python libraries (numpy, scipy, scikit-image, opencv-python, and matplotlib) for basic image/video processing, manipulation, and transformation. We shall start by displaying the three channels of an RGB image with 3D visualizations. Next, we shall demonstrate how to capture a video from a camera and extract frames. Then, we shall show how to implement Instagram-like Gotham filter. Finally, we shall explore the following few problems on image manipulations and see how to solve them using python libraries:
  • Plot image montage, crop/resize images, and draw contours
  • Convert PNG image with a palette to grayscale
  • Rotate an image and convert RGB to YUV color space (using scikit-image, PIL, python-opencv, and scipy.ndimage/misc)

Structure

This chapter is organized as follows:
  • Objectives
  • Problems
    Display RGB image color channels in 3D
    Video I/O
    Read/write video files
    Capture video from camera and extract frames with OpenCV-Python
    Implement Instagram-like Gotham filter
    Explore image manipulations (using scikit-image, PIL, python-opencv, and scipy ndimage/misc)
    Plot image montage with scikit-image
    Crop/resize images with SciPy ndimage module
    Draw contours with OpenCV-Python
    Counting objects in an image
    Convert a PNG image with a palette to grayscale with PIL
    Different ways to convert an RGB image to grayscale
    Rotating an image with scipy.ndimage
    Image differences with PIL
    Converting RGB to HSV and YUV color spaces with scikit-image
    Resizing an image with OpenCV-Python
    Add a logo to an image with scikit-image
    Change brightness/contrast of an image with linear transformation and gamma correction with OpenCV-Python
    Detecting colors and changing colors of objects with OpenCV-Python
    Object removal with seam carving
    Creating fake miniature effect
  • Summary
  • Questions
  • Key terms
  • References

Objectives

After studying this Chapter, you should be able to:
  • Understand the image/video storage and data structures in python
  • Do image/video file I/O in python using different libraries
  • Write python code to do basic image/video manipulations

Problems

Display RGB image color channels in 3D

It is very useful to be able to conceptualize an image as a function and visualize it to understand what it is and then do further analysis/processing. A grayscale image can be thought of a 2-D function f(x, y) of the pixel locations (x, y), a function that maps each pixel into its corresponding grey level (for example, an integer in [0,255] or equivalently a floating-point number in [0,1]), that is:
f : (x, y)R
For an RGB image, there are three such functions that can be denoted as:
fR (x, y), fG (x. y) and fB(x. y)
which is corresponding to each of the channels R, G, and B, respectively. The library matplotlib’s 3-D plot functions can be used to plot each of these functions. The following Python code shows how to plot the RGB channels separately in 3D.
The following are the steps you need to follow:
  1. First, start by importing all the required packages by using the following code. For reading an image, we need the imread() function from the scikit-image library’s io module. For array operations, ...

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Dedication Page
  5. About the Author
  6. About the Reviewer
  7. Acknowledgements
  8. Preface
  9. Errata
  10. Table of Contents
  11. 1. Basic Image and Video Processing
  12. 2. More Image Transformation and Manipulation
  13. 3. Sampling, Convolution, Discrete Fourier, Cosine and Wavelet Transform
  14. 4. Discrete Cosine/Wavelet Transform and Deconvolution
  15. 5. Image Enhancement
  16. 6. More Image Enhancement
  17. 7. Face Image Processing
  18. Index