In this series of articles, we will learn about Image Processing in Python. We will use the OpenCV (Open Source Computer Vision) library for that purpose. It provides numerous functions to perform image processing and real-time computer vision.
Introduction to Image Processing
In today’s world, data has become an important aspect in terms of machine learning and deep learning tasks. To be more specific, the image data. A lot of computer vision advancement, for example, object detection, and face recognition, etc., has been made possible because of large amounts of such data. These images need to be processed, manipulated, and analyzed before feeding them into some computer vision model. This is where image processing (digital image processing) comes into play. For example, all the images are resized to a specific size, noise needs to be removed from the noisy images, and illumination and reflection also need to be removed, etc. In the rest of the article, we are going to learn about Image Processing in Python using OpenCV.
Image Processing vs. Computer Vision
A lot of people get confused about computer vision and image processing. So, let me clarify the difference between them before moving further. So, in digital image processing, both the input and output are images. For example, you provide a noisy image as an input to an image processing algorithm. It will remove the noise from the image and then return it as an output. However, in computer vision, the input is an image, and the output is some useful information about it. For example, you provide a person’s photo as an input to a face recognition algorithm. The result, in this case, will be the name of the person.
Now, let’s get to know what an image is.
What is an Image?
An image is a two-dimensional function, F(x, y), where x and y are the spatial coordinates, and F(x, y) is the value of the intensity at the point x and y. Simply put, an image is nothing but a matrix containing pixel values.
Now we know what digital image processing is, its usage, and how it is different from computer vision. So, without further ado, let’s get started.
Installing OpenCV
First, we need to install opencv-python.
For Windows
pip install opencv-python
For Linux
$ sudo apt-get install python-opencv
Mac OS
$ brew install opencv3 --with-contrib --with-python3
To see if the installation was successful, run the following code.
import cv2 print(cv2.__version__)
If the program runs without any error, then you have installed it successfully.
Let’s now see how to read an image.
Read Image
Reading image is the very common and initial task of Image Processing in Python. The cv2.imread() function is used to read the image. It takes two arguments. The first argument is the path of the image. Make sure that the image is in the working directory. Otherwise, provide the full path of an image. The second argument is optional. It takes a flag, which represents the format of the image. The possible flags are:
- cv2.IMREAD_COLOR or pass 1. It loads the image in the color mode, i.e., three-channel BGR color format.
- or pass 0. It loads the image in the grayscale format.
- cv2.IMREAD_UNCHANGED or pass -1. As the name suggests, it loads the image as is.
We will learn about color formats in detail in later articles. The cv2.imread() function returns an image.
Note that if you provide a wrong path as an argument. It won’t raise any error. However, the return value will be None.
Display Image
To display an image, the cv2.imshow() function is used. It displays the image in a window. The function takes two arguments. The first argument is the name of the window, and the second argument is the image that we want to display.
Note that if you only use cv2.imshow() in your code, the window and the program will get closed before you could even see the image. To avoid it, use the cv2.waitKey(0) function. It will display the image indefinitely until you press some key. Moreover, cv2.destroyAllWindows() destroys all windows.
We will use the following puppy image as our reference image.
Let’s read and display a cute puppy photo.
import cv2 img = cv2.imread("puppy.jpg", cv2.IMREAD_COLOR) if img is None: print("Error in reading image.") else: cv2.imshow("A cute puppy", img) cv2.waitKey(0) cv2.destroyAllWindows()
Output
In the above example, we read the image in the color format. Then, we check if it is loaded successfully, i.e., the given path is right. If it is, then we display that image.
Let’s now read the same image in the grayscale format. For simplicity, we will pass 0 instead of cv2.IMREAD_GRAYSCALE.
import cv2 img = cv2.imread("puppy.jpg", 0) if img is None: print("Error in reading image.") else: cv2.imshow("A cute puppy", img) cv2.waitKey(0) cv2.destroyAllWindows()
Output
Let’s now learn to save an image.
Save Image
The cv2.imwrite() function is used to save the image. It takes two arguments, the file path and the image that you want to save. It returns True if the image is saved successfully. Otherwise, it returns False.
Let’s use the same code as above and save the image that we read in the grayscale format.
import cv2 img = cv2.imread("puppy.jpg", 0) if img is None: print("Error in reading image.") else: cv2.imshow("A cute puppy", img) cv2.waitKey(0) cv2.destroyAllWindows() cv2.imwrite("puppygrayscale.jpg", img)
The image is saved in the current working directory with the name puppygrayscale.jpg.
Access and modify the pixel value
To access a pixel value in a grayscale image, write img[i, j], where i is the row and j is the column. However, if the image is in the color format, then you will have to specify the channel as well, i.e., img[i, j, c]. In the BGR format, c=0 represents the Blue channel, c=1 represents the Green channel, and c=2 represents the Red channel. If you want to modify it, just assign a value to it. Let’s see.
import cv2 img = cv2.imread("puppy.jpg", 0) print(img[12, 0]) img[12, 0] = 0 print(img[12, 0])
Output
241
0
In the above code, first, we access the pixel value in the 12th row and the 0th column. Then, we set it to 0 and display that value again.
Let’s now do it for the colored image as well.
import cv2 img = cv2.imread("puppy.jpg", 1) #accessing the pixel values of three channel print(img[12, 0]) #accessing only the blue channel print(img[12, 0, 0])
Output
[229 242 244]
229
In the above code, first, we print the values of the three channels in the 12th row and the 0th column. Then, we only access the pixel value of the Blue channel.
All images in OpenCV Python are stored as NumPy arrays. Therefore, you access/ modify and retrieve properties in the same way as do you for a NumPy array.
Get Image Properties
The image.shape property returns a tuple containing the number of rows (height), columns (width), and channels. If the image is grayscale, then the tuple will not have any channels info.
The image.size property returns the total number of pixels in an image, i.e., number of rows × number of columns × number of channels.
The image.dtype property provides the data type of an image.
Consider the following example.
import cv2 img = cv2.imread("puppy.jpg", 1) print(f"Dimension: {img.shape}") print(f"Total number of pixels: {img.size}") print(f"Data type of image: {img.dtype}")
Output
Dimension: (321, 640, 3)
Total number of pixels: 616320
Data type of image: uint8
Now you know how to read, write, and display an image. Moreover, you can also retrieve basic image properties and access and modify intensity values. That’s it for this article! See you in the next one, in which we will discuss basic operations on images. And during the whole series of the articles, we will deeply learn Image Processing in Python using OpenCV.