We as humans can see and interpret images using our visual system and remember using our memory, but how do computers remember and interpret images?
Machines store images in the form of square boxes which contain numbers and each square box is called a pixel. The values of each pixel together make an image. The numbers in square boxes are stored in the form of a matrix of numbers. The size of the matrix depends upon the size of the image (n x m) which refers to the number of pixels in an image.
OpenCV: The image processing library which stands for Open-Source Computer Vision Library was invented by intel in 1999 and written in C/C++. The library's function is to perform image processing jobs such as Resizing, Blurring, sharpening, transformations, etc.
By using OpenCV we perform image processing and pre-processing of data. By performing certain operations on images, we are able to extract information from the images for intended objectives such as edge detection or Face Recognition.
Importing an image: The foremost basic task in image processing in OpenCV is to import an image in python.
Img=cv.imread(‘image.jpg’)
cv.imread(‘Image_tag’,Img)
Images in different color spaces:
Color spaces: Representation of an image using the different color combinations of the OpenCV library.
A Grayscale scale image is one channel in which values range from (0–255) lowest value(0) represents dark and the higher value(255) represents brightness, the size of the Grayscale image is represented as n*m.
To convert images into grayscale images:
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv.imshow(‘Image’,img_gray)
cv.waitKey(0) cv.destroyallWindows()
A color image is an image which is made of three color channels RGB(Red, Green, Blue), the size of the color image is represented as n*m*3, where n and m are rows, columns and ‘3’ denotes the three color channels RGB(Red, Green, Blue)
There are color spaces such as HSV, which stands for hue, saturation, and value, HSL stands for hue, saturation, and lightness.
HSV indicates the value of the color, while HSL is the amount of light. Along the angles from the center axis, there is hue, the actual colors. And the distance from the center axis belongs to saturation.
Geometrics in Images: In image processing, we may explicitly need to draw shapes in images to highlight any specific information in an image by a rectangle, circle, a line and we can also write text explicitly to denote or specify something.
RECTANGLE: To draw a rectangle we use the function cv.rectangle() and the parameters needed to be mentioned: image, coordinate of top-left corner, coordinate of the bottom-right corner, color of rectangle, thickness of rectangle.
Circle: To draw circles, we use the function cv.circle(): mention the image on which to draw circle, coordinate of the point, radius, color of the circle, thickness of the circle.
Line: To draw a line of any length we will use the function cv.line() in which we pass the parameters: image on which to draw line, starting coordinate, ending coordinate, color of the line, thickness of the line.
Text: In case of adding text to images, we will use the function cv.putText(), and mention the parameters: image on which to write, ‘The message’, Coordinate of text in image, font type, font scale, colour of text, thickness, line type
<Code Link given below>
Rescaling or Resizing Frames: Images are of different resolution with varied sizes some are large and some are small, but we may need to resize an image either larger or smaller accordingly.
The frame of the video or image can be resized into any size by rescaling explicitly using the OpenCV library function cv2.resize() and mentioning parameters: the image, width, height of the image, interpolation method for zooming or shrinking.
Translation: The method of shifting an image i.e., in leftmost, rightmost, upward, downward of frame of the image window. We need to create a transformation matrix and pass it to the function cv2.warpAffine with the image and with the dimensions of the image.
Rotation: In OpenCV, rotation of an image can be done using the function warpAffine() which takes the image, the rotation matrix, and the dimension. The Rotation matrix needs the center of the image, the angle of rotation, and the scaling value.
<Code Link given below>
Kernel: Extraction of information from images requies functions to do so. A kernel is made up of small 2-dimensional matrices of numbers, which act as an operation on pixels of the image.
The kernel computes the values of the surrounding pixel neighborhood's. To determine the value for the centre pixel, the kernel repeats the process until it finishes scanning the whole target image and at last results in a blurred, sharpened image of the original image.
The values of the convoluted image pixel are the result of the weighted sum of all the pixels in the neighborhoods. The new values of the image are formed from the sum of the elements resulting from the element-wise multiplication of two matrices.
Smoothing: Removing of noises from the images i.e., the high frequency content of the image using a low-pass-filters (LPF) by averaging the pixel values with a help of a kernel, resulting in a blurred image.
We can perform blurring using the Gaussian Blur function
Sharpening: To discover edges in an image we use sharpening to help computers differentiate between foreground and background in images, and extract edges to detect objects accurately. Edges in the images are sharp changes of brightness curves.
We can detect edge features in an image using the Canny function in the OpenCV library.
Thresholding: Binarization of images into either black as background and white as foreground vice versa by using thresholding parameters.
In simple words, when the pixel value is determined to be greater than the set threshold value, the pixel is assigned to whiten or darken.
In the cv.threshold() function the arguments are: source image, value of threshold, value to assign for above threshold value, types of thresholding.
Adaptive Thresholding In OpenCV thresholding can be used for varied image types. Calculating thresholds for small regions of the images can give us better results in extracting features of varying illumination. In the cv.adaptiveThreshold() function arguments passed are: source img, ,ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, block size, Constant.
ADAPTIVE_THRESH_MEAN_C: threshold value is the mean of the neighborhood area.
ADAPTIVE_THRESH_GAUSSIAN_C: threshold value is the weighted sum of neighborhood values where weights are a gaussian window.
Histogram Computation: Histogram is a graph which gives the measure of the intensity distribution on an image. The graph value ranges from 0 to 255 along the X-axis and the number of pixels in the image along the Y-axis.
We can calculate histogram in OpenCV by the cv2.calcHist() function. The parameters for the function are:
cv.calcHist(images, channels, mask, histSize, ranges[0,256])
Image: the source image of type uint80r float32 in brackets, because it is a list of images.
Channels: colour images have three channels, channels indicate for which we want to calculate histogram [0],[1],[2] indicates histogram for R,G,B. For Grayscale its value is [0]
mask: To find the histogram of a particular region of the image, otherwise the histogram for a full image is NONE.
histSize: Representation of BIN count(pixel interval), number of pixels in each interval.
ranges: Values which represent pixel range; it is[0,256].
Masking: Determination of pixel intensities at any particular region of the image is possible, the area of interest can be extracted explicitly using Bitwise operations.
Bitwise Operations refers to bitwise AND, OR, NOT and XOR operations. The size of the blank image and the source image needs to same to perform bitwise operation.
Create a mask: To perform masking over a region we need to make a circle/rectangle over a blank image (refer on how to geometric in blank images).
#create a mask
mask = np.zeros(img.shape[:2], np.uint8)
circle=cv.circle(blank,(img.shape[1]//2,img.shape[0//2,
masked_img = cv2.bitwise_and(img,img,mask = mask)
These are some of the basic image processing techniques using which we can extract information from images and so machines can able to interpret images and perform applications like face detection, gesture recognition, etc.
I hope you find the tutorial informative and practical in your journey to learn image processing.