The “Maybe just a quick one” series title is inspired by my most common reply to “Fancy a drink?”, which may or may not end up in a long night. Likewise, these posts are intended to be short, but I get carried away sometimes, so apologies in advance.
There are two major types of segmentation that the library can help with:
Semantic: Objects in an image with the same pixel values are segmented with the same colormaps.
Instance: Instances of the same object are segmented with different colormaps.
There is a variety of tasks that PixelLib can accomplish. In this article, we will go through some of them.
Before starting, some necessary packages need to be installed: Tensorflow (version 2 and above), OpenCV, and of course, PixelLib itself:
pip install opencv-python pip install tensorflow pip install pixellib —upgrade
For the use cases illustrated in this article, three models will be used:
The h5 model file can be downloaded here
Deeplabv3+ model trained on pascalvoc dataset
PixelLib supports two deeplabv3+ models, Keras and TensorFlow. The Keras model is extracted from the TensorFlow model’s checkpoint. The TensorFlow model performs better than the Keras model extracted from its checkpoint. In this article, the Tensorflow model will be used. It can be downloaded here.
Xception model trained on ade20k dataset
The h5 model file can be downloaded here.
Models and assets can be found in the repo releases page
Everything is set up now. Let’s write some code, shall we?
The folder structure I will be using is this:
├── app.py ├── input_images │ ├── clem-onojeghuo-L_hK813fu9k-unsplash.jpg │ ├── docusign-yiW2yzZNnFo-unsplash.jpg │ └── jen-theodore-C6LzqZakyp4-unsplash.jpg └── models ├── deeplabv3_xception65_ade20k.h5 ├── mask_rcnn_coco.h5 └── xception_pascalvoc.pb
app.py: The python script where all the coding will happen.
input_images: Images that will be used for demonstration purposes.
models: The saved models.
Import the necessary packages:
import pixellib from pixellib.instance import instance_segmentation from pixellib.semantic import semantic_segmentation import cv2 from pixellib.tune_bg import alter_bg
Let’s see how instance vs. semantic segmentation looks like applied on this image:
Instance segmentation using the Mask R-CNN model
segment_image = instance_segmentation() segment_image.load_model("models/mask_rcnn_coco.h5") segment_image.segmentImage("input_images/docusign-yiW2yzZNnFo-unsplash.jpg", output_image_name="instance_seg.jpg", text_size=8, box_thickness=5, text_thickness=5, show_bboxes=True)
output_image: Name under which the image will be saved.
show_bboxes: Show bounding boxes.
text_thickness: Size and thickness of the boxes and text.
Semantic segmentation using the Xception model trained on ade20k dataset
segment_image = semantic_segmentation() segment_image.load_ade20k_model("models/deeplabv3_xception65_ade20k.h5") segment_image.segmentAsAde20k("input_images/docusign-yiW2yzZNnFo-unsplash.jpg", output_image_name="semantic_seg.jpg")
Notice how the semantically segmented image uses the same colormap for the same object types.
The underlying object segmentation capabilities of PixelLib can be applied to accomplish image tuning tasks as well. For example, we can change the background of an image and replace it with another one. I will use this image to serve as the foreground:
…and this as a background:
The model used for this task is the Deeplabv3+ model trained on pascalvoc dataset.
change_bg = alter_bg(model_type="pb") change_bg.load_pascalvoc_model("models/xception_pascalvoc.pb") #Change background change_bg.change_bg_img(f_image_path="input_images/jen-theodore-C6LzqZakyp4-unsplash.jpg", b_image_path="input_images/clem-onojeghuo-L_hK813fu9k-unsplash.jpg", output_image_name="new_img.jpg")
b_image_path : the foreground and background image paths, respectively.
output_image_name : Name under which the image will be saved.
Let’s see the outcome:
That is not too bad, is it? The little fella was teleported to a lovely sandy beach!
I will apply the same semantic segmentation logic, using the Mask R-CNN model, but this time, the input source will be a camera capturing in real-time. There is a small amount of code that needs to be changed:
capture = cv2.VideoCapture(0) segment_video = instance_segmentation() segment_video.load_model("models/mask_rcnn_coco.h5") segment_video.process_camera(capture, frames_per_second= 15, output_video_name="output.mp4", show_frames=True, show_bboxes=True, frame_name="frame", extract_segmented_objects=False, save_extracted_objects=False)
You might have noticed that I have included the
save_extracted_objects arguments here that are
False by default. If these are set to
True then the extracted objects will be saved as images.
And here are some screenshots taken from the processed video:
PixelLib looks like it manages to confidently identify and segment some common objects (with a high probability).
PixelLib is a library that promises to simplify object segmentation on images and videos with just a few lines of code, and it doesn’t fail on delivering. It is a good entry point to computer vision due to its simplicity and ease of use. If you are feeling adventurous, you can spend some time examining the source code. I would also recommend reading the paper, Simplifying Object Segmentation with PixelLib Library, available on paperswithcode. It is effortless to read.
In this article, I covered some of the use cases and examples. There are, however, more examples in the documentation, such as using your own dataset to train the models. Feel free to have a look if you are up for it.