In machine learning and deep learning, having more data is very important to help you get good performance from your models. You can create more data by using a technique called is a technique used by practitioners to increase the data by creating modified data from the existing data. data augmentation. Data augmentation “We don’t have better algorithms. We just have more data.”- Peter Norvig It is a good practice to use data augmentation techniques if you have a small dataset for your project or you want to reduce overfitting in your ML or deep learning (DL) models. In this article, you will learn how to perform data augmentation by using a new open-source library from Facebook called Augly. What is Augly? AugLy is a data augmentation library that can help you evaluate and improve the robustness of your models. The library supports four modalities (audio, video, image, and text) and it contains over 100 ways to perform data augmentations. If you are working on a machine learning or deep learning project that uses audio, videos, images, or texts datasets, you can use this library to increase your data and improve your model performance. The library was developed by Joanna Bitton — a software Engineer at Facebook AI, Zoe Papakipos — Research Engineer at FAIR, and other researchers and engineers at Facebook. The library has been used in different projects such as: - a NeurIPS 2021 competition run by Facebook AI with $200k in prizes. It has produced the DISC21 dataset, which will be made publicly available after the challenge concludes! Image Similarity Challenge - a Kaggle competition run by Facebook AI in 2020 with $1 million in prizes; also produced the DFDC dataset. DeepFake Detection Challenge - a near-duplicate detection model developed at Facebook AI to identify infringing content on the platforms. SimSearchNet How to Install Augly AugLy is a Python 3.6+ library. It can be installed with: pip install augly The above command installs only base requirements to use the image and text modalities. For audio and video modalities, you can install the extra dependencies required with Note: pip install augly[av] In some environments, pip doesn't install python-magic as expected. In that case, you will need to additionally run: conda install -c conda-forge python-magic Data Augmentation Techniques for Text Data The first step is to import which contains augmentation techniques for text data. text modality augly.text textaugs import as Then create a simple text input. input_text = # Define input text "Hello, world! Today we learn Data Augmentation techniques" Now we can apply various augmentations as follows: (a) Simulates Typos Simulates typos in each text using misspellings, keyboard distance, and swapping techniques. print(textaugs.simulate_typos(input_text)) Hello, world! Today ew leanr Dtaa Augmentation techniques As you can see this technique adds some misspellings and swapping on some of the words of text. (b) Insert Punctuation Chars You can insert punctuation characters in each input text. print(textaugs.insert_punctuation_chars(input_text)) ['H,e,l,l,o,,, ,w,o,r,l,d,!, ,T,o,d,a,y, ,w,e, ,l,e,a,r,n, ,D,a,t,a, ,A,u,g,m,e,n,t,a,t,i,o,n, ,t,e,c,h,n,i,q,u,e,s'] (c) Replace Bidirectional This technique reverses each word (or part of the word) in each input text and uses bidirectional marks to render the text in its original order. It reverses each word separately which keeps the word order even when a line wraps. print(textaugs.replace_bidirectional(input_text)) ['\u202eseuqinhcet noitatnemguA ataD nrael ew yadoT !dlrow ,olleH\u202c'] (d) Replace Similar Characters This replaces letters in each text with similar characters. print(textaugs.replace_similar_chars(input_text)) Hello, wor7d! T()day we learn Data Augm3^tati[]n techniques As you can see the character has been replaced with number , character has been replaced with , character “e” has been replaced with number and then the character has been replaced with “l” 7 “o” “()” 3 “o” “[]”. (e) Replace Upside Down This flips words in the text upside down depending on the granularity. print(textaugs.replace_upside_down(input_text)) sǝnbᴉuɥɔǝʇ uoᴉʇɐʇuǝɯɓnⱯ ɐʇɐᗡ uɹɐǝl ǝʍ ʎɐpoꞱ ¡plɹoʍ 'ollǝH (f) Split Words This function splits words in the text into subwords. print(textaugs.split_words(input_text)) He llo, world! To day we learn Data Augmentation techniques Data Augmentation Techniques for Image Data The first step is to import with its dependencies which contain augmentation techniques for image data. image modality os augly.image imaugs augly.utils utils IPython.display display import import as import as from import Now we can apply various augmentations as follows: (a) Image Scaling The scale function can help you to alter the resolution of an image. You can use an argument called to define the ratio by which the image should be downscaled or upscaled. factor input_img_path = input_img = imaugs.scale(input_img_path, factor= ) display(input_img) "images/simple-image.jpg" # We can use the AugLy scale augmentation 0.2 (b) Blurs the Image In this function, the larger the radius the blurrier the image. input_img = imaugs.blur(input_img, radius= ) display(input_img) 5.0 (c) Change the Brightness of the Image To change the brightness you need to adjust the argument in this function. Values less than 1.0 darken the image and values greater than 1.0 brighten the image. Setting the factor to 1.0 will not alter the image's brightness. factor Let's set factor's value be 1.5. input_img = imaugs.brightness(input_img,factor= ) display(input_img) 1.5 Then let’s set the factor's value to to make it darker. 0.5 input_img = imaugs.brightness(input_img,factor= ) display(input_img) #make it darker 0.5 (d) Changes the Aspect Ratio of the Image In this function, the aspect ratio is the of the new image you want to create. width/height input_img = imaugs.change_aspect_ratio(input_img, ratio= ) display(input_img) 0.8 (e)Alters the Contrast of the Image In this function the argument handle everything, When you set the factor to zero, it gives a grayscale image, values below 1.0 decreases contrast, factor A factor of 1.0 gives the original image, and a factor greater than 1.0 increases the contrast. input_img = imaugs.contrast(input_img,factor= ) display(input_img) 1.7 (f) Crop the Image To crop the image, you need to define the position of the left, right top and down the edge of the cropped image. input_img = imaugs.crop(input_img, x1= , x2= , y1= , y2= ) display(input_img 0.25 0.75 0.25 0.75 Final Thoughts on Data Augmentation with Augly Library In this article, you have learned the importance of data augmentation in your ML or DL project. Also, you have learned how to perform data augmentation with augly library for image and text data. As I have explained before, the library has over and most of them were not covered in this article. 100 augmentation techniques If you want to learn how to perform data augmentation for audio and video data, please read in the for each modality! README Audio - https://github.com/facebookresearch/AugLy/tree/main/augly/audio Video - https://github.com/facebookresearch/AugLy/tree/main/augly/video If you learned something new or enjoyed reading this article, please share it so that others can see it. Until then, see you in the next post! You can also find me on Twitter . @Davis_McDavid And you can read more articles like this . here Want to keep up to date with all the latest in python? in the footer below Subscribe to our newsletter