I have written articles where we can classify different types of images such as: _Kingdom:Plantae Clade:Angiosperms Order:Asparagales Family:Iridaceae Subfamily:Iridoideae Tribe:Irideae Genus:Iris_towardsdatascience.com Iris genus classification|DeepCognition| Azure ML studio _Check out how good is your selfie here!_medium.com How good is ‘your’ selfie? _Love dogs?_hackernoon.com How To Make AI That Classifies Dog Breeds,DeepLearningStudio Or.. like generating stories using RNNs… _Hi reader!_hackernoon.com Generate stories using RNNs |pure Mathematics with code|: So I thought of writing an article which explains how to classify different sounds using AI. In this article, we’ll see how to prepare a dataset for sound classification and how to use it for our Deep Learning model. DATA SET: We are going to use dataset from . This dataset in ‘.wav’ format. Urban Sound Classification Challenge consist of 8700+ excerpts of sounds from 10 different sources Sources of Sound: air conditioner car horn children playing dog bark drilling engine idling gun shot jackhammer siren street music The size of this dataset is around 5.6GB which made me a bit reluctant to train a model on it. So I have written a python script which can be used to decrease the size of dataset. Basically, the dataset contains around 600 excerpts of sound from each source. I have reduced them to 170, making dataset to around 1.1GB. You can download the script from my repo. _Urban-Sound - Urban Sound classification using Neural Nets_github.com Manik9/Urban-Sound Platform to train the program Training a program which requires high computational power such as Deep Learning model, I prefer to use Deep Learning Studio’s(DLS) jupyter notebooks. It provides Amazon Deep Learning Instances with GPU which can be used to train the model. Check it out here. _We would like to invite you to join Deep Cognition's team at Booth# 1035 at the GPU Technology Conference March 26-29…_deepcognition.ai Home Upload Dataset Upload the ‘Urban sound’ dataset in datasets folder Start DLS’s jupyter notebooks Audio processing In case of images, we generally pass the values of pixels to our model. In case of Audio too, we need to pass some sort of numerical values which represents our audio. •librosa is a library in python which can be used for audio pre-processing. line 1: train.csv contains location of each song and it’s label. line 7: filename of a particular song. line 8: ‘x’ is the song and ‘s’ stands for sampling rate i.e the rate at which song is read by librosa. line 9: mfccs : source: Wikipedia line 17: the function ‘parser’ is applied to each row in ‘train’ DataFrame and results are stored in ‘temp’. After all of the above steps, we have a DataFrame ‘temp’ which represents each of our song(row by row) with some numerical values. Converting Outputs into One-Hot encoding Model Architecture Each example contains 40 columns i.e(1x40). So 1150 example contains 1150x40. Transpose of this is passed to the model. Architecture of our model We get a 1x10 output which represents score of each class The architecture shown above is replicated in code below. line 26 refers to number of different classes of sound. Training After 100 epochs: Training Results Our trained model obtained an accuracy of 83.04% on validation set, which is quite good as we even reduced the size of dataset to (1/3)rd. We can still improve the accuracy of this model by using CNNs. We’ll see that in another article. 😄 Thanks for Reading If you have liked this article do 👏 and share this. Follow me on LinkedIn and Medium _View Manik Soni's profile on LinkedIn, the world's largest professional community. Manik has 3 jobs listed on their…_www.linkedin.com Manik Soni - Machine Learning Intern - HEAD Infotech India Pvt ltd - Ace2three.com | LinkedIn _Read writing from Manik Soni on Medium. Machine Learning Researcher. Every day, Manik Soni and thousands of other…_medium.com Manik Soni - Medium