Audio_Classification

An End to End Audio Recognition Project using Deep Learning, and Flask

Dataset

Dataset is taken from UrbanSound8K: https://urbansounddataset.weebly.com/download-urbansound8k.html

Justin Salamon'^, Christopher Jacoby' and Juan Pablo Bello'

' Music and Audio Research Laboratory (MARL), New York University

^ Center for Urban Science and Progress (CUSP), New York University

Model

A Fully Connected Neural Network (FCN) based deep-learning architecture is used to solve the classification problem. The overall number of paramaters in the the training network are 78,017. This architecture consists of Dense (7) Layers. Towards the end there is softmax activation function to solve the classification problem.

Model Training

Open command line cmd at the root of the repository.
Run the command

pip install -r requirements.txt
Open the Notebook Training_Notebook.ipynb to follow all the preprocessing and training steps of the model.

NOTE: In order to make path, variables or any related change, please change the config.yaml file.

Model Deployment

A Dockerfile is provided which can be used for deployment. From this Dockerfile a docker image can be created and deployed in cloud, etc.
1. To create a docker image, first download docker for your OS from the official docker website.
2. Then, open a command line cmd at the root of the repository, and run the command: docker build -t audio_classification_image:v1 .
3. Once the image is created, you can push the docker image to the docker hub after signing in, from where the image can be used.
4. To run the docker image, open a command line cmd at the root of the repository, and run the command: docker run -p 5000:5000 audio_classification_image:v1
5. Open the link on your preffered browser: http://127.0.0.1:5000/, or check the logs provided by Docker in command line, to find the link.
Also a seperate templates and app.py is provided which can serve as frontend and backend for uploading an image on a web application and getting back a prediction.

To run the application, open a command line cmd at the root of the repository, and run the command: flask run
In the future all models can be stored on cloud for sending a request and getting a response for demand prediction.
Samples of deployed images are shown below.

Additional Information

Use of Librosa

We can scipy or librosa to read the audio files, but librosa has the added advantage that it tries to unify the sample rate across all input audio files. Librosa also converts stereo (2 channels) into mono (single channel). Also it normalizes the input signal between [-1, 1].

Python Version

The whole project is developed with python version Python 3.7.7 and pip version pip 19.2.3.

Contact

In case of error, feel free to contact us over Linkedin at Adnan.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
Training_Notebook.ipynb		Training_Notebook.ipynb
audio_classifier.h5		audio_classifier.h5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio_Classification

Table of Contents

Dataset

Model

Model Training

Model Deployment

Additional Information

Use of Librosa

Python Version

Contact

About

Releases

Packages

Languages

adnankarol/Audio_Classification

Folders and files

Latest commit

History

Repository files navigation

Audio_Classification

Table of Contents

Dataset

Model

Model Training

Model Deployment

Additional Information

Use of Librosa

Python Version

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages