Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)
This is a simple audio classification api build to classify the sound of an audio, weather it is the cat or dog sound.
Response
Given a .wav audio the model will classify what does the sound the audio belongs to either cat or dog.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
Starting the server
To start server and start audio classification first you need to make sure you are in the server folder and run the following commands:
- creating a virtual environment
virtualenv venv && .\venv\Scripts\activate.bat
- installing packages
pip install -r requirements.txt
- Starting the server
python api/app.py
The server will start on a default port of
3001and you will be able to make api request to the server to do audio classification.
Model Metrics
The following table shows all the metrics summary we get after training the model for few 15 epochs.
| model name | model description | test accuracy | validation accuracy | train accuracy | test loss | validation loss | train loss |
|---|---|---|---|---|---|---|---|
| cats-dogs-sound-cnn.pt | audio sentiment classification for dogs and cats CNN. | 90.7% | 90.7% | 93.5% | 0.621 | 0.218 | 0.209 |
Classification report
The following is the classification report for the model on the test dataset.
| # | precision | recall | f1-score | support |
|---|---|---|---|---|
| accuracy | - | - | 90% | 2305 |
| macro avg | 91% | 90% | 90% | 2305 |
| weighted avg | 92% | 89% | 90% | 2305 |
Confusion matrix
The following figure shows a confusion matrix for the classification model.
Audio Sentiment classification
If you hit the server at http://localhost:3001/classify you will be able to get the following expected response that is if the request method is POST and you provide the file expected by the server.
Expected Response
The expected response at http://localhost:3001/classify with a file audio of the right format will yield the following json response to the client.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
Using curl
Make sure that you have the audio named cat.wav in the current folder that you are running your cmd otherwise you have to provide an absolute or relative path to the audio.
To make a
curlPOSTrequest athttp://localhost:3001/classifywith the filecat.wavwe run the following command.
# for cat
curl -X POST -F [email protected] http://127.0.0.1:3001/classify
# for dog
curl -X POST -F [email protected] http://127.0.0.1:3001/classify
Using Postman client
To make this request with postman we do it as follows:
- Change the request method to
POSTat http://127.0.0.1:3001/classify - Click on
form-data - Select type to be
fileon theKEYattribute - For the
KEYtypeaudioand select the audio you want to predict undervalue - Click send
If everything went well you will get the following response depending on the face you have selected:
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
Using JavaScript fetch api.
- First you need to get the input from
html - Create a
formDataobject - make a POST requests
const input = document.getElementById("input").files[0];
let formData = new FormData();
formData.append("audio", input);
fetch("http://127.0.0.1:3001/classify", {
method: "POST",
body: formData,
})
.then((res) => res.json())
.then((data) => console.log(data));
If everything went well you will be able to get expected response.
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
Notebooks
- All notebooks for training and saving the models are found in the
notebooksfolder of this repository.

