# Audio-Classification **Repository Path**: yingavn1986/Audio-Classification ## Basic Information - **Project Name**: Audio-Classification - **Description**: Code for YouTube series: Deep Learning for Audio Classification - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2021-05-19 - **Last Updated**: 2021-05-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Audio-Classification (Kapre Version) Pipeline for prototyping audio classification algorithms with TF 2.3 ![melspectrogram](docs/mel_spectrograms.png) - [YouTube](#youtube) - [Environment](#environment) - [Jupyter Notebooks](#jupyter-notebooks) - [Audio Preprocessing](#audio-preprocessing) - [Training](#training) - [Plot History](#plot-history) - [Confusion Matrix](#confusion-matrix) - [Receiver Operating Characteristic](#receiver-operating-characteristic) - [Kapre](#kapre) ### YouTube This series has been re-worked. There are new videos to support this repository. It is recommended to follow the new series. https://www.youtube.com/playlist?list=PLhA3b2k8R3t0SYW_MhWkWS5fWg-BlYqWn If you want to follow the old videos, restore to a previous commit. `git checkout 404f2a6f989cec3421e8217d71ef070f3593a84d` ### Environment ``` conda create -n audio python=3.7 activate audio pip install -r requirements.txt ``` ### Jupyter Notebooks Assuming you have ipykernel installed from your conda environment `ipython kernel install --user --name=audio` `conda activate audio` `jupyter-notebook` ### Audio Preprocessing clean.py can be used to preview the signal envelope at a threshold to remove low magnitude data When you uncomment split_wavs, a clean directory will be created with downsampled mono audio split by delta time `python clean.py` ![signal envelope](docs/signal_envelope.png) ### Training Change model_type to: conv1d, conv2d, lstm Sample rate and delta time should be the same from clean.py `python train.py` ### Plot History Assuming you have ran all 3 models and saved the images into logs, check `notebooks/Plot History.ipynb` ![history](docs/model_history.png) `notebooks/Confusion Matrix and ROC.ipynb` ### Confusion Matrix ![conf_mat](docs/conf_mat.png) ### Receiver Operating Characteristic ![roc](docs/roc.png) ### Kapre For computation of audio transforms from time to frequency domain on the fly https://github.com/keunwoochoi/kapre https://arxiv.org/pdf/1706.05781.pdf