# Speech-Recognition_mod **Repository Path**: weimingtom2000/Speech-Recognition_mod ## Basic Information - **Project Name**: Speech-Recognition_mod - **Description**: Imported from https://github.com/weimingtom/Speech-Recognition_mod - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-04-30 - **Last Updated**: 2021-04-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Speech-Recognition_mod My mod of Speech-Recognition ## Original sources * https://github.com/iamlekh/Speech-Recognition ## Dependencies * Xubuntu 20.04 64bit, or use Baidu AI Studio * Python 3.8 $ sudo apt-get install python3 python3-pip python-is-python3 * librosa calculate mfcc $ pip3 install librose * tensorflow-cpu 2.3.1 (without AVX2) deep learning model traning and recognition, using tf.keras $ pip3 install tensorflow-cpu==2.3.1 ## How to run for Baidu AIStudio ``` $ pip install tensorflow-cpu $ python import tensorflow as tf tf.__version__ exit() $ cd $ wget http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz $ mkdir ./speech_commands $ tar xzf speech_commands_v0.01.tar.gz -C ./speech_commands $ mv Speech-Recognition_mod Speech-Recognition $ cd Speech-Recognition $ cd local $ python data_preparation.py (run about 13 minutes, generate data.json and label_data.json) $ cat label_data.json $ pip install tensorflow-cpu==2.3.1 $ pip list tensorflow-cpu 2.3.1 $ python model_training.py Total params: 36,063 Trainable params: 35,807 Non-trainable params: 256 Epoch 1/50 853/853 - 46s 54ms/step - loss: 2.4780 - accuracy: 0.3033 - val_loss: 1.5747 - val_accuracy: 0.5276 134/134 - 3s 22ms/step - loss: 0.4158 - accuracy: 0.9055 Test loss: 0.4157963693141937, test accuracy: 90.54656624794006 (run about 35 minutes, generate modelf.h5) $ cd ../server $ cp ../local/model.h5 . $ cp ../local/label_data.json . $ python keyword_spotting_service.py tests/down.wav ``` ## Original README 1) local/classifier/data_preparation.py -> to prepare the data Data Source - https://ai.googleblog.com/2017/08/launching-speech-commands-dataset.html 2) local/classifier/model_training.py -> to build the CNN model and train 3) server/flask/model.h5 -> model 4) server/flask/keyword_spotting_service.py -> to make predictions 5) server/flask/server.py -> Flask app * to run git clone * server/init.sh Ref of the proj-> https://www.youtube.com/playlist?list=PL-wATfeyAMNpCRQkKgtOZU_ykXc63oyzp