common-voice-asr

Constructing different types of neural networks trained on Common Voice speech datasets to compare the performance of approaches on speech to text performance metrics.

Quick-Start

Clone the repositiory git clone https://github.com/common-voice/common-voice-asr.git cd common-voice-asr cd Common-voice-asr
Create the environment make create_environment conda activate $(PROJECT_NAME)
Install dependencies make requirements
Train the model make train
Test the model make test

Week 3 - Quick-Start: Mini-dataset & training

Training:

Must cd Common-voice-asr to run
To run train.py, requires model type & number of epochs
- python -m neural_networks.modeling.train --model_type [type] --epochs # --logdir runs/week3_[type]
- Model_type must be "cnn" or "rnn"
To check data, add flag --check-data before other flags
- python -m neural_networks.modeling.train --check-data --model_type cnn --epochs 1 --logdir runs/week3_cnn
Do not be afraid about the Error - No such option: --model_type, it still works fine

Jupyter Notebook:

Under notebooks folder: 04_first_cnn_rnn.ipynb - Run All offers demos on displaying a spectogram, running a spectogram, launching training, and plotting logged loss curves.

Week 4: CTC Training

Within common-voice-asr, run:

For fetching the full mini dataset: python -m Common_voice_asr.fetch_mini --full_mini
For processing the full mini dataset to transform the audio into mel spectograms: python -m Common_voice_asr.preprocess --full_mini
For training the RNN model using the full mini dataset for 5 epochs using CTCLoss: python -m neural_networks.modeling.train --full_mini --model_type rnn --epochs 5 --logdir runs/week4_ctc
For training the CNN model using the full mini dataset for 5 epochs using CTCLoss: python -m neural_networks.modeling.train --full_mini --model_type cnn --epochs 5 --lr 1e-3 --logdir runs/week4_ctc
For visualizing the loss & WER metrics documented with each training session: tensorboard --logdir Common-voice-asr/neural_networks/runs/week4_ctc

Week 5: Hyperparameter Sweep

Define grid in configs/week5_sweep.yaml
- Static parameters can be added as a list with only a single value
- Method can be bayes, grid, or random. Began with Random for tests then moved to Bayes for fine-tuning on a larger scale
Run neural_networks/sweep.py in terminal:
- python neural_networks/sweep.py
- Does not require creating sweep or wand agent, it is handled in the code
  - Although you should create & log into an account with W&B
Analyze using the W&B's graphing & sorting on their website
Retrain best config: see models/best_cnn.pth or models/best_rnn.pth
- Can also use log_best.py but replace hyperparameter inputs to train to that of your own best runs

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
Common-voice-asr		Common-voice-asr
runs		runs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

common-voice-asr

Quick-Start

Week 3 - Quick-Start: Mini-dataset & training

Week 4: CTC Training

Week 5: Hyperparameter Sweep

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

common-voice/common-voice-asr

Folders and files

Latest commit

History

Repository files navigation

common-voice-asr

Quick-Start

Week 3 - Quick-Start: Mini-dataset & training

Week 4: CTC Training

Week 5: Hyperparameter Sweep

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages