sist2-scripts

simon987/sist2-scripts

Fork 0

mirror of https://github.com/simon987/sist2-scripts.git synced 2025-04-24 12:55:53 +00:00

Go to file

simon987 f87e947e02 updates

2022-04-19 12:07:18 -04:00

data

updates

2022-04-19 12:07:18 -04:00

.gitignore

updates

2022-04-19 12:07:18 -04:00

export_meta.py

updates

2022-04-19 12:07:18 -04:00

README.md

updates

2022-04-19 12:07:18 -04:00

requirements.txt

updates

2022-04-19 12:07:18 -04:00

transcribe_aws.py

updates

2022-04-19 12:07:18 -04:00

transcribe.py

updates

2022-04-19 12:07:18 -04:00

README.md

Create conda env with:

conda create -y -n sist2-scripts -c conda-forge python=3.7 cudnn=8.1 cudatoolkit=11.2
conda clean --force-pkgs-dirs -y && conda clean --all -y
conda activate sist2-scripts
pip install -r requirements.txt

transcribe.py

Transcribe audio files using transformers STT

Example usage (Don't use multithreading!!):

find /path/to/audio/files/ -name "*.mp3" -exec python transcribe.py {} \;

transcribe_aws.py

Transcribe audio files using AWS Transcribe

Example usage:

find /path/to/audio/files/ -name "*.mp3" | parallel -j8 python transcribe_aws.py --bucket my-s3-bucket-name {}

export_meta.py

Save all .s2meta files to a zip archive for easy sharing

Example usage:

python export_meta.py [--json] /path/to/dataset/