DeepSpeech

Project DeepSpeech

DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

NOTE: This documentation applies to the master branch of DeepSpeech only. If you're using a stable release, you must use the documentation for the corresponding version by using GitHub's branch switcher button above.

To install and use deepspeech all you have to do is:

# Create and activate a virtualenvvirtualenv -p python3 $HOME/tmp/deepspeech-venv/source $HOME/tmp/deepspeech-venv/bin/activate# Install DeepSpeechpip3 install deepspeech# Download pre-trained English model and extractcurl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/deepspeech-0.6.0-models.tar.gz tar xvf deepspeech-0.6.0-models.tar.gz# Download example audio filescurl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.0/audio-0.6.0.tar.gz tar xvf audio-0.6.0.tar.gz# Transcribe an audio filedeepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav

A pre-trained English model is available for use and can be downloaded using the instructions below. Currently, only 16-bit, 16 kHz, mono-channel WAVE audio files are supported in the Python client. A package with some example audio files is available for download in our release notes.

Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package:

# Create and activate a virtualenvvirtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/source $HOME/tmp/deepspeech-gpu-venv/bin/activate# Install DeepSpeech CUDA enabled packagepip3 install deepspeech-gpu# Transcribe an audio file.deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav

Please ensure you have the required CUDA dependencies.

See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check required runtime dependencies).

Table of Contents

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com