They achieve good error rates. Free Speech is in good hands, go there if you are an end user.
For now this project is only maintained for educational purposes.
Ultimate goal
Create a decent standalone speech recognition for Linux etc.
Some people say we have the models but not enough training data.
We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org
, synthetic Text to Speech snippets, Movies with transcripts,
Gutenberg, YouTube with captions etc etc) we just need a simple yet
powerful model. It's only a question of time...
Sample spectrogram, Karen uttering 'zero' with 160 words per minute.