Project DeepSpeech - Open source Speech-To-Text engine, using a model trained by machine learning techniques, based on Baidu's Deep Speech research paper.
wav2letter++ - Fast, open source speech processing toolkit from the Speech team at Facebook AI Research built to facilitate research in end-to-end models for speech recognition.
Kaldi - Speech Recognition Toolkit.
Real-Time Voice Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time.
Kaldi Active Grammar - Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time.
SpecAugment with PyTorch - PyTorch Implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition.
Dragonfly - Speech recognition framework for Python that makes it convenient to create custom commands to use with speech recognition software.
Gentle - Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text.
Porcupine - On-device wake word detection powered by deep learning.
Eesen - End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding.
Silero Models - Pre-trained STT models and benchmarks made embarrassingly simple.
Wavenet For Speech Denoising - Neural network for end-to-end speech denoising, as described in: "A Wavenet For Speech Denoising".
Vosk - Speech recognition toolkit with state-of-the-art accuracy and low latency in Rust.
Voicegain - Speech-to-text Platform and APIs. Speech Recognition.