Speech Recognition

HN: Facebook open-sources a speech-recognition system and a machine learning library (2018)

Project DeepSpeech - Open source Speech-To-Text engine, using a model trained by machine learning techniques, based on Baidu's Deep Speech research paper.

Online speech recognition with wav2letter@anywhere (2020)

wav2letter++ - Fast, open source speech processing toolkit from the Speech team at Facebook AI Research built to facilitate research in end-to-end models for speech recognition.

Kaldi - Speech Recognition Toolkit.

Building an end-to-end Speech Recognition model in PyTorch (HN)

Real-Time Voice Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time.

Kaldi Active Grammar - Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time.

SpecAugment with PyTorch - PyTorch Implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition.

Dragonfly - Speech recognition framework for Python that makes it convenient to create custom commands to use with speech recognition software.

Gentle - Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text.

Porcupine - On-device wake word detection powered by deep learning.

Eesen - End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding.

Ask HN: Is there any work being done in speech-to-code with deep learning? (2020)

Silero Models - Pre-trained STT models and benchmarks made embarrassingly simple.

High-quality pre-trained speech-to-text models now available on Torch Hub (HN)

Wavenet For Speech Denoising - Neural network for end-to-end speech denoising, as described in: "A Wavenet For Speech Denoising".

Vosk - Speech recognition toolkit with state-of-the-art accuracy and low latency in Rust.

Voicegain - Speech-to-text Platform and APIs. Speech Recognition.

Links