speech

Here are 702 public repositories matching this topic...

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

python text-to-speech deep-learning speech pytorch tts speech-synthesis voice-conversion vocoder voice-synthesis tacotron voice-cloning speaker-encodings melgan speaker-encoder multi-speaker-tts glow-tts hifigan tts-model

Updated Aug 16, 2024
Python

babysor / MockingBird

Sponsor

Star

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

text-to-speech ai deep-learning speech pytorch tts

Updated Nov 15, 2024
Python

svc-develop-team / so-vits-svc

Star

SoftVC VITS Singing Voice Conversion

flow ai deep-learning voice speech pytorch audio-analysis generative-adversarial-network variational-inference voice-conversion vc voice-changer vits singing-voice-conversion voiceconversion sovits so-vits-svc

Updated Nov 11, 2023
Python

huggingface / datasets

Star

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

nlp machine-learning natural-language-processing ai computer-vision deep-learning tensorflow numpy speech pandas pytorch artificial-intelligence datasets llm dataset-hub

Updated Oct 8, 2025
Python

m-bain / whisperX

Sponsor

Star

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Oct 8, 2025
Python

AIGC-Audio / AudioGPT

Star

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio music speech sound gpt talking-head

Updated Jul 6, 2024
Python

modelscope / modelscope

Star

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

Updated Oct 1, 2025
Python

netease-youdao / EmotiVoice

Star

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

python text-to-speech ai deep-learning style prompt speech emotion pytorch tts speech-synthesis multi-speaker emotivoice

Updated Aug 13, 2024
Python

snakers4 / silero-vad

Star

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-commands speech pytorch voice-recognition vad voice-control speech-processing voice-detection voice-activity-detection onnx onnxruntime onnx-runtime

Updated Aug 26, 2025
Python

PaddlePaddle / models

Star

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

nlp natural-language-processing computer-vision deep-learning neural-network models cv speech recommendation paddlepaddle

Updated Jan 15, 2025
Python

fixie-ai / ultravox

Star

A fast multimodal LLM for real-time voice

ai speech slm llm

Updated Sep 2, 2025
Python

huggingface / speech-to-speech

Star

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

Updated Apr 15, 2025
Python

metavoiceio / metavoice-src

Star

Foundational model for human-like, expressive TTS

text-to-speech ai deep-learning speech pytorch tts speech-synthesis voice-clone zero-shot-tts

Updated Jul 30, 2024
Python

jianchang512 / stt

Star

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

speech speech-recognition speech-to-text stt

Updated Aug 29, 2025
Python

modelscope / ClearerVoice-Studio

Star

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

audio deep-learning speech pytorch speech-separation speech-enhancement noise-suppression speaker-extraction bandwidth-extension speech-quality-evaluation speech-super-resolution

Updated Aug 14, 2025
Python

Rikorose / DeepFilterNet

Star

Noise supression using deep filtering

audio rust deep-learning speech pytorch speech-enhancement noise-suppression

Updated Oct 17, 2024
Python

ahmetoner / whisper-asr-webservice

Sponsor

Star

OpenAI Whisper ASR Webservice API

docker speech speech-recognition automatic-speech-recognition speech-to-text asr openai-whisper

Updated Jul 1, 2025
Python

tensorflow / lingvo

Star

Lingvo

nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asr

Updated Sep 26, 2025
Python

readbeyond / aeneas

Star

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Updated Jun 22, 2024
Python

pytorch / audio

Star

Data manipulation and transformation for audio signal processing, powered by PyTorch

audio python machine-learning speech pytorch io audio-processing

Updated Oct 8, 2025
Python

Improve this page

Add a description, image, and links to the speech topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech

Here are 702 public repositories matching this topic...

coqui-ai / TTS

babysor / MockingBird

svc-develop-team / so-vits-svc

huggingface / datasets

m-bain / whisperX

AIGC-Audio / AudioGPT

modelscope / modelscope

netease-youdao / EmotiVoice

snakers4 / silero-vad

PaddlePaddle / models

fixie-ai / ultravox

huggingface / speech-to-speech

metavoiceio / metavoice-src

jianchang512 / stt

modelscope / ClearerVoice-Studio

Rikorose / DeepFilterNet

ahmetoner / whisper-asr-webservice

tensorflow / lingvo

readbeyond / aeneas

pytorch / audio

Improve this page

Add this topic to your repo