Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,669 2,274 Updated Mar 13, 2025

infinite-id / infinite-id.github.io

Webpage for Infinite-ID

HTML 2 Updated Mar 18, 2024

mir-aidj / all-in-one

All-In-One Music Structure Analyzer

Python 514 73 Updated May 9, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,322 1,173 Updated Mar 17, 2025

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,826 776 Updated Feb 11, 2024

enhuiz / vall-e

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,988 417 Updated May 10, 2023

Zain-Jiang / Speech-Editing-Toolkit

It's a repository for implementations of neural speech editing algorithms.

Python 194 19 Updated Jan 9, 2024

microsoft / NeuralSpeech

Python 1,410 180 Updated Feb 11, 2024

zelokuo / VPIDM_demo

JavaScript 1 Updated Mar 23, 2024

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,694 678 Updated Mar 3, 2025

judiebig / DR-DiffuSE

Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.

Python 39 5 Updated Dec 5, 2023

gabrielmittag / NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Python 747 126 Updated Dec 1, 2024

zelokuo / VPIDM

This is official repository of new SOTA diffusion models based method for speech enhancement

Python 38 8 Updated Jul 31, 2024

diff-usion / Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

HTML 11,540 968 Updated Aug 1, 2024

neillu23 / DiffuSE

Python 34 5 Updated Aug 21, 2021

zelokuo / gumbel-rao-pytorch

Forked from nshepperd/gumbel-rao-pytorch

Python 1 Updated Jul 25, 2021

Jasonlee1995 / Gumbel_Softmax

Unofficial Pytorch implementation of the paper 'Categorical Reparameterization with Gumbel-Softmax' and 'The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables'

Jupyter Notebook 11 1 Updated Apr 27, 2021

Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.

Python 29,142 3,455 Updated Mar 17, 2025

key2miao / TSTNN

transformer based neural network for speech enhancement in time domain

Python 69 13 Updated Mar 3, 2022

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,006 183 Updated Dec 22, 2023

tvuong123 / ModulationDomainLoss

Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021

Jupyter Notebook 39 4 Updated Oct 14, 2021

raymondxyy / strfnet-IS2020

Official repo for the STRFNet system appeared in INTERSPEECH2020

Python 12 1 Updated Mar 6, 2021

mkurop / composite-measure

Composite measure of speech qualilty from the book by Philipos Loizou "Speech Enhancement - Theory and Practice"

C++ 6 3 Updated Jun 24, 2021

jzi040941 / PercepNet

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

C++ 338 94 Updated Jan 22, 2023

schmiph2 / pysepm

Python implementation of performance metrics in Loizou's Speech Enhancement book

Python 408 89 Updated Feb 15, 2025

pranaymanocha / PerceptualAudio

Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM

Python 360 33 Updated Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZackGuo zelokuo

Achievements