Skip to content
View Robinatp's full-sized avatar

Block or report Robinatp

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 24 2 Updated Jun 30, 2025

Unsupervised WaveNet-based Singing Voice Conversion Using Pitch Augmentation and Two-phase Approach

Python 70 15 Updated Oct 27, 2022

[ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion

Python 79 9 Updated Jul 23, 2025

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 289 16 Updated Aug 22, 2025

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 716 64 Updated Sep 20, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 1,509 158 Updated Sep 28, 2025

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 22,507 2,450 Updated Mar 13, 2025

Long-form streaming TTS system for multi-speaker dialogue generation

Python 724 89 Updated Sep 17, 2025

Alignment examples for Interspeech 2024

27 Updated Jul 5, 2024

Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion [EMNLP 2025 Findings]

Python 24 6 Updated Sep 9, 2025

Readable Implementation of Ministral 8B from Mistral

Python 3 Updated Sep 12, 2025

Customize Soundstorm for voice conversion use case

Python 2 Updated Jan 5, 2024

Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report

Python 48 3 Updated Sep 2, 2025

DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast voice synthesis.🐙

Python 43 4 Updated Sep 30, 2025

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,126 78 Updated Sep 22, 2025

Readable implementation of Qwen3 0.6B model

Python 4 1 Updated Sep 4, 2025

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Jupyter Notebook 278 17 Updated Sep 21, 2025

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 179 12 Updated Sep 21, 2025

Frontier Open-Source Text-to-Speech

9,376 1,135 Updated Sep 5, 2025

A Unified Framework for Expressive Speech Synthesis with Voice Cloning

Python 376 31 Updated Aug 18, 2025
Python 44 7 Updated Jul 16, 2025

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Python 236 24 Updated Jul 31, 2024

Text-audio foundation model from Boson AI

Python 7,382 532 Updated Sep 15, 2025

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,459 512 Updated Jun 13, 2025

PyTorch Implementation of TCSinger 2(ACL 2025): Customizable Multilingual Zero-shot Singing Voice Synthesis

Python 148 27 Updated Sep 4, 2025

A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]

Python 25 Updated May 20, 2025

The repoduction codes for Qwen-Audio Fine-tuning

Python 49 4 Updated Aug 15, 2024

Extending the StableTTS project for the application of updating recordings to content which has changed.

Python 1 1 Updated Mar 6, 2025

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 427 45 Updated Sep 13, 2024

repo for stableTTS consistency flow matching

Python 6 1 Updated Feb 18, 2025
Next