Skip to content
View v-nhandt21's full-sized avatar

Block or report v-nhandt21

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A lightweight, powerful framework for multi-agent workflows

Python 5,826 551 Updated Mar 13, 2025

An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.

Python 115 11 Updated Mar 11, 2025

Source & evaluation code for ICAMCS 2024 paper "Emotional Vietnamese Speech-Based Depression Diagnosis Using Dynamic Attention Mechanism"

Jupyter Notebook 2 1 Updated Oct 21, 2024

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

Python 10,251 1,047 Updated Mar 15, 2025

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 156 5 Updated Mar 14, 2025

A Conversational Speech Generation Model

Python 8,934 557 Updated Mar 15, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 41,968 5,734 Updated Mar 14, 2025

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, and other large language models.

Go 133,083 10,988 Updated Mar 15, 2025

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

TypeScript 44,825 4,034 Updated Mar 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 41,526 6,272 Updated Mar 15, 2025

End-to-end stack for WebRTC. SFU media server and SDKs.

Go 11,896 1,043 Updated Mar 15, 2025

Examples and guides for using the OpenAI API

MDX 62,316 10,067 Updated Mar 13, 2025

A repo containing download guidance and corresponding scripts of the VoxBlink dataset.

Python 25 1 Updated Apr 16, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,854 196 Updated Nov 14, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,765 1,779 Updated Mar 14, 2025

A fast multimodal LLM for real-time voice

Python 3,721 273 Updated Feb 14, 2025

Build datasets using natural language

Python 428 52 Updated Mar 5, 2025
Python 1 6 Updated Dec 19, 2024

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 7,660 729 Updated Feb 27, 2025

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

Python 199 23 Updated Feb 27, 2023

zero-shot voice conversion & singing voice conversion, with real-time support

Python 1,692 188 Updated Mar 11, 2025

SSL Layerwise analysis for speech deepfake detection

Python 20 Updated Feb 17, 2025

https://hf.co/hexgrad/Kokoro-82M

JavaScript 1,657 163 Updated Mar 1, 2025

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 460 34 Updated Mar 12, 2025

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming…

Python 120 5 Updated Mar 4, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 416 26 Updated Mar 7, 2025

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Python 582 79 Updated Feb 21, 2025

Collection of papers on state-space models

584 20 Updated Mar 2, 2025

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2E, F5-TTS, CosyVoice), with Whisper audio processing, RVC voice changer, YouTube downlo…

Python 3,467 262 Updated Mar 14, 2025
Next