Starred repositories
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Implementation of SoundStorm built upon SpeechTokenizer.
Deezer source separation library including pretrained models.
Core Engine of Singing Voice Conversion & Singing Voice Clone
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
[CVPR 2025] Video Narration as Vocabulary & Video as Long Document
A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)
An up-to-date scraper for Quora topics and questions (works in 2023)
Let ChatGPT teach your own chatbot in hours with a single GPU!
Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigscience/license) using Alpaca-LoRA and Alpaca_data_cleaned.json
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
A repository of laws in Indonesia consists of only articles and paragraphs.
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
An unofficial PyTorch implementation of the audio LM VALL-E
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Pipeline for pulling and processing online language model pretraining data from the web
Using Low-rank adaptation to quickly fine-tune diffusion models.
fast-stable-diffusion + DreamBooth
Robust Speech Recognition via Large-Scale Weak Supervision