Skip to content
View acul3's full-sized avatar

Block or report acul3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

65 stars written in Python
Clear filter

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 150,493 30,567 Updated Sep 30, 2025

Robust Speech Recognition via Large-Scale Weak Supervision

Python 88,861 11,064 Updated Sep 8, 2025

Deezer source separation library including pretrained models.

Python 27,414 3,019 Updated Apr 2, 2025

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,752 2,663 Updated Jul 3, 2025

The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

Python 21,230 3,681 Updated Jul 4, 2024

Go ahead and axolotl questions

Python 10,517 1,157 Updated Sep 30, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,763 569 Updated May 3, 2024

fast-stable-diffusion + DreamBooth

Python 7,855 1,371 Updated Aug 25, 2025

PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!

Python 7,151 847 Updated Mar 3, 2025

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 6,567 539 Updated Jul 11, 2024

A treasure chest for visual classification and recognition powered by PaddlePaddle

Python 5,730 1,190 Updated Jul 1, 2025

PyTorch native post-training library

Python 5,517 675 Updated Sep 30, 2025

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,712 482 Updated Jan 8, 2024

Let ChatGPT teach your own chatbot in hours with a single GPU!

Python 3,171 286 Updated Mar 17, 2024

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,993 413 Updated May 10, 2023

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,899 219 Updated Sep 29, 2025
Python 2,890 331 Updated Sep 26, 2025

Core Engine of Singing Voice Conversion & Singing Voice Clone

Python 2,823 923 Updated Apr 23, 2024

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,538 248 Updated Apr 24, 2024

Scrape Twitter for Tweets

Python 2,454 575 Updated Oct 5, 2022

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,171 331 Updated Sep 10, 2025

Scalable data pre processing and curation toolkit for LLMs

Python 1,160 181 Updated Sep 30, 2025

A family of lightweight multimodal models.

Python 1,044 77 Updated Nov 18, 2024

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Python 838 158 Updated Oct 10, 2023

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 827 44 Updated Jun 3, 2025

All-in-one text de-duplication

Python 718 74 Updated Aug 31, 2025

Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.

Python 701 74 Updated Aug 25, 2024

Original Implementation of Prompt Tuning from Lester, et al, 2021

Python 695 60 Updated Mar 6, 2025

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 603 55 Updated Jun 9, 2024
Next