Skip to content
View yasiral's full-sized avatar

Block or report yasiral

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

51 stars written in Python
Clear filter

🙌 OpenHands: Code Less, Make More

Python 63,880 7,722 Updated Sep 30, 2025

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 54,029 5,391 Updated Sep 30, 2025

ALL IN ONE Hacking Tool For Hackers

Python 53,662 5,814 Updated Mar 3, 2025

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 45,023 3,732 Updated Sep 29, 2025

We write your reusable computer vision tools. 💜

Python 35,424 2,911 Updated Sep 29, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 24,822 1,734 Updated Sep 28, 2025

SOTA Open Source TTS

Python 23,040 1,901 Updated Sep 23, 2025

Python scraper based on AI

Python 21,404 1,830 Updated Aug 13, 2025

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 20,469 2,196 Updated Mar 11, 2025

Open Source AI Platform - AI Chat with advanced features that works with every LLM

Python 14,935 2,006 Updated Sep 30, 2025

Automate browser-based workflows with LLMs and Computer Vision

Python 14,497 1,234 Updated Sep 30, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 13,319 1,956 Updated Sep 13, 2025

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,967 1,190 Updated Sep 7, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,956 796 Updated Sep 25, 2025
Python 8,652 513 Updated Oct 9, 2024

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

Python 7,356 815 Updated Jul 14, 2025

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Python 7,065 497 Updated Jul 24, 2025

High-resolution models for human tasks.

Python 5,161 299 Updated Nov 18, 2024

[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Python 4,717 765 Updated Mar 7, 2025

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,624 447 Updated Sep 21, 2024

Kolors Team

Python 4,545 345 Updated Nov 13, 2024

Official implementations for paper: Anydoor: zero-shot object-level image customization

Python 4,188 372 Updated Apr 8, 2024

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 3,480 258 Updated Jul 31, 2025

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,416 290 Updated Nov 5, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,073 216 Updated May 19, 2025

RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.

Python 3,064 364 Updated Sep 29, 2025

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Python 2,633 217 Updated Sep 29, 2025

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,626 435 Updated Sep 29, 2025

An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organiz…

Python 2,569 233 Updated Oct 21, 2024

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 2,231 199 Updated Sep 30, 2025
Next