Skip to content
View gavinzhang1995's full-sized avatar

Block or report gavinzhang1995

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Yingshi New Concept English

645 126 Updated Sep 8, 2022

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Python 7,064 497 Updated Jul 24, 2025

Documentation for Google's Gen AI site - including the Gemini API and Gemma

Jupyter Notebook 2,155 734 Updated Sep 8, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,399 163 Updated Mar 20, 2025

s1: Simple test-time scaling

Python 6,562 765 Updated Jun 25, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 3,157 292 Updated Jul 7, 2025

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,417 541 Updated May 18, 2025

More relighting!

Python 8,231 519 Updated Feb 20, 2025

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 3,158 198 Updated May 19, 2025

主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识

HTML 247 7 Updated May 12, 2024

Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"

Python 339 117 Updated Aug 2, 2024

Unsupervised Aspect Extraction for restaurant review dataset using mistral-7b model.

Jupyter Notebook 2 Updated Oct 27, 2023

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,223 92 Updated Feb 16, 2025

The ultimate training toolkit for finetuning diffusion models

Python 6,415 752 Updated Sep 30, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 89,755 10,028 Updated Oct 1, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,166 893 Updated Oct 1, 2025

Official inference repo for FLUX.1 models

Python 24,388 1,783 Updated Jul 31, 2025

Let us control diffusion models!

Python 33,118 2,970 Updated Feb 25, 2024

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 13,486 1,026 Updated Sep 28, 2025

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,772 305 Updated Mar 7, 2025

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目

2,627 235 Updated Jul 18, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,272 718 Updated Sep 22, 2025

Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.

Python 13 1 Updated Sep 12, 2025

Experiments with LAVIS library to perform image2text and text2image retrieval with BLIP and BLIP2 models

Jupyter Notebook 15 5 Updated Sep 25, 2023

2025最新悄咪咪收集的10000+个Telegram群合集,附全网最有趣好用的机器人BOT🤖【dianbaodaohang.com】

18,803 1,288 Updated Sep 24, 2025

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,929 1,067 Updated Nov 18, 2024

This is an official implementation for "Video Swin Transformers".

Python 1,583 210 Updated Mar 8, 2023

Video Swin Transformer - PyTorch

Python 265 38 Updated Jan 4, 2022
Next