loveunk

🎯

Focusing

Kevin loveunk

🎯

Focusing

333 followers · 15 following

Shenzhen

Achievements

x2 x3

Achievements

x2 x3

Stars

apple / ml-fastvlm

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 6,684 449 Updated May 5, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,223 1,504 Updated Apr 24, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,561 764 Updated Jun 25, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,149 891 Updated Sep 29, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,216 2,190 Updated Sep 3, 2025

loveunk / machine-learning-deep-learning-notes

机器学习、深度学习的学习路径及知识总结

Jupyter Notebook 2,137 360 Updated Jan 26, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,270 405 Updated Sep 14, 2025

apple / axlearn

An Extensible Deep Learning Library

Python 2,259 377 Updated Sep 16, 2025

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,225 79 Updated Jan 23, 2025

Yuliang-Liu / Monkey

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,923 135 Updated Jul 17, 2025

HyperGAI / HPT

HPT - Open Multimodal LLMs from HyperGAI

Python 315 22 Updated Jun 6, 2024

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Python 629 69 Updated Dec 10, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,570 298 Updated Aug 6, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,649 2,634 Updated Aug 12, 2024

OpenBMB / XAgent

An Autonomous LLM Agent for Complex Task Solving

Python 8,450 895 Updated Aug 12, 2024

OpenBMB / ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Python 27,471 3,457 Updated Sep 23, 2025

lencx / Noi

🚀 Power Your World with AI - Explore, Extend, Empower.

JavaScript 7,988 616 Updated Sep 15, 2025

facebookresearch / jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,212 319 Updated Feb 27, 2025

BAAI-DCAI / Bunny

A family of lightweight multimodal models.

Python 1,044 77 Updated Nov 18, 2024

OpenBMB / MiniCPM

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,374 521 Updated Sep 11, 2025

XingangPan / DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Python 35,994 3,445 Updated May 18, 2024

LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,756 591 Updated Jul 17, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

2,781 123 Updated Aug 28, 2025

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,364 239 Updated Dec 3, 2024

apple / ml-mgie

Python 3,891 254 Updated Mar 15, 2024

Ucas-HaoranWei / Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Python 624 44 Updated Dec 30, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,269 465 Updated Aug 7, 2024

Computer-Vision-in-the-Wild / CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,337 58 Updated Mar 14, 2024

amazon-science / mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)

Python 3,965 330 Updated Jun 12, 2024

lupantech / ScienceQA

Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

Python 691 65 Updated Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kevin loveunk

Achievements

Achievements

Block or report loveunk

Stars

apple / ml-fastvlm

Jiayi-Pan / TinyZero

simplescaling / s1

modelscope / ms-swift

Infrasys-AI / AISystem

loveunk / machine-learning-deep-learning-notes

LLaVA-VL / LLaVA-NeXT

apple / axlearn

DAMO-NLP-SG / VideoLLaMA2

Yuliang-Liu / Monkey

HyperGAI / HPT

Vision-CAIR / MiniGPT4-video

NVlabs / VILA

haotian-liu / LLaVA

OpenBMB / XAgent

OpenBMB / ChatDev

lencx / Noi

facebookresearch / jepa

BAAI-DCAI / Bunny

OpenBMB / MiniCPM

XingangPan / DragGAN

LiheYoung / Depth-Anything

yunlong10 / Awesome-LLMs-for-Video-Understanding

PKU-YuanGroup / Video-LLaVA

apple / ml-mgie

Ucas-HaoranWei / Vary-toy

QwenLM / Qwen-VL

Computer-Vision-in-the-Wild / CVinW_Readings

amazon-science / mm-cot

lupantech / ScienceQA