Skip to content
View xujz18's full-sized avatar

Organizations

@THUDM

Block or report xujz18

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 1,238 56 Updated Mar 12, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,436 85 Updated Mar 14, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,101 255 Updated Mar 14, 2025

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Python 467 29 Updated Feb 21, 2025

Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 1,004 42 Updated Feb 23, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,659 99 Updated Mar 7, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 526 31 Updated Mar 13, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,688 616 Updated Mar 7, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,778 465 Updated Mar 14, 2025
JavaScript 46 6 Updated Nov 5, 2024

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 3,285 221 Updated Feb 13, 2025

A fork to add multimodal model training to open-r1

Python 1,052 54 Updated Feb 8, 2025

A jounery to real multimodel R1 ! We are doing on large-scale experiment

Python 272 5 Updated Mar 8, 2025

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 16,167 1,125 Updated Mar 14, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,239 559 Updated Feb 26, 2025

Align Anything: Training All-modality Model with Feedback

Python 2,792 365 Updated Mar 14, 2025

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 26 1 Updated Feb 26, 2025

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 23,383 2,024 Updated Jan 23, 2025

Witness the aha moment of VLM with less than $3.

Python 3,249 255 Updated Mar 1, 2025

Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).

Python 71 8 Updated Jun 11, 2024

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or im…

121 4 Updated Jun 25, 2024

Scalable toolkit for efficient model alignment

Python 742 90 Updated Mar 14, 2025
4 Updated Feb 4, 2025

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 174 3 Updated Feb 17, 2025

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 82,145 12,066 Updated Mar 14, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,630 550 Updated Mar 14, 2025
Next