-
Tsinghua University, KEG Group
- Beijing, China
-
04:25
(UTC +08:00) - @xujz0703
- in/jiazheng-xu-6a96a11b2
- https://www.semanticscholar.org/author/Jiazheng-Xu/2214082934
Stars
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Solve Visual Understanding with Reinforced VLMs
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
MoBA: Mixture of Block Attention for Long-Context LLMs
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
verl: Volcano Engine Reinforcement Learning for LLMs
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
A fork to add multimodal model training to open-r1
A jounery to real multimodel R1 ! We are doing on large-scale experiment
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Align Anything: Training All-modality Model with Feedback
Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Witness the aha moment of VLM with less than $3.
Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).
RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or im…
Scalable toolkit for efficient model alignment
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)