-
UC Santa Cruz
-
04:59
(UTC +01:00) - xk-huang.github.io
Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Stars
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v…
The most open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.
Tooling to download and prepare the ILSVRC2012 dataset
p-doom / jasmine
Forked from FLAIROx/jafarA simple, performant and scalable JAX-based world modeling codebase
MedVLThinker: Simple Baselines for Multimodal Medical Reasoning
Pocket Flow: 100-line LLM framework. Let Agents build Agents!
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Wan: Open and Advanced Large-Scale Video Generative Models
SVG Differentiable Rendering: Generating vector graphics using neural networks. Support: text-to-SVG, Image-to-SVG, SVG Editing.
An open-source AI agent that brings the power of Gemini directly into your terminal.
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
Official PyTorch implementation for "Large Language Diffusion Models"
🤗 smolagents: a barebones library for agents that think in code.
A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
The simplest, fastest repository for training/finetuning medium-sized GPTs.
[SIGGRAPH 2025] Official code of the paper "FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios"
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
MAGI-1: Autoregressive Video Generation at Scale
Lightweight coding agent that runs in your terminal
Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.