vwxyzjn

😃

Costa Huang vwxyzjn

😃

Prev: RL @allenai @huggingface.

1.7k followers · 127 following

@huggingface
Philadelphia, PA
02:14 (UTC -04:00)
https://costa.sh
@vwxyzjn

Achievements

x4 x3 x3

Achievements

x4 x3 x3

Lists (5)

Sort

Stars

AlongWY / TransformerEngine_wheels

wheels for TransformerEngine

Python 4 1 Updated Sep 21, 2025

llm-d / llm-d

llm-d enables high-performance distributed LLM inference on Kubernetes

Makefile 1,803 172 Updated Sep 30, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,036 388 Updated Sep 29, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 905 144 Updated Sep 30, 2025

flexagoon / ream

Python 25 1 Updated Jul 31, 2025

cohere-ai / cohere-terrarium

A simple Python sandbox for helpful LLM data agents

Python 285 46 Updated Jun 18, 2024

PrimeIntellect-ai / prime-rl

Async RL Training at Scale

Python 658 104 Updated Sep 30, 2025

joerick / pyinstrument

🚴 Call stack profiler for Python. Shows you why your code is slow!

Python 7,370 249 Updated Sep 22, 2025

McGill-NLP / nano-aha-moment

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 535 51 Updated Jul 7, 2025

hendrycks / apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

Python 488 67 Updated Jun 19, 2024

amir20 / dozzle

Realtime log viewer for containers. Supports Docker, Swarm and K8s.

Go 9,705 413 Updated Sep 29, 2025

SzymonOzog / GPU_Programming

Python 78 8 Updated Sep 21, 2025

huggingface / Math-Verify

Python 948 44 Updated Jul 2, 2025

deepseek-ai / smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,788 420 Updated Mar 5, 2025

allenai / olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Python 14,174 1,066 Updated Sep 29, 2025

codingfisch / flashrl

Fast reinforcement learning 💨

Cython 26 1 Updated Jul 15, 2025

nebius / kvax

A FlashAttention implementation for JAX with support for efficient document mask computation and context parallelism.

Python 143 8 Updated Apr 11, 2025

PrimeIntellect-ai / verifiers

Environments for LLM Reinforcement Learning

Python 3,222 363 Updated Sep 30, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,480 539 Updated Sep 30, 2025

joey00072 / nanoGRPO

nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)

Python 120 6 Updated May 8, 2025

AI-Hypercomputer / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Python 380 53 Updated Jun 10, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,754 902 Updated Sep 29, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,922 284 Updated May 15, 2025

keraJLi / rejax

Hardware-Accelerated Reinforcement Learning Algorithms in pure Jax!

Python 236 19 Updated May 26, 2025

open-thought / reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,165 94 Updated Sep 29, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 26,901 3,200 Updated Apr 30, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,118 1,091 Updated Sep 16, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,561 765 Updated Jun 25, 2025

unikernelLinux / ukl

Unikernel Linux

C 210 20 Updated Aug 13, 2025

huggingface / picotron_tutorial

Python 221 34 Updated Feb 13, 2025

Costa Huang vwxyzjn

Lists (5)

🔥 CleanRL-supported Projects

Fancy Tech

🔮 Future tech

🚀 My stack

🔨 My tools

Stars