Skip to content
View 152334H's full-sized avatar
💤
💤

Organizations

@IRS-Cybersec

Block or report 152334H

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Base docker image used in Codex environments

Dockerfile 615 173 Updated Aug 28, 2025

Big & Small LLMs working together

Python 1,170 131 Updated Sep 30, 2025

Shared Middle-Layer for Triton Compilation

MLIR 288 78 Updated Sep 26, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,568 941 Updated Sep 30, 2025

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 880 45 Updated Mar 19, 2025

MLGym A New Framework and Benchmark for Advancing AI Research Agents

Python 557 55 Updated Aug 10, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,922 284 Updated May 15, 2025

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 762 48 Updated Aug 15, 2025

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

1,438 138 Updated Jul 18, 2025

Code, Data and Red Teaming for ZeroBench

46 3 Updated May 3, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 2,972 195 Updated Sep 17, 2025

Learnings and programs related to CUDA

Cuda 418 18 Updated Jun 29, 2025

A high-efficiency system-on-chip for floating-point compute workloads.

Python 43 19 Updated Jan 13, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,742 100 Updated Mar 18, 2025

CUDA/Metal accelerated language model inference

C 614 29 Updated May 29, 2025

Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

Python 230 42 Updated Sep 30, 2025

Test suite for probing the numerical behavior of NVIDIA tensor cores

Cuda 41 13 Updated Jul 24, 2024

Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]

Jupyter Notebook 545 33 Updated Jul 29, 2025

[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Python 361 24 Updated Mar 7, 2025

Simplifying reinforcement learning for complex game environments

C 3,437 250 Updated Sep 30, 2025

Scalable and Performant Data Loading

Python 304 16 Updated Sep 20, 2025

how to optimize some algorithm in cuda.

Cuda 2,530 228 Updated Sep 30, 2025

TORCH_LOGS parser for PT2

Rust 61 19 Updated Sep 20, 2025

Fastest kernels written from scratch

Cuda 361 48 Updated Sep 18, 2025

prime is a framework for efficient, globally distributed training of AI models over the internet.

Python 827 88 Updated May 22, 2025

Run Slurm in Kubernetes

Go 288 40 Updated Sep 30, 2025
Python 101 12 Updated Apr 24, 2025

supporting pytorch FSDP for optimizers

Python 84 4 Updated Dec 8, 2024

Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)

Python 411 46 Updated Sep 29, 2025
Next