Lists (2)
Sort Name ascending (A-Z)
Stars
3
stars
written in Cuda
Clear filter
DeepEP: an efficient expert-parallel communication library
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Implement Flash Attention using Cute.