
Starred repositories
nanomsg-next-generation -- light-weight brokerless messaging
Eclipse iceoryx™ - true zero-copy inter-process-communication
EntityX - A fast, type-safe C++ Entity-Component system
Tengine is a lite, high performance, modular inference engine for embedded device
This is a tutorial to learn LLVM, I realize a backend to compiler machine code for cpu0 which is a simple RISC cpu.
Lock-free concurrent work stealing deque in C++
NSRunLoop Reactor Style Implementation: Using BSD kqueue implements iOS/Mac NSRunLoop and RunLoop-Relative Foundation such as perform selector(or delay some times) on other thread , Timer, URLConne…
A fast work-stealing queue template in C++
Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]
Complete implementations from "Algorithms for Modern Hardware"
Access private members and statics of a C++ class
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.
It is open source ebook about TensorFlow kernel and implementation mechanism.
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
This repository contains several applications, demonstrating the Meltdown bug.
A collection of C++ headers which make it easier to write Python C extension modules.
A lightweight LLVM python binding for writing JIT compilers
Automatically exported from code.google.com/p/smhasher
Chinese version for Agner Fog's optimizing series
Optimizing Software In C++ 非正式中文翻译