
-
AirLab, Latent AI, AI Monk Labs, The Learning Machines, Ignitus
- Pittsburgh, PA
-
04:17
(UTC -04:00) - https://linktr.ee/smj007
Starred repositories
anushrisuresh / kv-quantization
Forked from meta-pytorch/gpt-fastReduce LLM KV cache memory usage by ~2x with minimal accuracy loss using selective quantization. Experiment with full precision as well as different compression strategies.
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
Smoothed Gradient Descent-Ascent for Min-Max Optimization
E5-V: Universal Embeddings with Multimodal Large Language Models
Source Code for "Map It Anywhere (MIA): Empowering Bird’s Eye View Mapping using Large-scale Public Data"
Images to inference with no labeling (use foundation models to train supervised models).
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Monocular Depth Estimation Toolbox based on MMSegmentation.
Generative Models by Stability AI
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Framework for Analysis of Class-Incremental Learning with 12 state-of-the-art methods and 3 baselines.
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…
(CVPR2023/TPAMI2024) Integrally Pre-Trained Transformer Pyramid Networks -- A Hierarchical Vision Transformer for Masked Image Modeling
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
A resource for learning about Machine learning & Deep Learning
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Convex optimizers for LASSO, including subgradient, project gradient, proximal gradient, smooth method, lagrangian method and stochastic gradient descent variants.