Note: DO NOT USE IT! THIS CODE IS PROVEN TO CONTAIN DATA LEAKAGE! Archive version of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"

Python 18 5 Updated May 1, 2025

google-deepmind / robovqa

Jupyter Notebook 30 5 Updated Dec 13, 2023

UT-Austin-RPL / VIOLA

Official implementation for VIOLA

Python 120 7 Updated Jun 18, 2023

xuguohai / X-CLIP

An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"

Python 174 19 Updated Apr 6, 2024

JingXiaolun / TC-MGC

Python 5 Updated May 26, 2025

ruc-aimc-lab / TeachCLIP

[CVPR 2024] TeachCLIP for Text-to-Video Retrieval

Python 40 Updated May 7, 2025

NVlabs / EAGLE

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Python 876 48 Updated Aug 8, 2025

DAMO-NLP-SG / VideoLLaMA3

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 996 70 Updated Aug 14, 2025

rail-berkeley / bridge_data_v2

Python 216 27 Updated Mar 17, 2024

openvla / openvla

Forked from TRI-ML/prismatic-vlms

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 3,983 470 Updated Mar 23, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,278 407 Updated Sep 14, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 13,485 1,027 Updated Sep 28, 2025

showlab / videollm-online

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 554 56 Updated Sep 2, 2025

BM-K / KoSentenceBERT-SKT

Sentence Embeddings using Siamese SKT KoBERT

Python 142 31 Updated Jan 6, 2023

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,062 127 Updated Aug 7, 2025

kyegomez / RT-2

Democratization of RT-2 "RT-2: New model translates vision and language into action"

Python 516 65 Updated Jul 26, 2024

danijar / dreamerv3

Mastering Diverse Domains through World Models

Python 2,176 371 Updated Sep 23, 2025

cremebrule / digital-cousins

Codebase for Automated Creation of Digital Cousins for Robust Policy Learning

Python 229 20 Updated Mar 31, 2025

Genesis-Embodied-AI / RoboGen

A generative and self-guided robotic agent that endlessly propose and master new skills.

Python 1,079 101 Updated May 31, 2024

octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

Python 1,368 233 Updated Jul 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IL-SEONG ilseong827

Block or report ilseong827

Stars

safety-research / circuit-tracer

PRIME-RL / SimpleVLA-RL

fuwei007 / Navbot-EN01

toshikwa / wappo.pytorch

yufeiwang63 / RL-VLM-F

sumedh7 / RoboCLIP

fuyw / FuRL

AlignmentResearch / vlmrm

aryopg / clinical_peft

microsoft / XPretrain

patrick-0817 / T-MASS-text-video-retrieval