Skip to content

Latest commit

 

History

History
348 lines (314 loc) · 13.9 KB

benchmark.rst

File metadata and controls

348 lines (314 loc) · 13.9 KB

Benchmark

We provide scripts for evaluating and training models on task datasets. The following benchmark results are included for reference.

ALBEF

Pretraining COCO (download) script
  Visual Genome (download)  
  SBU (download)  
  CC3M (download)  
  CC12M (download)  
  Retrieval R1 R5 R10 Training Evaluation
TR COCO (download) 77.6 94.1 97.2 script script
IR COCO (download) 61.0 84.5 90.7 script script
TR Flickr30k (download) 77.6 94.1 97.2 script script
IR Flickr30k (download) 61.0 84.5 90.7 script script
VQA test-dev test-std/test Training Evaluation
VQAv2 (download) 76.35 76.54 script script
OKVQA (download) NA 54.7 script NA
AOKVQA (download) 54.5 NA script NA
Multimodal Classification val test Training Evaluation
SNLI-VE (download) 80.60 81.04 script script
NLVR2 (download) 82.47 82.91 script script

BLIP

Pretraining (14M) COCO (download) script
  Visual Genome (download)  
  SBU (download)  
  CC3M (download)  
  CC12M (download)  
Tasks Retrieval R1 R5 R10 Training Evaluation
TR COCO (download) 82.0 95.8 98.1 script script
IR COCO (download) 64.5 86.0 91.7 script script
TR Flickr30k (download) 96.9 99.9 100.0 script script
IR Flickr30k (download) 87.5 97.6 98.9 script script
VQA test-dev test-std/test Training Evaluation
VQAv2 (download) 78.23 78.29 script script
OKVQA (download) NA 55.4 script script
AOKVQA (download) 56.2 50.1 script script
Image Captioning BLEU@4 CIDEr SPICE Training Evaluation
COCO (download) 39.9 133.5 23.7 script script
NoCaps (download) 31.9 109.1 14.7 NA script
Multimodal Classification val test Training Evaluation
NLVR2 (download) 82.48 83.25 script script

CLIP

Tasks Retrieval (Zero-shot) R1 R5 R10 Evaluation
TR COCO (download) 57.2 80.5 87.8 script
IR COCO (download) 36.5 60.8 71.0 script
TR Flickr30k (download) 86.5 98.0 99.1 script
IR Flickr30k (download) 67.0 88.9 93.3 script
Multimodal Classification val Evaluation
ImageNet 76.5 script

ALPRO

Tasks Retrieval R1 R5 R10 Training Evaluation
TR MSRVTT (download) 33.2 60.5 71.7 script script
VR MSRVTT (download) 33.8 61.4 72.7 script script
TR DiDeMo (download) 38.8 66.4 76.8 script script
VR DiDeMo (download) 36.6 67.5 77.9 script script
Video QA test Training Evaluation
MSRVTT 42.1 script script
MSVD 46.0 script script