Skip to content

Files

Latest commit

 

History

History

MedicalEval

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Model checkpoints

Most weights and checkpoint files will be downloaded automatically when initialing the corresponding testers. However, there are some files you should download personally and put in a directory. Please note that the downloaded files should be organized as follows:

/path/to/VLP_web_data
├── llama_checkpoints
│   ├── 7B
│   │   ├── checklist.chk
│   │   ├── consolidated.00.pth
│   │   └── params.json
│   ├── tokenizer.model
│   ├── 7B_hf
│   │   ├── config.json
│   │   ├── generation_config.json
│   │   ├── pytorch_model.bin
│   │   ├── pytorch_model.bin.index.json
│   │   ├── special_tokens_map.json
│   │   ├── tokenizer_config.json
│   │   └── tokenizer.model
│   └── tokenizer.model
├── MiniGPT-4
│   ├── alignment.txt
│   └── pretrained_minigpt4_7b.pth
├── llava_med
│   └── llava_med_in_text_60k
│       ├── added_tokens.json
│       ├── config.json
│       ├── generation_config.json
│       ├── pytorch_model-00001-of-00002.bin
│       ├── pytorch_model-00002-of-00002.bin
│       ├── pytorch_model.bin.index.json
│       ├── special_tokens_map.json
│       ├── tokenizer_config.json
│       └── tokenizer.model
├── VPGTrans_Vicuna
│
├── Radfm
│   └── pytorch_model.bin
│
├── MedVInT
│   ├── MedVInT-TD
│   └── MedVInT-TE
│
├── otter-9b-hf
│
└── test_path.json
  • For LLaMA-Adapter-v2, please obtain the LLaMA backbone weights using this form.

  • For MiniGPT-4, please download alignment.txt and pretrained_minigpt4_7b.pth.

  • For VPGTrans, please download VPGTrans_Vicuna.

  • For Otter, you can download the version we used in our evaluation from this repo. However, please note that the authors of Otter have updated their model, which is better than the version we used in evaluations, please check their github repo for the newest version.

  • For LlaVA-Med, please follow the instructions in here to prepare the weights of LLaVA-Med. Please note that the authors of LLaVA-Med have updated their model, you can download the latest version of the parameter for evaluation here.

  • For RadFM, please follow the instructions in here to prepare the environment and download the weights of the network.

  • For MedVInT, please follow the instructions in here to prepare the environment and download the weights of the network.

  • For Med-flamingo, please follow the instructions in here to prepare the environment and download the weights of the network.

  • For test_path.json in VLP_web_data, you need to add the path of the evaluated json file from OmniMedVQA.

  • We strongly thank all the authors of the evaluated methods. We appreciate their contributions in building the LVLMs. If you utilize the above models for evaluation, remember to cite these works accordingly. Thanks for their wonderful works.

Dataset

How to evaluation

Prefix_based_scores:

  • For MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med, you just need to modifty Prefix_based_Score/scripts/run_eval_loss.sh file to spectify the MODEL you want to evaluate and the path of the evaluated DATA json file from OmniMedVQA.
  • For Radfm, Med-flamingo, MedVInT, you should prepare a test_path.json file in VLP_web_data to specify the path list of the evaluated json file from OmniMedVQA.

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

cd Prefix_based_Score
bash scripts/run_eval_loss.sh

Radfm:

cd Prefix_based_Score
bash Radfm/test/test.sh

Med-flamingo:

cd Prefix_based_Score
bash Med-flamingo/scripts/test.sh

MedVInT:

cd Prefix_based_Score
bash MedVInT/src/MedVInT_TE/test.sh

Question_answering_scores:

  • For MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med, you just need to modifty Question_answering_scores/scripts/run_eval.sh file to spectify the MODEL you want to evaluate and the path of the evaluated DATA json file from OmniMedVQA.
  • For Radfm, Med-flamingo, MedVInT, you should prepare a test_path.json file in VLP_web_data to specify the path list of the evaluated json file from OmniMedVQA.

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

cd Question_answering_scores
bash scripts/run_eval_loss.sh

Radfm:

cd Question_answering_scores
bash Radfm/test/test.sh

Med-flamingo:

cd Question_answering_scores
bash Med-flamingo/scripts/test.sh

MedVInT:

cd Question_answering_scores
bash MedVInT/src/MedVInT_TD/test.sh