Multi-Modality-Arena/MedicalEval at main · OpenGVLab/Multi-Modality-Arena

Name	Name	Last commit message	Last commit date
parent directory ..
Prefix_based_Score	Prefix_based_Score
Question-answering_Score	Question-answering_Score
VLP_web_data	VLP_web_data
README.md	README.md

Model checkpoints

Most weights and checkpoint files will be downloaded automatically when initialing the corresponding testers. However, there are some files you should download personally and put in a directory. Please note that the downloaded files should be organized as follows:

/path/to/VLP_web_data
├── llama_checkpoints
│   ├── 7B
│   │   ├── checklist.chk
│   │   ├── consolidated.00.pth
│   │   └── params.json
│   ├── tokenizer.model
│   ├── 7B_hf
│   │   ├── config.json
│   │   ├── generation_config.json
│   │   ├── pytorch_model.bin
│   │   ├── pytorch_model.bin.index.json
│   │   ├── special_tokens_map.json
│   │   ├── tokenizer_config.json
│   │   └── tokenizer.model
│   └── tokenizer.model
├── MiniGPT-4
│   ├── alignment.txt
│   └── pretrained_minigpt4_7b.pth
├── llava_med
│   └── llava_med_in_text_60k
│       ├── added_tokens.json
│       ├── config.json
│       ├── generation_config.json
│       ├── pytorch_model-00001-of-00002.bin
│       ├── pytorch_model-00002-of-00002.bin
│       ├── pytorch_model.bin.index.json
│       ├── special_tokens_map.json
│       ├── tokenizer_config.json
│       └── tokenizer.model
├── VPGTrans_Vicuna
│
├── Radfm
│   └── pytorch_model.bin
│
├── MedVInT
│   ├── MedVInT-TD
│   └── MedVInT-TE
│
├── otter-9b-hf
│
└── test_path.json

For LLaMA-Adapter-v2, please obtain the LLaMA backbone weights using this form.
For MiniGPT-4, please download alignment.txt and pretrained_minigpt4_7b.pth.
For VPGTrans, please download VPGTrans_Vicuna.
For Otter, you can download the version we used in our evaluation from this repo. However, please note that the authors of Otter have updated their model, which is better than the version we used in evaluations, please check their github repo for the newest version.
For LlaVA-Med, please follow the instructions in here to prepare the weights of LLaVA-Med. Please note that the authors of LLaVA-Med have updated their model, you can download the latest version of the parameter for evaluation here.
For RadFM, please follow the instructions in here to prepare the environment and download the weights of the network.
For MedVInT, please follow the instructions in here to prepare the environment and download the weights of the network.
For Med-flamingo, please follow the instructions in here to prepare the environment and download the weights of the network.
For test_path.json in VLP_web_data, you need to add the path of the evaluated json file from OmniMedVQA.
We strongly thank all the authors of the evaluated methods. We appreciate their contributions in building the LVLMs. If you utilize the above models for evaluation, remember to cite these works accordingly. Thanks for their wonderful works.

Dataset

You can download the dataset from OpenDataLab or huggingface.

How to evaluation

Prefix_based_scores:

For MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med, you just need to modifty Prefix_based_Score/scripts/run_eval_loss.sh file to spectify the MODEL you want to evaluate and the path of the evaluated DATA json file from OmniMedVQA.
For Radfm, Med-flamingo, MedVInT, you should prepare a test_path.json file in VLP_web_data to specify the path list of the evaluated json file from OmniMedVQA.

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

cd Prefix_based_Score
bash scripts/run_eval_loss.sh

Radfm:

cd Prefix_based_Score
bash Radfm/test/test.sh

Med-flamingo:

cd Prefix_based_Score
bash Med-flamingo/scripts/test.sh

MedVInT:

cd Prefix_based_Score
bash MedVInT/src/MedVInT_TE/test.sh

Question_answering_scores:

For MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med, you just need to modifty Question_answering_scores/scripts/run_eval.sh file to spectify the MODEL you want to evaluate and the path of the evaluated DATA json file from OmniMedVQA.
For Radfm, Med-flamingo, MedVInT, you should prepare a test_path.json file in VLP_web_data to specify the path list of the evaluated json file from OmniMedVQA.

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

cd Question_answering_scores
bash scripts/run_eval_loss.sh

Radfm:

cd Question_answering_scores
bash Radfm/test/test.sh

Med-flamingo:

cd Question_answering_scores
bash Med-flamingo/scripts/test.sh

MedVInT:

cd Question_answering_scores
bash MedVInT/src/MedVInT_TD/test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

MedicalEval

MedicalEval

README.md

Model checkpoints

Dataset

How to evaluation

Prefix_based_scores:

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

Radfm:

Med-flamingo:

MedVInT:

Question_answering_scores:

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

Radfm:

Med-flamingo:

MedVInT:

Files

MedicalEval

Directory actions

More options

Directory actions

More options

Latest commit

History

MedicalEval

Folders and files

parent directory

README.md

Model checkpoints

Dataset

How to evaluation

Prefix_based_scores:

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

Radfm:

Med-flamingo:

MedVInT:

Question_answering_scores:

MiniGPT-4, BLIP2, InstructBLIP, LLaMA-Adapter-v2, LLaVA, Otter, mPLUG-Owl, VPGTrans, llava-med

Radfm:

Med-flamingo:

MedVInT: