Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
benchmark_decode.py		benchmark_decode.py
benchmark_pytorch_engine_a100.sh		benchmark_pytorch_engine_a100.sh
benchmark_turbomind_engine_a100.sh		benchmark_turbomind_engine_a100.sh
profile_generation.py		profile_generation.py
profile_hf_generation.py		profile_hf_generation.py
profile_pipeline_api.py		profile_pipeline_api.py
profile_restful_api.py		profile_restful_api.py
profile_throughput.py		profile_throughput.py
profile_triton_python_backend.py		profile_triton_python_backend.py

README.md

Benchmark

We provide several profiling tools to benchmark our models.

profile with dataset

Download the dataset below or create your own dataset.

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Profiling your model with profile_throughput.py

python profile_throughput.py \
 ShareGPT_V3_unfiltered_cleaned_split.json \
 /path/to/your/model \
 --concurrency 64

profile without dataset

profile_generation.py perform benchmark with dummy data.

pip install nvidia-ml-py

python profile_generation.py \
 /path/to/your/model \
 --concurrency 1 8 --prompt-tokens 1 512 --completion-tokens 2048 512

profile restful api

profile_restful_api.py is used to do benchmark on api server.

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

python3 profile_restful_api.py --backend lmdeploy --dataset-path ./ShareGPT_V3_unfiltered_cleaned_split.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark

benchmark

README.md

Benchmark

profile with dataset

profile without dataset

profile restful api

Files

benchmark

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmark

Folders and files

parent directory

README.md

Benchmark

profile with dataset

profile without dataset

profile restful api