-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Labels
Description
- I was recently finetuning Qwen3 1.7B params model for a use case and using LoRA. Noticed this output in the finetuning script
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1
\ /| Num examples = 22,948 | Num Epochs = 1 | Total steps = 11,474
O^O/ _/ \ Batch size per device = 1 | Gradient accumulation steps = 2
\ / Data Parallel GPUs = 1 | Total batch size (1 x 2 x 1) = 2
"-____-" Trainable parameters = 8,716,288/7,000,000,000 (0.12% trained)
It is surprising to see number of total parameters showing as 7B instead of 1.7B. Wondering why would that be? Llama 3B params are displayed correctly. Is this an issue with the code?
I am using the "unsloth/Qwen3-1.7B-bnb-4bit" model to train and using
unsloth==2025.4.7
unsloth-zoo==2025.5.8
Thanks