Skip to content

Conversation

@Datta0
Copy link
Collaborator

@Datta0 Datta0 commented May 31, 2025

Fixes: #2661

The current regex is very basic and doesn't consider the following scenarios for example

  1. Decimal/Floats in model name unsloth/Qwen3-1.7B-bnb-4bit
  2. Small 'b' for model name like google/gemma-3-1b-it
    I don't know if there are any. But I've tested with a few popular models and it works.

Note: Picking the param count from model name is not 100% accurate but is decent enough estimate.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the regex used to extract the approximate parameter count from model names in the quantization module, addressing issues with decimal values and lower-case “b” identifiers.

  • Introduces a new helper function extract_approx_params_from_config to extract the parameter count from the model configuration.
  • Updates get_model_param_count to use the new extraction function while removing the previous basic regex calculation.

@Datta0 Datta0 requested a review from Copilot June 1, 2025 05:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request improves the regex used for extracting approximate parameter counts from model configuration strings, addressing issues with decimal numbers in model names and lower-case "b" identifiers.

  • Introduces a new helper function extract_approx_params_from_config.
  • Updates get_model_param_count to use the new extraction function instead of the basic regex.

lowercase_b_families = ["gemma"] # gemma uses small 'b' : google/gemma-3-1b-it
model_name = getattr(config, "name_or_path", "")
import re
cleaned = re.sub(r"[-_]?bnb[-_]?4bit|[-_]?4bit|[-_]?8bit|[-_]?bnb", "", model_name, flags=re.IGNORECASE) # replace bnb and xbit
Copy link

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex substitution comment mentions 'xbit', but the pattern does not match any 'xbit' strings. Please update either the comment or the regex to correctly handle 'xbit' if intended.

Suggested change
cleaned = re.sub(r"[-_]?bnb[-_]?4bit|[-_]?4bit|[-_]?8bit|[-_]?bnb", "", model_name, flags=re.IGNORECASE) # replace bnb and xbit
cleaned = re.sub(r"[-_]?bnb[-_]?4bit|[-_]?4bit|[-_]?8bit|[-_]?bnb|[-_]?xbit", "", model_name, flags=re.IGNORECASE) # replace bnb and xbit

Copilot uses AI. Check for mistakes.
@shimmyshimmer shimmyshimmer merged commit c5a2a36 into unslothai:main Jun 1, 2025
@Datta0 Datta0 deleted the model_param_fix branch October 21, 2025 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Issue] Qwen3 Number of parameters displayed incorrectly

2 participants