Skip to content

Conversation

@Erland366
Copy link
Collaborator

Few updates ago, we patch trl.Config especially in the bf16 and fp16 args to bypass the Ampere GPU error -> https://github.com/unslothai/unsloth/blob/main/unsloth/models/rl.py#L235-L267

But the patching is not propagate to the inheritance class that is using dataclass. Therefore, we should inherit it using Python Class instead to propagate the patching.

This fixes training error on Mistral CPT

@Erland366 Erland366 requested a review from danielhanchen July 3, 2025 19:08
@shimmyshimmer shimmyshimmer merged commit 51a7023 into unslothai:main Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants