Skip to content

Conversation

@Datta0
Copy link
Collaborator

@Datta0 Datta0 commented Oct 23, 2025

For FP8 models, make it supported for non fast inference.
Also fix for when weight shape is not multiple of 8. Read here.

Also move hf quantizer patch to unsloth from unsloth zoo to make it run for non fast inference.

Patch Fbgemmfp8linear and fp8linear classes' forward methods here to make them work for compiled models (which don't use mamtul_lora explicitly)

depends on #unslothai/unsloth-zoo#337

Sample Qwen 2.5 VL 7B on FP8 GRPO :)
image

@Datta0 Datta0 changed the title Fix FP8 for models with non 8 multiple weights FP8 training enhancements Oct 23, 2025
@danielhanchen danielhanchen merged commit fc178b5 into unslothai:main Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants