l2norm divergence #16430

pwilkin · 2025-10-05T12:29:39Z

pwilkin
Oct 5, 2025
Collaborator

Does anyone know why the l2norm implementation is different than in the typical PyTorch implementations? Ran into the divergence when testing Qwen3Next with some tiny models. Pytorch does soft clamping:

def l2norm(x: torch.FloatTensor, dim: int = -1, eps: float = 1e-6):
    """This function is intended to align with the l2norm implementation in the FLA library."""
    inv_norm = torch.rsqrt((x * x).sum(dim=dim, keepdim=True) + eps)
    return x * inv_norm

while GGML does hard clamping (with fmax(sum, eps)). Is there a specific rationale for it?

ggerganov · 2025-10-06T07:59:24Z

ggerganov
Oct 6, 2025
Maintainer

It was implemented like this in #12412 - cc @MollySophia

0 replies

MollySophia · 2025-10-06T08:28:11Z

MollySophia
Oct 6, 2025
Collaborator

It was implemented according to: https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.normalize.html

0 replies

pwilkin · 2025-10-06T10:30:49Z

pwilkin
Oct 6, 2025
Collaborator Author

Yeah, I see base PyTorch uses the hard clamp, but both Llama4 and Qwen3 Next use the soft clamp (with adding the epsilon instead of max). Maybe we should admit a flag to specify which sort of clamping is needed? I'm not sure this will really matter in real-world scenarios too much, just bringing it up since I did encounter a discrepancy with a mock model.

1 reply

MollySophia Oct 6, 2025
Collaborator

I guess yes? We can add a flag to decide whether to use hard clamp or soft clamp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

l2norm divergence #16430

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

l2norm divergence #16430

Uh oh!

pwilkin Oct 5, 2025 Collaborator

Replies: 3 comments · 1 reply

Uh oh!

ggerganov Oct 6, 2025 Maintainer

Uh oh!

MollySophia Oct 6, 2025 Collaborator

Uh oh!

pwilkin Oct 6, 2025 Collaborator Author

Uh oh!

MollySophia Oct 6, 2025 Collaborator

pwilkin
Oct 5, 2025
Collaborator

Replies: 3 comments 1 reply

ggerganov
Oct 6, 2025
Maintainer

MollySophia
Oct 6, 2025
Collaborator

pwilkin
Oct 6, 2025
Collaborator Author

MollySophia Oct 6, 2025
Collaborator