Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad codegen for widen followed by ARM vdupq_n_* #137407

Open
CatsAreFluffy opened this issue Feb 22, 2025 · 8 comments
Open

Bad codegen for widen followed by ARM vdupq_n_* #137407

CatsAreFluffy opened this issue Feb 22, 2025 · 8 comments
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such O-AArch64 Armv8-A or later processors in AArch64 mode P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. S-has-bisection Status: a bisection has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@CatsAreFluffy
Copy link

Code

I tried this code:

use std::arch::aarch64::*;

#[no_mangle]
pub unsafe fn half_dup(x: u16) -> uint16x8_t {
    vaddq_u16(vreinterpretq_u16_u32(vdupq_n_u32(x as u32)), vdupq_n_u16(1))
}

I expected to see this happen: The compiled output would use the dup instruction

Instead, this happened: The compiled output uses a bunch of movs instead

Godbolt

Version it worked on

It most recently worked on: Rust 1.81.0

Version with regression

rustc --version --verbose:

rustc 1.82.0 (f6e511eec 2024-10-15)
binary: rustc
commit-hash: f6e511eec7342f59a25f7c0534f1dbea00d01b14
commit-date: 2024-10-15
host: x86_64-unknown-linux-gnu
release: 1.82.0
LLVM version: 19.1.1

and

rustc 1.87.0-nightly (f280acf4c 2025-02-19)
binary: rustc
commit-hash: f280acf4c743806abbbbcfe65050ac52ec4bdec0
commit-date: 2025-02-19
host: x86_64-unknown-linux-gnu
release: 1.87.0-nightly
LLVM version: 20.1.0

@CatsAreFluffy CatsAreFluffy added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Feb 22, 2025
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 22, 2025
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc labels Feb 22, 2025
@jieyouxu

This comment has been minimized.

@nikic
Copy link
Contributor

nikic commented Feb 22, 2025

@jieyouxu Rust 1.82 would be the LLVM 19 upgrade.

@nikic nikic added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed regression-untriaged Untriaged performance or correctness regression. C-bug Category: This is a bug. labels Feb 22, 2025
@jieyouxu
Copy link
Member

@jieyouxu Rust 1.82 would be the LLVM 19 upgrade.

Yeah sorry somehow I didn't notice the 1.82

@hkratz
Copy link
Contributor

hkratz commented Feb 22, 2025

@rustbot label +A-SIMD +O-aarch64

@rustbot rustbot added A-SIMD Area: SIMD (Single Instruction Multiple Data) O-AArch64 Armv8-A or later processors in AArch64 mode labels Feb 22, 2025
@nikic
Copy link
Contributor

nikic commented Feb 22, 2025

Upstream issue: llvm/llvm-project#128349

@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 24, 2025
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 24, 2025
@nikic
Copy link
Contributor

nikic commented Feb 24, 2025

Looks like Rust is using a non-canonical pattern for vector splat generation: https://github.com/rust-lang/stdarch/blob/1c6113f37b44baeb3364867f52bd67a3f70e7a6f/crates/core_arch/src/simd.rs#L25-L34

We shouldn't go through a 1-element vector here. We probably need to add a simd_splat intrinsic to generate the correct form.

@moxian
Copy link
Contributor

moxian commented Mar 25, 2025

bisects to #128866 , and more specifically rust-lang/stdarch@7365074

cc @scottmcm FYI, although I'm not sure this is actionable by you.

@rustbot label: -E-needs-bisection +S-has-bisection

@rustbot rustbot added S-has-bisection Status: a bisection has been found for this issue and removed E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc labels Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such O-AArch64 Armv8-A or later processors in AArch64 mode P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. S-has-bisection Status: a bisection has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants