[MLIR][Python] IntegerAttribute doesn't accept unsigned values greater than 63 bits #128072

teqdruid · 2025-02-20T21:20:15Z

# This works
ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0x7fffffffffffffff)

# This doesn't
ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0xffffffffffffffff)

TypeError: get(): incompatible function arguments. The following argument types are supported:
    1. get(type: pycde.circt._mlir_libs._mlir.ir.Type, value: int) -> pycde.circt._mlir_libs._mlir.ir.IntegerAttr

Invoked with types: pycde.circt._mlir_libs._mlir.ir.IntegerType, int

The problem is that the nanobind (and CAPI) calls use int64_t (signed) which 0xffffffffffffffff overflows. (This doesn't produce the nicest error message, but that's a problem with nanobind.) Generally speaking, we need a way to pass >= 64-bit values into nanobind (easy enough by using the python int type) and through the CAPI (the real issue). The 64-bit case is pretty easy to solve (using uint64_t), but we need a more generic way. In C++ world, we have APInt. Is there a C equivalent we could use? Is there some standard way we could support this?

The text was updated successfully, but these errors were encountered:

llvmbot · 2025-02-20T21:20:37Z

@llvm/issue-subscribers-mlir-python

Author: John Demme (teqdruid)

```python # This works ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0x7fffffffffffffff)

This doesn't

ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0xffffffffffffffff)

TypeError: get(): incompatible function arguments. The following argument types are supported:
1. get(type: pycde.circt._mlir_libs._mlir.ir.Type, value: int) -> pycde.circt._mlir_libs._mlir.ir.IntegerAttr

Invoked with types: pycde.circt._mlir_libs._mlir.ir.IntegerType, int


The problem is that the nanobind (and CAPI) calls use `int64_t` (signed) which 0xffffffffffffffff overflows. (This doesn't produce the nicest error message, but that's a problem with nanobind.) Generally speaking, we need a way to pass &gt;= 64-bit values into nanobind (easy enough by using the python int type) and through the CAPI (the real issue). The 64-bit case is pretty easy to solve (using `uint64_t`), but we need a more generic way. In C++ world, we have APInt. Is there a C equivalent we could use? Is there some standard way we could support this?
</details>

makslevental · 2025-02-20T21:28:04Z

In C++ world, we have APInt. Is there a C equivalent we could use? Is there some standard way we could support this?

I have been told by @stellaraccident that llvm/Support is exempt from the decree that "no C++ shall be linked to python extensions" - see MLIRPythonExtension.Core::PRIVATE_LINK_LIBS. So possibly the solution here is simply to add APIs that pass APInt back and forth between the python extension and the code in libMLIRPython.

stellaraccident · 2025-02-20T21:51:40Z

It's probably hopeless to try to standardize on the internal representation of an arbitrary precision int like what python uses. With that said, what you need is a c struct that lets you get in/out of both APInt and the procedure python has for accessing large ints. APInt itself is pretty simple in this regard: it has a numbits field and a union that either holds a single uint64 or a pointer to an array of uint64 that is heap allocated if more than one word.

This isn't a great design for dealing with a lot of large values... Will result in a lot of heap allocations.

What I'd do: define an apint_interop_t struct that has a number of bits and a pointer to uint64 words that is at least of size numbits / 64. For APIs that need to take an APInt, the caller just populates that as needed. For those that return, have the caller pass in a pointer to the struct with numbits set to the max that is allocated. Then if too small, have the read function return a failure and populate the actual numbits needed.

It's annoying, but you can build reasonably efficient interop with such a mechanism, even for the array cases, if a bit careful about it.

teqdruid · 2025-02-21T18:41:47Z

Yeah, that sounds like the response I expected. I had hoped that someone would point me to a C APInt interop which LLVM has already using somewhere.

I'll keep this in mind for the future. I'm in sprint/hack to a demo mode right now.

Or if anyone else wants to do it. Strikes me as a pretty good first project.

llvmbot · 2025-02-21T18:42:24Z

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
Fix the issue locally.
Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
Create a Git commit.
Run git clang-format HEAD~1 to format your changes.
Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

llvmbot · 2025-02-21T18:42:25Z

@llvm/issue-subscribers-good-first-issue

Author: John Demme (teqdruid)

```python # This works ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0x7fffffffffffffff)

This doesn't

ir.IntegerAttr.get(ir.IntegerType.get_signless(64), 0xffffffffffffffff)

TypeError: get(): incompatible function arguments. The following argument types are supported:
1. get(type: pycde.circt._mlir_libs._mlir.ir.Type, value: int) -> pycde.circt._mlir_libs._mlir.ir.IntegerAttr

Invoked with types: pycde.circt._mlir_libs._mlir.ir.IntegerType, int


The problem is that the nanobind (and CAPI) calls use `int64_t` (signed) which 0xffffffffffffffff overflows. (This doesn't produce the nicest error message, but that's a problem with nanobind.) Generally speaking, we need a way to pass &gt;= 64-bit values into nanobind (easy enough by using the python int type) and through the CAPI (the real issue). The 64-bit case is pretty easy to solve (using `uint64_t`), but we need a more generic way. In C++ world, we have APInt. Is there a C equivalent we could use? Is there some standard way we could support this?
</details>

teqdruid · 2025-02-21T18:48:50Z

In C++ world, we have APInt. Is there a C equivalent we could use? Is there some standard way we could support this?

I have been told by @stellaraccident that llvm/Support is exempt from the decree that "no C++ shall be linked to python extensions" - see MLIRPythonExtension.Core::PRIVATE_LINK_LIBS. So possibly the solution here is simply to add APIs that pass APInt back and forth between the python extension and the code in libMLIRPython.

This seems pretty hacky since we'd have to add a C++ API somewhere which supports the CAPI attributes and APInt. Possible, but ugly as it doesn't solve the problem for non-C++ users.

bababuck · 2025-02-21T19:13:44Z

@teqdruid @stellaraccident It doesn't seem like anyone has started work on this, could I give it a go?

stellaraccident · 2025-02-21T20:34:07Z

Sure - tag me explicitly if you'd like my help on reviews... I'm not otherwise watching super closely.

teqdruid · 2025-02-21T22:22:58Z

Sure - tag me explicitly if you'd like my help on reviews... I'm not otherwise watching super closely.

Same here

llvm/llvm-project#128072 To mitigate this, if a constant over 63-bits is specified, create it by concatenating 32-bit constants.

bababuck · 2025-03-06T00:48:01Z

@stellaraccident @teqdruid

Apolgies for the slow movement on this.

My understanding is that the interfaces involved are:
Python interface:

get (PyType &type, int64_t value) {
       MlirAttribute attr = mlirIntegerAttrGet(type, value);
       return PyIntegerAttribute(type.getContext(), attr);
};

static py::int_ toPyInt(PyIntegerAttribute &self) {
    MlirType type = mlirAttributeGetType(self);
    if (mlirTypeIsAIndex(type) || mlirIntegerTypeIsSignless(type))
          return mlirIntegerAttrGetValueInt(self);
    if (mlirIntegerTypeIsSigned(type))
          return mlirIntegerAttrGetValueSInt(self);
    return mlirIntegerAttrGetValueUInt(self);

CAPI:

/// Returns the value stored in the given integer attribute, assuming the value
/// is of signless type and fits into a signed 64-bit integer.
MLIR_CAPI_EXPORTED int64_t mlirIntegerAttrGetValueInt(MlirAttribute attr);

/// Returns the value stored in the given integer attribute, assuming the value
/// is of signed type and fits into a signed 64-bit integer.
MLIR_CAPI_EXPORTED int64_t mlirIntegerAttrGetValueSInt(MlirAttribute attr);

/// Returns the value stored in the given integer attribute, assuming the value
/// is of unsigned type and fits into an unsigned 64-bit integer.
MLIR_CAPI_EXPORTED uint64_t mlirIntegerAttrGetValueUInt(MlirAttribute attr);

For Python, my understanding is that way in and out for large integers in python is via use of py::byte.

I just want to make sure I'm understanding correctly what was meant by the following:

For APIs that need to take an APInt, the caller just populates that as needed. For those that return, have the caller pass in a pointer to the struct with numbits set to the max that is allocated.

From this, my understanding is that the CAPI should be updated with functions like the following:

MLIR_CAPI_EXPORTED int mlirIntegerAttrGetValueUInterop(MlirAttribute attr, apint_interop_t *interop);

The Python interface would call these new functions to receive an interop and the interface would convert the uint64_t array to interface with the py::int_ class via py::byte. Does this sound correct?

I also wanted to understand better how your proposed design helps with heap accesses, it's unclear to me where we could save on allocations. Thanks!

Fixes issue llvm#128072. Allows for arbitrarily sized integers to be requested via Python.

teqdruid added the mlir:python MLIR Python bindings label Feb 20, 2025

teqdruid added the good first issue https://github.com/llvm/llvm-project/contribute label Feb 21, 2025

teqdruid added a commit to llvm/circt that referenced this issue Feb 27, 2025

[PyCDE] Mitigate >63-bit constant bug upstream

bae6727

llvm/llvm-project#128072 To mitigate this, if a constant over 63-bits is specified, create it by concatenating 32-bit constants.

bababuck added a commit to bababuck/llvm-project that referenced this issue Mar 10, 2025

Handle large integers (> 64 bits) for the IntegerAttr C-API

9a64fda

Fixes issue llvm#128072. Allows for arbitrarily sized integers to be requested via Python.

bababuck mentioned this issue Mar 10, 2025

[mlir] Handle large integers (> 64 bits) for the IntegerAttr C-API #130539

Open

bababuck added a commit to bababuck/llvm-project that referenced this issue Mar 20, 2025

Handle large integers (> 64 bits) for the IntegerAttr C-API

ee6c0aa

Fixes issue llvm#128072. Allows for arbitrarily sized integers to be requested via Python.

bababuck added a commit to bababuck/llvm-project that referenced this issue Mar 20, 2025

Handle large integers (> 64 bits) for the IntegerAttr C-API

4ba791e

Fixes issue llvm#128072. Allows for arbitrarily sized integers to be requested via Python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLIR][Python] IntegerAttribute doesn't accept unsigned values greater than 63 bits #128072

[MLIR][Python] IntegerAttribute doesn't accept unsigned values greater than 63 bits #128072

teqdruid commented Feb 20, 2025

llvmbot commented Feb 20, 2025

This doesn't

makslevental commented Feb 20, 2025 •

edited

Loading

stellaraccident commented Feb 20, 2025

teqdruid commented Feb 21, 2025

llvmbot commented Feb 21, 2025

llvmbot commented Feb 21, 2025

This doesn't

teqdruid commented Feb 21, 2025

bababuck commented Feb 21, 2025

stellaraccident commented Feb 21, 2025

teqdruid commented Feb 21, 2025

bababuck commented Mar 6, 2025 •

edited

Loading

[MLIR][Python] IntegerAttribute doesn't accept unsigned values greater than 63 bits #128072

[MLIR][Python] IntegerAttribute doesn't accept unsigned values greater than 63 bits #128072

Comments

teqdruid commented Feb 20, 2025

llvmbot commented Feb 20, 2025

This doesn't

makslevental commented Feb 20, 2025 • edited Loading

stellaraccident commented Feb 20, 2025

teqdruid commented Feb 21, 2025

llvmbot commented Feb 21, 2025

llvmbot commented Feb 21, 2025

This doesn't

teqdruid commented Feb 21, 2025

bababuck commented Feb 21, 2025

stellaraccident commented Feb 21, 2025

teqdruid commented Feb 21, 2025

bababuck commented Mar 6, 2025 • edited Loading

makslevental commented Feb 20, 2025 •

edited

Loading

bababuck commented Mar 6, 2025 •

edited

Loading