Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact layout for instance field smaller than 32-bit #21251

Open
hangshao0 opened this issue Mar 3, 2025 · 13 comments
Open

Compact layout for instance field smaller than 32-bit #21251

hangshao0 opened this issue Mar 3, 2025 · 13 comments

Comments

@hangshao0
Copy link
Contributor

hangshao0 commented Mar 3, 2025

For instance fields smaller than 32-bit, they take 32-bit in current OpenJ9 object layout.

For example, a value type like the following has 4 fields:

value class SimpleVT {
	byte b1;
        byte b2;
        byte b3;
        byte b4;
}

These 4 fields take 128-bit.

Project Valhalla supports atomic value types, this limits what OpenJ9 can flatten. To support SimpleVT atomically, OpenJ9 cannot flatten SimpleVT in its container class (any class that has SimpleVT as instance field) now.

Openj9 is already distinguishing fields smaller than 32 bit in the interpreter via flag J9FieldTypeBoolean, J9FieldTypeByte, J9FieldTypeChar, J9FieldTypeShort. Ideally in the object layout, boolean/byte should only take 8 bits, short/char should only take 16 bits, so that we can atomically flatten classes like SimpleVT, which will take only 32 bits.

We put 64-bit fields first, then followed by 32-bit fields. Additionally, we need to record the offsets for the first 16-bit and 8-bit field. The 16-bit and 8-bits fields should be put starting from these offsets. For atomic VTs size <= 32 bits, they need to be put at 32-bit aligned address. For atomic VTs 32-bit to 64-bit, they need to be put at 64-bit aligned address. For atomic VT larger than 64-bit, we won’t flatten them.

In the bytecode interpreter, getfieldLogic() needs to be updated to check for J9FieldTypeBoolean, J9FieldTypeByte, J9FieldTypeChar, J9FieldTypeShort. New helper inlineMixedObjectReadU16/I16/U8/I8() need to be added.

Putfield() also needs to be updated. Instead of using inlineMixedObjectStoreU32(), New helper inlineMixedObjectWriteU16/I16/U8/I8() need to be used accordingly.

16-bit and 8-bit fields cannot be reference, so there is no impact on the instanceDescription. Also each instanceDescription bit represent 1 reference size data, so that the final object size needs to be rounded up to a multiple of reference size if necessary,

This issue only deals with instance fields now. Similar change could be applied to static field as well in the future.

This is not limited to Valhalla. We can implement this in Valhalla first, then look at extending the new layout to non-Valhalla builds as well.

@hangshao0
Copy link
Contributor Author

@tajila
Copy link
Contributor

tajila commented Mar 7, 2025

@gacholio FYI

@gacholio
Copy link
Contributor

gacholio commented Mar 7, 2025

Many (and I mean many) years ago we did an analysis that determined this was not going to save us any real heap space. Has there been a recent analysis?

@tajila
Copy link
Contributor

tajila commented Mar 7, 2025

Many (and I mean many) years ago we did an analysis that determined this was not going to save us any real heap space. Has there been a recent analysis?

The main goal is not to save heap space in java objects, although that is not a bad secondary goal if it occurs. The real purpose for this is to allow us to support more flattened VTs in a data consistent manner (no tearing). Realistically, we cannot support anything larger than 64bits. With our given layout policy it eliminates types with a few small fields given that we will expand everything to 32bits.

Another benefit of this work is that we will align ourselves with the layout policy of the RI. In the past issues have been brought up due to the difference in layout policy.

@gacholio
Copy link
Contributor

The idea of atomic VTs is a bit foolish, IMO. Who would ever actually need to swap 4 booleans at once (or 8, since we've dropped 32-bit support)?

Having said this, this does simply the get/set field operations. I don't thinks it simplifies returns.

@tajila
Copy link
Contributor

tajila commented Mar 11, 2025

The idea of atomic VTs is a bit foolish, IMO. Who would ever actually need to swap 4 booleans at once (or 8, since we've dropped 32-bit support)?

Its important because today with a given reference type it is impossible to observe tearing. If you convert an existing ref type to a value type (as would be the case with all value-based classes) then it would be possible to observe tearing if those fields were flattened. So the atomic variant is a way to guarantee data consistency in order to preserve program behaviour.

@tajila
Copy link
Contributor

tajila commented Mar 11, 2025

Having said this, this does simply the get/set field operations. I don't thinks it simplifies returns.

Yes, this only impacts get/set. The returns remain single or double slots.

@tajila
Copy link
Contributor

tajila commented Mar 12, 2025

FYI @amicic @dmitripivkine

@tajila
Copy link
Contributor

tajila commented Mar 12, 2025

One example Id like to call out is:

java.time.LocalDate {
    private final int year;

    private final short month;

    private final short day;
}

Currently, this would be 96 bits, but with the new proposal it can fit in 64bits.

@tajila
Copy link
Contributor

tajila commented Mar 12, 2025

Another is

class Float16 {
    private final short value;

Where the expectation is that 8 of them can be contiguously loaded with an xmm register for vector API.

@jdmpapin
Copy link
Contributor

jdmpapin commented Mar 12, 2025

This will also be a step toward simplifying Unsafe get/put for boolean, byte, char, and short. Currently the size of the actual access is 1 or 2 bytes for native memory and array elements, but it's 4 bytes for instance and static fields. With this change, only the static fields would still be 4 bytes. If the follow-on change were made for static fields as well, then the size of the access would always be the same (for a given type). Even then it would still be necessary to distinguish the static case, but at least not for this reason. If we can eventually also make the meaning of the offset consistent between the instance and static case, then (aside from arraylets/off-heap) all of the cases would be unified, and these operations would become simple loads and stores with no control flow. Clearly this kind of benefit is pretty far off, but personally I'd like to get there one day, and this is moving in that direction.

@gacholio
Copy link
Contributor

I imagine it would not be difficult to do this for statics as well. It might be best to continue to packege the fields in minimum 4 bytes, but only the low-order bits (depending on the field size).

@gacholio
Copy link
Contributor

I imagine it would not be difficult to do this for statics as well. It might be best to continue to packege the fields in minimum 4 bytes, but only the low-order bits (depending on the field size).

Actually, that makes little sense, so ignore that for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants