Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-45340: Don't create object dictionaries unless actually needed #28802

Merged
merged 25 commits into from
Oct 13, 2021

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Oct 7, 2021

A "normal" Python objects is conceptually just a pair of pointers, one to the class, and one to the dictionary.

With shared keys, the dictionary is redundant as it is no more than a pair of pointers, one to the keys and one to the values.

By adding a pointer to the values to the object and fetching the keys via the class, we can avoid creating a dictionary for many objects.

See faster-cpython/ideas#72 for more details.

About 1% faster, which is nice, but the main benefit is the reduced memory consumption.

https://bugs.python.org/issue45340

@markshannon
Copy link
Member Author

Needs NEWS and some additional documentation.

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments about LOAD_METHOD.

Comment on lines +943 to 945
if (*owner_dictptr) {
SPECIALIZATION_FAIL(LOAD_METHOD, SPEC_FAIL_IS_ATTR);
goto fail;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this reduce the specialization rates by much?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check. Overall performance is up a bit, so that's good enough for this PR 🙂

We will need to recheck the specialization rates and kinds of failures for LOAD_ATTR, STORE_ATTR and LOAD_METHOD. From my superficial scan, it appears that the kinds of failures are more clustered, which bodes well.

But changing all that in this PR would just confuse matters.

assert(self_cls->tp_dictoffset > 0);
assert(self_cls->tp_inline_values_offset > 0);
PyDictObject *dict = *(PyDictObject **)(((char *)self) + self_cls->tp_dictoffset);
DEOPT_IF(dict != NULL, LOAD_METHOD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait is this due to the new scheme that objects shouldn't have dict unless really required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. We assume that having no __dict__ is the most common case, so it isn't worth bothering to check for the shared key dict case.


PyObject *descr = NULL;
DesciptorClassification kind = 0;
kind = analyze_descriptor(owner_cls, name, &descr, 0);
// Store the version right away, in case it's modified halfway through.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: is there a reason why we don't need to do this immediately anymore? Are the new dictionary operations atomic/don't call any _PyType_Lookup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to call _Py_dict_lookup, which could do anything, now we call _PyDictKeys_StringLookup which cannot.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 9, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 7978220 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 9, 2021
@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 12, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8c894f3 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 12, 2021
@markshannon
Copy link
Member Author

The buildbot failures are:

  • buildbot/AMD64 Windows10 PR: Not enough memory on that machine. Lots of MemoryErrors.
  • buildbot/s390x RHEL7 LTO PR: Segfault due to stack overflow. Pre-existing issue (although one that needs fixing if we are going to support s390)
  • buildbot/x86 Gentoo Installed with X PR: Tk issue
  • buildbot/x86 Gentoo Non-Debug with X PR: Tk issue

@markshannon
Copy link
Member Author

Benchmark of final version: 1% faster

@markshannon markshannon merged commit a8b9350 into python:main Oct 13, 2021
if (owner_cls->tp_inline_values_offset) {
PyObject **owner_dictptr = _PyObject_DictPointer(owner);
assert(owner_dictptr);
if (*owner_dictptr) {
Copy link
Member

@Fidget-Spinner Fidget-Spinner Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon I'm somewhat confused why anything with a dict pointer is now an attribute, even without looking into said dict. Is there a reason elsewhere (or is this so that we don't need another opcode for the other dict form)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything will a dict pointer will be a miss for LOAD_INSTANCE_VALUE, so I'm assuming it isn't worth optimizing.
Maybe we should attempt to optimize this using LOAD_ATTR_WITH_HINT at this point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems to be for LOAD_METHOD_CACHED. I'll try to play around with it and see if the dictionary matters (versus LOAD_ATTR_INSTANCE_VALUE/LOAD_ATTR_WITH_HINT and such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants