-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-45340: Don't create object dictionaries unless actually needed #28802
bpo-45340: Don't create object dictionaries unless actually needed #28802
Conversation
… keys more freely.
Needs NEWS and some additional documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments about LOAD_METHOD
.
if (*owner_dictptr) { | ||
SPECIALIZATION_FAIL(LOAD_METHOD, SPEC_FAIL_IS_ATTR); | ||
goto fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this reduce the specialization rates by much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check. Overall performance is up a bit, so that's good enough for this PR 🙂
We will need to recheck the specialization rates and kinds of failures for LOAD_ATTR
, STORE_ATTR
and LOAD_METHOD
. From my superficial scan, it appears that the kinds of failures are more clustered, which bodes well.
But changing all that in this PR would just confuse matters.
assert(self_cls->tp_dictoffset > 0); | ||
assert(self_cls->tp_inline_values_offset > 0); | ||
PyDictObject *dict = *(PyDictObject **)(((char *)self) + self_cls->tp_dictoffset); | ||
DEOPT_IF(dict != NULL, LOAD_METHOD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait is this due to the new scheme that objects shouldn't have dict unless really required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is. We assume that having no __dict__
is the most common case, so it isn't worth bothering to check for the shared key dict case.
|
||
PyObject *descr = NULL; | ||
DesciptorClassification kind = 0; | ||
kind = analyze_descriptor(owner_cls, name, &descr, 0); | ||
// Store the version right away, in case it's modified halfway through. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: is there a reason why we don't need to do this immediately anymore? Are the new dictionary operations atomic/don't call any _PyType_Lookup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to call _Py_dict_lookup
, which could do anything, now we call _PyDictKeys_StringLookup
which cannot.
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 7978220 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8c894f3 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
The buildbot failures are:
|
Benchmark of final version: 1% faster |
if (owner_cls->tp_inline_values_offset) { | ||
PyObject **owner_dictptr = _PyObject_DictPointer(owner); | ||
assert(owner_dictptr); | ||
if (*owner_dictptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markshannon I'm somewhat confused why anything with a dict pointer is now an attribute, even without looking into said dict. Is there a reason elsewhere (or is this so that we don't need another opcode for the other dict form)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything will a dict pointer will be a miss for LOAD_INSTANCE_VALUE
, so I'm assuming it isn't worth optimizing.
Maybe we should attempt to optimize this using LOAD_ATTR_WITH_HINT
at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems to be for LOAD_METHOD_CACHED
. I'll try to play around with it and see if the dictionary matters (versus LOAD_ATTR_INSTANCE_VALUE
/LOAD_ATTR_WITH_HINT
and such.
A "normal" Python objects is conceptually just a pair of pointers, one to the class, and one to the dictionary.
With shared keys, the dictionary is redundant as it is no more than a pair of pointers, one to the keys and one to the values.
By adding a pointer to the values to the object and fetching the keys via the class, we can avoid creating a dictionary for many objects.
See faster-cpython/ideas#72 for more details.
About 1% faster, which is nice, but the main benefit is the reduced memory consumption.
https://bugs.python.org/issue45340