-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-129149: Add fast path for medium-size integers in PyLong_FromSsize_t()
#129301
Conversation
Use it in PyLong_FromLong() and PyLong_FromLongLong(). This is just a refactoring and will create the same binary code.
Use it in now in PyLong_FromSsize_t(), too.
PyLong_FromSsize_t
PyLong_FromSsize_t()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Few stylistic nitpicks, feel free to ignore.
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
I like your suggestions, brings us more in sync with |
BTW, next time you can apply several suggestions in one shot (subscribers will get less notifications): it's possible add them in one batch (on tab "files changed"). |
Great, will do! I've done some more syncs with PYLONG_FROM_UINT, if we want to keep the diff small, we can remove (all of) them. |
@chris-eibl, please don't sync with the main branch, unless you want to fix file conflict or trigger a CI build. |
Just to reassure: this needs no further action from me atm - correct? Most probably, another core reviewer is needed since it is still labeled as "awaiting core review"? Do I have to request this? If yes, whom should I ask for a review? |
CC @picnixz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, can we have some benchmarks?
(on a multiple of 4 spaces)
pyperformance was neutral on it. I did think of some micro benchmarks, but I've found no good candidate without writing an external c application to directly benchmark the Maybe something with ctypes would work out, too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you actually tests for checking the medium-size path? as well as perhaps checks for checking SSIZE_T_MIN
(namely, tests to testcapi
)
Yeah, Especially one negative medium-size int would be fine, too? Let's change values = (0, 1, 1234, max_val)
if min_val < 0:
values += (-1, min_val) to values = (0, 1, 512, 1234, max_val)
if min_val < 0:
values += (-1, -512, -1234, min_val) |
The bots are failing. Seems not related to the PR. Maybe updating with the main branch can fix it - shall I give it a try? |
yes, the CI config changed |
Yeah - that fixed it :) |
🤖 New build scheduled with the buildbot fleet by @picnixz for commit fda2844 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F129301%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, conditioned to build bot success (one build bot is know to fail though)
So the 2 build bots currently failing are know to fail so I think we can safely merge this one. Just asking Victor if he wants to add something else here. |
Here are some benchmarks:
before:
after:
Details
#include <chrono>
#include <iomanip>
#include <iostream>
#include "Python.h"
#define _PY_NSMALLPOSINTS 257
#define _PY_NSMALLNEGINTS 5
const Py_ssize_t PyLong_MASK_ssize_t = static_cast<Py_ssize_t>(PyLong_MASK);
void bench(Py_ssize_t val)
{
const auto start{ std::chrono::steady_clock::now() };
for (size_t i = 0; i < 10000000; ++i)
{
PyObject* obj = PyLong_FromSsize_t(val);
Py_DECREF(obj);
}
const auto finish{ std::chrono::steady_clock::now() };
const std::chrono::duration<double> elapsed_seconds{ finish - start };
std::cout << std::setw(12) << val << ": " << std::setw(8) << elapsed_seconds.count() * 1000 << " ms\n";
}
int main()
{
Py_Initialize();
std::cout << "some small ints\n";
std::cout << std::setprecision(2) << std::fixed;
bench(-_PY_NSMALLNEGINTS);
bench(0);
bench(_PY_NSMALLPOSINTS - 1);
std::cout << "some medium ints\n";
bench(0 - PyLong_MASK_ssize_t);
bench(1000);
bench(PyLong_MASK_ssize_t - 1);
std::cout << "some bigger ints\n";
bench(0 - PyLong_MASK_ssize_t - 1);
bench(PyLong_MASK_ssize_t + 1);
Py_Finalize();
}
|
The benchmarks are promising. We are gaining like 10-20% so it's non-negligible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Interesting optimization.
@vstinner: shall |
Let's wait until this change PR is merged. |
I'll merge this by the end of the week (hence my assignment). @chris-eibl can you update the branch just to run the latest CI? TiA (I'll do it tomorrow otherwise) |
Thank you! |
You can reuse the same issue for this follow-up PR btw. |
…romSsize_t()` (python#129301) The implementation of `PyLong_FromLong()`, `PyLong_FromLongLong()` and `PyLong_FromSsize_t()` are now handled by a common macro `PYLONG_FROM_INT` which contains fast paths depending on the size of the integer to convert. Consequently, `PyLong_FromSsize_t()` for medium-sized integers is faster by roughly 25%. --------- Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Add the macro
PYLONG_FROM_INT
and use it inPyLong_FromLong()
andPyLong_FromLongLong()
.There, this is just a binary compatible refactoring.
Use it in
PyLong_FromSsize_t()
, too, to get the fast path for medium-size integers.