Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129149: Add fast path for medium-size integers in PyLong_FromSsize_t() #129301

Merged
merged 13 commits into from
Mar 13, 2025

Conversation

chris-eibl
Copy link
Contributor

@chris-eibl chris-eibl commented Jan 25, 2025

Add the macro PYLONG_FROM_INT and use it in PyLong_FromLong() and PyLong_FromLongLong().
There, this is just a binary compatible refactoring.

Use it in PyLong_FromSsize_t(), too, to get the fast path for medium-size integers.

Use it in PyLong_FromLong() and PyLong_FromLongLong().
This is just a refactoring and will create the same binary code.
Use it in now in PyLong_FromSsize_t(), too.
@chris-eibl chris-eibl changed the title gh-129149: Add fast path for medium-size integers in :c:func:PyLong_FromSsize_t gh-129149: Add fast path for medium-size integers in PyLong_FromSsize_t() Jan 25, 2025
Copy link
Member

@skirpichev skirpichev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Few stylistic nitpicks, feel free to ignore.

chris-eibl and others added 3 commits January 26, 2025 06:22
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
@chris-eibl
Copy link
Contributor Author

I like your suggestions, brings us more in sync with PYLONG_FROM_UINT.
Just wanted to keep the diff small.

@skirpichev
Copy link
Member

BTW, next time you can apply several suggestions in one shot (subscribers will get less notifications): it's possible add them in one batch (on tab "files changed").

@skirpichev skirpichev self-requested a review January 26, 2025 05:43
@chris-eibl
Copy link
Contributor Author

chris-eibl commented Jan 26, 2025

BTW, next time you can apply several suggestions in one shot (subscribers will get less notifications): it's possible add them in one batch (on tab "files changed").

Great, will do!

I've done some more syncs with PYLONG_FROM_UINT, if we want to keep the diff small, we can remove (all of) them.

@skirpichev
Copy link
Member

@chris-eibl, please don't sync with the main branch, unless you want to fix file conflict or trigger a CI build.

@chris-eibl
Copy link
Contributor Author

Just to reassure: this needs no further action from me atm - correct?

Most probably, another core reviewer is needed since it is still labeled as "awaiting core review"?

Do I have to request this? If yes, whom should I ask for a review?

@skirpichev
Copy link
Member

CC @picnixz

@picnixz picnixz self-requested a review February 24, 2025 20:09
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, can we have some benchmarks?

(on a multiple of 4 spaces)
@chris-eibl
Copy link
Contributor Author

Out of curiosity, can we have some benchmarks?

pyperformance was neutral on it. I did think of some micro benchmarks, but I've found no good candidate without writing an external c application to directly benchmark the PyAPI_FUNC PyLong_AsSsize_t, yet.

Maybe something with ctypes would work out, too?

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you actually tests for checking the medium-size path? as well as perhaps checks for checking SSIZE_T_MIN (namely, tests to testcapi)

@chris-eibl
Copy link
Contributor Author

Yeah, check_long_asint uses 1234, which is a medium-size int. I could add more.
As you suggest, best in there, so that we get more coverage for the other test_long_as*.

Especially one negative medium-size int would be fine, too?

Let's change

        values = (0, 1, 1234, max_val)
        if min_val < 0:
            values += (-1, min_val)

to

        values = (0, 1, 512, 1234, max_val)
        if min_val < 0:
            values += (-1, -512, -1234, min_val)

@chris-eibl
Copy link
Contributor Author

The bots are failing. Seems not related to the PR.

Maybe updating with the main branch can fix it - shall I give it a try?

@picnixz
Copy link
Member

picnixz commented Mar 1, 2025

yes, the CI config changed

@chris-eibl
Copy link
Contributor Author

Yeah - that fixed it :)

@picnixz picnixz self-requested a review March 1, 2025 13:42
@picnixz picnixz added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @picnixz for commit fda2844 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F129301%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2025
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, conditioned to build bot success (one build bot is know to fail though)

@picnixz
Copy link
Member

picnixz commented Mar 1, 2025

So the 2 build bots currently failing are know to fail so I think we can safely merge this one. Just asking Victor if he wants to add something else here.

@picnixz picnixz requested a review from vstinner March 1, 2025 15:59
@chris-eibl
Copy link
Contributor Author

Here are some benchmarks:

  • on my Windows 10 PC (i5-4570 CPU)
  • Microsoft Visual Studio 2022 17.13.0 Preview 5.0
  • release build (non-PGO)
  • about 25% faster

before:

some small ints
          -5:    21.26 ms
           0:    20.83 ms
         256:    20.90 ms
some medium ints
 -1073741823:   123.28 ms
        1000:   124.79 ms
  1073741822:   122.32 ms
some bigger ints
 -1073741824:   196.28 ms
  1073741824:   195.87 ms

after:

some small ints
          -5:    20.90 ms
           0:    20.45 ms
         256:    20.50 ms
some medium ints
 -1073741823:   100.39 ms
        1000:    98.54 ms
  1073741822:    98.61 ms
some bigger ints
 -1073741824:   198.79 ms
  1073741824:   198.45 ms

Details

#include <chrono>
#include <iomanip>
#include <iostream>

#include "Python.h"

#define _PY_NSMALLPOSINTS           257
#define _PY_NSMALLNEGINTS           5

const Py_ssize_t PyLong_MASK_ssize_t = static_cast<Py_ssize_t>(PyLong_MASK);

void bench(Py_ssize_t val)
{
    const auto start{ std::chrono::steady_clock::now() };

    for (size_t i = 0; i < 10000000; ++i)
    {
        PyObject* obj = PyLong_FromSsize_t(val);
        Py_DECREF(obj);
    }

    const auto finish{ std::chrono::steady_clock::now() };
    const std::chrono::duration<double> elapsed_seconds{ finish - start };

    std::cout << std::setw(12) << val << ": " << std::setw(8) << elapsed_seconds.count() * 1000 << " ms\n";
}

int main()
{
    Py_Initialize();

    std::cout << "some small ints\n";
    std::cout << std::setprecision(2) << std::fixed;
    bench(-_PY_NSMALLNEGINTS);
    bench(0);
    bench(_PY_NSMALLPOSINTS - 1);

    std::cout << "some medium ints\n";
    bench(0 - PyLong_MASK_ssize_t);
    bench(1000);
    bench(PyLong_MASK_ssize_t - 1);

    std::cout << "some bigger ints\n";
    bench(0 - PyLong_MASK_ssize_t - 1);
    bench(PyLong_MASK_ssize_t + 1);

    Py_Finalize();
}
 

@picnixz
Copy link
Member

picnixz commented Mar 1, 2025

The benchmarks are promising. We are gaining like 10-20% so it's non-negligible.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Interesting optimization.

cc @serhiy-storchaka

@chris-eibl
Copy link
Contributor Author

@vstinner: shall PyLong_FromInt32(), PyLong_FromUInt32() and PyLong_FromUInt64() use these macros in a follow up PR, too? AFAIR you've introduced those methods.

@vstinner
Copy link
Member

vstinner commented Mar 3, 2025

@vstinner: shall PyLong_FromInt32(), PyLong_FromUInt32() and PyLong_FromUInt64() use these macros in a follow up PR, too? AFAIR you've introduced those methods.

Let's wait until this change PR is merged.

@picnixz picnixz self-assigned this Mar 12, 2025
@picnixz
Copy link
Member

picnixz commented Mar 12, 2025

I'll merge this by the end of the week (hence my assignment). @chris-eibl can you update the branch just to run the latest CI? TiA (I'll do it tomorrow otherwise)

@picnixz picnixz merged commit 119bcfa into python:main Mar 13, 2025
42 checks passed
@picnixz
Copy link
Member

picnixz commented Mar 13, 2025

Thank you!

@picnixz
Copy link
Member

picnixz commented Mar 13, 2025

shall PyLong_FromInt32(), PyLong_FromUInt32() and PyLong_FromUInt64() use these macros in a follow up PR, too? AFAIR you've introduced those methods.

You can reuse the same issue for this follow-up PR btw.

@chris-eibl chris-eibl deleted the gh-129149 branch March 13, 2025 18:54
plashchynski pushed a commit to plashchynski/cpython that referenced this pull request Mar 17, 2025
…romSsize_t()` (python#129301)

The implementation of `PyLong_FromLong()`, `PyLong_FromLongLong()` and `PyLong_FromSsize_t()`
are now handled by a common macro `PYLONG_FROM_INT` which contains fast paths depending on the
size of the integer to convert. Consequently, `PyLong_FromSsize_t()` for medium-sized integers is faster
by roughly 25%.

---------

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants