gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` #126600

Eclips4 · 2024-11-08T20:14:16Z

Issue: Make the specializing interpreter thread-safe in --disable-gil builds #115999

brandtbucher · 2024-11-08T20:28:57Z

Lib/test/test_opcache.py

+    def test_unpack_sequence(self):
+        def f():
+            for _ in range(100):
+                a, b = 1, 2


Sort of unrelated to this PR, but I'm surprised constant unpacking like this isn't peepholed into LOAD_CONST, STORE_FAST, LOAD_CONST, STORE_FAST. Maybe because it's actually a longer instruction sequence than the original? It certainly does less work.

@iritkatriel, thoughts?

mpage

Thanks for doing this! The changes look good. I think the current implementations of UNPACK_SEQUENCE_{TUPLE,TWO_TUPLE} are already thread-safe since tuples are immutable. The implementation of UNPACK_SEQUENCE_LIST isn't thread-safe (there is nothing preventing another thread from adding or removing items from the list while the instruction is executing) so we're going to need a slightly different implementation in the free-threaded build.

I would suggest taking a critical section around the list size check and the code that pushes items onto the stack. Something like:

inst(UNPACK_SEQUENCE_LIST, (unused/1, seq -- values[oparg])) {
    PyObject *seq_o = PyStackRef_AsPyObjectBorrow(seq);
    DEOPT_IF(!PyList_CheckExact(seq_o));
    int should_deopt = 0;
    Py_BEGIN_CRITICAL_SECTION(seq_o);
    should_deopt = PyList_GET_SIZE(seq_o) != oparg;
    if (!should_deopt) {
        STAT_INC(UNPACK_SEQUENCE, hit);
        PyObject **items = _PyList_ITEMS(seq_o);
        for (int i = oparg; --i >= 0; ) {
            *values++ = PyStackRef_FromPyObjectNew(items[i]);
        }
    }
    Py_END_CRITICAL_SECTION();
    DEOPT_IF(should_deopt);
    DECREF_INPUTS();
}

Python/bytecodes.c

bedevere-app · 2024-11-08T21:09:05Z

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

mpage

LGTM!

markshannon · 2024-11-12T18:18:56Z

Python/bytecodes.c

-                *values++ = PyStackRef_FromPyObjectNew(items[i]);
+            int should_deopt = 0;
+            Py_BEGIN_CRITICAL_SECTION(seq_o);
+            should_deopt = PyList_GET_SIZE(seq_o) != oparg;


Could this be rewritten as something like:

if (PyList_GET_SIZE(seq_o) != oparg) { END_CRITICAL_SECTION DEOPT_IF(true); }

We can't use the macros because they introduce and close a new scope.

I don't think there's precedence in CPython for using the critical section functions manually (everything uses the macros), but if it's important that we maintain the structure you're suggesting we could do something like

#ifdef Py_GIL_DISABLED PyCriticalSection _py_cs; PyCriticalSection_Begin(&_py_cs, seq_o); #endif if (PyList_GET_SIZE(seq_o) != oparg) { #ifdef Py_GIL_DISABLED PyCriticalSection_End(&_py_cs); #endif DEOPT_IF(true); } STAT_INC(UNPACK_SEQUENCE, hit); PyObject **items = _PyList_ITEMS(seq_o); for (int i = oparg; --i >= 0; ) { *values++ = PyStackRef_FromPyObjectNew(items[i]); } #ifdef Py_GIL_DISABLED PyCriticalSection_End(&_py_cs); #endif DECREF_INPUTS();

Note that the preprocessor guards are not necessary for correctness; the critical section functions are a no-op in default builds.

markshannon

This doesn't fit into the general pattern of specialized instructions:

One or more guards
Zero or one actions

See the docs for why this matters.

Has this been benchmarked?

mpage · 2024-11-13T04:09:52Z

Has this been benchmarked?

Performance is neutral overall on the default build. Though strangely the results for the unpack sequence benchmark look like this change is significantly faster. I'm not sure how reliable that benchmark is, however.

Performance is improved by 2% overall on the free-threaded build.

mpage · 2024-11-16T00:42:46Z

@markshannon - Please have another look at this.

mpage

LGTM!

Python/specialize.c

Co-authored-by: mpage <mpage@cs.stanford.edu>

mpage · 2024-11-22T00:54:29Z

@Eclips4 - I think it would be fine to merge this PR. We can put up follow up PRs if any subsequent changes are required.

…E` (python#126600) Add free-threaded specialization for `UNPACK_SEQUENCE` opcode. `UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable. `UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack. --------- Co-authored-by: Matt Page <mpage@meta.com> Co-authored-by: mpage <mpage@cs.stanford.edu>

Eclips4 added 2 commits November 8, 2024 15:52

Add FT specialization

5e0d875

Add test case

fa8ff2f

Eclips4 added skip news topic-free-threading labels Nov 8, 2024

Eclips4 requested a review from mpage November 8, 2024 20:14

Eclips4 requested a review from markshannon as a code owner November 8, 2024 20:14

bedevere-app bot added the awaiting core review label Nov 8, 2024

bedevere-app bot mentioned this pull request Nov 8, 2024

Make the specializing interpreter thread-safe in --disable-gil builds #115999

Open

brandtbucher reviewed Nov 8, 2024

View reviewed changes

mpage requested changes Nov 8, 2024

View reviewed changes

Python/bytecodes.c Show resolved Hide resolved

bedevere-app bot removed the awaiting core review label Nov 8, 2024

bedevere-app bot added the awaiting changes label Nov 8, 2024

Address Matt's review

d49e9e6

Eclips4 requested a review from mpage November 8, 2024 21:57

mpage approved these changes Nov 8, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting changes labels Nov 8, 2024

ruidazeng approved these changes Nov 9, 2024

View reviewed changes

markshannon reviewed Nov 12, 2024

View reviewed changes

Refactor UNPACK_SEQUENCE_LIST

ea1abf3

mpage requested a review from markshannon November 16, 2024 00:42

Eclips4 added 2 commits November 21, 2024 13:50

Merge branch 'main' into ft-specialize-unpack-sequence

7ad46ad

Align PR with recent changes

66a2d67

Eclips4 requested a review from mpage November 21, 2024 12:01

mpage approved these changes Nov 21, 2024

View reviewed changes

Python/specialize.c Outdated Show resolved Hide resolved

Update Python/specialize.c

16093d6

Co-authored-by: mpage <mpage@cs.stanford.edu>

Eclips4 merged commit 27486c3 into python:main Nov 22, 2024
57 checks passed

bedevere-app bot removed the awaiting merge label Nov 22, 2024

Eclips4 deleted the ft-specialize-unpack-sequence branch November 22, 2024 17:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` #126600

gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` #126600

Eclips4 commented Nov 8, 2024 •

edited by bedevere-app bot

Loading

brandtbucher Nov 8, 2024

mpage left a comment •

edited

Loading

bedevere-app bot commented Nov 8, 2024

mpage left a comment

markshannon Nov 12, 2024

mpage Nov 13, 2024

markshannon left a comment

mpage commented Nov 13, 2024 •

edited

Loading

mpage commented Nov 16, 2024

mpage left a comment

mpage commented Nov 22, 2024

gh-115999: Add free-threaded specialization for UNPACK_SEQUENCE #126600

gh-115999: Add free-threaded specialization for UNPACK_SEQUENCE #126600

Conversation

Eclips4 commented Nov 8, 2024 • edited by bedevere-app bot Loading

brandtbucher Nov 8, 2024

Choose a reason for hiding this comment

mpage left a comment • edited Loading

Choose a reason for hiding this comment

bedevere-app bot commented Nov 8, 2024

mpage left a comment

Choose a reason for hiding this comment

markshannon Nov 12, 2024

Choose a reason for hiding this comment

mpage Nov 13, 2024

Choose a reason for hiding this comment

markshannon left a comment

Choose a reason for hiding this comment

mpage commented Nov 13, 2024 • edited Loading

mpage commented Nov 16, 2024

mpage left a comment

Choose a reason for hiding this comment

mpage commented Nov 22, 2024

gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` #126600

gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` #126600

Eclips4 commented Nov 8, 2024 •

edited by bedevere-app bot

Loading

mpage left a comment •

edited

Loading

mpage commented Nov 13, 2024 •

edited

Loading