Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128942: make arraymodule.c free-thread safe (lock-free) #130771

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Mar 2, 2025

I added lock-free single element reads and writes by mostly copying the list object's homework. TL;DR: pyperformance scimark seems to be back to about what it was without the free-thread safe stuff (pending confirmation of course). Tried a few other things but the list strategy seems good enough (except for the negative index thing I mentioned in #130744, if that is an issue).

Timings, the relevant ones are "OLD" - non free-thread safe arraymodule, "SLOW" - the previous slower PR and the last two "LFREERW".

### scimark_fft ###
WITHGIL: Mean +- std dev: 200 ms +- 6 ms
OLD:     Mean +- std dev: 238 ms +- 7 ms
SLOW:    Mean +- std dev: 274 ms +- 4 ms
ITEMFIX: Mean +- std dev: 272 ms +- 14 ms
NOLOCK:  Mean +- std dev: 240 ms +- 4 ms
TEST1:   Mean +- std dev: 267 ms +- 4 ms
TEST2:   Mean +- std dev: 255 ms +- 6 ms
LFREERD: Mean +- std dev: 256 ms +- 7 ms   # left in standard Py_SIZE() in getarrayitem
LFREERD: Mean +- std dev: 262 ms +- 7 ms   # atomic read of size in getarrayitem
LFREERD: Mean +- std dev: 270 ms +- 7 ms   # build and run again
LFREERD: Mean +- std dev: 255 ms +- 10 ms  # orthodox
LFREERD: Mean +- std dev: 259 ms +- 8 ms   # build and run again
LFREERW: Mean +- std dev: 239 ms +- 7 ms   # lockfree write single element
LFREERW: Mean +- std dev: 242 ms +- 6 ms   # build and run again
STATICL: Mean +- std dev: 223 ms +- 6 ms   # statically linked
STATICL: Mean +- std dev: 223 ms +- 5 ms   # build and run again
CLEANUP: Mean +- std dev: 224 ms +- 5 ms   # including atomic get/set
AMEMCPY: Mean +- std dev: 226 ms +- 6 ms   # atomic item copy on resize (instead of memcpy)
AMEMCPY: Mean +- std dev: 222 ms +- 5 ms   # build and run again
LOCKWR:  Mean +- std dev: 226 ms +- 4 ms   # locked writes again, but much better with static linking
LOCKWR:  Mean +- std dev: 226 ms +- 5 ms   # build and run again
AMCAGG:  Mean +- std dev: 222 ms +- 7 ms   # aggregate memcpy (just using best build)

### scimark_lu ###
WITHGIL: Mean +- std dev: 64.6 ms +- 1.9 ms
OLD:     Mean +- std dev: 89.0 ms +- 2.1 ms
SLOW:    Mean +- std dev: 95.4 ms +- 3.1 ms
ITEMFIX: Mean +- std dev: 92.0 ms +- 2.1 ms
NOLOCK:  Mean +- std dev: 88.5 ms +- 2.5 ms
TEST1:   Mean +- std dev: 89.7 ms +- 2.1 ms
TEST2:   Mean +- std dev: 90.5 ms +- 2.4 ms
LFREERD: Mean +- std dev: 89.9 ms +- 2.7 ms
LFREERD: Mean +- std dev: 90.9 ms +- 2.6 ms
LFREERD: Mean +- std dev: 93.4 ms +- 4.0 ms
LFREERD: Mean +- std dev: 91.9 ms +- 3.7 ms
LFREERD: Mean +- std dev: 91.0 ms +- 5.0 ms
LFREERW: Mean +- std dev: 89.2 ms +- 2.8 ms
LFREERW: Mean +- std dev: 88.8 ms +- 2.5 ms
STATICL: Mean +- std dev: 86.9 ms +- 1.9 ms
STATICL: Mean +- std dev: 87.1 ms +- 2.2 ms
CLEANUP: Mean +- std dev: 88.9 ms +- 2.5 ms
AMEMCPY: Mean +- std dev: 87.9 ms +- 1.9 ms
AMEMCPY: Mean +- std dev: 88.7 ms +- 4.8 ms
LOCKWR:  Mean +- std dev: 88.0 ms +- 2.4 ms
LOCKWR:  Mean +- std dev: 88.5 ms +- 3.5 ms
AMCAGG:  Mean +- std dev: 88.5 ms +- 3.5 ms

### scimark_monte_carlo ###
WITHGIL: Mean +- std dev: 40.1 ms +- 2.3 ms
OLD:     Mean +- std dev: 53.4 ms +- 1.0 ms
SLOW:    Mean +- std dev: 56.8 ms +- 1.4 ms
ITEMFIX: Mean +- std dev: 55.9 ms +- 0.7 ms
NOLOCK:  Mean +- std dev: 54.4 ms +- 1.3 ms
TEST1:   Mean +- std dev: 54.1 ms +- 0.6 ms
TEST2:   Mean +- std dev: 54.1 ms +- 0.6 ms
LFREERD: Mean +- std dev: 54.3 ms +- 0.9 ms
LFREERD: Mean +- std dev: 55.9 ms +- 0.9 ms
LFREERD: Mean +- std dev: 55.4 ms +- 1.1 ms
LFREERD: Mean +- std dev: 55.4 ms +- 1.1 ms
LFREERD: Mean +- std dev: 55.2 ms +- 1.5 ms
LFREERW: Mean +- std dev: 53.2 ms +- 1.1 ms
LFREERW: Mean +- std dev: 53.9 ms +- 1.5 ms
STATICL: Mean +- std dev: 55.0 ms +- 1.9 ms
STATICL: Mean +- std dev: 53.3 ms +- 1.0 ms
CLEANUP: Mean +- std dev: 52.6 ms +- 0.5 ms
AMEMCPY: Mean +- std dev: 53.3 ms +- 0.6 ms
AMEMCPY: Mean +- std dev: 53.4 ms +- 1.2 ms
LOCKWR:  Mean +- std dev: 55.4 ms +- 4.0 ms
LOCKWR:  Mean +- std dev: 54.8 ms +- 1.8 ms
AMCAGG:  Mean +- std dev: 53.1 ms +- 1.0 ms

### scimark_sor ###
WITHGIL: Mean +- std dev: 70.3 ms +- 3.3 ms
OLD:     Mean +- std dev: 98.1 ms +- 1.8 ms
SLOW:    Mean +- std dev: 99.9 ms +- 1.1 ms
ITEMFIX: Mean +- std dev:  100 ms +- 1 ms
NOLOCK:  Mean +- std dev: 97.3 ms +- 1.4 ms
TEST1:   Mean +- std dev: 97.5 ms +- 0.7 ms
TEST2:   Mean +- std dev: 98.5 ms +- 1.5 ms
LFREERD: Mean +- std dev: 97.0 ms +- 2.0 ms
LFREERD: Mean +- std dev: 98.2 ms +- 1.4 ms
LFREERD: Mean +- std dev: 98.4 ms +- 2.7 ms
LFREERD: Mean +- std dev: 97.6 ms +- 1.5 ms
LFREERD: Mean +- std dev: 97.9 ms +- 1.8 ms
LFREERW: Mean +- std dev: 97.2 ms +- 1.6 ms
LFREERW: Mean +- std dev: 96.9 ms +- 2.0 ms
STATICL: Mean +- std dev: 97.8 ms +- 1.9 ms
STATICL: Mean +- std dev: 97.2 ms +- 1.7 ms
CLEANUP: Mean +- std dev: 97.7 ms +- 1.2 ms
AMEMCPY: Mean +- std dev: 96.0 ms +- 2.0 ms
AMEMCPY: Mean +- std dev: 96.4 ms +- 1.5 ms
LOCKWR:  Mean +- std dev: 97.7 ms +- 1.5 ms
LOCKWR:  Mean +- std dev: 96.8 ms +- 2.5 ms
AMCAGG:  Mean +- std dev: 95.6 ms +- 0.5 ms

### scimark_sparse_mat_mult ###
WITHGIL: Mean +- std dev: 2.89 ms +- 0.11 ms
OLD:     Mean +- std dev: 3.94 ms +- 0.15 ms
SLOW:    Mean +- std dev: 4.68 ms +- 0.07 ms
ITEMFIX: Mean +- std dev: 4.34 ms +- 0.07 ms
NOLOCK:  Mean +- std dev: 3.94 ms +- 0.58 ms
TEST1:   Mean +- std dev: 3.99 ms +- 0.12 ms
TEST2:   Mean +- std dev: 3.80 ms +- 0.12 ms
LFREERD: Mean +- std dev: 3.84 ms +- 0.17 ms
LFREERD: Mean +- std dev: 3.83 ms +- 0.15 ms
LFREERD: Mean +- std dev: 3.94 ms +- 0.12 ms
LFREERD: Mean +- std dev: 3.74 ms +- 0.12 ms
LFREERD: Mean +- std dev: 3.83 ms +- 0.19 ms
LFREERW: Mean +- std dev: 3.84 ms +- 0.19 ms
LFREERW: Mean +- std dev: 3.87 ms +- 0.17 ms
STATICL: Mean +- std dev: 3.37 ms +- 0.08 ms
STATICL: Mean +- std dev: 3.49 ms +- 0.12 ms
CLEANUP: Mean +- std dev: 3.45 ms +- 0.14 ms
AMEMCPY: Mean +- std dev: 3.44 ms +- 0.13 ms
AMEMCPY: Mean +- std dev: 3.46 ms +- 0.14 ms
LOCKWR:  Mean +- std dev: 3.52 ms +- 0.47 ms
LOCKWR:  Mean +- std dev: 3.44 ms +- 0.10 ms
AMCAGG:  Mean +- std dev: 3.38 ms +- 0.09 ms

@tom-pytel
Copy link
Contributor Author

ping @colesbury

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Disclaimer: I'm not an expert on the FT list implementation, so take some of my comments with a grain of salt.

Seeing good single-threaded performance is nice, but what about multi-threaded scaling? The number of locks that are still here scare me a little--it would be nice if this scaled well for concurrent use as well, especially for operations that don't require concurrent writes (e.g., comparisons and copies).

@tom-pytel
Copy link
Contributor Author

Note, this is not ready to go, there is the memory issue which needs resolving.

@tom-pytel
Copy link
Contributor Author

@ZeroIntensity you can remove the do-not-merge, its not an arraymodule issue, its a QSBR issue, see #130794.

@tom-pytel
Copy link
Contributor Author

The main thing here for acceptance is a benchmark run which I am not able to start (I only did local pyperformance check against main), so someone with access will have to initiate that to compare with main.

@colesbury colesbury self-requested a review March 4, 2025 16:22
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten a chance to look through arraymodule.c yet. I'll review that later this week.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall approach here seems good. A few comments below.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 5, 2025

The actual arraymodule cleanup is fine, just one thing you should pay attention to when looking at this:

data->items is not a deref but pointer arithmetic so is safe to use if data is NULL in memcopies and _PyBytes_Repeat when associated size is 0. But there are places where this might not be safe which are accomodated, specifically:

  • array_array_buffer_info_impl
  • array_array_tobytes_impl
  • array_array_fromunicode_impl
  • array_array_tounicode_impl
  • array_buffer_getbuf_lock_held

Are there any other places where this needs to take place?

Its the test and trying to run it with parallel-threads with tsan that's problematic.

  • warnings doesn't play nice with --parallel-threads, found a least evil solution but its not wonderful. This still gets "env changed" but at least can run. Is there a way to check for parallel-threads other than this?
    def setUp(self):
        if (not support.Py_GIL_DISABLED or
                any('"parallel_threads": null' in a for a in sys.argv) or
                all('parallel_threads' not in a for a in sys.argv)):
            self.enterContext(warnings.catch_warnings())
  • --parallel-threads doesn't play well with the threaded FreeThreadingTest, get dangling threads sometimes.

  • ./Include/cpython/pyatomic_gcc.h:615:3: warning: ‘atomic_thread_fence’ is not supported with ‘-fsanitize=thread’ [-Wtsan]
    Do we care?

  • And worst of all, cleaned up data races in arraymodule (as far as I can see). I used to be able to run test stuff setting supperssions without problems but with test_array I get:

$ ./python -m test --parallel-threads=4 -j4 test_array
Using random seed: 2354862211
0:00:00 load avg: 0.55 Run 1 test in parallel using 1 worker process
0:00:30 load avg: 0.80 running (1): test_array (30.1 sec)
0:00:41 load avg: 0.75 [1/1/1] test_array worker non-zero exit code (Exit code 66)
==================

WARNING: ThreadSanitizer: data race (pid=1075807)
  Read of size 8 at 0x7f58e22c0098 by thread T4:
    #0 _Py_TYPE Include/object.h:268 (python+0x22a21a)
    #1 Py_IS_TYPE Include/object.h:291 (python+0x22a21a)
    #2 compare_unicode_unicode_threadsafe Objects/dictobject.c:1413 (python+0x22a21a)
    #3 do_lookup Objects/dictobject.c:1010 (python+0x22bb38)
    #4 unicodekeys_lookup_unicode_threadsafe Objects/dictobject.c:1438 (python+0x22bd9b)
    #5 _Py_dict_lookup_threadsafe Objects/dictobject.c:1493 (python+0x232e50)
    #6 _PyDict_GetItemRef_KnownHash Objects/dictobject.c:2342 (python+0x233f39)
    #7 PyDict_GetItemRef Objects/dictobject.c:2378 (python+0x234070)
    #8 cache_struct_converter Modules/_struct.c:2524 (_struct.cpython-314td-x86_64-linux-gnu.so+0xc916)
    #9 calcsize Modules/clinic/_struct.c.h:249 (_struct.cpython-314td-x86_64-linux-gnu.so+0xceb0)
    #10 cfunction_vectorcall_O Objects/methodobject.c:523 (python+0x25961c)
    #11 _PyObject_VectorcallTstate Include/internal/pycore_call.h:167 (python+0x1896b3)
    #12 PyObject_Vectorcall Objects/call.c:327 (python+0x189810)
    #13 _PyEval_EvalFrameDefault Python/generated_cases.c.h:1371 (python+0x42310d)
    ...

  Previous write of size 8 at 0x7f58e22c0098 by thread T1:
    #0 memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:799 (libtsan.so.0+0x614cb)
    #1 memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:797 (libtsan.so.0+0x614cb)
    #2 memset /usr/include/x86_64-linux-gnu/bits/string_fortified.h:59 (python+0x274169)
    #3 fill_mem_debug Objects/obmalloc.c:2739 (python+0x274169)
    #4 _PyMem_DebugRawAlloc Objects/obmalloc.c:2823 (python+0x27429d)
    #5 _PyMem_DebugRawMalloc Objects/obmalloc.c:2839 (python+0x2742d6)
    #6 _PyMem_DebugMalloc Objects/obmalloc.c:3004 (python+0x274331)
    #7 PyObject_Malloc Objects/obmalloc.c:1412 (python+0x296e92)
    #8 PyUnicode_New Objects/unicodeobject.c:1407 (python+0x31fd7d)
    #9 PyUnicode_Concat Objects/unicodeobject.c:11647 (python+0x335e53)
    #10 PyNumber_Add Objects/abstract.c:1135 (python+0x14e11c)
    #11 _PyEval_EvalFrameDefault Python/generated_cases.c.h:62 (python+0x41b60b)
    ...

WARNING: ThreadSanitizer: data race (pid=1079529)
  Write of size 8 at 0x7f0050cbd0f0 by thread T17:
    #0 descr_get_qualname Objects/descrobject.c:624 (python+0x1ac687)
    #1 getset_get Objects/descrobject.c:193 (python+0x1aaf41)
    #2 _PyObject_GenericGetAttrWithDict Objects/object.c:1694 (python+0x268a4e)
    ...

  Previous write of size 8 at 0x7f0050cbd0f0 by thread T20:
    #0 descr_get_qualname Objects/descrobject.c:624 (python+0x1ac687)
    #1 getset_get Objects/descrobject.c:193 (python+0x1aaf41)
    #2 _PyObject_GenericGetAttrWithDict Objects/object.c:1694 (python+0x268a4e)
    ...

Which is not arraymodule stuff and not suppressed. So as of now its not runnable clean under tsan (at least on my system how I am running it).

Left the bad setUp() in test_array for now so u can see but it will be getting removed (or fixed).

@colesbury
Copy link
Contributor

colesbury commented Mar 5, 2025

I'd like test_array added to TSAN_TESTS, not TSAN_PARALLEL_TESTS.

warnings doesn't play nice with --parallel-threads

Yes, warnings is not therad-safe (even with the GIL). Neil is work on making it thread-safe, but for now we shouldn't add the test to TSAN_PARALLEL_TESTS for now. It's not worth trying to work around the warnings stuff.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 11, 2025

TL;DR:

If you are fine with data race on resize (due to memcpy from current array buffer to new array buffer not being atomic), then this PR is ready to go (or at least review and benchmark properly). If you want to eliminate race on resize then its either atomic memcpy or locked writes.

With respect to this this PR:

I don't want to deal with a complicated atomic memcpy at this point.

That's fine, I only added it because I was under the impression you wanted to eliminate ALL data races. Removed.

The main thread-safety benefit here is related to resizing the array concurrently with other operations.

Data race on memcpy or no, I am fairly confident the resize here happens correctly. New arraydata is stored when ready after resize with release ordering and loaded for use with acquire ordering in both individual element lock-free get and set. All other accesses for other operations are locked like resize is. I've also been stress testing this constantly in the background and haven't run into anything so far for what that's worth. To be concise, the following operations are lock-free, all others are locked:

  • array_item
  • array_ass_item - when its an assign and not a delete
  • array_subscr - when its a single element access and not a slice
  • array_ass_subscr - when its a single element replacing assign and not a delete or slice

I care a lot less about data races when reading and writing elements of the array. If we can avoid them with judicious use of relaxed loads and stores, then that's fine,

So ... Seeing as the elementwise atomic reads / writes as they are here don't appear to affect performance I would say they are fine to leave in.

Other points:

We should be careful not to regress performance here.

EDIT: If you mean vs. non-free-thread-safe, updated scimark numbers are in header of this PR.

I was surprised that the aggregate atomic memcpy had the same performance as vanilla memcpy (at least according to the quick timings I did). For reference the atomic memcpy inner loop compiled to the asm below, and the code preceding it appears to align this block to a cache line:

.L733:
	mov	r11, QWORD PTR [rdi]
	add	rsi, 64	# dest,
	add	rdi, 64	# src,
	mov	QWORD PTR -64[rsi], r11
	mov	r8, QWORD PTR -56[rdi]
	mov	QWORD PTR -56[rsi], r8
	mov	r11, QWORD PTR -48[rdi]
	mov	QWORD PTR -48[rsi], r11
	mov	r8, QWORD PTR -40[rdi]
	mov	QWORD PTR -40[rsi], r8
	mov	r11, QWORD PTR -32[rdi]
	mov	QWORD PTR -32[rsi], r11
	mov	r8, QWORD PTR -24[rdi]
	mov	QWORD PTR -24[rsi], r8
	mov	r11, QWORD PTR -16[rdi]
	mov	QWORD PTR -16[rsi], r11
	mov	r8, QWORD PTR -8[rdi]
	mov	QWORD PTR -8[rsi], r8
	cmp	r10, rsi
	ja	.L733

So maybe performance not a worry with that when/if decide to do atomic memory ops eventually.

Currently, only a small subset of tests run under TSAN, but I want to get to the point where the entire test suite runs under TSAN without warnings. If necessary, we can split up the newly added tests and skip the ones under TSAN that would issue warnings.

Wrt tests added here, that big blob of tests in test_free_threading is for data integrity, not data races. The data race tests added in test_free_threading_tsan are only ones that I know to be safe from races (note the ones commented out after removing atomic memcpy). Did you want to add more that may race and separate them?

@tom-pytel
Copy link
Contributor Author

In the meantime, is there any particular test or module which needs attention for your clean tsan project or just everything?

@colesbury
Copy link
Contributor

I think ctypes may be the most important right now. See:

https://github.com/python/cpython/issues?q=is%3Aissue%20state%3Aopen%20ctypes%20label%3Atopic-free-threading

I think @ZeroIntensity started working on it, but I don't think he's working on it right now.

I would really l like to fix:

  • Races when using unrelated ctypes data (e.g., non thread-safe global state)
  • Races when using ctypes data in a read-only manner (e.g., constructing multiple instances of the same type concurrently)

I am less concerned about concurrent modifications to the same ctypes instances. I don't really want to deal with that for now.

@ZeroIntensity
Copy link
Member

Yeah, sorry, that's on my list of things to do. I got burnt out a little while ago and haven't thought of picking it up again. I want to do a couple of things with #129824, and then I'll get back to fixing ctypes in the next week or two.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes look good, but I am still pretty unsure about changing the code to loading and storing items to use FT_ATOMIC_ macros. I think we may be opening a can of worms with all of the different data types.

I think we may want to stick with normal pointer loads and stores. If someone concurrently modifies the same array from multiple threads, that's will be a data race and UB, but the mixing of data types here is also technically UB.

@colesbury
Copy link
Contributor

I'm sorry for the conflicting advice I've given. It's hard to know how we should pursue this without seeing the actual code changes.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 15, 2025

Most of the changes look good, but I am still pretty unsure about changing the code to loading and storing items to use FT_ATOMIC_ macros. I think we may be opening a can of worms with all of the different data types.

Have a look at the possible memoryview format codes, and there is numpy. They are doable though.

I think we may want to stick with normal pointer loads and stores. If someone concurrently modifies the same array from multiple threads, that's will be a data race and UB, but the mixing of data types here is also technically UB.

IMHO I think you are overly worried about that phrase "undefined behavior" which in practice really just means "undefined (stale) value", otherwise probably no modern system would boot. Also I have never had a problem writing an int to an unsigned int * location or vice versa, but warnings are annoying yes.

In any case changed back to non-atomic get/set. Also removed tsan-specific test.

@tom-pytel
Copy link
Contributor Author

Small detail, you want to leave or remove test_array from TSAN_TESTS?

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor comments below, but I think the alignment of items is important.

@tom-pytel
Copy link
Contributor Author

Tools/c-analyzer/check-c-globals.py may be a little unstable. It complained on:

#ifdef _MSC_VER

typedef __declspec(align(8)) struct {
    Py_ssize_t allocated;
    char items[];
} arraydata;

#else

typedef struct {
    Py_ssize_t allocated;
    _Alignas(8) char items[];
} arraydata;

#endif

And:

#ifdef _MSC_VER
#define STRUCT_ALIGN __declspec(align(8))
#define ATTR_ALIGN
#else
#define STRUCT_ALIGN
#define ATTR_ALIGN _Alignas(8)
#endif

typedef STRUCT_ALIGN struct {
    Py_ssize_t allocated;
    ATTR_ALIGN char items[];
} arraydata;

for arraymodule.c, leading to what is there now.

@colesbury
Copy link
Contributor

Use something like: https://gist.github.com/colesbury/96f27e2ddf6b151adeeb4c28ed7554d8

The alignment specifier has to be on char items[] because we are putting things that are not char there. It's not useful for it to be on the arraydata struct.

@tom-pytel
Copy link
Contributor Author

Just a minor nit "MS_WINDOWS" or "_MSC_VER"? The latter is used in the codebase for MSVC-specific directives, and you can run gcc under Windows.

@colesbury
Copy link
Contributor

pyport.h uses _MSC_VER for define _Py_thread_local __declspec(thread), so I guess _MSC_VER.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants