Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-126895: fix readline module in free-threaded build #131208

Merged
merged 7 commits into from
Mar 17, 2025

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Mar 13, 2025

This is nothing more than adding @critical_section to functions which use the readline library. It is not thread-safe (because readline is not thread-safe), but it doesn't crash like before.

The test test_free_threading_doctest_difflib is probably overkill, but included at first to show that it works. Can comment out or remove as desired.

Also, without test_free_threading_doctest_difflib (that uses stuff which is not free-thread safe), test_readline fails without crashing with --parallel-threads, where previously it crashed. Also without it doesn't complain when tested with TSAN.

Subinterpreters are not an issue because readline just doesn't load in them: module readline does not support loading in subinterpreters

@tom-pytel
Copy link
Contributor Author

ping @ZeroIntensity , @colesbury

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This covers free-threading, but after re-reading some of my analysis from back in November, I'm pretty sure it's the GNU readline library itself that's not thread-safe. Critical sections are per-interpreter, so this will continue to race in subinterpreters. I'd rather handle both cases in one fix.

That said, I'm not sure whether it's unsafe to call any readline function concurrently, or just the same function across multiple threads. I suspect its the former. In that case, let's go with a global lock around readline calls, but otherwise we can go with a local static PyMutex to serialize single calls.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 14, 2025

Critical sections are per-interpreter, so this will continue to race in subinterpreters.

What I meant above is that readline just straight up doesn't import in a subinterpreter, it gives that error: module readline does not support loading in subinterpreters, so this is not an issue.

That said, I'm not sure whether it's unsafe to call any readline function concurrently,

You can call across threads and it won't crash because it is synchronized with GIL, you will just get wrong results. But if you called free-threaded it would crash, now it just behaves like not free-threaded and doesn't crash.

but otherwise we can go with a local static PyMutex to serialize single calls

Could do that, but is that safe if one call dies in an unexpected way and mutex is left locked?

@ZeroIntensity
Copy link
Member

  • Oh, I thought we got everything in the stdlib over to multi-phase init :(. It might be worth fixing that and going with global locks for 3.14, but this fix makes sense for 3.13.
  • Yes, but I'm trying to figure out the actual implications here. Can we call unrelated readline functions concurrently, or does all of readline need to be protected by a single lock?
  • Critical sections have the same issue. We shouldn't worry about threads randomly dying and leaving things locked.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 14, 2025

  • Yes, but I'm trying to figure out the actual implications here. Can we call unrelated readline functions concurrently, or does all of readline need to be protected by a single lock?

The single lock on module is what I implemented here, so they are all mutually exclusive (since readline lib is specifically not thread-safe).

  • Critical sections have the same issue. We shouldn't worry about threads randomly dying and leaving things locked.

I was under the impression that critical sections, as opposed to pure PyMutex, are fairly deadlock proof? Don't they release after a period of time? And also: https://docs.python.org/3/c-api/init.html#python-critical-section-api -

"Critical sections avoid deadlocks by implicitly suspending active critical sections and releasing the locks during calls to PyEval_SaveThread(). When PyEval_RestoreThread() is called, the most recent critical section is resumed, and its locks reacquired. This means the critical section API provides weaker guarantees than traditional locks ..."

@ZeroIntensity
Copy link
Member

Critical sections are based off an object's PyMutex, it just implicitly releases that mutex when the thread state is detached (for example, when waiting on another lock), which prevents deadlocks. If the thread crashes, the lock is still held. (These details don't matter much here, anyway.)

@tom-pytel
Copy link
Contributor Author

So what's the verdict then wrt this, changes?

@ZeroIntensity
Copy link
Member

Let's just confirm that we really do need a global lock here; I don't want to drastically inhibit scaling if we don't have to.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 14, 2025

Let's just confirm that we really do need a global lock here; I don't want to drastically inhibit scaling if we don't have to.

Short: Yes it needs to be global because readline is not thread-safe. But its only applied on things that actually call into readline.

@colesbury colesbury self-requested a review March 14, 2025 20:59
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding critical sections seems fine to me:

  • I don't think the new tests are worth it
  • There are a bunch of functions that don't have critical sections: readline.write_history_file, readline.append_history_file, readline.set_history_length, ...

@ZeroIntensity
Copy link
Member

I'm fine with critical sections too, I'm just worried about basically enabling the GIL for all usage of readline.

Yes it needs to be global because readline is not thread-safe. But its only applied on things that actually call into readline.

I think you're misinterpreting what I mean a little. "Thread-safety" is a blanket term; I mean that we might not need locks synchronizing the entire module. For example, it makes sense to me that write_history_file isn't thread-safe against read_history_file, but I see no reason that set_startup_hook couldn't run in another thread while those two are synchronized.

@colesbury
Copy link
Contributor

Adding a lock to the readline module is not like enabling the GIL. The GIL affects everything in the process, including things that don't touch the readline module. The readline lock only affects things that use readline. Using multiple threads to concurrently muck around with readline is not a real use case that we should optimize for. So far the only time I've seen anyone try to use readline from multiple threads at all is when fuzzing CPython.

...but I see no reason that set_startup_hook couldn't run in another thread while those two are synchronized...

In general, we should prefer the simplest approach to thread safety that is efficient. In this case, that means a single lock for the entire readline module.

@tom-pytel
Copy link
Contributor Author

read_history_file had critical section but I did miss write_history_file and append_history_file. Other points:

  • @critical_section module -> @critical_section.
  • Added atomics instead of critical sections to _history_length stuff.
  • Removed tests but left one single minimized one because there really should be at least one to check for crash?
  • readline_set_completion_display_matches_hook_impl has critical section because is setting rl_ global var which may be used inside locked readline functions. Anything that touches global rl_ stuff is critical sectioned (except setup / constructor stuff).
  • I didn't give stuff set_hook and set_startup_hook and the like critical sections because they just setting py vars, not readline stuff, maybe should have at least atomics, but does it matter?

One potential issue. call_readline() uses readline stuff but not locked here because is super low level and no clean way to access lock. Also not used here but exposed via PyOS_ReadlineFunctionPointer which is used by myreadline.c in Parser. Could store module in a global to use for lock or look it up everytime in sys.modules in this function, but is it worth it? Or can just count on caller to sync and that nothing else is using readline simultaneously?

@colesbury
Copy link
Contributor

  • set_hook should not have it's own critical section, but the callers should have a critical section
  • call_readline cannot (and should not) use a critical section. It's called without the GIL held in the default build.

@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 17, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 0dca261 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F131208%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Mar 17, 2025
@colesbury colesbury self-assigned this Mar 17, 2025
@colesbury colesbury removed the needs backport to 3.13 bugs and security fixes label Mar 17, 2025
@colesbury colesbury merged commit 863d54c into python:main Mar 17, 2025
59 checks passed
@colesbury
Copy link
Contributor

Thanks Tom!

I'm not going to backport this to 3.13. I think it's unlikely that anyone is going to try to use the readline module from multiple threads concurrently outside of fuzzing.

plashchynski pushed a commit to plashchynski/cpython that referenced this pull request Mar 17, 2025
…31208)

The underlying readline library is not thread-safe so this adds `@critical_section` to functions that use it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants