gh-151518: Avoid STW starvation of attaching threads#152826
Draft
tpn wants to merge 1 commit into
Draft
Conversation
Free-threaded stop-the-world pauses can otherwise starve a thread trying to reattach after it was suspended while detached. A tight manual gc.collect() loop can release and immediately request the next stop-the-world pause, repeatedly parking the detached thread before it can attach and make progress. Add a distinct _Py_THREAD_SUSPENDED_DETACHED state for tstates parked from DETACHED. tstate_wait_attach() marks an attach waiter only after observing that detached-origin suspended state, and park_detached_threads() skips only those active waiters on later stop-the-world passes. The ordinary successful tstate_try_attach() path remains the baseline CAS-only path. Teach the related stop-the-world paths about both suspended states, including start_the_world() and tstate_delete_common(). Keep the new wait flag after the existing hot free-threaded _PyThreadStateImpl fields so their offsets do not move. Add a free-threaded GC regression test that runs a subprocess with a tight gc.collect() worker and verifies the main thread can reattach after sleeping and stop the worker.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #151518.
Repeated stop-the-world requests can starve a thread that was detached when
an earlier request parked it.
start_the_world()restores the tstate toDETACHED, but the next requester can park it again before its OS threadcompletes
_PyThreadState_Attach().This change distinguishes tstates parked while detached from those suspended
while attached, and marks a waiter only when
tstate_wait_attach()observesthat detached-origin state. Later STW passes leave only active attach waiters
unparked until they attach. Threads that are merely detached for sleep or I/O
remain immediately parkable, and the normal uncontended attach path remains a
single CAS. The waiter flag reuses existing
_PyThreadStateImpltail padding.The regression test runs a tight
gc.collect()loop in a subprocess andverifies that another thread can return from
time.sleep()and stop thecollector.
Validation:
mainatefcfb1a4e0f4.timed out after 30 seconds in 2/3 runs on unpatched
main; the patchedCI-style debug build passed 5/5 in about 1.2 seconds per run.
(
--with-pydebug --enable-safety --enable-slower-safety --disable-gil):PYTHON_GIL=0 ./python -m test --fast-ci test_free_threading.test_gc test_threadingpassed, 251 tests run and 5 skipped.
--disable-gil): the same focused test commandpassed, 251 tests run and 5 skipped.
_PyThreadStateImplare both 15,792bytes, with all common offsets unchanged;
stw_attach_waitingoverlays__paddingat offset 15,728.make patchcheckpassed.Earlier no-debug performance comparisons on 64-CPU AMD and 28-CPU Intel
systems found the no-thread and detached-sleeper paths roughly flat. In the
adversarial attached/churn cases, lower repeated-collector throughput was
paired with greater worker progress, which is the intended fairness effect.