diff options
author | KJ Tsanaktsidis <kj@kjtsanaktsidis.id.au> | 2023-05-19 09:12:10 +1000 |
---|---|---|
committer | Nobuyoshi Nakada <nobu@ruby-lang.org> | 2023-05-26 14:48:08 +0900 |
commit | 8e1abef46951c56ea4c9d164d33be5858e1e758d (patch) | |
tree | feed68e2fd8f53f17e4be619e2879a8aa0fa02af /thread.c | |
parent | 3dae55ea2f2e305305eea53186988c4f02e53338 (diff) |
Fix a potential busy-loop in the thread scheduler (esp. on FreeBSD)
This patch fixes a potential busy-loop in the thread scheduler. If there
are two threads, the main thread (where Ruby signal handlers must run)
and a sleeping thread, it is possible for the following sequence of
events to occur:
* The sleeping thread is in native_sleep -> sigwait_sleep A signal
* arives, kicking this thread out of rb_sigwait_sleep The sleeping
* thread calls THREAD_BLOCKING_END and eventually
thread_sched_to_running_common
* the sleeping thread writes into the sigwait_fd pipe by calling
rb_thread_wakeup_timer_thread
* the sleeping thread re-loops around in native_sleep() because
the desired sleep time has not actually yet expired
* that calls rb_sigwait_sleep again the ppoll() in rb_sigwait_sleep
* immediately returns because
of the byte written into the sigwait_fd by
rb_thread_wakeup_timer_thread
* that wakes the thread up again and kicks the whole cycle off again.
Such a loop can only be broken by the main thread waking up and handling
the signal, such that ubf_threads_empty() below becomes true again;
however this loop can actually keep things so busy (and cause so much
contention on the main thread's interrupt_lock) that the main thread
doesn't deal with the signal for many seconds. This seems particuarly
likely on FreeBSD 13.
(the cycle can also be broken by the sleeping thread finally elapsing
its desired sleep time).
The fix for _this_ loop is to only wakeup the timer thrad in
thread_sched_to_running_common if the current thread is not itself the
sigwait thread.
An almost identical loop also happens in the same circumstances because
the call to check_signals_nogvl (through sigwait_timeout) in
rb_sigwait_sleep returns true if there is any pending signal for the
main thread to handle. That then causes rb_sigwait_sleep to skip over
sleeping entirely.
This is unnescessary and counterproductive, I believe; if the main
thread needs to be woken up that is done inline in check_signals_nogvl
anyway.
See https://bugs.ruby-lang.org/issues/19680
Notes
Notes:
Merged: https://github.com/ruby/ruby/pull/7864
Diffstat (limited to 'thread.c')
-rw-r--r-- | thread.c | 1 |
1 files changed, 0 insertions, 1 deletions
@@ -4509,7 +4509,6 @@ check_signals_nogvl(rb_thread_t *th, int sigwait_fd) else { threadptr_trap_interrupt(vm->ractor.main_thread); } - ret = TRUE; /* for rb_sigwait_sleep */ } return ret; } |