summaryrefslogtreecommitdiff
path: root/thread.c
AgeCommit message (Collapse)Author
2023-10-13fix `native_thread_destroy()` timingKoichi Sasada
With M:N thread scheduler, the native thread (NT) related resources should be freed when the NT is no longer needed. So the calling `native_thread_destroy()` at the end of `is will be freed when `thread_cleanup_func()` (at the end of Ruby thread) is not correct timing. Call it when the corresponding Ruby thread is collected.
2023-10-12M:N thread scheduler for RactorsKoichi Sasada
This patch introduce M:N thread scheduler for Ractor system. In general, M:N thread scheduler employs N native threads (OS threads) to manage M user-level threads (Ruby threads in this case). On the Ruby interpreter, 1 native thread is provided for 1 Ractor and all Ruby threads are managed by the native thread. From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means 1 Ruby thread has 1 native thread. M:N scheduler change this strategy. Because of compatibility issue (and stableness issue of the implementation) main Ractor doesn't use M:N scheduler on default. On the other words, threads on the main Ractor will be managed with 1:1 thread scheduler. There are additional settings by environment variables: `RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor. Note that non-main ractors use the M:N scheduler without this configuration. With this configuration, single ractor applications run threads on M:1 thread scheduler (green threads, user-level threads). `RUBY_MAX_CPU=n` specifies maximum number of native threads for M:N scheduler (default: 8). This patch will be reverted soon if non-easy issues are found. [Bug #19842]
2023-09-07Optimize handle_interrupt(Exception => ..) as a common caseMatthew Draper
When interrupt behavior is configured for all possible exceptions using 'Exception', there's no need to iterate the pending exception's ancestors for hash lookups. More significantly, by storing the catch-all timing symbol directly in the mask stack, we can skip allocating the hash we would otherwise need. Notes: Merged: https://github.com/ruby/ruby/pull/8278
2023-09-07Skip allocation if handle_interrupt arg is already usableMatthew Draper
If the supplied hash is already frozen and compare-by-identity, we can use it directly (still checking its contents are valid symbols), without making a new copy. Notes: Merged: https://github.com/ruby/ruby/pull/8278
2023-07-13Remove RARRAY_CONST_PTR_TRANSIENTPeter Zhu
RARRAY_CONST_PTR now does the same things as RARRAY_CONST_PTR_TRANSIENT. Notes: Merged: https://github.com/ruby/ruby/pull/8071
2023-06-30Compile disabled code for thread cache alwaysNobuyoshi Nakada
2023-06-30Don't check for null pointer in calls to freePeter Zhu
According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer. Notes: Merged: https://github.com/ruby/ruby/pull/8004
2023-06-03Fix `Thread#join(timeout)` when running inside the fiber scheduler. (#7903)Samuel Williams
Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-06-01Use a real Ruby mutex in rb_io_close_wait_list (#7884)KJ Tsanaktsidis
Because a thread calling IO#close now blocks in a native condvar wait, it's possible for there to be _no_ threads left to actually handle incoming signals/ubf calls/etc. This manifested as failing tests on Solaris 10 (SPARC), because: * One thread called IO#close, which sent a SIGVTALRM to the other thread to interrupt it, and then waited on the condvar to be notified that the reading thread was done. * One thread was calling IO#read, but it hadn't yet reached the actual call to select(2) when the SIGVTALRM arrived, so it never unblocked itself. This results in a deadlock. The fix is to use a real Ruby mutex for the close lock; that way, the closing thread goes into sigwait-sleep and can keep trying to interrupt the select(2) thread. See the discussion in: https://github.com/ruby/ruby/pull/7865/ Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-05-26* expand tabs. [ci skip]git
Please consider using misc/expand_tabs.rb as a pre-commit hook.
2023-05-26Fix busy-loop when waiting for file descriptors to closeKJ Tsanaktsidis
When one thread is closing a file descriptor whilst another thread is concurrently reading it, we need to wait for the reading thread to be done with it to prevent a potential EBADF (or, worse, file descriptor reuse). At the moment, that is done by keeping a list of threads still using the file descriptor in io_close_fptr. It then continually calls rb_thread_schedule() in fptr_finalize_flush until said list is empty. That busy-looping seems to behave rather poorly on some OS's, particulary FreeBSD. It can cause the TestIO#test_race_gets_and_close test to fail (even with its very long 200 second timeout) because the closing thread starves out the using thread. To fix that, I introduce the concept of struct rb_io_close_wait_list; a list of threads still using a file descriptor that we want to close. We call `rb_notify_fd_close` to let the thread scheduler know we're closing a FD, which fills the list with threads. Then, we call rb_notify_fd_close_wait which will block the thread until all of the still-using threads are done. This is implemented with a condition variable sleep, so no busy-looping is required. Notes: Merged: https://github.com/ruby/ruby/pull/7865
2023-05-26Fix a potential busy-loop in the thread scheduler (esp. on FreeBSD)KJ Tsanaktsidis
This patch fixes a potential busy-loop in the thread scheduler. If there are two threads, the main thread (where Ruby signal handlers must run) and a sleeping thread, it is possible for the following sequence of events to occur: * The sleeping thread is in native_sleep -> sigwait_sleep A signal * arives, kicking this thread out of rb_sigwait_sleep The sleeping * thread calls THREAD_BLOCKING_END and eventually thread_sched_to_running_common * the sleeping thread writes into the sigwait_fd pipe by calling rb_thread_wakeup_timer_thread * the sleeping thread re-loops around in native_sleep() because the desired sleep time has not actually yet expired * that calls rb_sigwait_sleep again the ppoll() in rb_sigwait_sleep * immediately returns because of the byte written into the sigwait_fd by rb_thread_wakeup_timer_thread * that wakes the thread up again and kicks the whole cycle off again. Such a loop can only be broken by the main thread waking up and handling the signal, such that ubf_threads_empty() below becomes true again; however this loop can actually keep things so busy (and cause so much contention on the main thread's interrupt_lock) that the main thread doesn't deal with the signal for many seconds. This seems particuarly likely on FreeBSD 13. (the cycle can also be broken by the sleeping thread finally elapsing its desired sleep time). The fix for _this_ loop is to only wakeup the timer thrad in thread_sched_to_running_common if the current thread is not itself the sigwait thread. An almost identical loop also happens in the same circumstances because the call to check_signals_nogvl (through sigwait_timeout) in rb_sigwait_sleep returns true if there is any pending signal for the main thread to handle. That then causes rb_sigwait_sleep to skip over sleeping entirely. This is unnescessary and counterproductive, I believe; if the main thread needs to be woken up that is done inline in check_signals_nogvl anyway. See https://bugs.ruby-lang.org/issues/19680 Notes: Merged: https://github.com/ruby/ruby/pull/7864
2023-05-18Add Fiber#kill, similar to Thread#kill. (#7823)Samuel Williams
Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-05-15Remove explicit SIGCHLD handling. (#7816)Samuel Williams
* Remove unused SIGCHLD handling. * Remove unused `init_sigchld`. * Remove unnecessary `#define RUBY_SIGCHLD (0)`. * Remove unused `SIGCHLD_LOSSY`. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-04-04fix deadlock on `Thread#join`Koichi Sasada
because of 9720f5ac894566ade2aabcf9adea0a3235de1353 http://rubyci.s3.amazonaws.com/solaris11-sunc/ruby-master/log/20230403T130011Z.fail.html.gz ``` 1) Failure: TestThread#test_signal_at_join [/export/home/chkbuild/chkbuild-sunc/tmp/build/20230403T130011Z/ruby/test/ruby/test_thread.rb:1488]: Exception raised: <#<fatal:"No live threads left. Deadlock?\n1 threads, 1 sleeps current:0x00891288 main thread:0x00891288\n* #<Thread:0xfef89a18 sleep_forever>\n rb_thread_t:0x00891288 native:0x00000001 int:0\n \n">> Backtrace: -:30:in `join' -:30:in `block (3 levels) in <main>' -:21:in `times' -:21:in `block (2 levels) in <main>'. ``` The mechanism: * Main thread (M) calls `Thread#join` * M: calls `sleep_forever()` * M: set `th->status = THREAD_STOPPED_FOREVER` * M: do `checkints` * M: handle a trap handler with `th->status = THREAD_RUNNABLE` * M: thread switch at the end of the trap handler * Another thread (T) will process `Thread#kill` by M. * T: `rb_threadptr_join_list_wakeup()` at the end of T tris to wakeup M, but M's state is runnable because M is handling trap handler and just ignore the waking up and terminate T$a * T: switch to M. * M: after the trap handler, reset `th->status = THREAD_STOPPED_FOREVER` and check deadlock -> Deadlock because only M is living. To avoid such situation, add new sleep flags `SLEEP_ALLOW_SPURIOUS` and `SLEEP_NO_CHECKINTS` to skip any check ints. BTW this is instentional to leave second `vm_check_ints_blocking()` without checking `SLEEP_NO_CHECKINTS` because `SLEEP_ALLOW_SPURIOUS` should be specified with `SLEEP_NO_CHECKINTS` and skipping this checkints can skip any interrupts. Notes: Merged: https://github.com/ruby/ruby/pull/7647
2023-04-01use `sleep_forever()` on `thread_join_sleep()`Koichi Sasada
because it does same thing. Notes: Merged: https://github.com/ruby/ruby/pull/7642
2023-03-31cosmetic changeKoichi Sasada
reorder `sleep_forever()` and so on.
2023-03-31pass `th` to `thread_sched_to_waiting()`Koichi Sasada
for future extension Notes: Merged: https://github.com/ruby/ruby/pull/7639
2023-03-31remove "\n" for `RUBY_DEBUG_LOG()`Koichi Sasada
because `RUBY_DEBUG_LOG()` add "\n" at the end of message.
2023-03-30`rb_ractor_thread_list()` only for current ractorKoichi Sasada
so that no need to lock the ractor. Notes: Merged: https://github.com/ruby/ruby/pull/7616
2023-03-30cosmetic changeKoichi Sasada
Notes: Merged: https://github.com/ruby/ruby/pull/7618
2023-03-15Rename RB_GC_SAVE_MACHINE_CONTEXT -> RB_VM_SAVE_MACHINE_CONTEXTMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7465
2023-03-15Remove SIGCHLD `waidpid`. (#7527)Samuel Williams
* Remove `waitpid_lock` and related code. * Remove un-necessary test. * Remove `rb_thread_sleep_interruptible` dead code. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-03-14Revert SIGCHLD changes to diagnose CI failures. (#7517)Samuel Williams
* Revert "Remove special handling of `SIGCHLD`. (#7482)" This reverts commit 44a0711eab7fbc71ac2c8ff489d8c53e97a8fe75. * Revert "Remove prototypes for functions that are no longer used. (#7497)" This reverts commit 4dce12bead3bfd91fd80b5e7195f7f540ffffacb. * Revert "Remove SIGCHLD `waidpid`. (#7476)" This reverts commit 1658e7d96696a656d9bd0a0c84c82cde86914ba2. * Fix change to rjit variable name. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-03-09Remove SIGCHLD `waidpid`. (#7476)Samuel Williams
* Remove `waitpid_lock` and related code. * Remove un-necessary test. * Remove `rb_thread_sleep_interruptible` dead code. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-03-07Remove obsoleted functions in rjit.cTakashi Kokubun
2023-03-07Get rid of MJIT's special forkTakashi Kokubun
2023-03-06s/mjit/rjit/Takashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7462
2023-03-06s/MJIT/RJIT/Takashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7462
2023-03-07Correctly clean up `keeping_mutexes` before resuming any other threads. (#7460)Samuel Williams
It's possible (but very rare) to have a race condition between setting `mutex->fiber = NULL` and `thread_mutex_remove(th, mutex)` which results in the following bug: ``` [BUG] invalid keeping_mutexes: Attempt to unlock a mutex which is not locked ``` Fixes <https://bugs.ruby-lang.org/issues/19480>. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2023-03-06Stop exporting symbols for MJITTakashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7459
2023-03-06TestThreadInstrumentation: emit the EXIT event soonerJean Boussier
``` 1) Failure: TestThreadInstrumentation#test_thread_instrumentation [/tmp/ruby/src/trunk-repeat20-asserts/test/-ext-/thread/test_instrumentation_api.rb:33]: Call counters[4]: [3, 4, 4, 4, 0]. Expected 0 to be > 0. ``` We fire the EXIT hook after the call to `thread_sched_to_dead` which mean another thread might be running before the `EXIT` hook have been executed. Notes: Merged: https://github.com/ruby/ruby/pull/7249
2023-02-09Merge gc.h and internal/gc.hMatt Valentine-House
[Feature #19425] Notes: Merged: https://github.com/ruby/ruby/pull/7273
2023-02-08Only emit circular dependency warning for owned thread shieldsJean byroot Boussier
[Bug #19415] If multiple threads attemps to load the same file concurrently it's not a circular dependency issue. So we check that the existing ThreadShield is owner by the current fiber before warning about circular dependencies. Notes: Merged: https://github.com/ruby/ruby/pull/7257
2023-02-06Revert "Only emit circular dependency warning for owned thread shields"Jean byroot Boussier
This reverts commit fa49651e05a06512e18ccb2f54a7198c9ff579de. Notes: Merged: https://github.com/ruby/ruby/pull/7256
2023-02-06Only emit circular dependency warning for owned thread shieldsJean Boussier
[Bug #19415] If multiple threads attemps to load the same file concurrently it's not a circular dependency issue. So we check that the existing ThreadShield is owner by the current fiber before warning about circular dependencies. Notes: Merged: https://github.com/ruby/ruby/pull/7252
2023-01-20Remove unused struct member thgroup->groupMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7162
2022-12-01Introduce `Fiber#storage` for inheritable fiber-scoped variables. (#6612)Samuel Williams
Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2022-11-16Using UNDEF_P macroS-H-GAMELINKS
Notes: Merged: https://github.com/ruby/ruby/pull/6721
2022-11-09Make pending_interrupt?(Exception) workYusuke Endoh
A patch from katsu (Katsuhiro Ueno) [Bug #19110] Notes: Merged: https://github.com/ruby/ruby/pull/6689
2022-10-20Avoid missed wakeup with fiber scheduler and Fiber.blocking. (#6588)Samuel Williams
* Ensure that blocked fibers don't prevent valid wakeups. Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2022-10-07Add IO#timeout attribute and use it for blocking IO operations. (#5653)Samuel Williams
Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2022-09-11MJIT: Do not hang after forking with threadsTakashi Kokubun
First, rb_mjit_fork should call rb_thread_atfork to stop threads after fork in the child process. Unfortunately, we cannot use rb_fork_ruby to prevent this kind of mistakes because MJIT needs special handling of waiting_pid and mjit_pause/resume. Second, mjit_waitpid_finished should be checked regardless of trap_interrupt. It doesn't seem like the flag is not set when SIGCHLD is handled for an MJIT child process.
2022-09-07Exit status macros need sys/wait.h on FreeBSDNobuyoshi Nakada
2022-09-06Do not fork the process on --mjit-waitTakashi Kokubun
fork is for parallel compilation, but --mjit-wait cancels it. It's more useful to not fork it for binding.irb, debugging, etc.
2022-08-06Allow `RUBY_DEBUG_LOG` format to be emptyNobuyoshi Nakada
GCC warns of empty format strings, perhaps because they have no effects in printf() and there are better ways than sprintf(). However, ruby_debug_log() adds informations other than the format, this warning is not the case.
2022-08-02Implement Queue#pop(timeout: sec)Jean Boussier
[Feature #18774] As well as `SizedQueue#pop(timeout: sec)` If both `non_block=true` and `timeout:` are supplied, ArgumentError is raised. Notes: Merged: https://github.com/ruby/ruby/pull/6185
2022-07-26Rename rb_ary_tmp_new to rb_ary_hidden_newPeter Zhu
rb_ary_tmp_new suggests that the array is temporary in some way, but that's not true, it just creates an array that's hidden and not on the transient heap. This commit renames it to rb_ary_hidden_new. Notes: Merged: https://github.com/ruby/ruby/pull/6180
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-07-13GVL Instrumentation: remove the EXITED count assertionJean Boussier
It's very flaky for some unknown reason. Something we have an extra EXITED event. I suspect some other test is causing this. Notes: Merged: https://github.com/ruby/ruby/pull/6133