ruby.git/internal/thread.h, branch v3_4_9

introduce `rb_ec_check_ints()`

2024-11-08T09:02:46+00:00

to avoid TLS issue with N:M threads.

`interrupt_exec`

2024-11-08T09:02:46+00:00

introduce
- rb_threadptr_interrupt_exec
- rb_ractor_interrupt_exec

to intercept the thread/ractor execution.

Ensure fiber scheduler is woken up when close interrupts read

2024-09-17T00:11:44+00:00

If one thread is reading and another closes that socket, the close
blocks waiting for the read to abort cleanly. This ensures that Ruby is
totally done with the file descriptor _BEFORE_ we tell the OS to close
and potentially re-use it.

When the read is correctly terminated, the close should be unblocked.
That currently works if closing is happening on a thread, but if it's
happening on a fiber with a fiber scheduler, it does NOT work.

This patch ensures that if the close happened in a fiber scheduled
thread, that the scheduler is notified that the fiber is unblocked.

[Bug #20723]

Proof of Concept: Allow to prevent fork from happening in known fork unsafe API

2024-09-05T09:43:46+00:00

[Feature #20590]

For better of for worse, fork(2) remain the primary provider of
parallelism in Ruby programs. Even though it's frowned uppon in
many circles, and a lot of literature will simply state that only
async-signal safe APIs are safe to use after `fork()`, in practice
most APIs work well as long as you are careful about not forking
while another thread is holding a pthread mutex.

One of the APIs that is known cause fork safety issues is `getaddrinfo`.
If you fork while another thread is inside `getaddrinfo`, a mutex
may be left locked in the child, with no way to unlock it.

I think we could reduce the impact of these problem by preventing
in for the most notorious and common cases, by locking around
`fork(2)` and known unsafe APIs with a read-write lock.

Do not `poll` first

2024-01-04T20:51:25+00:00

Before this patch, the MN scheduler waits for the IO with the
following steps:

1. `poll(fd, timeout=0)` to check fd is ready or not.
2. if fd is not ready, waits with MN thread scheduler
3. call `func` to issue the blocking I/O call

The advantage of advanced `poll()` is we can wait for the
IO ready for any fds. However `poll()` becomes overhead
for already ready fds.

This patch changes the steps like:

1. call `func` to issue the blocking I/O call
2. if the `func` returns `EWOULDBLOCK` the fd is `O_NONBLOCK`
   and we need to wait for fd is ready so that waits with MN
   thread scheduler.

In this case, we can wait only for `O_NONBLOCK` fds. Otherwise
it waits with blocking operations such as `read()` system call.
However we don't need to call `poll()` to check fd is ready
in advance.

With this patch we can observe performance improvement
on microbenchmark which repeats blocking I/O (not
`O_NONBLOCK` fd) with and without MN thread scheduler.

```ruby
require 'benchmark'

f = open('/dev/null', 'w')
f.sync = true

TN = 1
N = 1_000_000 / TN

Benchmark.bm{|x|
  x.report{
    TN.times.map{
      Thread.new{
        N.times{f.print '.'}
      }
    }.each(&:join)
  }
}
__END__
TN = 1
                 user     system      total        real
ruby32       0.393966   0.101122   0.495088 (  0.495235)
ruby33       0.493963   0.089521   0.583484 (  0.584091)
ruby33+MN    0.639333   0.200843   0.840176 (  0.840291) <- Slow
this+MN      0.512231   0.099091   0.611322 (  0.611074) <- Good
```

declare `rb_thread_io_blocking_call`

2023-12-19T22:00:41+00:00

M:N thread scheduler for Ractors

2023-10-12T05:47:01+00:00

This patch introduce M:N thread scheduler for Ractor system.

In general, M:N thread scheduler employs N native threads (OS threads)
to manage M user-level threads (Ruby threads in this case).
On the Ruby interpreter, 1 native thread is provided for 1 Ractor
and all Ruby threads are managed by the native thread.

From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means
1 Ruby thread has 1 native thread. M:N scheduler change this strategy.

Because of compatibility issue (and stableness issue of the implementation)
main Ractor doesn't use M:N scheduler on default. On the other words,
threads on the main Ractor will be managed with 1:1 thread scheduler.

There are additional settings by environment variables:

`RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor.
Note that non-main ractors use the M:N scheduler without this
configuration. With this configuration, single ractor applications
run threads on M:1 thread scheduler (green threads, user-level threads).

`RUBY_MAX_CPU=n` specifies maximum number of native threads for
M:N scheduler (default: 8).

This patch will be reverted soon if non-easy issues are found.

[Bug #19842]

Use a real Ruby mutex in rb_io_close_wait_list (#7884)

2023-06-01T08:37:18+00:00

Because a thread calling IO#close now blocks in a native condvar wait,
it's possible for there to be _no_ threads left to actually handle
incoming signals/ubf calls/etc.

This manifested as failing tests on Solaris 10 (SPARC), because:

* One thread called IO#close, which sent a SIGVTALRM to the other
  thread to interrupt it, and then waited on the condvar to be notified
  that the reading thread was done.
* One thread was calling IO#read, but it hadn't yet reached the actual
  call to select(2) when the SIGVTALRM arrived, so it never unblocked
  itself.

This results in a deadlock.

The fix is to use a real Ruby mutex for the close lock; that way, the
closing thread goes into sigwait-sleep and can keep trying to interrupt
the select(2) thread.

See the discussion in: https://github.com/ruby/ruby/pull/7865/

Fix busy-loop when waiting for file descriptors to close

2023-05-26T05:51:23+00:00

When one thread is closing a file descriptor whilst another thread is
concurrently reading it, we need to wait for the reading thread to be
done with it to prevent a potential EBADF (or, worse, file descriptor
reuse).

At the moment, that is done by keeping a list of threads still using the
file descriptor in io_close_fptr. It then continually calls
rb_thread_schedule() in fptr_finalize_flush until said list is empty.

That busy-looping seems to behave rather poorly on some OS's,
particulary FreeBSD. It can cause the TestIO#test_race_gets_and_close
test to fail (even with its very long 200 second timeout) because the
closing thread starves out the using thread.

To fix that, I introduce the concept of struct rb_io_close_wait_list; a
list of threads still using a file descriptor that we want to close. We
call `rb_notify_fd_close` to let the thread scheduler know we're closing
a FD, which fills the list with threads. Then, we call
rb_notify_fd_close_wait which will block the thread until all of the
still-using threads are done.

This is implemented with a condition variable sleep, so no busy-looping
is required.

Add Fiber#kill, similar to Thread#kill. (#7823)

2023-05-18T14:33:42+00:00