<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/internal, branch v4.0.2</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>Rename fiber_serial into ec_serial</title>
<updated>2025-12-16T08:51:07+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-12-15T23:43:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=e42bcd7ce76e75601ef3adf35467edf277471af2'/>
<id>e42bcd7ce76e75601ef3adf35467edf277471af2</id>
<content type='text'>
Since it now live in the EC.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since it now live in the EC.
</pre>
</div>
</content>
</entry>
<entry>
<title>Store the fiber_serial in the EC to allow inlining</title>
<updated>2025-12-16T08:51:07+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-12-14T09:46:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=28b195fc67788c03be59c2a4cbf0cad52ac3b90f'/>
<id>28b195fc67788c03be59c2a4cbf0cad52ac3b90f</id>
<content type='text'>
Mutexes spend a significant amount of time in `rb_fiber_serial`
because it can't be inlined (except with LTO).
The fiber struct is opaque the so function can't be defined as inlineable.

Ideally the while fiber struct would not be opaque to the rest of
Ruby core, but it's tricky to do.

Instead we can store the fiber serial in the execution context
itself, and make its access cheaper:

```
$ hyperfine './miniruby-baseline --yjit /tmp/mut.rb' './miniruby-inline-serial --yjit /tmp/mut.rb'
Benchmark 1: ./miniruby-baseline --yjit /tmp/mut.rb
  Time (mean ± σ):      4.011 s ±  0.084 s    [User: 3.977 s, System: 0.011 s]
  Range (min … max):    3.950 s …  4.245 s    10 runs

Benchmark 2: ./miniruby-inline-serial --yjit /tmp/mut.rb
  Time (mean ± σ):      3.495 s ±  0.150 s    [User: 3.448 s, System: 0.009 s]
  Range (min … max):    3.340 s …  3.869 s    10 runs

Summary
  ./miniruby-inline-serial --yjit /tmp/mut.rb ran
    1.15 ± 0.05 times faster than ./miniruby-baseline --yjit /tmp/mut.rb
```

```ruby
i = 10_000_000
mut = Mutex.new
while i &gt; 0
  i -= 1
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
end
```
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mutexes spend a significant amount of time in `rb_fiber_serial`
because it can't be inlined (except with LTO).
The fiber struct is opaque the so function can't be defined as inlineable.

Ideally the while fiber struct would not be opaque to the rest of
Ruby core, but it's tricky to do.

Instead we can store the fiber serial in the execution context
itself, and make its access cheaper:

```
$ hyperfine './miniruby-baseline --yjit /tmp/mut.rb' './miniruby-inline-serial --yjit /tmp/mut.rb'
Benchmark 1: ./miniruby-baseline --yjit /tmp/mut.rb
  Time (mean ± σ):      4.011 s ±  0.084 s    [User: 3.977 s, System: 0.011 s]
  Range (min … max):    3.950 s …  4.245 s    10 runs

Benchmark 2: ./miniruby-inline-serial --yjit /tmp/mut.rb
  Time (mean ± σ):      3.495 s ±  0.150 s    [User: 3.448 s, System: 0.009 s]
  Range (min … max):    3.340 s …  3.869 s    10 runs

Summary
  ./miniruby-inline-serial --yjit /tmp/mut.rb ran
    1.15 ± 0.05 times faster than ./miniruby-baseline --yjit /tmp/mut.rb
```

```ruby
i = 10_000_000
mut = Mutex.new
while i &gt; 0
  i -= 1
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
  mut.synchronize { }
end
```
</pre>
</div>
</content>
</entry>
<entry>
<title>Fewer calls to `GET_EC()` and `GET_THREAD()` (#15506)</title>
<updated>2025-12-12T19:47:43+00:00</updated>
<author>
<name>Luke Gruber</name>
<email>luke.gruber@shopify.com</email>
</author>
<published>2025-12-12T19:47:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=3add3db797c4216423fdaa4bef6e2ee3c7630303'/>
<id>3add3db797c4216423fdaa4bef6e2ee3c7630303</id>
<content type='text'>
The changes are to `io.c` and `thread.c`.
I changed the API of 2 exported thread functions from `internal/thread.h` that
didn't look like they had any use in C extensions:

* rb_thread_wait_for_single_fd
* rb_thread_io_wait

I didn't change the following exported internal function because it's
used in C extensions:

* rb_thread_fd_select

I added a comment to note that this function, although internal, is used
in C extensions.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The changes are to `io.c` and `thread.c`.
I changed the API of 2 exported thread functions from `internal/thread.h` that
didn't look like they had any use in C extensions:

* rb_thread_wait_for_single_fd
* rb_thread_io_wait

I didn't change the following exported internal function because it's
used in C extensions:

* rb_thread_fd_select

I added a comment to note that this function, although internal, is used
in C extensions.</pre>
</div>
</content>
</entry>
<entry>
<title>thead_sync.c: directly pass the execution context to yield</title>
<updated>2025-12-12T09:08:05+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-12-12T08:10:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=ff831eb0572b2d8f794acca478ea77c7bfefbc61'/>
<id>ff831eb0572b2d8f794acca478ea77c7bfefbc61</id>
<content type='text'>
Saves one more call to GET_EC()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Saves one more call to GET_EC()
</pre>
</div>
</content>
</entry>
<entry>
<title>Mutex: avoid repeated calls to `GET_EC`</title>
<updated>2025-12-11T22:25:57+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-12-10T10:44:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=07b2356a6ad314b9a7b2bb9fc0527b440f004faa'/>
<id>07b2356a6ad314b9a7b2bb9fc0527b440f004faa</id>
<content type='text'>
That call is surprisingly expensive, so trying doing it once
in `#synchronize` and then passing the EC to lock and unlock
saves quite a few cycles.

Before:

```
ruby 4.0.0dev (2025-12-10T09:30:18Z master c5608ab4d7) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               Mutex     1.888M i/100ms
             Monitor     1.633M i/100ms
Calculating -------------------------------------
               Mutex     22.610M (± 0.2%) i/s   (44.23 ns/i) -    113.258M in   5.009097s
             Monitor     19.148M (± 0.3%) i/s   (52.22 ns/i) -     96.366M in   5.032755s
```

After:
```
ruby 4.0.0dev (2025-12-10T10:40:07Z speedup-mutex 1c901cd4f8) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               Mutex     2.095M i/100ms
             Monitor     1.578M i/100ms
Calculating -------------------------------------
               Mutex     24.456M (± 0.4%) i/s   (40.89 ns/i) -    123.584M in   5.053418s
             Monitor     19.176M (± 0.1%) i/s   (52.15 ns/i) -     96.243M in   5.018977s
```

Bench:

```
require 'bundler/inline'

gemfile do
  gem "benchmark-ips"
end

mutex = Mutex.new
require "monitor"
monitor = Monitor.new

Benchmark.ips do |x|
  x.report("Mutex") { mutex.synchronize { } }
  x.report("Monitor") { monitor.synchronize { } }
end
```
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
That call is surprisingly expensive, so trying doing it once
in `#synchronize` and then passing the EC to lock and unlock
saves quite a few cycles.

Before:

```
ruby 4.0.0dev (2025-12-10T09:30:18Z master c5608ab4d7) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               Mutex     1.888M i/100ms
             Monitor     1.633M i/100ms
Calculating -------------------------------------
               Mutex     22.610M (± 0.2%) i/s   (44.23 ns/i) -    113.258M in   5.009097s
             Monitor     19.148M (± 0.3%) i/s   (52.22 ns/i) -     96.366M in   5.032755s
```

After:
```
ruby 4.0.0dev (2025-12-10T10:40:07Z speedup-mutex 1c901cd4f8) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               Mutex     2.095M i/100ms
             Monitor     1.578M i/100ms
Calculating -------------------------------------
               Mutex     24.456M (± 0.4%) i/s   (40.89 ns/i) -    123.584M in   5.053418s
             Monitor     19.176M (± 0.1%) i/s   (52.15 ns/i) -     96.243M in   5.018977s
```

Bench:

```
require 'bundler/inline'

gemfile do
  gem "benchmark-ips"
end

mutex = Mutex.new
require "monitor"
monitor = Monitor.new

Benchmark.ips do |x|
  x.report("Mutex") { mutex.synchronize { } }
  x.report("Monitor") { monitor.synchronize { } }
end
```
</pre>
</div>
</content>
</entry>
<entry>
<title>Speed up class allocator search</title>
<updated>2025-12-11T17:53:10+00:00</updated>
<author>
<name>John Hawthorn</name>
<email>john@hawthorn.email</email>
</author>
<published>2025-12-06T07:31:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=32e6dc0f31b98cf17dd9ace46561d74a55966b20'/>
<id>32e6dc0f31b98cf17dd9ace46561d74a55966b20</id>
<content type='text'>
This rewrites the class allocator search to be faster. Instead of using
RCLASS_SUPER, which is now even slower due to Box, we can scan the
superclasses list to find a class where the allocator is defined.

This also disallows allocating from an ICLASS. Previously I believe that
was only done for FrozenCore, and that was changed in
e596cf6e93dbf121e197cccfec8a69902e00eda3.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This rewrites the class allocator search to be faster. Instead of using
RCLASS_SUPER, which is now even slower due to Box, we can scan the
superclasses list to find a class where the allocator is defined.

This also disallows allocating from an ICLASS. Previously I believe that
was only done for FrozenCore, and that was changed in
e596cf6e93dbf121e197cccfec8a69902e00eda3.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add `NUM2PTR` and `PTR2NUM` macros</title>
<updated>2025-12-10T03:09:50+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2025-12-10T03:09:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=3636277dc5837bcedcd5ef43d49423194064a676'/>
<id>3636277dc5837bcedcd5ef43d49423194064a676</id>
<content type='text'>
These macros have been defined here and there, so collect them.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These macros have been defined here and there, so collect them.
</pre>
</div>
</content>
</entry>
<entry>
<title>Box: remove copied extension files</title>
<updated>2025-12-09T14:41:50+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2025-11-08T02:05:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=573896a40ac25bd9febb2bbc0502b43ef36f9b9b'/>
<id>573896a40ac25bd9febb2bbc0502b43ef36f9b9b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix strict aliasing warning in rb_int128_to_numeric</title>
<updated>2025-12-08T23:01:05+00:00</updated>
<author>
<name>Peter Zhu</name>
<email>peter@peterzhu.ca</email>
</author>
<published>2025-12-07T17:43:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=55ea3ec00f5166423cd7dcd67e220cd264a766f6'/>
<id>55ea3ec00f5166423cd7dcd67e220cd264a766f6</id>
<content type='text'>
If we don't have uint128, then rb_int128_to_numeric emits a strict
aliasing warning:

    numeric.c:3641:39: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
    3641 |         return rb_uint128_to_numeric(*(rb_uint128_t*)&amp;n);
         |                                       ^~~~~~~~~~~~~~~~~
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we don't have uint128, then rb_int128_to_numeric emits a strict
aliasing warning:

    numeric.c:3641:39: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
    3641 |         return rb_uint128_to_numeric(*(rb_uint128_t*)&amp;n);
         |                                       ^~~~~~~~~~~~~~~~~
</pre>
</div>
</content>
</entry>
<entry>
<title>Make `ruby_reset_leap_second_info` internal</title>
<updated>2025-12-08T03:11:58+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2025-12-08T03:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a82aa08fe0112aefd35e28dc5ca3f9ea9238ae17'/>
<id>a82aa08fe0112aefd35e28dc5ca3f9ea9238ae17</id>
<content type='text'>
It is exported only for the extension library to test, but the method
is no longer used since 29e31e72fb5a14194a78ec974c4ba56c33ad8d45.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is exported only for the extension library to test, but the method
is no longer used since 29e31e72fb5a14194a78ec974c4ba56c33ad8d45.
</pre>
</div>
</content>
</entry>
</feed>
