<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/benchmark, branch v4.0.4</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>Avoid allocating intermediate string in zone_str</title>
<updated>2025-11-16T23:23:28+00:00</updated>
<author>
<name>John Hawthorn</name>
<email>john@hawthorn.email</email>
</author>
<published>2025-11-14T20:53:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=bdeee7012c37834a17712c7aaa8c602ae1208a8e'/>
<id>bdeee7012c37834a17712c7aaa8c602ae1208a8e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Micro-optimize Object#class</title>
<updated>2025-08-30T12:14:10+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-08-30T08:51:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=fd0c772db7e5098c2b8e03559317a3592074dfe7'/>
<id>fd0c772db7e5098c2b8e03559317a3592074dfe7</id>
<content type='text'>
Since `BUILTIN_TYPE` and `RCLASS_SINGLETON_P` are both stored in
`RBasic.flags`, we can combine these two checks in a single bitmask.

This rely on `T_ICLASS` and `T_CLASS` not overlapping, and assume
`klass` is always either of these types.

Just combining the masks brings a small but consistent 1.08x speedup on the simple case benchmark.

```
compare-ruby: ruby 3.5.0dev (2025-08-30T01:45:42Z obj-class 01a57bd6cd) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-30T09:56:24Z obj-class 2685f8dbb4) +YJIT +PRISM [arm64-darwin24]

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|obj        |     444.410|   478.895|
|           |           -|     1.08x|
|extended   |     135.139|   140.206|
|           |           -|     1.04x|
|singleton  |     165.155|   155.832|
|           |       1.06x|         -|
|immediate  |     380.103|   432.090|
|           |           -|     1.14x|
```

But with the RB_UNLIKELY compiler hint, it's much more significant, however
the singleton and enxtended cases are slowed down.
However we can assume the simple case is way more common than the other two.

```
compare-ruby: ruby 3.5.0dev (2025-08-30T01:45:42Z obj-class 01a57bd6cd) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-30T09:51:01Z obj-class 12d01a1b02) +YJIT +PRISM [arm64-darwin24]

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|obj        |     444.951|   556.191|
|           |           -|     1.25x|
|extended   |     136.836|   113.871|
|           |       1.20x|         -|
|singleton  |     166.335|   167.747|
|           |           -|     1.01x|
|immediate  |     379.642|   509.515|
|           |           -|     1.34x|
```
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since `BUILTIN_TYPE` and `RCLASS_SINGLETON_P` are both stored in
`RBasic.flags`, we can combine these two checks in a single bitmask.

This rely on `T_ICLASS` and `T_CLASS` not overlapping, and assume
`klass` is always either of these types.

Just combining the masks brings a small but consistent 1.08x speedup on the simple case benchmark.

```
compare-ruby: ruby 3.5.0dev (2025-08-30T01:45:42Z obj-class 01a57bd6cd) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-30T09:56:24Z obj-class 2685f8dbb4) +YJIT +PRISM [arm64-darwin24]

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|obj        |     444.410|   478.895|
|           |           -|     1.08x|
|extended   |     135.139|   140.206|
|           |           -|     1.04x|
|singleton  |     165.155|   155.832|
|           |       1.06x|         -|
|immediate  |     380.103|   432.090|
|           |           -|     1.14x|
```

But with the RB_UNLIKELY compiler hint, it's much more significant, however
the singleton and enxtended cases are slowed down.
However we can assume the simple case is way more common than the other two.

```
compare-ruby: ruby 3.5.0dev (2025-08-30T01:45:42Z obj-class 01a57bd6cd) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-30T09:51:01Z obj-class 12d01a1b02) +YJIT +PRISM [arm64-darwin24]

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|obj        |     444.951|   556.191|
|           |           -|     1.25x|
|extended   |     136.836|   113.871|
|           |       1.20x|         -|
|singleton  |     166.335|   167.747|
|           |           -|     1.01x|
|immediate  |     379.642|   509.515|
|           |           -|     1.34x|
```
</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: getinstancevariable cache indexes for types other than T_OBJECT</title>
<updated>2025-08-28T22:02:29+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-08-28T17:29:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=b6d4882c05d64aa6cb5ff8761584ae910de67f21'/>
<id>b6d4882c05d64aa6cb5ff8761584ae910de67f21</id>
<content type='text'>
While accessing the ivars of other types is too complicated to
realistically generate the ASM for it, we can at least provide
the ivar index as to not have to lookup the shape tree every
time.

```
compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-28T17:58:32Z yjit-get-exivar efaa8c9b09) +YJIT +PRISM [arm64-darwin24]

|                           |compare-ruby|built-ruby|
|:--------------------------|-----------:|---------:|
|vm_ivar_get_on_obj         |     930.458|   936.865|
|                           |           -|     1.01x|
|vm_ivar_get_on_class       |     134.471|   431.622|
|                           |           -|     3.21x|
|vm_ivar_get_on_generic     |     146.679|   284.408|
|                           |           -|     1.94x|
```

Co-Authored-By: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While accessing the ivars of other types is too complicated to
realistically generate the ASM for it, we can at least provide
the ivar index as to not have to lookup the shape tree every
time.

```
compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-28T17:58:32Z yjit-get-exivar efaa8c9b09) +YJIT +PRISM [arm64-darwin24]

|                           |compare-ruby|built-ruby|
|:--------------------------|-----------:|---------:|
|vm_ivar_get_on_obj         |     930.458|   936.865|
|                           |           -|     1.01x|
|vm_ivar_get_on_class       |     134.471|   431.622|
|                           |           -|     3.21x|
|vm_ivar_get_on_generic     |     146.679|   284.408|
|                           |           -|     1.94x|
```

Co-Authored-By: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Populate ivar caches for types other than T_OBJECT</title>
<updated>2025-08-28T07:25:51+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-08-27T16:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=e7fb87ee3a826a387e5bfb9edea059f1aa065462'/>
<id>e7fb87ee3a826a387e5bfb9edea059f1aa065462</id>
<content type='text'>
`vm_setinstancevariable` had a codepath to try to match the inline
cache for types other than T_OBJECT, but the cache population path
in `vm_setivar_slowpath` was exclusive to T_OBJECT, so `vm_setivar_default`
would never match anything.

This commit improves `vm_setivar_slowpath` so that it is capable of
filling the cache for all types, and adds a `vm_setivar_class` codepath
for `T_CLASS` and `T_MODULE`.

`vm_setivar`, `vm_setivar_default` and `vm_setivar_class` could be unified,
but based on the very explicit `NOINLINE` I assume they were split to minimize
codesize.

```
compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-27T16:30:31Z setivar-cache-gene.. 4fe78ff296) +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |     161.809|   164.688|
|                         |           -|     1.02x|
|vm_ivar_set_on_generic   |      58.769|   115.638|
|                         |           -|     1.97x|
|vm_ivar_set_on_class     |      70.034|   141.042|
|                         |           -|     2.01x|
```
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`vm_setinstancevariable` had a codepath to try to match the inline
cache for types other than T_OBJECT, but the cache population path
in `vm_setivar_slowpath` was exclusive to T_OBJECT, so `vm_setivar_default`
would never match anything.

This commit improves `vm_setivar_slowpath` so that it is capable of
filling the cache for all types, and adds a `vm_setivar_class` codepath
for `T_CLASS` and `T_MODULE`.

`vm_setivar`, `vm_setivar_default` and `vm_setivar_class` could be unified,
but based on the very explicit `NOINLINE` I assume they were split to minimize
codesize.

```
compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-27T16:30:31Z setivar-cache-gene.. 4fe78ff296) +PRISM [arm64-darwin24]

|                         |compare-ruby|built-ruby|
|:------------------------|-----------:|---------:|
|vm_ivar_set_on_instance  |     161.809|   164.688|
|                         |           -|     1.02x|
|vm_ivar_set_on_generic   |      58.769|   115.638|
|                         |           -|     1.97x|
|vm_ivar_set_on_class     |      70.034|   141.042|
|                         |           -|     2.01x|
```
</pre>
</div>
</content>
</entry>
<entry>
<title>Update string_casecmp.yml</title>
<updated>2025-08-11T13:22:38+00:00</updated>
<author>
<name>Erim Icel</name>
<email>erimicel@gmail.com</email>
</author>
<published>2025-08-10T14:36:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=c914389ae83d0a5b8ddc00d2c1577c937de72743'/>
<id>c914389ae83d0a5b8ddc00d2c1577c937de72743</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Optimize `str_casecmp` length check using pointer end</title>
<updated>2025-08-11T13:22:38+00:00</updated>
<author>
<name>Erim Icel</name>
<email>erimicel@gmail.com</email>
</author>
<published>2025-08-10T12:50:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=5e324ac11c2c9c6712e2cdff37f212367f71e094'/>
<id>5e324ac11c2c9c6712e2cdff37f212367f71e094</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Struct: keep direct reference to IMEMO/fields when space allows</title>
<updated>2025-08-06T15:07:49+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-08-06T12:46:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=f3206cc79bec2fd852e81ec56de59f0a67ab32b7'/>
<id>f3206cc79bec2fd852e81ec56de59f0a67ab32b7</id>
<content type='text'>
It's not rare for structs to have additional ivars, hence are one
of the most common, if not the most common type in the `gen_fields_tbl`.

This can cause Ractor contention, but even in single ractor mode
means having to do a hash lookup to access the ivars, and increase
GC work.

Instead, unless the struct is perfectly right sized, we can store
a reference to the associated IMEMO/fields object right after the
last struct member.

```
compare-ruby: ruby 3.5.0dev (2025-08-06T12:50:36Z struct-ivar-fields-2 9a30d141a1) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-06T12:57:59Z struct-ivar-fields-2 2ff3ec237f) +PRISM [arm64-darwin24]
warming up.....

|                      |compare-ruby|built-ruby|
|:---------------------|-----------:|---------:|
|member_reader         |    590.317k|  579.246k|
|                      |       1.02x|         -|
|member_writer         |    543.963k|  527.104k|
|                      |       1.03x|         -|
|member_reader_method  |    213.540k|  213.004k|
|                      |       1.00x|         -|
|member_writer_method  |    192.657k|  191.491k|
|                      |       1.01x|         -|
|ivar_reader           |    403.993k|  569.915k|
|                      |           -|     1.41x|
```

Co-Authored-By: Étienne Barrié &lt;etienne.barrie@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's not rare for structs to have additional ivars, hence are one
of the most common, if not the most common type in the `gen_fields_tbl`.

This can cause Ractor contention, but even in single ractor mode
means having to do a hash lookup to access the ivars, and increase
GC work.

Instead, unless the struct is perfectly right sized, we can store
a reference to the associated IMEMO/fields object right after the
last struct member.

```
compare-ruby: ruby 3.5.0dev (2025-08-06T12:50:36Z struct-ivar-fields-2 9a30d141a1) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-08-06T12:57:59Z struct-ivar-fields-2 2ff3ec237f) +PRISM [arm64-darwin24]
warming up.....

|                      |compare-ruby|built-ruby|
|:---------------------|-----------:|---------:|
|member_reader         |    590.317k|  579.246k|
|                      |       1.02x|         -|
|member_writer         |    543.963k|  527.104k|
|                      |       1.03x|         -|
|member_reader_method  |    213.540k|  213.004k|
|                      |       1.00x|         -|
|member_writer_method  |    192.657k|  191.491k|
|                      |       1.01x|         -|
|ivar_reader           |    403.993k|  569.915k|
|                      |           -|     1.41x|
```

Co-Authored-By: Étienne Barrié &lt;etienne.barrie@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>benchmark_driver: Stop using `Ractor#take`</title>
<updated>2025-07-04T06:23:20+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-07-03T09:56:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=95235fd528f38dcf2400abdc09a1ec2d71384ced'/>
<id>95235fd528f38dcf2400abdc09a1ec2d71384ced</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix minor typos in comments, specs, and docs</title>
<updated>2025-06-17T22:51:16+00:00</updated>
<author>
<name>Tim Smith</name>
<email>tsmith84@gmail.com</email>
</author>
<published>2025-06-17T05:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=3cfd71e7e48475d8d36ad1f1f4ca7924f67ef72e'/>
<id>3cfd71e7e48475d8d36ad1f1f4ca7924f67ef72e</id>
<content type='text'>
Just a bit of minor cleanup

Signed-off-by: Tim Smith &lt;tsmith84@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just a bit of minor cleanup

Signed-off-by: Tim Smith &lt;tsmith84@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Move more NilClass methods to ruby</title>
<updated>2025-06-12T07:30:09+00:00</updated>
<author>
<name>Hartley McGuire</name>
<email>skipkayhil@gmail.com</email>
</author>
<published>2025-06-02T22:47:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=81209719321f9cded2c4bdf50203f5ef34e3db7e'/>
<id>81209719321f9cded2c4bdf50203f5ef34e3db7e</id>
<content type='text'>
```
$ make benchmark ITEM=nilclass COMPARE_RUBY="/opt/rubies/ruby-master/bin/ruby"
/opt/rubies/3.4.2/bin/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
                    --executables="compare-ruby::/opt/rubies/ruby-master/bin/ruby -I.ext/common --disable-gem" \
                    --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
                    --output=markdown --output-compare -v $(find ../benchmark -maxdepth 1 -name 'nilclass' -o -name '*nilclass*.yml' -o -name '*nilclass*.rb' | sort)
compare-ruby: ruby 3.5.0dev (2025-06-02T13:52:25Z master cbd49ecbbe) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-06-02T22:47:21Z hm-ruby-nilclass 3e7f1f0466) +PRISM [arm64-darwin24]

|             |compare-ruby|built-ruby|
|:------------|-----------:|---------:|
|rationalize  |     24.056M|   53.908M|
|             |           -|     2.24x|
|to_c         |     23.652M|   82.781M|
|             |           -|     3.50x|
|to_i         |     89.526M|   84.388M|
|             |       1.06x|         -|
|to_f         |     84.746M|   96.899M|
|             |           -|     1.14x|
|to_r         |     25.107M|   83.472M|
|             |           -|     3.32x|
|splat        |     42.772M|   42.717M|
|             |       1.00x|         -|
```

This makes them much faster
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
```
$ make benchmark ITEM=nilclass COMPARE_RUBY="/opt/rubies/ruby-master/bin/ruby"
/opt/rubies/3.4.2/bin/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
                    --executables="compare-ruby::/opt/rubies/ruby-master/bin/ruby -I.ext/common --disable-gem" \
                    --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gem" \
                    --output=markdown --output-compare -v $(find ../benchmark -maxdepth 1 -name 'nilclass' -o -name '*nilclass*.yml' -o -name '*nilclass*.rb' | sort)
compare-ruby: ruby 3.5.0dev (2025-06-02T13:52:25Z master cbd49ecbbe) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-06-02T22:47:21Z hm-ruby-nilclass 3e7f1f0466) +PRISM [arm64-darwin24]

|             |compare-ruby|built-ruby|
|:------------|-----------:|---------:|
|rationalize  |     24.056M|   53.908M|
|             |           -|     2.24x|
|to_c         |     23.652M|   82.781M|
|             |           -|     3.50x|
|to_i         |     89.526M|   84.388M|
|             |       1.06x|         -|
|to_f         |     84.746M|   96.899M|
|             |           -|     1.14x|
|to_r         |     25.107M|   83.472M|
|             |           -|     3.32x|
|splat        |     42.772M|   42.717M|
|             |       1.00x|         -|
```

This makes them much faster
</pre>
</div>
</content>
</entry>
</feed>
