<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/re.c, branch v4.0.4</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>merge revision(s) 526344b56ea968d5704bdefe6e10bb3cf7f4f569, 8ad6baa01746e8de0460f0ccdaee69953a70af17: [Backport #21933]</title>
<updated>2026-05-11T20:43:47+00:00</updated>
<author>
<name>Takashi Kokubun</name>
<email>takashikkbn@gmail.com</email>
</author>
<published>2026-05-11T20:43:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=dd78605b2d06600750c331f307083d60df702814'/>
<id>dd78605b2d06600750c331f307083d60df702814</id>
<content type='text'>
	[PATCH] Fix Box regexp match vars after non-match

	[PATCH] Use box_ready for $&amp;, $`, $\', $+

	These variables have rb_gvar_readonly_setter, so box_ready is sufficient.
	Only $~ needs box_dynamic due to its custom match_setter.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
	[PATCH] Fix Box regexp match vars after non-match

	[PATCH] Use box_ready for $&amp;, $`, $\', $+

	These variables have rb_gvar_readonly_setter, so box_ready is sufficient.
	Only $~ needs box_dynamic due to its custom match_setter.
</pre>
</div>
</content>
</entry>
<entry>
<title>use `SET_SHAREABLE`</title>
<updated>2025-10-23T04:08:26+00:00</updated>
<author>
<name>Koichi Sasada</name>
<email>ko1@atdot.net</email>
</author>
<published>2025-09-24T20:50:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=bc00c4468e0054ca896d2b83d3020180915f64cf'/>
<id>bc00c4468e0054ca896d2b83d3020180915f64cf</id>
<content type='text'>
to adopt strict shareable rule.

* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
  but should not leak reference to unshareable objects to Ruby world
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
to adopt strict shareable rule.

* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
  but should not leak reference to unshareable objects to Ruby world
</pre>
</div>
</content>
</entry>
<entry>
<title>ZJIT: Compile toregexp (#14200)</title>
<updated>2025-08-19T14:02:13+00:00</updated>
<author>
<name>Daniel Colson</name>
<email>danieljamescolson@gmail.com</email>
</author>
<published>2025-08-19T14:02:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=fc5ee247d5307a292cd2b083ce82fc24005bb385'/>
<id>fc5ee247d5307a292cd2b083ce82fc24005bb385</id>
<content type='text'>
`toregexp` is fairly similar to `concatstrings`, so this commit extracts
a helper for pushing and popping operands on the native stack.

There's probably opportunity to move some of this into lir (e.g. Alan
suggested a push_many that could use STP on ARM to push 2 at a time),
but I might save that for another day.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`toregexp` is fairly similar to `concatstrings`, so this commit extracts
a helper for pushing and popping operands on the native stack.

There's probably opportunity to move some of this into lir (e.g. Alan
suggested a push_many that could use STP on ARM to push 2 at a time),
but I might save that for another day.</pre>
</div>
</content>
</entry>
<entry>
<title>[DOC] Fill undocumented documents</title>
<updated>2025-08-03T17:23:43+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2025-08-03T17:23:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=6179cc011829b9e4c7b253ac2d2a3f47d8fd6890'/>
<id>6179cc011829b9e4c7b253ac2d2a3f47d8fd6890</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>* adjust indent</title>
<updated>2025-06-17T03:30:18+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2025-06-16T15:42:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a60bf9e693706c69484521d4967c9beb4d45772b'/>
<id>a60bf9e693706c69484521d4967c9beb4d45772b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Only use regex internal reg_cache when in main ractor</title>
<updated>2025-06-12T20:13:18+00:00</updated>
<author>
<name>Luke Gruber</name>
<email>luke.gruber@shopify.com</email>
</author>
<published>2025-06-12T15:10:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=97994c77fb5b82ca959e1188ecaee7d633d60a8e'/>
<id>97994c77fb5b82ca959e1188ecaee7d633d60a8e</id>
<content type='text'>
Using this `reg_cache` is racy across ractors, so don't use it when in a
ractor. Also, its use across ractors can cause a regular expression created
in 1 ractor to be used in another ractor (an isolation bug).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using this `reg_cache` is racy across ractors, so don't use it when in a
ractor. Also, its use across ractors can cause a regular expression created
in 1 ractor to be used in another ractor (an isolation bug).
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix regular expressions across ractors that match different encodings</title>
<updated>2025-06-10T16:00:17+00:00</updated>
<author>
<name>Luke Gruber</name>
<email>luke.gruber@shopify.com</email>
</author>
<published>2025-06-09T22:21:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=585dcffff1a0ed5fe43657661644628707ff0869'/>
<id>585dcffff1a0ed5fe43657661644628707ff0869</id>
<content type='text'>
In commit d42b9ffb206, an optimization was introduced that can speed up
Regexp#match by 15% when it matches with strings of different encodings.
This optimization, however, does not work across ractors. To fix this,
we only use the optimization if no ractors have been started. In the
future, we could use atomics for the reference counting if we find it's
needed and if it's more performant.

The backtrace of the misbehaving native thread:

```
  * frame #0: 0x0000000189c94388 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000189ccd88c libsystem_pthread.dylib`pthread_kill + 296
    frame #2: 0x0000000189bd6c60 libsystem_c.dylib`abort + 124
    frame #3: 0x0000000189adb174 libsystem_malloc.dylib`malloc_vreport + 892
    frame #4: 0x0000000189adec90 libsystem_malloc.dylib`malloc_report + 64
    frame #5: 0x0000000189ae321c libsystem_malloc.dylib`___BUG_IN_CLIENT_OF_LIBMALLOC_POINTER_BEING_FREED_WAS_NOT_ALLOCATED + 32
    frame #6: 0x00000001001c3be4 ruby`onig_free_body(reg=0x000000012d84b660) at regcomp.c:5663:5
    frame #7: 0x00000001001ba828 ruby`rb_reg_prepare_re(re=4748462304, str=4748451168) at re.c:1680:13
    frame #8: 0x00000001001bac58 ruby`rb_reg_onig_match(re=4748462304, str=4748451168, match=(ruby`reg_onig_search [inlined] rbimpl_RB_TYPE_P_fastpath at value_type.h:349:14
ruby`reg_onig_search [inlined] rbimpl_rstring_getmem at rstring.h:391:5
ruby`reg_onig_search at re.c:1781:5), args=0x000000013824b168, regs=0x000000013824b150) at re.c:1708:20
    frame #9: 0x00000001001baefc ruby`rb_reg_search_set_match(re=4748462304, str=4748451168, pos=&lt;unavailable&gt;, reverse=0, set_backref_str=1, set_match=0x0000000000000000) at re.c:1809:27
    frame #10: 0x00000001001bae80 ruby`rb_reg_search0(re=&lt;unavailable&gt;, str=&lt;unavailable&gt;, pos=&lt;unavailable&gt;, reverse=&lt;unavailable&gt;, set_backref_str=&lt;unavailable&gt;, match=&lt;unavailable&gt;) at re.c:1861:12 [artificial]
    frame #11: 0x0000000100230b90 ruby`rb_pat_search0(pat=&lt;unavailable&gt;, str=&lt;unavailable&gt;, pos=&lt;unavailable&gt;, set_backref_str=&lt;unavailable&gt;, match=&lt;unavailable&gt;) at string.c:6619:16 [artificial]
    frame #12: 0x00000001002287f4 ruby`rb_str_sub_bang [inlined] rb_pat_search(pat=4748462304, str=4748451168, pos=0, set_backref_str=1) at string.c:6626:12
    frame #13: 0x00000001002287dc ruby`rb_str_sub_bang(argc=1, argv=0x00000001381280d0, str=4748451168) at string.c:6668:11
    frame #14: 0x000000010022826c ruby`rb_str_sub
```

You can reproduce this by running:
```
RUBY_TESTOPTS="--name=/test_str_capitalize/" make test-all TESTS=test/ruby/test_m17n.comb
```

However, you need to run it with multiple ractors at once.

Co-authored-by: jhawthorn &lt;john@hawthorn.email&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit d42b9ffb206, an optimization was introduced that can speed up
Regexp#match by 15% when it matches with strings of different encodings.
This optimization, however, does not work across ractors. To fix this,
we only use the optimization if no ractors have been started. In the
future, we could use atomics for the reference counting if we find it's
needed and if it's more performant.

The backtrace of the misbehaving native thread:

```
  * frame #0: 0x0000000189c94388 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000189ccd88c libsystem_pthread.dylib`pthread_kill + 296
    frame #2: 0x0000000189bd6c60 libsystem_c.dylib`abort + 124
    frame #3: 0x0000000189adb174 libsystem_malloc.dylib`malloc_vreport + 892
    frame #4: 0x0000000189adec90 libsystem_malloc.dylib`malloc_report + 64
    frame #5: 0x0000000189ae321c libsystem_malloc.dylib`___BUG_IN_CLIENT_OF_LIBMALLOC_POINTER_BEING_FREED_WAS_NOT_ALLOCATED + 32
    frame #6: 0x00000001001c3be4 ruby`onig_free_body(reg=0x000000012d84b660) at regcomp.c:5663:5
    frame #7: 0x00000001001ba828 ruby`rb_reg_prepare_re(re=4748462304, str=4748451168) at re.c:1680:13
    frame #8: 0x00000001001bac58 ruby`rb_reg_onig_match(re=4748462304, str=4748451168, match=(ruby`reg_onig_search [inlined] rbimpl_RB_TYPE_P_fastpath at value_type.h:349:14
ruby`reg_onig_search [inlined] rbimpl_rstring_getmem at rstring.h:391:5
ruby`reg_onig_search at re.c:1781:5), args=0x000000013824b168, regs=0x000000013824b150) at re.c:1708:20
    frame #9: 0x00000001001baefc ruby`rb_reg_search_set_match(re=4748462304, str=4748451168, pos=&lt;unavailable&gt;, reverse=0, set_backref_str=1, set_match=0x0000000000000000) at re.c:1809:27
    frame #10: 0x00000001001bae80 ruby`rb_reg_search0(re=&lt;unavailable&gt;, str=&lt;unavailable&gt;, pos=&lt;unavailable&gt;, reverse=&lt;unavailable&gt;, set_backref_str=&lt;unavailable&gt;, match=&lt;unavailable&gt;) at re.c:1861:12 [artificial]
    frame #11: 0x0000000100230b90 ruby`rb_pat_search0(pat=&lt;unavailable&gt;, str=&lt;unavailable&gt;, pos=&lt;unavailable&gt;, set_backref_str=&lt;unavailable&gt;, match=&lt;unavailable&gt;) at string.c:6619:16 [artificial]
    frame #12: 0x00000001002287f4 ruby`rb_str_sub_bang [inlined] rb_pat_search(pat=4748462304, str=4748451168, pos=0, set_backref_str=1) at string.c:6626:12
    frame #13: 0x00000001002287dc ruby`rb_str_sub_bang(argc=1, argv=0x00000001381280d0, str=4748451168) at string.c:6668:11
    frame #14: 0x000000010022826c ruby`rb_str_sub
```

You can reproduce this by running:
```
RUBY_TESTOPTS="--name=/test_str_capitalize/" make test-all TESTS=test/ruby/test_m17n.comb
```

However, you need to run it with multiple ractors at once.

Co-authored-by: jhawthorn &lt;john@hawthorn.email&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix memory leak in rb_reg_search_set_match</title>
<updated>2025-03-12T01:55:03+00:00</updated>
<author>
<name>Peter Zhu</name>
<email>peter@peterzhu.ca</email>
</author>
<published>2025-03-11T19:05:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=1cdec3240b3c998c0cbf73556786aa3fa0b02ae7'/>
<id>1cdec3240b3c998c0cbf73556786aa3fa0b02ae7</id>
<content type='text'>
https://github.com/ruby/ruby/pull/12801 changed regexp matches to reuse
the backref, which causes memory to leak if the original registers of the
match is not freed.

For example, the following script leaks memory:

    10.times do
      1_000_000.times do
        "aaaaaaaaaaa".gsub(/a/, "")
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    774256
    1535152
    2297360
    3059280
    3821296
    4583552
    5160304
    5091456
    5114256
    4980192

After:

    12480
    11440
    11696
    11632
    11632
    11760
    11824
    11824
    11824
    11888
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
https://github.com/ruby/ruby/pull/12801 changed regexp matches to reuse
the backref, which causes memory to leak if the original registers of the
match is not freed.

For example, the following script leaks memory:

    10.times do
      1_000_000.times do
        "aaaaaaaaaaa".gsub(/a/, "")
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    774256
    1535152
    2297360
    3059280
    3821296
    4583552
    5160304
    5091456
    5114256
    4980192

After:

    12480
    11440
    11696
    11632
    11632
    11760
    11824
    11824
    11824
    11888
</pre>
</div>
</content>
</entry>
<entry>
<title>Reuse the backref if it isn't marked as busy.</title>
<updated>2025-02-24T17:32:46+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-02-24T13:42:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=97e6ad49a4604e7e4ca04da2aaafc63cbd5d29d8'/>
<id>97e6ad49a4604e7e4ca04da2aaafc63cbd5d29d8</id>
<content type='text'>
[Misc #20652]
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[Misc #20652]
</pre>
</div>
</content>
</entry>
<entry>
<title>String#gsub! Elide MatchData allocation when we know it can't escape</title>
<updated>2025-02-24T17:32:46+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>jean.boussier@gmail.com</email>
</author>
<published>2025-02-24T10:39:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=87f9c3c65e38fa3e5c6ef097e2cf63ff448f48d6'/>
<id>87f9c3c65e38fa3e5c6ef097e2cf63ff448f48d6</id>
<content type='text'>
In gsub is used with a string replacement or a map that doesn't
have a default proc, we know for sure no code can cause the MatchData
to escape the `gsub` call.

In such case, we still have to allocate a new MatchData because we
don't know what is the lifetime of the backref, but for any subsequent
match we can re-use the MatchData we allocated ourselves, reducing
allocations significantly.

This partially fixes [Misc #20652], except when a block is used,
and partially reduce the performance impact of
abc0304cb28cb9dcc3476993bc487884c139fd11 / [Bug #17507]

```
compare-ruby: ruby 3.5.0dev (2025-02-24T09:44:57Z master 5cf146399f) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-02-24T10:58:27Z gsub-elude-match da966636e9) +PRISM [arm64-darwin24]
warming up....

|                 |compare-ruby|built-ruby|
|:----------------|-----------:|---------:|
|escape           |      3.577k|    3.697k|
|                 |           -|     1.03x|
|escape_bin       |      5.869k|    6.743k|
|                 |           -|     1.15x|
|escape_utf8      |      3.448k|    3.738k|
|                 |           -|     1.08x|
|escape_utf8_bin  |      6.361k|    7.267k|
|                 |           -|     1.14x|
```

Co-Authored-By: Étienne Barrié &lt;etienne.barrie@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In gsub is used with a string replacement or a map that doesn't
have a default proc, we know for sure no code can cause the MatchData
to escape the `gsub` call.

In such case, we still have to allocate a new MatchData because we
don't know what is the lifetime of the backref, but for any subsequent
match we can re-use the MatchData we allocated ourselves, reducing
allocations significantly.

This partially fixes [Misc #20652], except when a block is used,
and partially reduce the performance impact of
abc0304cb28cb9dcc3476993bc487884c139fd11 / [Bug #17507]

```
compare-ruby: ruby 3.5.0dev (2025-02-24T09:44:57Z master 5cf146399f) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-02-24T10:58:27Z gsub-elude-match da966636e9) +PRISM [arm64-darwin24]
warming up....

|                 |compare-ruby|built-ruby|
|:----------------|-----------:|---------:|
|escape           |      3.577k|    3.697k|
|                 |           -|     1.03x|
|escape_bin       |      5.869k|    6.743k|
|                 |           -|     1.15x|
|escape_utf8      |      3.448k|    3.738k|
|                 |           -|     1.08x|
|escape_utf8_bin  |      6.361k|    7.267k|
|                 |           -|     1.14x|
```

Co-Authored-By: Étienne Barrié &lt;etienne.barrie@gmail.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
