ruby.git/test/ruby/test_regexp.rb, branch v3_4_9

[Backport #13671] Fix that "ss" in look-behind causes syntax error

2025-11-06T18:25:26+00:00

Fixes k-takata/Onigmo#92.

This fix was ported from oniguruma:
https://github.com/kkos/oniguruma/commit/257082dac8c6019198b56324012f0bd1830ff4ba

https://github.com/k-takata/Onigmo/commit/b1a5445fbeba97b3e94a733c2ce11c033453af73

Make word prop match join_control to conform to UTS 18

2025-08-27T22:17:15+00:00

See .

https://unicode.org/reports/tr18/#word states word should match join_control chars.

It did not previously:

```ruby
[*0x0..0xD799, *0xE000..0x10FFFF].map { |n| n.chr 'utf-8' } => all_chars
all_chars.grep(/\p{join_control}/) => jc
jc.count # => 2
jc.grep(/\p{word}/).count # => 0
```
[Backport #19417]

---

Backporting note: I regenerated `enc/unicode/15.0.0/name2ctype.h` using
`make update-unicode`.

Fix regex timeout double-free after stack_double

2024-11-12T07:33:21+00:00

As of 10574857ce167869524b97ee862b610928f6272f, it's possible to crash
on a double free due to `stk_alloc` AKA `msa->stack_p` being freed
twice, once at the end of match_at and a second time in `FREE_MATCH_ARG`
in the parent caller.

Fixes [Bug #20886]

Fix memory leak in Regexp capture group when timeout

2024-07-25T13:23:49+00:00

[Bug #20650]

The capture group allocates memory that is leaked when it times out.

For example:

    re = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001)
    str = "a" * 1000000 + "x"

    10.times do
      100.times do
        re =~ str
      rescue Regexp::TimeoutError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    34688
    56416
    78288
    100368
    120784
    140704
    161904
    183568
    204320
    224800

After:

    16288
    16288
    16880
    16896
    16912
    16928
    16944
    17184
    17184
    17200

Add MatchData#bytebegin and MatchData#byteend

2024-07-16T05:48:06+00:00

These methods return the byte-based offset of the beginning or end of the specified match.

[Feature #20576]

TestRegexp#test_match_cache_positive_look_behind: Extend the timeout limit

2024-06-07T14:29:59+00:00

TestRegexp#test_timeout_shorter_than_global: Extend the timeout limit

2024-06-07T14:11:10+00:00

TestRegexp#test_s_timeout: accept timeout errors more tolerantly

2024-06-07T13:37:08+00:00

This test seems flaky on macOS GitHub Actions

Don't use assert_separately in Bug 20453 test

2024-04-25T15:28:56+00:00

https://github.com/ruby/ruby/pull/10630#discussion_r1579565056

The PR was merged before I had a chance to address this feedback.
`assert_separately` is not necessary for this test if I don't use a
global timeout.

[Bug #20453] segfault in Regexp timeout

2024-04-25T14:28:18+00:00

https://bugs.ruby-lang.org/issues/20228 started freeing `stk_base` to
avoid a memory leak. But `stk_base` is sometimes stack allocated (using
`xalloca`), so the free only works if the regex stack has grown enough
to hit `stack_double` (which uses `xmalloc` and `xrealloc`).

To reproduce the problem on master and 3.3.1:

```ruby
Regexp.timeout = 0.001
/^(a*)x$/ =~ "a" * 1000000 + "x"'
```

Some details about this potential fix:

`stk_base == stk_alloc` on
[init](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1153),
so if `stk_base != stk_alloc` we can be sure we called
[`stack_double`](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1210)
and it's safe to free. It's also safe to free if we've
[saved](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1187-L1189)
the stack to `msa->stack_p`, since we do the `stk_base != stk_alloc`
check before saving.

This matches the check we do inside
[`stack_double`](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1221)