| Age | Commit message (Collapse) | Author |
|
to ensure the issue doesn't exit in ruby_3_2 branch.
|
|
Use UTF-8 encoding for literal extended regexps with UTF-8 characters
in comments
Fixes [Bug #19455]
---
re.c | 9 ++++++++-
test/ruby/test_regexp.rb | 7 +++++++
2 files changed, 15 insertions(+), 1 deletion(-)
|
|
[Bug #19587] Fix `reset_match_cache` arguments
---
regexec.c | 2 +-
test/ruby/test_regexp.rb | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
|
|
[Bug #19471] `Regexp.compile` should handle keyword arguments
As well as `Regexp.new`, it should pass keyword arguments to the
`Regexp#initialize` method.
---
re.c | 2 +-
test/ruby/test_regexp.rb | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
|
|
[Bug #19476]: correct cache index computation for repetition (#7457)
---
regexec.c | 4 ++--
test/ruby/test_regexp.rb | 5 +++++
2 files changed, 7 insertions(+), 2 deletions(-)
|
|
[Bug #19467] correct cache points and counting failure on
`OP_ANYCHAR_STAR_PEEK_NEXT` (#7454)
---
regexec.c | 20 ++++++++++++++++----
test/ruby/test_regexp.rb | 10 ++++++++++
2 files changed, 26 insertions(+), 4 deletions(-)
|
|
Fix parsing of regexps that toggle extended mode on/off inside regexp
This was broken in ec3542229b29ec93062e9d90e877ea29d3c19472. That commit
didn't handle cases where extended mode was turned on/off inside the
regexp. There are two ways to turn extended mode on/off:
```
/(?-x:#y)#z
/x =~ '#y'
/(?-x)#y(?x)#z
/x =~ '#y'
```
These can be nested inside the same regexp:
```
/(?-x:(?x)#x
(?-x)#y)#z
/x =~ '#y'
```
As you can probably imagine, this makes handling these regexps
somewhat complex. Due to the nesting inside portions of regexps,
the unassign_nonascii function needs to be recursive. In
recursive mode, it needs to track both opening and closing
parentheses, similar to how it already tracked opening and
closing brackets for character classes.
When scanning the regexp and coming to `(?` not followed by `#`,
scan for options, and use `x` and `i` to determine whether to
turn on or off extended mode. For `:`, indicting only the
current regexp section should have the extended mode
switched, recurse with the extended mode set or unset. For `)`,
indicating the remainder of the regexp (or current regexp portion
if already recursing) should turn extended mode on or off, just
change the extended mode flag and keep scanning.
While testing this, I noticed that `a`, `d`, and `u` are accepted
as options, in addition to `i`, `m`, and `x`, but I can't see
where those options are documented. I'm not sure whether or not
handling `a`, `d`, and `u` as options is a bug.
Fixes [Bug #19379]
---
re.c | 153 +++++++++++++++++++++++++++++++++++++----------
test/ruby/test_regexp.rb | 56 +++++++++++++++++
2 files changed, 176 insertions(+), 33 deletions(-)
|
|
Fix [Bug 19273], set correct value to `outer_repeat` on `OP_REPEAT`
(#7035)
---
regexec.c | 2 +-
test/ruby/test_regexp.rb | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
|
|
argument
Previously, only certain values of the 3rd argument triggered a
deprecation warning.
First step for fix for bug #18797. Support for the 3rd argument
will be removed after the release of Ruby 3.2.
Fix minor fallout discovered by the tests.
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6976
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6988
|
|
Notes:
Merged-By: makenowjust <make.just.on@gmail.com>
|
|
https://bugs.ruby-lang.org/issues/19104#change-100542
Notes:
Merged: https://github.com/ruby/ruby/pull/6902
|
|
Regexp tests are flaky.
http://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20221128T050004Z.fail.html.gz
|
|
It fails on riscv (QEmu)
http://rubyci.s3.amazonaws.com/debian-riscv64/ruby-master/log/20221124T000021Z.fail.html.gz
```
1) Error:
TestRegexp#test_cache_optimization_square:
Regexp::TimeoutError: regexp match timeout
/home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1693:in `<main>'
/home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1688:in `test_cache_optimization_square'
```
|
|
The timeout seems too short for some CIs.
http://rubyci.s3.amazonaws.com/debian11-aarch64/ruby-master/log/20221120T012840Z.fail.html.gz
|
|
The tests failed on windows
https://github.com/ruby/ruby/actions/runs/3440997073/jobs/5740085169#step:18:62
```
1) Failure:
TestRegexp#test_s_timeout [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1586]:
<0.30000000000000004> expected but was
<0.3>.
2) Failure:
TestRegexp#test_timeout_shorter_than_global [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1631]:
<0.30000000000000004> expected but was
<0.3>.
```
|
|
It does not work well in assert_separately
|
|
|
|
|
|
|
|
Fix per-instance Regexp timeout
This makes it follow what was decided in [Bug #19055]:
* `Regexp.new(str, timeout: nil)` should respect the global timeout
* `Regexp.new(str, timeout: huge_val)` should use the maximum value that
can be represented in the internal representation
* `Regexp.new(str, timeout: 0 or negative value)` should raise an error
Notes:
Merged-By: mame <mame@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6216
|
|
`Regexp.new` now supports passing the regexp flags not only as an
`Integer`, but also as a `String. Unknown flags raise errors.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
Now second argument should be `true`, `false`, `nil` or Integer.
This flag is confused with third argument some times.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
Invalid escapes are handled at multiple levels. The first level
is in parse.y, so skip invalid unicode escape checks for regexps
in parse.y.
Make rb_reg_preprocess and unescape_nonascii accept the regexp
options. In unescape_nonascii, if the regexp is an extended
regexp, when "#" is encountered, ignore all characters until the
end of line or end of regexp.
Unfortunately, in extended regexps, you can use "#" as a non-comment
character inside a character class, so also parse "[" and "]"
specially for extended regexps, and only skip comments if "#" is
not inside a character class. Handle nested character classes as well.
This issue doesn't just affect extended regexps, it also affects
"(#?" comments inside all regexps. So for those comments, scan
until trailing ")" and ignore content inside.
I'm not sure if there are other corner cases not handled. A
better fix would be to redesign the regexp parser so that it
unescaped during parsing instead of before parsing, so you already
know the current parsing state.
Fixes [Bug #18294]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/5721
Merged-By: jeremyevans <code@jeremyevans.net>
|
|
https://hackerone.com/reports/1220911
Notes:
Merged: https://github.com/ruby/ruby/pull/5793
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5740
Merged-By: nobu <nobu@ruby-lang.org>
|
|
[Bug #18669]
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
[Feature #17837]
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Idea from Jirka Marsik.
Fixes [Bug #18631]
Notes:
Merged: https://github.com/ruby/ruby/pull/5710
|
|
|
|
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]
Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
Notes:
Merged-By: shugo <shugo@ruby-lang.org>
|
|
In certain conditions, Regexp#match could return a MatchData with
missing captures. This seems to require at the least, multiple
threads calling a method that calls the same block/proc/lambda
which calls Regexp#match.
The race condition happens because the MatchData is passed from
indirectly via the backref, and other threads can modify the
backref.
Fix the issue by:
1. Not reusing the existing MatchData from the backref, and always
allocating a new MatchData.
2. Passing the MatchData directly to the caller using a VALUE*,
instead of indirectly through the backref.
It's likely that variants of this issue exist for other Regexp
methods. Anywhere that MatchData is passed implicitly through
the backref is probably vulnerable to this issue.
Fixes [Bug #17507]
Notes:
Merged: https://github.com/ruby/ruby/pull/4734
|
|
|
|
The method to return the length of the matched substring
corresponding to the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
The method to return the single matched substring corresponding to
the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4822
|
|
Ruby uses a recursive algorithm for handling control/meta escapes
in strings (read_escape). However, the equivalent code for regexps
(tokadd_escape) in did not use a recursive algorithm. Due to this,
Handling of control/meta escapes in regexp did not have the same
behavior as in strings, leading to behavior such as the following
returning nil:
```ruby
/\c\xFF/ =~ "\c\xFF"
```
Switch the code for handling \c, \C and \M in literal regexps to
use the same code as for strings (read_escape), to keep behavior
consistent between the two.
Fixes [Bug #14367]
Notes:
Merged: https://github.com/ruby/ruby/pull/4495
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4059
|
|
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Notes:
Merged: https://github.com/ruby/ruby/pull/3917
|
|
Only one warning is shown for the same Regexp object, so create
different objects to support repeating tests.
http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3290658
|
|
Instead of suppressing all warnings wholly in each test scripts by
setting `$VERBOSE` to `nil` in `setup` methods.
Notes:
Merged: https://github.com/ruby/ruby/pull/3925
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Quantifier reduction when using +?)* and +?)+ should not be done
as it affects which text will be matched.
This removes the need for the RQ_PQ_Q ReduceType, so remove the
enum entry and related switch case.
Test that these are the only two patterns affected by testing all
quantifier reduction tuples for both the captured and uncaptured
cases and making sure the matched text is the same for both.
Fixes [Bug #17341]
Notes:
Merged: https://github.com/ruby/ruby/pull/3808
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3483
|
|
|