Age | Commit message (Collapse) | Author |
|
|
|
Avoid race condition in Regexp#match
In certain conditions, Regexp#match could return a MatchData with
missing captures. This seems to require at the least, multiple
threads calling a method that calls the same block/proc/lambda
which calls Regexp#match.
The race condition happens because the MatchData is passed from
indirectly via the backref, and other threads can modify the
backref.
Fix the issue by:
1. Not reusing the existing MatchData from the backref, and always
allocating a new MatchData.
2. Passing the MatchData directly to the caller using a VALUE*,
instead of indirectly through the backref.
It's likely that variants of this issue exist for other Regexp
methods. Anywhere that MatchData is passed implicitly through
the backref is probably vulnerable to this issue.
Fixes [Bug #17507]
---
re.c | 46 +++++++++++++++++++---------------------------
test/ruby/test_regexp.rb | 21 +++++++++++++++++++++
2 files changed, 40 insertions(+), 27 deletions(-)
|
|
Preserve the encoding of the argument in IndexError [Bug #18160]
---
re.c | 20 ++++++++++----------
test/ruby/test_regexp.rb | 7 ++++++-
2 files changed, 16 insertions(+), 11 deletions(-)
|
|
0846c2da457e7523819236ac7da492029b3ef73d,6c7cb00c094332a208cf36e5cd723a9ba60c41b8: [Backport #16376]
Check backref number buffer overrun [Bug #16376]
---
regcomp.c | 21 ++++++++++++---------
test/ruby/test_regexp.rb | 6 ++++++
2 files changed, 18 insertions(+), 9 deletions(-)
test/ruby/test_regexp.rb: Avoid "ambiguity between regexp and two
divisions"
---
test/ruby/test_regexp.rb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
Capture to reserved name variables if already defined [Bug #17533]
---
parse.y | 5 +++--
test/ruby/test_regexp.rb | 11 +++++++++++
2 files changed, 14 insertions(+), 2 deletions(-)
|
|
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Notes:
Merged: https://github.com/ruby/ruby/pull/3917
|
|
Only one warning is shown for the same Regexp object, so create
different objects to support repeating tests.
http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3290658
|
|
Instead of suppressing all warnings wholly in each test scripts by
setting `$VERBOSE` to `nil` in `setup` methods.
Notes:
Merged: https://github.com/ruby/ruby/pull/3925
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Quantifier reduction when using +?)* and +?)+ should not be done
as it affects which text will be matched.
This removes the need for the RQ_PQ_Q ReduceType, so remove the
enum entry and related switch case.
Test that these are the only two patterns affected by testing all
quantifier reduction tuples for both the captured and uncaptured
cases and making sure the matched text is the same for both.
Fixes [Bug #17341]
Notes:
Merged: https://github.com/ruby/ruby/pull/3808
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3483
|
|
|
|
Default to ONIGERR_INVALID_CHAR_PROPERTY_NAME in
fetch_char_property_to_ctype and only set otherwise if an ending
} is found.
Fixes [Bug #17340]
Notes:
Merged: https://github.com/ruby/ruby/pull/3807
|
|
`String#sub` with a string pattern defers creating a `Regexp`
until `MatchData#regexp` creates a `Regexp` from the matched
string. `Regexp#last_match(group_name)` accessed its content
without creating the `Regexp` though. [Bug #16508]
|
|
[Feature #8948] [Feature #16377]
Since Regexp literals always reference the same instance,
allowing to mutate them can lead to state leak.
Notes:
Merged: https://github.com/ruby/ruby/pull/2705
|
|
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf.
Revert [Feature #13083]
|
|
This reverts commit 452bee3ee8d68059fabd9b1c7a75661b14e3933e.
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/2637
|
|
* {String|Symbol}#match{?} with nil returns falsy
To improve consistency with Regexp#match{?}
* String#match(nil) returns `nil` instead of TypeError
* String#match?(nil) returns `false` instead of TypeError
* Symbol#match(nil) returns `nil` instead of TypeError
* Symbol#match?(nil) returns `false` instead of TypeError
* Prefer exception
* Follow empty ENV
* Drop outdated specs
* Write ruby/spec for above
https://github.com/ruby/ruby/pull/1506/files#r183242981
* Fix merge miss
|
|
|
|
Unicode Version 12.1.0 adds one single character, U+32FF SQUARE ERA NAME REIWA,
for the new Japanese era starting on May 1st. 12.1.0 will be finalized only on
May 7th, so we go with the beta version because further changes in the data we
need are highly unlikely, and we want to make sure Ruby is ready for the new era.
* common.mk: change UNICODE_VERSION to 12.1.0, UNICODE_BETA to YES
* enc/unicode/12.1.0, enc/unicode/12.1.0/casefold.h, enc/unicode/12.1.0/name2ctype.h:
add directory and generated data files for new version
* lib/unicode_normalize/tables.rb: update for new character
* test/ruby/test_regexp.rb: add test for character property age=12.1
* test/test_unicode_normalize.rb: add test for NFKC decomposition of new character
This (mostly) completes issue #15195.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
after `/' operator
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66821 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* parse.y (reg_named_capture_assign_iter): ignore non-local name
captures, including non-ASCII constant names.
[ruby-dev:50719] [Bug #15437]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66193 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
In test/ruby/test_regexp.rb and test/ruby/test_string.rb, change
some instances of COMBINING DIAERESIS (U+0308, above) to
COMBINING DIAERESIS BELOW (U+0324) to make it more easily visible
in test output, particularly in the context of double quotes
surrounding strings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* regparse.c (node_extended_grapheme_cluster): as Unicode 10 has
added Grapheme_Cluster_Break properties to some characters,
remove duplicated ranges for Unicode 9.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_str_with_term): change terminator.
* re.c (rb_reg_s_union): terminator in source string does not need
to be escaped. terminators are outside of regexp source itself.
[ruby-core:86149] [Bug #14608]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (unescape_nonascii): escaped multibyte character should be
copied as-is, just with checking if the encoding matches.
https://twitter.com/sakuro/status/972014409986883584
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* More general fix coming.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61194 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* Reverts part of r54522.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61193 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* regexec.c (match_at): invalidate end position not yet matched
when new start position is pushed, to dispose previously stored
position. [ruby-core:83743] [Bug #14101]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/test_regexp.rb (test_absent): add simple tests for
absent operator.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60754 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_to_s): needs embedded options to check syntax of
sub-regexp. [ruby-core:82328] [Bug #13798]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* parse.y (reg_named_capture_assign_iter): do not insert trace
instructions before local variable assinments. putobject is
expected at first. [ruby-core:79940] [Bug #13287]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57801 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* regparse.c (parse_char_class): initialize return values before
depth limit check. returned values will be freed in callers
regardless the error. [ruby-core:79624] [Bug #13234]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57660 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Follow r49675, r57098 and r57110. Don't assume RMatch::regexp always
contains a valid Regexp instance; it will be Qnil if the MatchData is
created by rb_backref_set_string(). [ruby-core:78741] [Bug #13054]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
rb_enc_ascget() erroneously reports success even if the given byte
sequence is incomplete, for non-ASCII compatible encoding strings.
rb_enc_precise_mbclen() may return a negative value on error, and thus
rb_enc_ascget() must not store the return value in 'unsigned int';
otherwise the subsequent MBCLEN_CHARFOUND_P() check won't catch the
error. [ruby-core:78646] [Bug #13034]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_match_m_p): consider char boundary. rb_str_subpos
does not adjust to the boundary if len == 0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57051 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Patch by: Koichi ITO <koic.ito@gmail.com>
[Fix GH-1498]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Follow r16757 ("* re.c: fix SEGV by Regexp.allocate.names,
Match.allocate.names, etc.", 2008-06-02). Don't do null dereference if
MatchData#hash or #== is called against an uninitialized instance.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56994 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* meta character \X matches Unicode 9.0.0 characters with some workarounds
for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences.
[Feature #12831] [ruby-core:77586]
The term "character" can have many meanings bytes, codepoints, combined
characters, and so on. "grapheme cluster" is highest one of such words,
which means user-perceived characters.
Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to
handle grapheme clusters (extended grapheme cluster).
But some specs aren't updated to current situation because Unicode Emoji
is rapidly extended without well definition.
It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be
easily tested by looking at immediately adjacent characters". (the
sentence will be removed in the next version)
Though some of its detail are described in Unicode Technical Report #51
UNICODE EMOJI but it is not merged into UTR#29 yet.
http://unicode.org/reports/tr29/
http://unicode.org/reports/tr51/
http://unicode.org/Public/emoji/4.0/
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56037 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Ruby's Regexp doesn't allow normal numbered groups if the regexp
has named groups. On such case it optimizes out related NT_ENCLOSE.
But even on the case it can use \g<0>.
This fix not to remove NT_ENCLOSE whose regnum is 0.
[ruby-core:75828] [Bug #12454]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
character is too big. [Bug #12420] [Bug #12423]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
if compile_length_tree raised error [Bug #12418]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_match_m_p): [DOC] fix return value in rdoc.
* test/ruby/test_regexp.rb (TestRegexp#test_match_p): add some
tests from document.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55075 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_match_m_p): fix match against empty string.
rb_str_offset returns the end when the position exceeds the
length. fix the range parameter of onig_search.
[ruby-core:75604] [Bug #12394]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55069 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* re.c (rb_reg_match_m_p): should return nil if no match, as the
document says. [Feature #8110]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
bool and doesn't save backref.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55061 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|