summaryrefslogtreecommitdiff
path: root/test/ruby/test_regexp.rb
AgeCommit message (Collapse)Author
2021-05-12Fix handling of control/meta escapes in literal regexpsJeremy Evans
Ruby uses a recursive algorithm for handling control/meta escapes in strings (read_escape). However, the equivalent code for regexps (tokadd_escape) in did not use a recursive algorithm. Due to this, Handling of control/meta escapes in regexp did not have the same behavior as in strings, leading to behavior such as the following returning nil: ```ruby /\c\xFF/ =~ "\c\xFF" ``` Switch the code for handling \c, \C and \M in literal regexps to use the same code as for strings (read_escape), to keep behavior consistent between the two. Fixes [Bug #14367] Notes: Merged: https://github.com/ruby/ruby/pull/4495
2021-03-16test/ruby/test_regexp.rb: Avoid "ambiguity between regexp and two divisions"Yusuke Endoh
2021-03-15Check backref number buffer overrun [Bug #16376]xtkoba (Tee KOBAYASHI)
2021-01-13Capture to reserved name variables if already defined [Bug #17533]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4059
2020-12-18Use category: :deprecated in warnings that are related to deprecationJeremy Evans
Also document that both :deprecated and :experimental are supported :category option values. The locations where warnings were marked as deprecation warnings was previously reviewed by shyouhei. Comment a couple locations where deprecation warnings should probably be used but are not currently used because deprecation warning enablement has not occurred at the time they are called (RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K). Add assert_deprecated_warn to test assertions. Use this to simplify some tests, and fix failing tests after marking some warnings with deprecated category. Notes: Merged: https://github.com/ruby/ruby/pull/3917
2020-12-18use eval to create different Regexp objectsKoichi Sasada
Only one warning is shown for the same Regexp object, so create different objects to support repeating tests. http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3290658
2020-12-17test/ruby: Check warning messages at a finer granularityNobuyoshi Nakada
Instead of suppressing all warnings wholly in each test scripts by setting `$VERBOSE` to `nil` in `setup` methods. Notes: Merged: https://github.com/ruby/ruby/pull/3925 Merged-By: nobu <nobu@ruby-lang.org>
2020-12-02Do not reduce quantifiers if it affects which text will be matchedJeremy Evans
Quantifier reduction when using +?)* and +?)+ should not be done as it affects which text will be matched. This removes the need for the RQ_PQ_Q ReduceType, so remove the enum entry and related switch case. Test that these are the only two patterns affected by testing all quantifier reduction tuples for both the captured and uncaptured cases and making sure the matched text is the same for both. Fixes [Bug #17341] Notes: Merged: https://github.com/ruby/ruby/pull/3808
2020-11-28[Feature #17136] Remove special behavior from $KCODENobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/3483
2020-11-27Separated tests for $KCODE and $=Nobuyoshi Nakada
2020-11-24Detect the premature end of char property in regexpJeremy Evans
Default to ONIGERR_INVALID_CHAR_PROPERTY_NAME in fetch_char_property_to_ctype and only set otherwise if an ending } is found. Fixes [Bug #17340] Notes: Merged: https://github.com/ruby/ruby/pull/3807
2020-01-16`Regexp` in `MatchData` can be `nil`Nobuyoshi Nakada
`String#sub` with a string pattern defers creating a `Regexp` until `MatchData#regexp` creates a `Regexp` from the matched string. `Regexp#last_match(group_name)` accessed its content without creating the `Regexp` though. [Bug #16508]
2020-01-15Freeze Regexp literalsJean Boussier
[Feature #8948] [Feature #16377] Since Regexp literals always reference the same instance, allowing to mutate them can lead to state leak. Notes: Merged: https://github.com/ruby/ruby/pull/2705
2019-12-04Revert "Regexp#match{?} with nil raises TypeError as String, Symbol (#1506)"NARUSE, Yui
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf. Revert [Feature #13083]
2019-12-04Revert "Revert nil error and adding deprecation message"NARUSE, Yui
This reverts commit 452bee3ee8d68059fabd9b1c7a75661b14e3933e.
2019-11-06Undefine MatchData.allocate [Feature #16294]Nobuyoshi Nakada
2019-11-03Added assertions for linebreakNobuyoshi Nakada
2019-11-03Revert nil error and adding deprecation messageKenichi Kamiya
Notes: Merged: https://github.com/ruby/ruby/pull/2637
2019-10-17Regexp#match{?} with nil raises TypeError as String, Symbol (#1506)Kenichi Kamiya
* {String|Symbol}#match{?} with nil returns falsy To improve consistency with Regexp#match{?} * String#match(nil) returns `nil` instead of TypeError * String#match?(nil) returns `false` instead of TypeError * Symbol#match(nil) returns `nil` instead of TypeError * Symbol#match?(nil) returns `false` instead of TypeError * Prefer exception * Follow empty ENV * Drop outdated specs * Write ruby/spec for above https://github.com/ruby/ruby/pull/1506/files#r183242981 * Fix merge miss
2019-06-29Escape control codes in regexp warning messageNobuyoshi Nakada
2019-04-05update to Unicode Version 12.1.0 (beta)duerst
Unicode Version 12.1.0 adds one single character, U+32FF SQUARE ERA NAME REIWA, for the new Japanese era starting on May 1st. 12.1.0 will be finalized only on May 7th, so we go with the beta version because further changes in the data we need are highly unlikely, and we want to make sure Ruby is ready for the new era. * common.mk: change UNICODE_VERSION to 12.1.0, UNICODE_BETA to YES * enc/unicode/12.1.0, enc/unicode/12.1.0/casefold.h, enc/unicode/12.1.0/name2ctype.h: add directory and generated data files for new version * lib/unicode_normalize/tables.rb: update for new character * test/ruby/test_regexp.rb: add test for character property age=12.1 * test/test_unicode_normalize.rb: add test for NFKC decomposition of new character This (mostly) completes issue #15195. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67441 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-15Suppress warning: ambiguous first argument; put parentheses or a space even ↵naruse
after `/' operator git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66821 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-20parse.y: ignore constant name capturesnobu
* parse.y (reg_named_capture_assign_iter): ignore non-local name captures, including non-ASCII constant names. [ruby-dev:50719] [Bug #15437] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66463 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-04commit missduerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66193 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-04change diaeresis from above to below for better visibilityduerst
In test/ruby/test_regexp.rb and test/ruby/test_string.rb, change some instances of COMBINING DIAERESIS (U+0308, above) to COMBINING DIAERESIS BELOW (U+0324) to make it more easily visible in test output, particularly in the context of double quotes surrounding strings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-15regparse.c: Suppress duplicated range warning by mere \Xnobu
* regparse.c (node_extended_grapheme_cluster): as Unicode 10 has added Grapheme_Cluster_Break properties to some characters, remove duplicated ranges for Unicode 9. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-03-16re.c: do not escape terminator in Regexp.unionnobu
* re.c (rb_reg_str_with_term): change terminator. * re.c (rb_reg_s_union): terminator in source string does not need to be escaped. terminators are outside of regexp source itself. [ruby-core:86149] [Bug #14608] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-03-11re.c: fixed escaped multibyte charnobu
* re.c (unescape_nonascii): escaped multibyte character should be copied as-is, just with checking if the encoding matches. https://twitter.com/sakuro/status/972014409986883584 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62718 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-12Revert r61192 and r61193eregon
* More general fix coming. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61194 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-12Use EnvUtil.with_default_external in tests needing iteregon
* Reverts part of r54522. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61193 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-13regexec.c: invalidate previously matched positionnobu
* regexec.c (match_at): invalidate end position not yet matched when new start position is pushed, to dispose previously stored position. [ruby-core:83743] [Bug #14101] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-13test_regexp.rb: test_absentnobu
* test/ruby/test_regexp.rb (test_absent): add simple tests for absent operator. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60754 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-11re.c: options for sub-regexpnobu
* re.c (rb_reg_to_s): needs embedded options to check syntax of sub-regexp. [ruby-core:82328] [Bug #13798] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07parse.y: remove tracenobu
* parse.y (reg_named_capture_assign_iter): do not insert trace instructions before local variable assinments. putobject is expected at first. [ruby-core:79940] [Bug #13287] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57801 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-20regparse.c: initialize return valuesnobu
* regparse.c (parse_char_class): initialize return values before depth limit check. returned values will be freed in callers regardless the error. [ruby-core:79624] [Bug #13234] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57660 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-20re.c: consider the case of RMatch::regexp is nilrhe
Follow r49675, r57098 and r57110. Don't assume RMatch::regexp always contains a valid Regexp instance; it will be Qnil if the MatchData is created by rb_backref_set_string(). [ruby-core:78741] [Bug #13054] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-14encoding.c: handle needmore error from rb_enc_precise_mbclen()rhe
rb_enc_ascget() erroneously reports success even if the given byte sequence is incomplete, for non-ASCII compatible encoding strings. rb_enc_precise_mbclen() may return a negative value on error, and thus rb_enc_ascget() must not store the return value in 'unsigned int'; otherwise the subsequent MBCLEN_CHARFOUND_P() check won't catch the error. [ruby-core:78646] [Bug #13034] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-12re.c: char boundarynobu
* re.c (rb_reg_match_m_p): consider char boundary. rb_str_subpos does not adjust to the boundary if len == 0. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57051 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-06Fix typoskazu
Patch by: Koichi ITO <koic.ito@gmail.com> [Fix GH-1498] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-06re.c: check that MatchData is initializedrhe
Follow r16757 ("* re.c: fix SEGV by Regexp.allocate.names, Match.allocate.names, etc.", 2008-06-02). Don't do null dereference if MatchData#hash or #== is called against an uninitialized instance. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56994 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30Regexp supports Unicoe 9.0.0's \Xnaruse
* meta character \X matches Unicode 9.0.0 characters with some workarounds for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences. [Feature #12831] [ruby-core:77586] The term "character" can have many meanings bytes, codepoints, combined characters, and so on. "grapheme cluster" is highest one of such words, which means user-perceived characters. Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to handle grapheme clusters (extended grapheme cluster). But some specs aren't updated to current situation because Unicode Emoji is rapidly extended without well definition. It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be easily tested by looking at immediately adjacent characters". (the sentence will be removed in the next version) Though some of its detail are described in Unicode Technical Report #51 UNICODE EMOJI but it is not merged into UTR#29 yet. http://unicode.org/reports/tr29/ http://unicode.org/reports/tr51/ http://unicode.org/Public/emoji/4.0/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-30Use qualified namesnobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56037 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01* regcomp.c (noname_disable_map): don't optimize out group 0naruse
Ruby's Regexp doesn't allow normal numbered groups if the regexp has named groups. On such case it optimizes out related NT_ENCLOSE. But even on the case it can use \g<0>. This fix not to remove NT_ENCLOSE whose regnum is 0. [ruby-core:75828] [Bug #12454] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-25* regparse.c (fetch_token_in_cc): raise error if given octal escapednaruse
character is too big. [Bug #12420] [Bug #12423] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-25* regcomp.c (compile_length_tree): return error code immediatelynaruse
if compile_length_tree raised error [Bug #12418] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-19fix document of Regexp#match?kazu
* re.c (rb_reg_match_m_p): [DOC] fix return value in rdoc. * test/ruby/test_regexp.rb (TestRegexp#test_match_p): add some tests from document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55075 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-19re.c: fix match?nobu
* re.c (rb_reg_match_m_p): fix match against empty string. rb_str_offset returns the end when the position exceeds the length. fix the range parameter of onig_search. [ruby-core:75604] [Bug #12394] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55069 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-19re.c: match? should return nil if no matchnobu
* re.c (rb_reg_match_m_p): should return nil if no match, as the document says. [Feature #8110] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-18* re.c (rb_reg_match_m_p): Introduce Regexp#match?, which returnsnaruse
bool and doesn't save backref. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55061 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-18* re.c (match_ary_subseq): get subseq of match array without creatingnaruse
temporary array. * re.c (match_ary_aref): get element(s) of match array without creating temporary array. * re.c (match_aref): Use match_ary_subseq with handling irregulars. * re.c (match_values_at): Use match_ary_aref. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55053 b2dd03c8-39d4-4d8f-98ff-823fe69b080e