summaryrefslogtreecommitdiff
path: root/test/strscan
AgeCommit message (Collapse)Author
2025-11-05[ruby/strscan] Deprecate constant `Id`Nobuyoshi Nakada
`$Id$` is for RCS, CVS, and SVN; no information with GIT. https://github.com/ruby/strscan/commit/9e3db14fa2
2025-11-05[ruby/strscan] Deprecate undocumented toplevel constant `ScanError`Nobuyoshi Nakada
https://github.com/ruby/strscan/commit/b4ddc3a2a6
2025-06-03[ruby/strscan] Support `Ractor#value`Hiroshi SHIBATA
(https://github.com/ruby/strscan/pull/157) This is same as https://github.com/ruby/stringio/pull/134 --------- https://github.com/ruby/strscan/commit/141f9cf9b6 Co-authored-by: Koichi Sasada <ko1@atdot.net>
2025-05-31`Ractor::Port`Koichi Sasada
* Added `Ractor::Port` * `Ractor::Port#receive` (support multi-threads) * `Rcator::Port#close` * `Ractor::Port#closed?` * Added some methods * `Ractor#join` * `Ractor#value` * `Ractor#monitor` * `Ractor#unmonitor` * Removed some methods * `Ractor#take` * `Ractor.yield` * Change the spec * `Racotr.select` You can wait for multiple sequences of messages with `Ractor::Port`. ```ruby ports = 3.times.map{ Ractor::Port.new } ports.map.with_index do |port, ri| Ractor.new port,ri do |port, ri| 3.times{|i| port << "r#{ri}-#{i}"} end end p ports.each{|port| pp 3.times.map{port.receive}} ``` In this example, we use 3 ports, and 3 Ractors send messages to them respectively. We can receive a series of messages from each port. You can use `Ractor#value` to get the last value of a Ractor's block: ```ruby result = Ractor.new do heavy_task() end.value ``` You can wait for the termination of a Ractor with `Ractor#join` like this: ```ruby Ractor.new do some_task() end.join ``` `#value` and `#join` are similar to `Thread#value` and `Thread#join`. To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced. This commit changes `Ractor.select()` method. It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates. We removes `Ractor.yield` and `Ractor#take` because: * `Ractor::Port` supports most of similar use cases in a simpler manner. * Removing them significantly simplifies the code. We also change the internal thread scheduler code (thread_pthread.c): * During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks. This lock is released by `rb_ractor_sched_barrier_end()` which is called at the end of operations that require the barrier. * fix potential deadlock issues by checking interrupts just before setting UBF. https://bugs.ruby-lang.org/issues/21262 Notes: Merged: https://github.com/ruby/ruby/pull/13445
2025-05-08[ruby/strscan] jruby: Check if len++ walked off the endCharles Oliver Nutter
(https://github.com/ruby/strscan/pull/153) Fix https://github.com/ruby/strscan/pull/152 CRuby can walk off the end because there's always a null byte. In JRuby, the byte array is often (usually?) the exact size of the string. So we need to check if len++ walked off the end. This code was ported from a version by @byroot in https://github.com/ruby/strscan/pull/127 but I missed adding this check due to a lack of tests. A test is included for both "-" and "+" parsing. https://github.com/ruby/strscan/commit/1abe4ca556
2025-05-08[ruby/strscan] jruby: Pass end index to byteListToInumCharles Oliver Nutter
(https://github.com/ruby/strscan/pull/150) These parse methods take begin and end indices, not begin and length. A test is included. Fixes https://github.com/jruby/jruby/issues/8823 https://github.com/ruby/strscan/commit/9690e39e73
2025-05-02[ruby/strscan] named_captures: fix incompatibility withSutou Kouhei
MatchData#named_captures (https://github.com/ruby/strscan/pull/146) Fix https://github.com/ruby/strscan/pull/145 `MatchData#named_captures` use the last matched value for each name. Reported by Linus Sellberg. Thanks!!! https://github.com/ruby/strscan/commit/a6086ea322
2025-02-25[ruby/strscan] Enable tests passing on TruffleRubyAndrii Konchyn
(https://github.com/ruby/strscan/pull/144) Changes: - enabled tests passing on TruffleRuby - removed `truffleruby` and keep only `truffleruby-head` in CI https://github.com/ruby/strscan/commit/4aadfc8408 Notes: Merged: https://github.com/ruby/ruby/pull/12804
2025-02-25[ruby/strscan] Fix a bug that inconsistency of IndexError vs nil forNAITOH Jun
unknown capture group (https://github.com/ruby/strscan/pull/143) Fix https://github.com/ruby/strscan/pull/139 Reported by Benoit Daloze. Thanks!!! https://github.com/ruby/strscan/commit/bc8a0d2623 Notes: Merged: https://github.com/ruby/ruby/pull/12804
2025-02-25[ruby/strscan] Fix a bug that scanning methods that don't use RegexpNAITOH Jun
don't clear named capture groups (https://github.com/ruby/strscan/pull/142) Fix https://github.com/ruby/strscan/pull/135 https://github.com/ruby/strscan/commit/b957443e20 Notes: Merged: https://github.com/ruby/ruby/pull/12804
2025-02-21[ruby/strscan] `scan_integer(base: 16)` ignore x suffix if notJean Boussier
followed by hexadecimal (https://github.com/ruby/strscan/pull/141) Fix: https://github.com/ruby/strscan/issues/140 `0x<EOF>`, `0xZZZ` should be parsed as `0` instead of not matching at all. https://github.com/ruby/strscan/commit/c4e4795ed2
2025-02-17[ruby/strscan] Fix a bug that scan_until behaves differently withNAITOH Jun
Regexp and String patterns (https://github.com/ruby/strscan/pull/138) Fix https://github.com/ruby/strscan/pull/131 https://github.com/ruby/strscan/commit/e1cec2e726
2025-02-14[ruby/strscan] Fix a bug that scan_integer doesn't update matchedJean Boussier
data (https://github.com/ruby/strscan/pull/133) Fix https://github.com/ruby/strscan/pull/130 Reported by Andrii Konchyn. Thanks!!! https://github.com/ruby/strscan/commit/4e5f17f87a
2024-12-02[ruby/strscan] test: don't omit "(...)" for method calls that have at least ↵Sutou Kouhei
one argument https://github.com/ruby/strscan/commit/dddae9c99a
2024-12-02StringScanner#scan_integer support base 16 integers (#116)Jean Boussier
Followup: https://github.com/ruby/strscan/pull/115 `scan_integer` is now implemented in Ruby as to efficiently handle keyword arguments without allocating a Hash. Given the goal of `scan_integer` is to more effciently parse integers without having to allocate an intermediary object, using `rb_scan_args` would defeat the purpose. Additionally, the C implementation now uses `rb_isdigit` and `rb_isxdigit`, because on Windows `isdigit` is locale dependent.
2024-12-02[ruby/strscan] Prevent a warning "ambiguous first argument" during aYusuke Endoh
test (https://github.com/ruby/strscan/pull/118) https://rubyci.s3.amazonaws.com/debian11/ruby-master/log/20241128T153002Z.log.html.gz ``` /home/chkbuild/chkbuild/tmp/build/20241128T153002Z/ruby/test/strscan/test_stringscanner.rb:908: warning: ambiguous first argument; put parentheses or a space even after `-` operator ``` https://github.com/ruby/strscan/commit/af3fd2f045
2024-11-27[ruby/strscan] Implement #scan_integer to efficiently parse IntegerJean Boussier
(https://github.com/ruby/strscan/pull/115) Fix: https://github.com/ruby/strscan/issues/113 This allows to directly parse an Integer from a String without needing to first allocate a sub string. Notes: The implementation is limited by design, it's meant as a first step, only the most straightforward, based 10 integers are supported. https://github.com/ruby/strscan/commit/6a3c74b4c8
2024-10-26[ruby/strscan] [JRuby] Optimize `scan()`: Remove duplicate `ifNAITOH Jun
(restLen() < patternsize()) return context.nil;` checks in `!headonly`. (https://github.com/ruby/strscan/pull/110) - before: #109 ## Why? https://github.com/ruby/strscan/blob/d31274f41b7c1e28f23d58cf7bfea03baa818cb7/ext/jruby/org/jruby/ext/strscan/RubyStringScanner.java#L371-L373 This means the following : `if (str.size() - curr < pattern.size()) return context.nil;` A similar check is made within `StringSupport#index()` within `!headonly`. https://github.com/jruby/jruby/blob/be7815ec02356a58891c8727bb448f0c6a826d96/core/src/main/java/org/jruby/util/StringSupport.java#L1706-L1720 ```Java public static int index(ByteList source, ByteList other, int offset, Encoding enc) { int sourceLen = source.realSize(); int sourceBegin = source.begin(); int otherLen = other.realSize(); if (otherLen == 0) return offset; if (sourceLen - offset < otherLen) return -1; ``` - source = `strBL` - other = `patternBL` - offset = `strBeg + curr` This means the following : `if (strBL.realSize() - (strBeg + curr) < patternBL.realSize()) return -1;` Both checks are the same. ## Benchmark It shows String as a pattern is 2.40x faster than Regexp as a pattern. ``` $ benchmark-driver benchmark/check_until.yaml Warming up -------------------------------------- regexp 7.613M i/s - 7.593M times in 0.997350s (131.35ns/i) regexp_var 7.793M i/s - 7.772M times in 0.997364s (128.32ns/i) string 13.222M i/s - 13.199M times in 0.998297s (75.63ns/i) string_var 15.283M i/s - 15.216M times in 0.995667s (65.43ns/i) Calculating ------------------------------------- regexp 10.003M i/s - 22.840M times in 2.283361s (99.97ns/i) regexp_var 9.991M i/s - 23.378M times in 2.340019s (100.09ns/i) string 23.454M i/s - 39.666M times in 1.691221s (42.64ns/i) string_var 23.998M i/s - 45.848M times in 1.910447s (41.67ns/i) Comparison: string_var: 23998466.3 i/s string: 23453777.5 i/s - 1.02x slower regexp: 10002809.4 i/s - 2.40x slower regexp_var: 9990580.1 i/s - 2.40x slower ``` https://github.com/ruby/strscan/commit/843e931d13
2024-09-17[ruby/strscan] Accept String as a pattern at non headNAITOH Jun
(https://github.com/ruby/strscan/pull/106) It supports non-head match cases such as StringScanner#scan_until. If we use a String as a pattern, we can improve match performance. Here is a result of the including benchmark. ## CRuby It shows String as a pattern is 1.18x faster than Regexp as a pattern. ``` $ benchmark-driver benchmark/check_until.yaml Warming up -------------------------------------- regexp 9.403M i/s - 9.548M times in 1.015459s (106.35ns/i) regexp_var 9.162M i/s - 9.248M times in 1.009479s (109.15ns/i) string 8.966M i/s - 9.274M times in 1.034343s (111.54ns/i) string_var 11.051M i/s - 11.190M times in 1.012538s (90.49ns/i) Calculating ------------------------------------- regexp 10.319M i/s - 28.209M times in 2.733707s (96.91ns/i) regexp_var 10.032M i/s - 27.485M times in 2.739807s (99.68ns/i) string 9.681M i/s - 26.897M times in 2.778397s (103.30ns/i) string_var 12.162M i/s - 33.154M times in 2.726046s (82.22ns/i) Comparison: string_var: 12161920.6 i/s regexp: 10318949.7 i/s - 1.18x slower regexp_var: 10031617.6 i/s - 1.21x slower string: 9680843.7 i/s - 1.26x slower ``` ## JRuby It shows String as a pattern is 2.11x faster than Regexp as a pattern. ``` $ benchmark-driver benchmark/check_until.yaml Warming up -------------------------------------- regexp 7.591M i/s - 7.544M times in 0.993780s (131.74ns/i) regexp_var 6.143M i/s - 6.125M times in 0.997038s (162.77ns/i) string 14.135M i/s - 14.079M times in 0.996067s (70.75ns/i) string_var 14.079M i/s - 14.057M times in 0.998420s (71.03ns/i) Calculating ------------------------------------- regexp 9.409M i/s - 22.773M times in 2.420268s (106.28ns/i) regexp_var 10.116M i/s - 18.430M times in 1.821820s (98.85ns/i) string 21.389M i/s - 42.404M times in 1.982519s (46.75ns/i) string_var 20.897M i/s - 42.237M times in 2.021187s (47.85ns/i) Comparison: string: 21389191.1 i/s string_var: 20897327.5 i/s - 1.02x slower regexp_var: 10116464.7 i/s - 2.11x slower regexp: 9409222.3 i/s - 2.27x slower ``` See: https://github.com/jruby/jruby/blob/be7815ec02356a58891c8727bb448f0c6a826d96/core/src/main/java/org/jruby/util/StringSupport.java#L1706-L1736 --------- https://github.com/ruby/strscan/commit/f9d96c446a Co-authored-by: Sutou Kouhei <kou@clear-code.com>
2024-03-27[ruby/strscan] Omit tests for `#scan_byte` and `#peek_byte` onAndrii Konchyn
TruffleRuby temporary (https://github.com/ruby/strscan/pull/91) The methods were added in #89 but they aren't implemented in TruffleRuby yet. So let's omit them for now to have CI green. https://github.com/ruby/strscan/commit/844d963b56
2024-02-26[ruby/strscan] Add a method for peeking and reading bytes asAaron Patterson
integers (https://github.com/ruby/strscan/pull/89) This commit adds `scan_byte` and `peek_byte`. `scan_byte` will scan the current byte, return it as an integer, and advance the cursor. `peek_byte` will return the current byte as an integer without advancing the cursor. Currently `StringScanner#get_byte` returns a string, but I want to get the current byte without allocating a string. I think this will help with writing high performance lexers. --------- https://github.com/ruby/strscan/commit/873aba2e5d Co-authored-by: Sutou Kouhei <kou@clear-code.com>
2024-02-08[ruby/strscan] Don't add begin to length for new string sliceCharles Oliver Nutter
(https://github.com/ruby/strscan/pull/87) Fixes https://github.com/ruby/strscan/pull/86 https://github.com/ruby/strscan/commit/c17b015c00
2024-01-19[ruby/strscan] Add test to check encoding for empty stringNAITOH Jun
(https://github.com/ruby/strscan/pull/80) See: https://github.com/ruby/strscan/issues/78#issuecomment-1890849891 https://github.com/ruby/strscan/commit/d0508518a9
2024-01-14[ruby/strscan] StringScanner#captures: Return nil not "" forNAITOH Jun
unmached capture (https://github.com/ruby/strscan/pull/72) fix https://github.com/ruby/strscan/issues/70 If there is no substring matching the group (s[3]), the behavior is different. If there is no substring matching the group, the corresponding element (s[3]) should be nil. ``` s = StringScanner.new('foobarbaz') #=> #<StringScanner 0/9 @ "fooba..."> s.scan /(foo)(bar)(BAZ)?/ #=> "foobar" s[0] #=> "foobar" s[1] #=> "foo" s[2] #=> "bar" s[3] #=> nil s.captures #=> ["foo", "bar", ""] s.captures.compact #=> ["foo", "bar", ""] ``` ``` s = StringScanner.new('foobarbaz') #=> #<StringScanner 0/9 @ "fooba..."> s.scan /(foo)(bar)(BAZ)?/ #=> "foobar" s[0] #=> "foobar" s[1] #=> "foo" s[2] #=> "bar" s[3] #=> nil s.captures #=> ["foo", "bar", nil] s.captures.compact #=> ["foo", "bar"] ``` https://docs.ruby-lang.org/ja/latest/method/MatchData/i/captures.html ``` /(foo)(bar)(BAZ)?/ =~ "foobarbaz" #=> 0 $~.to_a #=> ["foobar", "foo", "bar", nil] $~.captures #=> ["foo", "bar", nil] $~.captures.compact #=> ["foo", "bar"] ``` * StringScanner#captures is not yet documented. https://docs.ruby-lang.org/ja/latest/class/StringScanner.html https://github.com/ruby/strscan/commit/1fbfdd3c6f
2023-07-27[ruby/strscan] Sync missed commitPeter Zhu
Syncs commit ruby/strscan@76b377a5d875ec77282d9319d62d8f24fe283b40.
2023-02-21[ruby/strscan] Mask out this test on JRuby/WindowsCharles Oliver Nutter
See https://github.com/jruby/jruby/issues/7644 for the root issue, which will require fixes to JRuby's regular expression engine, JOni. https://github.com/ruby/strscan/commit/29a65abff2
2023-02-21[ruby/strscan] test: Run test more with fixed anchor modeSutou Kouhei
(https://github.com/ruby/strscan/pull/60) fix https://github.com/ruby/strscan/pull/56
2023-02-21[ruby/strscan] Add test case to `test_string`OKURA Masafumi
(https://github.com/ruby/strscan/pull/58) `string` returns the original string after `scan` is called. Current test doesn't check this behavior and now it's covered.
2022-12-09Merge strscan-3.0.5Hiroshi SHIBATA
Notes: Merged: https://github.com/ruby/ruby/pull/6890
2021-05-06[ruby/strscan] Fix segmentation fault of `StringScanner#charpos` when ↵Kenichi Kamiya
`String#byteslice` returns non string value [Bug #17756] (#20) https://github.com/ruby/strscan/commit/92961cde2b
2021-05-06Import from https://github.com/ruby/strscan/pull/19Hiroshi SHIBATA
* Use Gemfile instead of Gem::Specification#add_development_dependency. * Use pend instead of skip for test-unit.
2020-12-18[strscan] Make strscan Ractor safe (#17)Kenta Murata
* Make strscan Ractor safe * Add test-unit in the development dependencies https://github.com/ruby/strscan/commit/3c93c2bebe
2019-11-18Deprecate taint/trust and related methods, and make the methods no-opsJeremy Evans
This removes the related tests, and puts the related specs behind version guards. This affects all code in lib, including some libraries that may want to support older versions of Ruby. Notes: Merged: https://github.com/ruby/ruby/pull/2476
2019-10-14Import StringScanner 1.0.3 (#2553)Sutou Kouhei
Notes: Merged-By: kou <kou@clear-code.com>
2017-11-29strscan.c: add MatchData-like methodsnobu
* ext/strscan/strscan.c: added `size`, `captures` and `values_at` to StringScanner, shorthands of accessing the matched data. based on the patch by apeiros (Stefan Rusterholz) at [ruby-core:20412]. [Feature #836] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60929 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-21strscan.c: fix segfault in arefnobu
* ext/strscan/strscan.c (strscan_aref): fix segfault after get_byte or getch which do not apply regexp. [ruby-core:82116] [Bug #13759] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-06{ext,test}/strscan: Specify frozen_string_literal: true.kazu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-16Add frozen_string_literal: false for all filesnaruse
When you change this to true, you may need to add more tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53141 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-08-03strscan.c: encoding in messagesnobu
* ext/strscan/strscan.c (strscan_aref): preserve argument encoding in error messages. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47044 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-03-11* test: get rid of warnings.usa
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@45313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-05-24* ext/strscan/strscan.c (strscan_aref): raise error if givennaruse
name reference is not found. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@40912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2013-05-21* ext/strscan/strscan.c (strscan_aref): support named captures.naruse
patched by Konstantin Haase [ruby-core:54664] [Feature #8343] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@40881 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-11-28Added #charpos for multibyte string position.ryan
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37916 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-02-14avoid method redefinition.akr
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26* ext/strscan/strscan.c (strscan_set_string): set string should not benobu
dupped or frozen, because freezing it causes #concat method failure, and unnecessary to dup without freezing. a patch from Aaron Patterson at [ruby-core:25145]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24679 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-06-17* ext/strscan/strscan.c (Init_strscan): remove obsoletematz
matchedsize method, use matched_size instead. [ruby-dev:38591] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23721 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-24* test: assert_raises has been deprecated since a long time ago.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-05* test/stringio/test_stringio.rb: add tests to achieve over 95% testmame
coverage of stringio. * test/strscan/test_stringscanner.rb: ditto for strscan. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16847 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-05-12* re.c (rb_reg_prepare_re): made non static with small refactoring.matz
* ext/strscan/strscan.c (strscan_do_scan): should adjust encoding before regex searching. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@16387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-28add a test.akr
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e