Age | Commit message (Collapse) | Author |
|
|
|
This patch speeds up setting the backref match object by avoiding some
memcopies. Take the following code for example:
```ruby
"hello world" =~ /hello/
p $~
```
When the RE matches the string, we have to set the Match object in the
backref global. So we would allocate a match object[^1] and use
`rb_reg_region_copy`[^2] to make a deep copy of the stack allocated
`re_registers` struct[^3] in to the newly created Ruby object. This
could possibly trigger GC[^4], and would allocate new memory.
This patch makes a shallow copy of the `re_registers` struct on to the
Match object allowing the match object to manage the `re_registers`
pointer and also avoiding some calls to `xmalloc` and some manual
memcopy.
Benchmark looks like this:
```ruby
require "benchmark/ips"
def test_re thing
thing =~ /hello/
end
Benchmark.ips do |x|
x.report("re hit") do
test_re "hello world"
end
x.report("re miss") do
test_re "world"
end
end
```
Before this patch:
```
$ ruby -v test.rb
ruby 3.2.0dev (2022-07-27T22:29:00Z master 4ad69899b7) [arm64-darwin21]
Ignoring bcrypt-3.1.16 because its extensions are not built. Try: gem pristine bcrypt --version 3.1.16
Warming up --------------------------------------
re hit 345.401k i/100ms
re miss 673.584k i/100ms
Calculating -------------------------------------
re hit 3.452M (± 0.5%) i/s - 17.270M in 5.002535s
re miss 6.736M (± 0.4%) i/s - 34.353M in 5.099593s
```
After this patch:
```
$ ./ruby -v test.rb
ruby 3.2.0dev (2022-08-01T21:24:12Z less-memcpy 0ff2a56606) [arm64-darwin21]
Warming up --------------------------------------
re hit 419.578k i/100ms
re miss 673.251k i/100ms
Calculating -------------------------------------
re hit 4.201M (± 0.7%) i/s - 21.398M in 5.093593s
re miss 6.716M (± 0.4%) i/s - 33.663M in 5.012756s
```
Matches get faster and misses maintain the same speed
[^1]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1737
[^2]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1738
[^3]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1686
[^4]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L981
Notes:
Merged: https://github.com/ruby/ruby/pull/6206
|
|
[Misc #18891]
Notes:
Merged: https://github.com/ruby/ruby/pull/6094
|
|
|
|
Related to [Feature #18838]
Notes:
Merged: https://github.com/ruby/ruby/pull/6047
|
|
Co-Authored-By: Janosch Müller <janosch.mueller@betterplace.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
`Regexp.new` now supports passing the regexp flags not only as an
`Integer`, but also as a `String. Unknown flags raise errors.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
Now second argument should be `true`, `false`, `nil` or Integer.
This flag is confused with third argument some times.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
|
|
|
|
|
|
Invalid escapes are handled at multiple levels. The first level
is in parse.y, so skip invalid unicode escape checks for regexps
in parse.y.
Make rb_reg_preprocess and unescape_nonascii accept the regexp
options. In unescape_nonascii, if the regexp is an extended
regexp, when "#" is encountered, ignore all characters until the
end of line or end of regexp.
Unfortunately, in extended regexps, you can use "#" as a non-comment
character inside a character class, so also parse "[" and "]"
specially for extended regexps, and only skip comments if "#" is
not inside a character class. Handle nested character classes as well.
This issue doesn't just affect extended regexps, it also affects
"(#?" comments inside all regexps. So for those comments, scan
until trailing ")" and ignore content inside.
I'm not sure if there are other corner cases not handled. A
better fix would be to redesign the regexp parser so that it
unescaped during parsing instead of before parsing, so you already
know the current parsing state.
Fixes [Bug #18294]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/5721
Merged-By: jeremyevans <code@jeremyevans.net>
|
|
Treats:
#to_s
#named_captures
#string
#inspect
#hash
#==
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#[]
#values_at
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#pre_match
#post_match
#to_a
#captures
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#begin
#end
#match
#match_length
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#regexp
#names
#size
#offset
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
::new
::escape
::try_convert
::union
::last_match
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#fixed_encoding?
#hash
#==
#=~
#match
#match?
Also, in regexp.rdoc:
Changes heading from 'Special Global Variables' to 'Regexp Global Variables'.
Add tiny section 'Regexp Interpolation'.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#source
#inspect
#to_s
#casefold?
#options
#names
#named_captures
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5740
Merged-By: nobu <nobu@ruby-lang.org>
|
|
[Bug #18669]
|
|
Currently it has only one function prototype.
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
[Feature #17837]
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]
Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
Notes:
Merged-By: shugo <shugo@ruby-lang.org>
|
|
https://github.com/ruby/ruby/pull/5518#discussion_r809645406
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5449
|
|
* Adding links to literals and Kernel
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4925
Merged-By: nobu <nobu@ruby-lang.org>
|
|
In certain conditions, Regexp#match could return a MatchData with
missing captures. This seems to require at the least, multiple
threads calling a method that calls the same block/proc/lambda
which calls Regexp#match.
The race condition happens because the MatchData is passed from
indirectly via the backref, and other threads can modify the
backref.
Fix the issue by:
1. Not reusing the existing MatchData from the backref, and always
allocating a new MatchData.
2. Passing the MatchData directly to the caller using a VALUE*,
instead of indirectly through the backref.
It's likely that variants of this issue exist for other Regexp
methods. Anywhere that MatchData is passed implicitly through
the backref is probably vulnerable to this issue.
Fixes [Bug #17507]
Notes:
Merged: https://github.com/ruby/ruby/pull/4734
|
|
The method to return the length of the matched substring
corresponding to the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
The method to return the single matched substring corresponding to
the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4837
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4822
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4822
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4695
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Following non-special_const literals:
* T_REGEXP
Notes:
Merged: https://github.com/ruby/ruby/pull/4548
|
|
* add static modifier for rb_reg_eqq func
* add static modifier for rb_check_regexp_type func
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
|
|
|
|
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Notes:
Merged: https://github.com/ruby/ruby/pull/3917
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3483
|
|
Regexp literals are frozen, and also dynamically comppiled Regexp
literals (/#{expr}/) are frozen.
Notes:
Merged: https://github.com/ruby/ruby/pull/3676
|
|
Some global variables should be used from non-main Ractors.
[Bug #17268]
```ruby
# ractor-local (derived from created ractor): debug
'$DEBUG' => $DEBUG,
'$-d' => $-d,
# ractor-local (derived from created ractor): verbose
'$VERBOSE' => $VERBOSE,
'$-w' => $-w,
'$-W' => $-W,
'$-v' => $-v,
# process-local (readonly): other commandline parameters
'$-p' => $-p,
'$-l' => $-l,
'$-a' => $-a,
# process-local (readonly): getpid
'$$' => $$,
# thread local: process result
'$?' => $?,
# scope local: match
'$~' => $~.inspect,
'$&' => $&,
'$`' => $`,
'$\'' => $',
'$+' => $+,
'$1' => $1,
# scope local: last line
'$_' => $_,
# scope local: last backtrace
'$@' => $@,
'$!' => $!,
# ractor local: stdin, out, err
'$stdin' => $stdin.inspect,
'$stdout' => $stdout.inspect,
'$stderr' => $stderr.inspect,
```
Notes:
Merged: https://github.com/ruby/ruby/pull/3670
|
|
https://github.com/ruby/ruby/runs/1041040167?check_suite_focus=true#step:11:177
```
compiling ../src/re.c
re.c
../src/re.c(317): error C2057: expected constant expression
../src/re.c(317): error C2466: cannot allocate an array of constant size 0
../src/re.c(467): error C2057: expected constant expression
../src/re.c(467): error C2466: cannot allocate an array of constant size 0
../src/re.c(467): error C2133: 'opts': unknown size
../src/re.c(559): error C2057: expected constant expression
../src/re.c(559): error C2466: cannot allocate an array of constant size 0
../src/re.c(559): error C2133: 'optbuf': unknown size
../src/re.c(673): error C2057: expected constant expression
../src/re.c(673): error C2466: cannot allocate an array of constant size 0
../src/re.c(673): error C2133: 'opts': unknown size
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX64\x64\cl.EXE"' : return code '0x2'
Stop.
```
|