| Age | Commit message (Collapse) | Author |
|
6b4f8945d600168bf530d21395da8293fbd5e8ba: [Backport #20909]
Check negative integer underflow
Many of Oniguruma functions need valid encoding strings
|
|
d5080f6e8b77364483ff6727b1065e45e180f05d: [Backport #20292]"
This reverts commit a54c717c7a74b91a3cdf20742c355e3ea42052d1.
|
|
d5080f6e8b77364483ff6727b1065e45e180f05d: [Backport #20292]
[Bug #20292] Truncate embedded string to new capacity
Fix -Wsign-compare on String#initialize
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
../string.c:1886:57: warning: comparison of integer expressions of different signedness: ‘size_t’ {aka ‘long unsigned int’} and ‘long int’ [-Wsign-compare]
1886 | if (STR_EMBED_P(str)) RUBY_ASSERT(osize <= str_embed_capa(str));
| ^~
|
|
Fix coderange of invalid_encoding_string.<<(ord)
Appending valid encoding character can change coderange from invalid to valid.
Example: "\x95".force_encoding('sjis')<<0x5C will be a valid string "\x{955C}"
---
string.c | 6 +++++-
test/ruby/test_string.rb | 3 +++
2 files changed, 8 insertions(+), 1 deletion(-)
|
|
Fix memory leak in grapheme clusters
[Bug #20150]
String#grapheme_cluters and String#each_grapheme_cluster leaks memory
because if the string is not UTF-8, then the created regex will not
be freed.
For example:
str = "hello world".encode(Encoding::UTF_32LE)
10.times do
1_000.times do
str.grapheme_clusters
end
puts `ps -o rss= -p #{$$}`
end
Before:
26000
42256
59008
75792
92528
109232
125936
142672
159392
176160
After:
9264
9504
9808
10000
10128
10224
10352
10544
10704
10896
---
string.c | 98 +++++++++++++++++++++++++++++++-----------------
test/ruby/test_string.rb | 11 ++++++
2 files changed, 75 insertions(+), 34 deletions(-)
|
|
[Bug #19748] Fix out-of-bound access in `String#byteindex`
---
string.c | 17 +++++++----------
test/ruby/test_string.rb | 3 +++
2 files changed, 10 insertions(+), 10 deletions(-)
|
|
String#bytesplice should return self
In Feature #19314, we concluded that the return value of String#bytesplice
should be changed from the source string to the receiver, because the source
string is useless and confusing when extra arguments are added.
This change should be included in Ruby 3.2.1.
---
string.c | 4 ++--
test/ruby/test_string.rb | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
|
|
Calling `String#scan` without a block creates an incomplete MatchData
object whose `RMATCH(match)->str` is Qfalse. Usually this object is not
leaked, but it was possible to pull it by using ObjectSpace.each_object.
This change hides the internal MatchData object by using rb_obj_hide.
Fixes [Bug #19159]
Notes:
Merged: https://github.com/ruby/ruby/pull/6836
|
|
It's questionable whether we want to allow rstrip to work for strings
where the broken coderange occurs before the trailing whitespace and
not after, but this approach is probably simpler, and I don't think
users should expect string operations like rstrip to work on broken
strings.
In some cases, this changes rstrip to raise
Encoding::CompatibilityError instead of ArgumentError. However, as
the problem is related to an encoding issue in the receiver, and due
not due to an issue with an argument, I think
Encoding::CompatibilityError is the more appropriate error.
Fixes [Bug #18931]
Notes:
Merged: https://github.com/ruby/ruby/pull/6282
|
|
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6590
|
|
This is an inelegant hack, by manually checking for this specific
code point in rb_str_inspect. Some testing indicates that this is
the only code point affected.
It's possible a better fix would be inside of lower-level encoding
code, such that rb_enc_isprint would return false and not true for
codepoint 0x85.
Fixes [Bug #16842]
Notes:
Merged: https://github.com/ruby/ruby/pull/4229
|
|
Previously, it was including one newline when chomp was used,
which is inconsistent with IO#each_line behavior. This makes
behavior consistent with IO#each_line, chomping all paragraph
separators (multiple consecutive newlines), but not single
newlines.
Partially Fixes [Bug #18768]
Notes:
Merged: https://github.com/ruby/ruby/pull/5960
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5584
|
|
|
|
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]
Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
Notes:
Merged-By: shugo <shugo@ruby-lang.org>
|
|
|
|
|
|
|
|
Also, check if a suffix is empty, to guarantee the assumption of
`onigenc_get_left_adjust_char_head` that `*s` is always accessible,
even in the case of `SHARABLE_MIDDLE_SUBSTRING`.
|
|
String#initialize can leak memory when called on a string that is marked
with STR_NOFREE because it does not unset the STR_NOFREE flag.
Notes:
Merged: https://github.com/ruby/ruby/pull/4814
|
|
It looks like GitHub syntax-highlighting does not support an empty
heredoc. This change adds a newline to make GitHub can handle the syntax
appropriately.
https://bugs.ruby-lang.org/issues/17662
|
|
The documentation already specifies that they strip whitespace
and defines whitespace to include null.
This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.
Fixes [Bug #17467]
Notes:
Merged: https://github.com/ruby/ruby/pull/4164
|
|
This modifies the following String methods to return String instances
instead of subclass instances:
* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase
This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.
Some string methods were left to return subclass instances:
* String#+@
* String#-@
Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.
Fixes [#10845]
Notes:
Merged: https://github.com/ruby/ruby/pull/3701
|
|
And `-w` option turns it on.
Notes:
Merged: https://github.com/ruby/ruby/pull/3481
|
|
Returns `nil` instead of an empty string when non-integer number is given (to make it 2.7 compatible).
Notes:
Merged-By: soutaro <matsumoto@soutaro.com>
|
|
https://bugs.ruby-lang.org/issues/6670#change-75907
|
|
If the passed string is frozen, bare and not shared, then there
is no need to duplicate it.
Ref: 4ab69ebbd7cef8539f687e1f948845d076461dc6
Ref: https://bugs.ruby-lang.org/issues/11386
Notes:
Merged: https://github.com/ruby/ruby/pull/3430
|
|
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.
```
# without patch
"abcdbce".index(/b\Kc/) # => 1
"abcdbce".rindex(/b\Kc/) # => 4
```
This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".index(/b\Kc/) # => 2
"abcdbce".rindex(/b\Kc/) # => 5
```
Fixes [Bug #17118]
Notes:
Merged: https://github.com/ruby/ruby/pull/3414
|
|
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.
```
# without patch
"abcdbce".partition(/b\Kc/) # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcd", "c", "ce"]
```
This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".partition(/b\Kc/) # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcdb", "c", "e"]
```
As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.
Fixes [Bug #17119]
Notes:
Merged: https://github.com/ruby/ruby/pull/3413
|
|
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.
Fixes [Bug #17113]
Notes:
Merged: https://github.com/ruby/ruby/pull/3407
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/2920
|
|
Split with the matched part when the separator matches the empty
part at the beginning. [Bug #11014]
|
|
[Bug #16438]
|
|
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf.
Revert [Feature #13083]
|
|
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
Notes:
Merged: https://github.com/ruby/ruby/pull/2476
|
|
* {String|Symbol}#match{?} with nil returns falsy
To improve consistency with Regexp#match{?}
* String#match(nil) returns `nil` instead of TypeError
* String#match?(nil) returns `false` instead of TypeError
* Symbol#match(nil) returns `nil` instead of TypeError
* Symbol#match?(nil) returns `false` instead of TypeError
* Prefer exception
* Follow empty ENV
* Drop outdated specs
* Write ruby/spec for above
https://github.com/ruby/ruby/pull/1506/files#r183242981
* Fix merge miss
|
|
Cfuncs that use rb_scan_args with the : entry suffer similar keyword
argument separation issues that Ruby methods suffer if the cfuncs
accept optional or variable arguments.
This makes the following changes to : handling.
* Treats as **kw, prompting keyword argument separation warnings
if called with a positional hash.
* Do not look for an option hash if empty keywords are provided.
For backwards compatibility, treat an empty keyword splat as a empty
mandatory positional hash argument, but emit a a warning, as this
behavior will be removed in Ruby 3. The argument number check
needs to be moved lower so it can correctly handle an empty
positional argument being added.
* If the last argument is nil and it is necessary to treat it as an option
hash in order to make sure all arguments are processed, continue to
treat the last argument as the option hash. Emit a warning in this case,
as this behavior will be removed in Ruby 3.
* If splitting the keyword hash into two hashes, issue a warning, as we
will not be splitting hashes in Ruby 3.
* If the keyword argument is required to fill a mandatory positional
argument, continue to do so, but emit a warning as this behavior will
be going away in Ruby 3.
* If keyword arguments are provided and the last argument is not a hash,
that indicates something wrong. This can happen if a cfunc is calling
rb_scan_args multiple times, and providing arguments that were not
passed to it from Ruby. Callers need to switch to the new
rb_scan_args_kw function, which allows passing of whether keywords
were provided.
This commit fixes all warnings caused by the changes above.
It switches some function calls to *_kw versions with appropriate
kw_splat flags. If delegating arguments, RB_PASS_CALLED_KEYWORDS
is used. If creating new arguments, RB_PASS_KEYWORDS is used if
the last argument is a hash to be treated as keywords.
In open_key_args in io.c, use rb_scan_args_kw.
In this case, the arguments provided come from another C
function, not Ruby. The last argument may or may not be a hash,
so we can't set keyword argument mode. However, if it is a
hash, we don't want to warn when treating it as keywords.
In Ruby files, make sure to appropriately use keyword splats
or literal keywords when calling Cfuncs that now issue keyword
argument separation warnings through rb_scan_args. Also, make
sure not to pass nil in place of an option hash.
Work around Kernel#warn warnings due to problems in the Rubygems
override of the method. There is an open pull request to fix
these issues in Rubygems, but part of the Rubygems tests for
their override fail on ruby-head due to rb_scan_args not
recognizing empty keyword splats, which this commit fixes.
Implementation wise, adding rb_scan_args_kw is kind of a pain,
because rb_scan_args takes a variable number of arguments.
In order to not duplicate all the code, the function internals need
to be split into two functions taking a va_list, and to avoid passing
in a ton of arguments, a single struct argument is used to handle
the variables previously local to the function.
Notes:
Merged-By: jeremyevans <code@jeremyevans.net>
|
|
* string.c (rb_str_sub_bang): retrieves a pointer to the
replacement string buffer just before using it, for the case of
replacement with the receiver string itself. [Bug #16105]
|
|
* string.c (rb_str_split_m): occupy match data not to be modified
during yielding the block. [Bug #16024]
|
|
|
|
rb_fstring behavior in this case is to freeze the receiver. I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.
Fixes [Bug #15926]
|
|
* string.c (get_reg_grapheme_cluster): make regexp from properly
encoded sources fro wide-char encodings. [Bug #15965]
* regparse.c (node_extended_grapheme_cluster): suppress false
duplicated range warning for the time being.
|
|
|
|
* string.c (rb_str_init): allocate new buffer if the string is
shared. [Bug #15937]
|
|
* string.c (rb_str_init): preserve the embedded content when
self-copying with a capacity. [Bug #15937]
|
|
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.
The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.
```ruby
a = ('j' * 24).b.b
eval('', binding, a)
p a
4.times { GC.start }
p a
```
- string.c (str_replace_shared_without_enc): when given a
dependent string, depend on the root of the dependent
string.
[Bug #15934]
|
|
Skip the webrick httpauth tests that use crypt when testing on
OpenBSD.
Fixes [Bug #11363]
|
|
* string.c (str_duplicate): share the root shared string if the
original string is already sharing, so that all shared strings
refer the root shared string directly. indirect sharing can
cause a dangling pointer.
[Bug #15792]
|
|
* string.c (rb_str_split_m): warn use of non-nil $;.
* string.c (rb_fs_setter): warn when set to non-nil value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|