| Age | Commit message (Collapse) | Author |
|
|
|
[Bug #20292] Truncate embedded string to new capacity
|
|
#20190] (#10300)
Fix coderange of invalid_encoding_string.<<(ord)
Appending valid encoding character can change coderange from invalid to valid.
Example: "\x95".force_encoding('sjis')<<0x5C will be a valid string "\x{955C}"
|
|
#20150] (#10253)
Fix memory leak in grapheme clusters
[Bug #20150]
String#grapheme_cluters and String#each_grapheme_cluster leaks memory
because if the string is not UTF-8, then the created regex will not
be freed.
For example:
str = "hello world".encode(Encoding::UTF_32LE)
10.times do
1_000.times do
str.grapheme_clusters
end
puts `ps -o rss= -p #{$$}`
end
Before:
26000
42256
59008
75792
92528
109232
125936
142672
159392
176160
After:
9264
9504
9808
10000
10128
10224
10352
10544
10704
10896
---
string.c | 98 +++++++++++++++++++++++++++++++-----------------
test/ruby/test_string.rb | 11 ++++++
2 files changed, 75 insertions(+), 34 deletions(-)
|
|
The test fails when RGENGC_CHECK_MODE is turned on:
TestString#test_sub_gc_compact_stress = 9.42 s
1) Failure:
TestString#test_sub_gc_compact_stress [test/ruby/test_string.rb:2089]:
<"aaa [amp] yyy"> expected but was
<"aaa [] yyy">.
|
|
|
|
|
|
String#chomp! returned nil without checking the number of passed
arguments in this case.
|
|
|
|
We need to guard match from GC because otherwise it could end up being
reclaimed or moved in compaction.
|
|
We need to guard match from GC because otherwise it could end up being
reclaimed or moved in compaction.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/8348
|
|
- String#start_with?
- String#delete_prefix
- String#delete_prefix!
Notes:
Merged: https://github.com/ruby/ruby/pull/8296
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/8296
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/8080
Merged-By: nobu <nobu@ruby-lang.org>
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7988
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7988
|
|
|
|
bytesplice(index, length, str, str_index, str_length) -> string
bytesplice(range, str, str_range) -> string
In these forms, the content of +self+ is replaced by str.byteslice(str_index, str_length) or str.byteslice(str_range); however the substring of +str+ is not allocated as a new string.
Notes:
Merged: https://github.com/ruby/ruby/pull/7160
|
|
In Feature #19314, we concluded that the return value of String#bytesplice
should be changed from the source string to the receiver, because the source
string is useless and confusing when extra arguments are added.
This change should be included in Ruby 3.2.1.
|
|
This optimisation is no longer helpful now that we use VWA to allocate
strings in larger size pools where they can be embedded.
Notes:
Merged: https://github.com/ruby/ruby/pull/6965
|
|
Calling `String#scan` without a block creates an incomplete MatchData
object whose `RMATCH(match)->str` is Qfalse. Usually this object is not
leaked, but it was possible to pull it by using ObjectSpace.each_object.
This change hides the internal MatchData object by using rb_obj_hide.
Fixes [Bug #19159]
Notes:
Merged: https://github.com/ruby/ruby/pull/6836
|
|
It's questionable whether we want to allow rstrip to work for strings
where the broken coderange occurs before the trailing whitespace and
not after, but this approach is probably simpler, and I don't think
users should expect string operations like rstrip to work on broken
strings.
In some cases, this changes rstrip to raise
Encoding::CompatibilityError instead of ArgumentError. However, as
the problem is related to an encoding issue in the receiver, and due
not due to an issue with an argument, I think
Encoding::CompatibilityError is the more appropriate error.
Fixes [Bug #18931]
Notes:
Merged: https://github.com/ruby/ruby/pull/6282
|
|
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6590
|
|
This is an inelegant hack, by manually checking for this specific
code point in rb_str_inspect. Some testing indicates that this is
the only code point affected.
It's possible a better fix would be inside of lower-level encoding
code, such that rb_enc_isprint would return false and not true for
codepoint 0x85.
Fixes [Bug #16842]
Notes:
Merged: https://github.com/ruby/ruby/pull/4229
|
|
Previously, it was including one newline when chomp was used,
which is inconsistent with IO#each_line behavior. This makes
behavior consistent with IO#each_line, chomping all paragraph
separators (multiple consecutive newlines), but not single
newlines.
Partially Fixes [Bug #18768]
Notes:
Merged: https://github.com/ruby/ruby/pull/5960
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5584
|
|
|
|
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]
Co-authored-by: NARUSE, Yui <naruse@airemix.jp>
Notes:
Merged-By: shugo <shugo@ruby-lang.org>
|
|
|
|
|
|
|
|
Also, check if a suffix is empty, to guarantee the assumption of
`onigenc_get_left_adjust_char_head` that `*s` is always accessible,
even in the case of `SHARABLE_MIDDLE_SUBSTRING`.
|
|
String#initialize can leak memory when called on a string that is marked
with STR_NOFREE because it does not unset the STR_NOFREE flag.
Notes:
Merged: https://github.com/ruby/ruby/pull/4814
|
|
It looks like GitHub syntax-highlighting does not support an empty
heredoc. This change adds a newline to make GitHub can handle the syntax
appropriately.
https://bugs.ruby-lang.org/issues/17662
|
|
The documentation already specifies that they strip whitespace
and defines whitespace to include null.
This wraps the new behavior in the appropriate guards in the specs,
but does not specify behavior for previous versions, because this
is a bug that could be backported.
Fixes [Bug #17467]
Notes:
Merged: https://github.com/ruby/ruby/pull/4164
|
|
This modifies the following String methods to return String instances
instead of subclass instances:
* String#*
* String#capitalize
* String#center
* String#chomp
* String#chop
* String#delete
* String#delete_prefix
* String#delete_suffix
* String#downcase
* String#dump
* String#each/#each_line
* String#gsub
* String#ljust
* String#lstrip
* String#partition
* String#reverse
* String#rjust
* String#rpartition
* String#rstrip
* String#scrub
* String#slice!
* String#slice/#[]
* String#split
* String#squeeze
* String#strip
* String#sub
* String#succ/#next
* String#swapcase
* String#tr
* String#tr_s
* String#upcase
This also fixes a bug in String#swapcase where it would return the
receiver instead of a copy of the receiver if the receiver was the
empty string.
Some string methods were left to return subclass instances:
* String#+@
* String#-@
Both of these methods will return the receiver (subclass instance)
in some cases, so it is best to keep the returned class consistent.
Fixes [#10845]
Notes:
Merged: https://github.com/ruby/ruby/pull/3701
|
|
And `-w` option turns it on.
Notes:
Merged: https://github.com/ruby/ruby/pull/3481
|
|
Returns `nil` instead of an empty string when non-integer number is given (to make it 2.7 compatible).
Notes:
Merged-By: soutaro <matsumoto@soutaro.com>
|
|
https://bugs.ruby-lang.org/issues/6670#change-75907
|
|
If the passed string is frozen, bare and not shared, then there
is no need to duplicate it.
Ref: 4ab69ebbd7cef8539f687e1f948845d076461dc6
Ref: https://bugs.ruby-lang.org/issues/11386
Notes:
Merged: https://github.com/ruby/ruby/pull/3430
|
|
When the pattern Regexp given to String#index and String#rindex
contain a /\K/ (lookbehind) operator, these methods return the
position where the beginning of the lookbehind pattern matches, while
they are expected to return the position where the \K matches.
```
# without patch
"abcdbce".index(/b\Kc/) # => 1
"abcdbce".rindex(/b\Kc/) # => 4
```
This patch fixes this problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".index(/b\Kc/) # => 2
"abcdbce".rindex(/b\Kc/) # => 5
```
Fixes [Bug #17118]
Notes:
Merged: https://github.com/ruby/ruby/pull/3414
|
|
When the pattern given to String#partition and String#rpartition
contain a /\K/ (lookbehind) operator, the methods return strings
sliced at incorrect positions.
```
# without patch
"abcdbce".partition(/b\Kc/) # => ["a", "c", "cdbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcd", "c", "ce"]
```
This patch fixes the problem by using BEG(0) instead of the return
value of rb_reg_search.
```
# with patch
"abcdbce".partition(/b\Kc/) # => ["ab", "c", "dbce"]
"abcdbce".rpartition(/b\Kc/) # => ["abcdb", "c", "e"]
```
As a side-effect this patch makes String#partition 2x faster when the
pattern is a costly Regexp by performing Regexp search only once,
which was unexpectedly done twice in the original implementation.
Fixes [Bug #17119]
Notes:
Merged: https://github.com/ruby/ruby/pull/3413
|
|
Use BEG(0) instead of the result of rb_reg_search to handle the cases
when the separator Regexp contains /\K/ (lookbehind) operator.
Fixes [Bug #17113]
Notes:
Merged: https://github.com/ruby/ruby/pull/3407
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/2920
|
|
Split with the matched part when the separator matches the empty
part at the beginning. [Bug #11014]
|
|
[Bug #16438]
|
|
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf.
Revert [Feature #13083]
|