Age | Commit message (Collapse) | Author |
|
Fix memory leak
* string.c (str_make_independent_expand): free independent buffer.
[Bug# 15935]
Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67805 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Tag string shared roots to fix use-after-free
The buffer deduplication codepath in rb_fstring can be used to free the buffer
of shared string roots, which leads to use-after-free.
Introudce a new flag to tag strings that at one point have been a shared root.
Check for it in rb_fstring to avoid freeing buffers that are shared by
multiple strings. This change is based on nobu's idea in [ruby-core:94838].
The included test case test for the sequence of calls to internal functions
that lead to this bug. See attached ticket for Ruby level repros.
[Bug #16151]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67804 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Fixed heap-use-after-free
* string.c (rb_str_sub_bang): retrieves a pointer to the
replacement string buffer just before using it, for the case of
replacement with the receiver string itself. [Bug #16105]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
f1b76ea63ce40670071a857f408a4747c571f1e9,1d1f98d49c9908f4e3928e582d31fd2e9f252f92: [Backport #16024]
Occupy match data
* string.c (rb_str_split_m): occupy match data not to be modified
during yielding the block. [Bug #16024]
Reuse match data
* string.c (rb_str_split_m): reuse occupied match data. [Bug #16024]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
8aecc90974ab1ac87056f77e2cb3406c5c041504,2f6cc15cdb3d64135b29cfd5ee376a5a03ebbee7: [Backport #15965]
Hoisted out WIDE_ENCODINGS
Fixed String#grapheme_clusters with wide encodings
* string.c (get_reg_grapheme_cluster): make regexp from properly
encoded sources fro wide-char encodings. [Bug #15965]
* regparse.c (node_extended_grapheme_cluster): suppress false
duplicated range warning for the time being.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67741 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Get rid of undefined behavior
* string.c (rb_str_sub_bang): str and repl can be same.
[Bug #15946]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67739 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
28678997e40869f5591eae60edd9757334426ffb,8797f48373dcfa3ff8e748667732dea8aea4347e: [Backport #15937]
Preserve the string content at self-copying
* string.c (rb_str_init): preserve the embedded content when
self-copying with a capacity. [Bug #15937]
New buffer for shared string
* string.c (rb_str_init): allocate new buffer if the string is
shared. [Bug #15937]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67738 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
String#b: Don't depend on dependent string
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.
The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.
```ruby
a = ('j' * 24).b.b
eval('', binding, a)
p a
4.times { GC.start }
p a
```
- string.c (str_replace_shared_without_enc): when given a
dependent string, depend on the root of the dependent
string.
[Bug #15934]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Fix memory leak
* string.c (str_replace_shared_without_enc): free previous buffer
before replaced.
* parse.y (gettable): make sure in advance that the `__FILE__`
object shares a fstring, to get rid of replacement with the
fstring later.
TODO: this hack may be needed in other places.
[Bug #15916]
Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
3f9562015e651735bfc2fdd14e8f6963b673e22a,c06ddfee878524168e4af07443217ed2f8d0954b,3b3b4a44e57dfe03ce3913009d69a33d6f6100be: [Backport #15792]
Get rid of indirect sharing
* string.c (str_duplicate): share the root shared string if the
original string is already sharing, so that all shared strings
refer the root shared string directly. indirect sharing can
cause a dangling pointer.
[Bug #15792]
str_duplicate: Don't share with a frozen shared string
This is a follow up for 3f9562015e651735bfc2fdd14e8f6963b673e22a.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.
Such string looks like:
```
-------- -----------------
| root | ------ owns -----> | root's buffer |
-------- -----------------
^ ^ ^
----------- | |
| shared1 | ------ references ----- |
----------- |
^ |
----------- |
| shared2 | ------ references ---------
-----------
```
This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:
```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2); /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
str_make_independent(str); /* no frozen check */
}
```
If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:
```
----------- --------------------
| shared1 | -------- owns --------> | shared1's buffer |
----------- --------------------
^
|
----------- -------------------------
| shared2 | ------ references ----> | root's buffer (freed) |
----------- -------------------------
```
Here is a reproduction script for the situation this commit fixes.
```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```
- string.c (str_duplicate): always share with the root string when
the original is a shared string.
- test_rb_str_dup.rb: specifically test `rb_str_dup` to make
sure it does not try to share with a shared string.
[Bug #15792]
Closes: https://github.com/ruby/ruby/pull/2159
Update dependencies
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67731 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Fix potential memory leak
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
string.c: respect the actual encoding
* string.c (rb_enc_str_coderange): respect the actual encoding of
if a BOM presents, and scan for the actual code range.
[ruby-core:91662] [Bug #15635]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67181 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Follow behaviour of IO#ungetbyte
see r65802 and [Bug #14359]
* expand tabs.
setbyte / ungetbyte allow out-of-range integers
* string.c: String#setbyte to accept arbitrary integers [Bug #15460]
* io.c: ditto for IO#ungetbyte
* ext/strringio/stringio.c: ditto for StringIO#ungetbyte
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@66845 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.
We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.
* enc/unicode.c: Actual implementation
* string.c: Add mention of this special case for documentation
* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
that uses String#capitalize on some (including nonsensical)
combinations of MTAVRULI and Mkhedruli, and a canary test to
detect the potential assignment of characters to the currently
open slots (holes) at U+1CBB and U+1CBC.
* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
expectation data.
Together with r65933, this closes issue #14839.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Especially over checking argc then calling rb_scan_args just to
raise an ArgumentError.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Unicode Text Segmentation considers CRLF as a character. [Bug #15337]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65954 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
It seems that decades ago, ruby was written under assumption that
char is unsigned. Which is of course a false assumption. We
need to explicitly store a numeric value into an unsigned char
variable to tell we expect 0..255 value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65900 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
The behaviour of String#setbyte has been depending on the width
of int, which is not portable. Must check explicitly.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65804 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Looking at the lines right above, it is clear than a blue sky
that we cannot assume `p` to be aligned at all when
UNALIGNED_WORD_ACCESS is true. It is a wrong idea to use
__builtin_assume_aligned for that situation.
See also: https://travis-ci.org/ruby/ruby/jobs/451710732#L2007
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65592 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
These APIs are much like <valgrind/memcheck.h>. Use them to
fine-grain annotate the usage of our memory.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_str_format_m): should pass `int`.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65456 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
(re-commit of r65444)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c: [DOC] improve docs for String#{strip,lstrip,rstrip}{,!}:
small clarification, avoid referring to the receiver as `str'
(does not appear in the call-seq of the generated HTML docs),
enable links for cross-references, simplify rdoc.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (get_reg_grapheme_cluster): show error info and relax
to rb_fatal from rb_bug.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65068 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c: [DOC] move unaltered case for String#strip to the end,
similar to other strip methods.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
The former states explicitly that the argument must be a literal,
and can optimize away `strlen` on all compilers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65059 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
`ptr` for these functions must refer constant string literals.
Otherwise, the result string's content can be modified/discarded
unexpectedly.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Patch by Josh Goldberg. [Fix GH-1933] [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64757 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Just avoid being loose.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63632 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* Document about optional getline arguments
* Add examples, especially for the demonstration of `chomp: true`
[Fix GH-1886]
From: Koki Takahashi <hakatasiloving@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
[Feature #14478] [ruby-core:85669]
Thanks-to: Sam Saffron <sam.saffron@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63566 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_str_aset): prefer BUILTIN_TYPE over TYPE after
SPECIAL_CONST_P check.
* string.c (rb_str_start_with): prefer RB_TYPE_P over switch by
TYPE.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63543 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_str_start_with): [DOC] start_with? example with
regexp.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63541 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Building with HAVE_MALLOC_USABLE_SIZE currently makes
SIZED_REALLOC_N ignore the old size arg.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63487 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Another part of the plan to reduce dependencies on malloc_usable_size:
https://bugs.ruby-lang.org/issues/10238
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63485 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* range.c (range_each_func): adjust the signature of the callback
function to rb_str_upto_each, and exit the loop if the callback
returned non-zero.
* string.c (rb_str_upto_endless_each): ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (scan_once): fix the matched substring with `\K`, the
beginning of that string may differ from the matched position.
[ruby-core:86663] [Bug #14707]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63252 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Typical usages:
```
p ary[1..] # drop the first element; identical to ary[1..-1]
(1..).each {|n|...} # iterate forever from 1; identical to 1.step{...}
```
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (str_undump): get rid of warning C4129 by VC.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63170 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|