Age | Commit message (Collapse) | Author |
|
Similar to the check used for String#gsub. Can fix possible
segfault.
Fixes [Bug #15941]
|
|
rb_fstring behavior in this case is to freeze the receiver. I'm
not sure if that should be changed, so this takes the conservative
approach of duping the receiver in String#-@ before passing
to rb_fstring.
Fixes [Bug #15926]
|
|
|
|
* string.c (get_reg_grapheme_cluster): make regexp from properly
encoded sources fro wide-char encodings. [Bug #15965]
* regparse.c (node_extended_grapheme_cluster): suppress false
duplicated range warning for the time being.
|
|
When a string is #frozen, it's capacity is resized to fit (if it is much
larger), since we know it will no longer be mutated.
> puts ObjectSpace.dump(String.new("a"*30, capacity: 1000))
{"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"...
> puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze)
{"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"...
(ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize)
Previously, if we dedup into an fstring, using String#-@, capacity would
not be reduced.
> puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000))
{"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"...
This commit makes rb_fstring call rb_str_resize, the same as
rb_str_freeze does.
Closes: https://github.com/ruby/ruby/pull/2256
|
|
|
|
* string.c (rb_str_sub_bang): str and repl can be same.
[Bug #15946]
|
|
* string.c (rb_str_init): allocate new buffer if the string is
shared. [Bug #15937]
|
|
* string.c (rb_str_init): preserve the embedded content when
self-copying with a capacity. [Bug #15937]
|
|
* string.c (str_make_independent_expand): free independent buffer.
[Bug# 15935]
Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
|
|
|
|
Registering a string that depend on a dependent string as fstring
can lead to use-after-free. See c06ddfe and 3f95620 for details.
The following script triggers use-after-free on trunk, 2.4.6, 2.5.5
and 2.6.3. Credits to @wanabe for using eval as a cross-version way
of registering a fstring.
```ruby
a = ('j' * 24).b.b
eval('', binding, a)
p a
4.times { GC.start }
p a
```
- string.c (str_replace_shared_without_enc): when given a
dependent string, depend on the root of the dependent
string.
[Bug #15934]
|
|
* string.c (str_replace_shared_without_enc): free previous buffer
before replaced.
* parse.y (gettable): make sure in advance that the `__FILE__`
object shares a fstring, to get rid of replacement with the
fstring later.
TODO: this hack may be needed in other places.
[Bug #15916]
Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
|
|
|
|
This is a follow up for 3f9562015e651735bfc2fdd14e8f6963b673e22a.
Before this commit, it was possible to create a shared string which
shares with another shared string by passing a frozen shared string
to `str_duplicate`.
Such string looks like:
```
-------- -----------------
| root | ------ owns -----> | root's buffer |
-------- -----------------
^ ^ ^
----------- | |
| shared1 | ------ references ----- |
----------- |
^ |
----------- |
| shared2 | ------ references ---------
-----------
```
This is bad news because `rb_fstring(shared2)` can make `shared1`
independent, which severs the reference from `shared1` to `root`:
```c
/* from fstr_update_callback() */
str = str_new_frozen(rb_cString, shared2); /* can return shared1 */
if (STR_SHARED_P(str)) { /* shared1 is also a shared string */
str_make_independent(str); /* no frozen check */
}
```
If `shared1` was the only reference to `root`, then `root` can be
reclaimed by the GC, leaving `shared2` in a corrupted state:
```
----------- --------------------
| shared1 | -------- owns --------> | shared1's buffer |
----------- --------------------
^
|
----------- -------------------------
| shared2 | ------ references ----> | root's buffer (freed) |
----------- -------------------------
```
Here is a reproduction script for the situation this commit fixes.
```ruby
a = ('a' * 24).strip.freeze.strip
-a
p a
4.times { GC.start }
p a
```
- string.c (str_duplicate): always share with the root string when
the original is a shared string.
- test_rb_str_dup.rb: specifically test `rb_str_dup` to make
sure it does not try to share with a shared string.
[Bug #15792]
Closes: https://github.com/ruby/ruby/pull/2159
|
|
This reverts commit 5776ae347540ac19c40d146a3566a806cd176bf1.
Mistaken `max` as `min`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
No problem for unaligned-ness because we never dereference.
|
|
|
|
* string.c (str_duplicate): share the root shared string if the
original string is already sharing, so that all shared strings
refer the root shared string directly. indirect sharing can
cause a dangling pointer.
[Bug #15792]
|
|
* string.c (rb_str_split_m): warn use of non-nil $;.
* string.c (rb_fs_setter): warn when set to non-nil value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_str_split_m): improve splitting into chars by an
empty string, without a regexp.
Comparison:
to_chars-1
built-ruby: 1273527.6 i/s
compare-ruby: 189423.3 i/s - 6.72x slower
to_chars-10
built-ruby: 120993.5 i/s
compare-ruby: 37075.8 i/s - 3.26x slower
to_chars-100
built-ruby: 15646.4 i/s
compare-ruby: 4012.1 i/s - 3.90x slower
to_chars-1000
built-ruby: 1295.1 i/s
compare-ruby: 408.5 i/s - 3.17x slower
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c: remove <code> markups, which are not only unnecessary
but also prevented cross-references.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_str_crypt): fix indent not to make the whole list
verbatim entirely.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c (rb_enc_str_coderange): respect the actual encoding of
if a BOM presents, and scan for the actual code range.
[ruby-core:91662] [Bug #15635]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67167 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
[Bug #11391]
From: Franck Verrot <franck@verrot.fr>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* Officially states that String#dump is intended for round-trip.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66894 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* string.c: String#setbyte to accept arbitrary integers [Bug #15460]
* io.c: ditto for IO#ungetbyte
* ext/strringio/stringio.c: ditto for StringIO#ungetbyte
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66824 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* eval_error.c (print_errinfo): defer escaping control char in
error messages until writing to stderr, instead of quoting at
building the message. [ruby-core:90853] [Bug #15497]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66753 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
And its friends: lines, chars, grapheme_clusters, and codepoints.
[Feature #6670] [ruby-core:90728]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66579 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Forgot to write the ticket number in the commit log...
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66578 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
And its friends: lines, chars, grapheme_clusters, and codepoints.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.
We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.
* enc/unicode.c: Actual implementation
* string.c: Add mention of this special case for documentation
* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
that uses String#capitalize on some (including nonsensical)
combinations of MTAVRULI and Mkhedruli, and a canary test to
detect the potential assignment of characters to the currently
open slots (holes) at U+1CBB and U+1CBC.
* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
expectation data.
Together with r65933, this closes issue #14839.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Especially over checking argc then calling rb_scan_args just to
raise an ArgumentError.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Unicode Text Segmentation considers CRLF as a character. [Bug #15337]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65954 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
It seems that decades ago, ruby was written under assumption that
char is unsigned. Which is of course a false assumption. We
need to explicitly store a numeric value into an unsigned char
variable to tell we expect 0..255 value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65900 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|