Remove Encoding#replicate

2022-11-10Transition shape when object's capacity changesJemma Issroff
Transition shape when object's capacity changes

This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <>
Decouple GC slot sizes from RVALUE
Add a new macro BASE_SLOT_SIZE that determines the slot size. For Variable Width Allocation (compiled with USE_RVARGC=1), all slot sizes are powers-of-2 multiples of BASE_SLOT_SIZE. For USE_RVARGC=0, BASE_SLOT_SIZE is set to sizeof(RVALUE).
2022-01-28Remove assert_equal that will never be runPeter Zhu
`@s1.set_len(3)` will raise so the `assert_equal` will never be ran.
Make embedded string length a long for VWA
A short (2 bytes) will cause unaligned struct accesses when strings are used as a buffer to directly store binary data.
Remove tainted and trusted features
Already these had been announced to be removed in 3.2.
[Feature #18239] Implement VWA for strings
This commit adds support for embedded strings with variable capacity and uses Variable Width Allocation to allocate strings.
2021-06-26Scan the coderange in the given encodingNobuyoshi Nakada
2021-03-22rb_enc_interned_str: handle autoloaded encodingsJean Boussier
rb_enc_interned_str: handle autoloaded encodings

If called with an autoloaded encoding that was not yet initialized, `rb_enc_interned_str` would crash with a NULL pointer exception. See:
Fix rb_interned_str_* functions to not assume static strings
Fixes [Feature #13381] When passed a `fake_str`, `register_fstring` would create new strings with `str_new_static`. That's not what was expected, and answer almost no use cases.
2020-11-20Make String methods return String instances when called on a subclass instanceJeremy Evans
This modifies the following String methods to return String instances instead of subclass instances: * String#* * String#capitalize * String#center * String#chomp * String#chop * String#delete * String#delete_prefix * String#delete_suffix * String#downcase * String#dump * String#each/#each_line * String#gsub * String#ljust * String#lstrip * String#partition * String#reverse * String#rjust * String#rpartition * String#rstrip * String#scrub * String#slice! * String#slice/#[] * String#split * String#squeeze * String#strip * String#sub * String#succ/#next * String#swapcase * String#tr * String#tr_s * String#upcase This also fixes a bug in String#swapcase where it would return the receiver instead of a copy of the receiver if the receiver was the empty string. Some string methods were left to return subclass instances: * String#+@ * String#-@ Both of these methods will return the receiver (subclass instance) in some cases, so it is best to keep the returned class consistent. Fixes [#10845] Notes: Merged:
2020-03-08Word array instead of splittingNobuyoshi Nakada
2019-11-18test/-ext-/string/test_fstring.rb: suppress a warning for taintYusuke Endoh
Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind version guards. This affects all code in lib, including some libraries that may want to support older versions of Ruby.
2019-11-05Revert "[EXPERIMENTAL] Make Symbol#to_s return a frozen String [Feature #16150]"NARUSE, Yui
This reverts commit 6ffc045a817fbdf04a6945d3c260b55b0fa1fd1e.
2019-10-08more use of RbConfig::LIMITS卜部昌平
`8 * RbConfig::SIZEOF` ... is not straight.
2019-09-30test/-ext-/string/test_fstring.rb: suppress "possibly useless use of -@"Yusuke Endoh
"in void context" by assigning the result to a dummy variable.
[EXPERIMENTAL] Make Symbol#to_s return a frozen String
* Always the same frozen String for a given Symbol. * Avoids extra allocations whenever calling Symbol#to_s. * See [Feature #16150]
2019-09-26Tag string shared roots to fix use-after-freeAlan Wu
Tag string shared roots to fix use-after-free

The buffer deduplication codepath in rb_fstring can be used to free the buffer of shared string roots, which leads to use-after-free. Introudce a new flag to tag strings that at one point have been a shared root. Check for it in rb_fstring to avoid freeing buffers that are shared by multiple strings. This change is based on nobu's idea in [ruby-core:94838]. The included test case test for the sequence of calls to internal functions that lead to this bug. See attached ticket for Ruby level repros. [Bug #16151]
2019-06-26Resize capacity for fstringJohn Hawthorn
When a string is #frozen, it's capacity is resized to fit (if it is much larger), since we know it will no longer be mutated. > puts ObjectSpace.dump("a"*30, capacity: 1000)) {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"... > puts ObjectSpace.dump("a"*30, capacity: 1000).freeze) {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"... (ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize) Previously, if we dedup into an fstring, using String#-@, capacity would not be reduced. > puts ObjectSpace.dump("a"*30, capacity: 1000)) {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"... This commit makes rb_fstring call rb_str_resize, the same as rb_str_freeze does. Closes:
2019-06-23Get rid of error with frozen string literalNobuyoshi Nakada
[Bug #14194]
2019-05-09str_duplicate: Don't share with a frozen shared stringAlan Wu
str_duplicate: Don't share with a frozen shared string

This is a follow up for 3f9562015e651735bfc2fdd14e8f6963b673e22a. Before this commit, it was possible to create a shared string which shares with another shared string by passing a frozen shared string to `str_duplicate`. Such string looks like: ``` -------- ----------------- | root | ------ owns -----> | root's buffer | -------- ----------------- ^ ^ ^ ----------- | | | shared1 | ------ references ----- | ----------- | ^ | ----------- | | shared2 | ------ references --------- ----------- ``` This is bad news because `rb_fstring(shared2)` can make `shared1` independent, which severs the reference from `shared1` to `root`: ```c /* from fstr_update_callback() */ str = str_new_frozen(rb_cString, shared2); /* can return shared1 */ if (STR_SHARED_P(str)) { /* shared1 is also a shared string */ str_make_independent(str); /* no frozen check */ } ``` If `shared1` was the only reference to `root`, then `root` can be reclaimed by the GC, leaving `shared2` in a corrupted state: ``` ----------- -------------------- | shared1 | -------- owns --------> | shared1's buffer | ----------- -------------------- ^ | ----------- ------------------------- | shared2 | ------ references ----> | root's buffer (freed) | ----------- ------------------------- ``` Here is a reproduction script for the situation this commit fixes. ```ruby a = ('a' * 24).strip.freeze.strip -a p a 4.times { GC.start } p a ``` - string.c (str_duplicate): always share with the root string when the original is a shared string. - test_rb_str_dup.rb: specifically test `rb_str_dup` to make sure it does not try to share with a shared string. [Bug #15792] Closes:
2017-12-12Add FrozenError as a subclass of RuntimeErrorshyouhei
Add FrozenError as a subclass of RuntimeError

FrozenError will be used instead of RuntimeError for exceptions raised when there is an attempt to modify a frozen object. The reason for this change is to differentiate exceptions related to frozen objects from generic exceptions such as those generated by Kernel#raise without an exception class. From: Jeremy Evans <> Signed-off-by: Urabe Shyouhei <>
Add test for Bug::String.buf_new

2017-12-02string.c: fix rb_external_str_new_with_encnobu
string.c: fix rb_external_str_new_with_enc

* string.c (rb_external_str_new_with_enc): do not search non-ascii by NULL pointer. [ruby-core:84055] [Bug #14150]
Fixed misspelling words.

These are detected by
These are detected by git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31io.c: shrink read buffernobu
io.c: shrink read buffer

* io.c (io_setstrbuf): return true if the buffer is newly created. * io.c (io_set_read_length): shrink the read buffer if it is a new object and is too large. [ruby-core:81370] [Bug #13597]
2016-12-22test_modify_expand.rb: skip if no overflownobu
test_modify_expand.rb: skip if no overflow

* test/-ext-/string/test_modify_expand.rb (test_integer_overflow): no longer happens on platforms where size_t is larger than long, e.g. 64bit windows, since r57122.
2016-11-12test_fstring.rb: fix exceptionnobu
test_fstring.rb: fix exception

* test/-ext-/string/test_fstring.rb (test_singleton_class): fix expected exception class. [ruby-dev:49867] [Bug #12923]
2016-11-12class.c: no fstring singleton classnobu
class.c: no fstring singleton class

* class.c (singleton_class_of): prohibit fstrings from creating singleton classes. temporary measure for [ruby-dev:49867] [Bug #12923]
2016-09-13string.c: fix buffer overflow check condition in rb_str_set_len()rhe
string.c: fix buffer overflow check condition in rb_str_set_len()

* string.c (rb_str_set_len): The buffer overflow check is wrong. The space for termlen is allocated outside the capacity returned by rb_str_capacity(). This fixes r41920 ("string.c: multi-byte terminator", 2013-07-11). [ruby-core:77257] [Bug #12757] * test/-ext-/string/test_set_len.rb (test_capacity_equals_to_new_size): Test for this change. Applying only the test will trigger [BUG].
require "rbconfig/sizeof"

They may fail parallel test-all
They may fail parallel test-all git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30string.c: return reallocated pointernobu
string.c: return reallocated pointer

* string.c (str_fill_term): return new pointer reallocated by filling terminator.
2016-05-18string.c: integer overflownobu
string.c: integer overflow

* string.c (rb_str_modify_expand): check integer overflow. [ruby-core:75592] [Bug #12390]
* string.c (rb_str_init): introduce size)

[Feature #12024]
[Feature #12024] git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-03* string.c (str_new_frozen): if the given string is embeddedablenaruse
but not embedded, embed a new copied string. [Bug #11946] git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27* test/-ext-/string/test_capacity.rb: Added missing library.hsbt
git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27Add tests about String's internal capacitynaruse
git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-16Add frozen_string_literal: false for all filesnaruse
When you change this to true, you may need to add more tests. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-01string.c: no frozen error at cstrnobu
* string.c (rb_string_value_cstr): should not raise on frozen string. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-01cstr.c: split bug_str_cstr_untermnobu
* ext/-test-/string/cstr.c (bug_str_cstr_unterm): split unterminating from bug_str_cstr_term. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-07-25string.c: fill the terminatornobu
* string.c (str_replace_shared_without_enc): fill the terminator of embedded strings in wide char encodings. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-07-24string.c: pool only bare strings in fstringnobu
* string.c (fstr_update_callback): pool bare strings only. * string.c (rb_fstring): return the original string with sharing a fstring if it has extra attributes, not the fstring itself. [ruby-dev:49188] [Bug #11386] git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-07-15encoding.c: drop dummy encoding flagnobu
* encoding.c (enc_autoload): drop dummy encoding flag from the loaded encoding index. this flag is used only in this source. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-07-15-test-/string: movenobu
* ext/-test-/string/extconf.rb: move "-test-/string/" to "-test-/". git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-07-07file.c: skip invalid bytenobu
* file.c (rb_str_normalize_ospath): skip invalid byte sequence not to loop infinitely. this case usually does not happen as the input name should come from real file systems. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-06-12test_nofree.rb: fix commit missnobu
* test/-ext-/string/test_nofree.rb (test_no_memory_leak): remove limit and make the interation longer. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-06-12test_nofree.rb: rehearsalnobu
* test/-ext-/string/test_nofree.rb (test_no_memory_leak): add a rehearsal. git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-04-17string.c: clear NOFREE flag at embeddingnobu
* string.c (STR_SET_EMBED): clear NOFREE flag at embedding as embedded strings no longer refer static strings. [ruby-core:68436] [Bug #10942] git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-01-26string.c: consider widecharnobu
* string.c (str_make_independent_expand): consider wide char encoding. [Fix GH-821] git-svn-id: svn+ssh:// b2dd03c8-39d4-4d8f-98ff-823fe69b080e