summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2023-12-24Fix Symbol#inspect for GC compactionPeter Zhu
The test fails when RGENGC_CHECK_MODE is turned on: 1) Failure: TestSymbol#test_inspect_under_gc_compact_stress [test/ruby/test_symbol.rb:123]: <":testing"> expected but was <":\x00\x00\x00\x00\x00\x00\x00">.
2023-12-23Fix String#sub for GC compactionPeter Zhu
The test fails when RGENGC_CHECK_MODE is turned on: TestString#test_sub_gc_compact_stress = 9.42 s 1) Failure: TestString#test_sub_gc_compact_stress [test/ruby/test_string.rb:2089]: <"aaa [amp] yyy"> expected but was <"aaa [] yyy">.
2023-12-17Stir the hash value more with encoding indexNobuyoshi Nakada
2023-12-16[Bug #20068] Encoding does not matter to empty stringsNobuyoshi Nakada
2023-12-13Make String#chomp! raise ArgumentError for 2+ arguments if string is emptyJeremy Evans
String#chomp! returned nil without checking the number of passed arguments in this case.
2023-12-01Make String#undump compaction safePeter Zhu
2023-12-01Pin embedded shared stringsPeter Zhu
Embedded shared strings cannot be moved because strings point into the slot of the shared string. There may be code using the RSTRING_PTR on the stack, which would pin the string but not pin the shared string, causing it to move.
2023-11-29Guard match from GC in String#gsubPeter Zhu
We need to guard match from GC because otherwise it could end up being reclaimed or moved in compaction.
2023-11-27Guard match from GC when scanning stringPeter Zhu
We need to guard match from GC because otherwise it could end up being reclaimed or moved in compaction.
2023-11-20Specialize String#dupJean Boussier
`String#+@` is 2-3 times faster than `String#dup` because it can directly go through `rb_str_dup` instead of using the generic much slower `rb_obj_dup`. This fact led to the existance of the ugly `Performance/UnfreezeString` rubocop performance rule that encourage users to rewrite the much more readable and convenient `"foo".dup` into the ugly `(+"foo")`. Let's make that rubocop rule useless. ``` compare-ruby: ruby 3.3.0dev (2023-11-20T02:02:55Z master 701b0650de) [arm64-darwin22] last_commit=[ruby/prism] feat: add encoding for IBM865 (https://github.com/ruby/prism/pull/1884) built-ruby: ruby 3.3.0dev (2023-11-20T12:51:45Z faster-str-lit-dup 6b745bbc5d) [arm64-darwin22] warming up.. | |compare-ruby|built-ruby| |:------|-----------:|---------:| |uplus | 16.312M| 16.332M| | | -| 1.00x| |dup | 5.912M| 16.329M| | | -| 2.76x| ```
2023-11-09String#force_encoding don't clear coderange if encoding is unchangedJean Boussier
Some code out there blind calls `force_encoding` without checking what the original encoding was, which clears the coderange uselessly. If the String is big, it can be a rather costly mistake. For instance the `rack-utf8_sanitizer` gem does this on request bodies.
2023-11-08String for string literal is not resizableNobuyoshi Nakada
2023-11-02Make String.new size pools aware.Jean Boussier
If the required capacity would fit in an embded string, returns one. This can reduce malloc churn for code that use string buffers.
2023-09-27[DOC] Missing comment markersNobuyoshi Nakada
2023-09-26[Bug #19902] Update the coderange regarding the changed regionNobuyoshi Nakada
2023-09-01Use end of char boundary in start_with?John Hawthorn
Previously we used the next character following the found prefix to determine if the match ended on a broken character. This had caused surprising behaviour when a valid character was followed by a UTF-8 continuation byte. This commit changes the behaviour to instead look for the end of the last character in the prefix. [Bug #19784] Co-authored-by: ywenc <ywenc@github.com> Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/8348
2023-08-26[Bug #19784] Fix behaviors against prefix with broken encodingNobuyoshi Nakada
- String#start_with? - String#delete_prefix - String#delete_prefix! Notes: Merged: https://github.com/ruby/ruby/pull/8296
2023-08-26Introduce `at_char_boundary` functionNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/8296
2023-08-23Fix premature string collection during appendAlan Wu
Previously, the following crashed due to use-after-free with AArch64 Alpine Linux 3.18.3 (aarch64-linux-musl): ```ruby str = 'a' * (32*1024*1024) p({z: str}) ``` 32 MiB is the default for `GC_MALLOC_LIMIT_MAX`, and the crash could be dodged by setting `RUBY_GC_MALLOC_LIMIT_MAX` to large values. Under a debugger, one can see the `str2` of rb_str_buf_append() getting prematurely collected while str_buf_cat4() allocates capacity. Add GC guards so the buffer of `str2` lives across the GC run initiated in str_buf_cat4(). [Bug #19792]
2023-08-22Use STR_EMBED_P instead of testing STR_NOEMBEDPeter Zhu
2023-08-18Don't check for STR_NOEMBED in rb_fstringPeter Zhu
We don't need to check for STR_NOEMBED because the check above for STR_EMBED_P means that it can never be false. Notes: Merged: https://github.com/ruby/ruby/pull/8238
2023-08-11[DOC] Don't suppress autolinks (#8208)Burdette Lamar
Notes: Merged-By: peterzhu2118 <peter@peterzhu.ca>
2023-08-03No computing embed_capa_max in str_subseqKunshan Wang
Fix str_subseq so that it does not attempt to predict the size of the object returned by str_alloc_heap. Notes: Merged: https://github.com/ruby/ruby/pull/8165
2023-07-28Fill terminator properlyNobuyoshi Nakada
2023-07-15[Bug #19769] Fix range of size 1 in `String#tr`alexandre184
Notes: Merged: https://github.com/ruby/ruby/pull/8080 Merged-By: nobu <nobu@ruby-lang.org>
2023-07-09Make the string index functions closer to symmetricNobuyoshi Nakada
So that irregular parts may be more noticeable. Notes: Merged: https://github.com/ruby/ruby/pull/8047
2023-07-09Make `rb_str_rindex` return byte indexNobuyoshi Nakada
Leave callers to convert byte index to char index, as well as `rb_str_index`, so that `rb_str_rpartition` does not need to re-convert char index to byte index. Notes: Merged: https://github.com/ruby/ruby/pull/8047
2023-07-09[Bug #19763] Raise same message exception for regexpNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/8045
2023-06-28Ensure the byte position is a valid boundaryNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7991
2023-06-28[Bug #19748] Fix out-of-bound access in `String#byteindex`Nobuyoshi Nakada
2023-06-28[Bug #19746] `String#index` with regexp should clear `$~` unless matchedNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7988
2023-06-20[DOC] Regexp doc (#7923)Burdette Lamar
Notes: Merged-By: peterzhu2118 <peter@peterzhu.ca>
2023-06-09Assign into optimal size pools using String#split("")Matt Valentine-House
When String#split is used with an empty string as the field seperator it effectively splits the original string into chars, and there is a pre-existing fast path for this using SPLIT_TYPE_CHARS. However this path creates an empty array in the smallest size pool and grows from there, despite already knowing the size of the desired array. This commit pre-allocates the correct size array in this case in order to allow the arrays to be embedded and avoid being allocated in the transient heap Notes: Merged: https://github.com/ruby/ruby/pull/7919
2023-06-06Unify length field for embedded and heap strings (#7908)Peter Zhu
* Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-05[DOC] Update flags doc for stringsPeter Zhu
The length of an embedded string is no longer in the flags.
2023-06-01Simplify duplicated codePeter Zhu
The capacity of the string can be calculated using the str_capacity function. Notes: Merged: https://github.com/ruby/ruby/pull/7879
2023-06-01Don't refetch ptr and lenPeter Zhu
The call to RSTRING_GETMEM already fetched the pointer and length, so we don't need to fetch it again. Notes: Merged: https://github.com/ruby/ruby/pull/7879
2023-05-26Remove dead code in string.cPeter Zhu
The STR_DEC_LEN macro is not used.
2023-04-06[Feature #19474] Refactor NEWOBJ macrosMatt Valentine-House
NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec Notes: Merged: https://github.com/ruby/ruby/pull/7393
2023-04-04[Feature #19579] Remove !USE_RVARGC code (#7655)Peter Zhu
Remove !USE_RVARGC code [Feature #19579] The Variable Width Allocation feature was turned on by default in Ruby 3.2. Since then, we haven't received bug reports or backports to the non-Variable Width Allocation code paths, so we assume that nobody is using it. We also don't plan on maintaining the non-Variable Width Allocation code, so we are going to remove it. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-03-18RJIT: Optimize String#bytesizeTakashi Kokubun
2023-03-06Stop exporting symbols for MJITTakashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7459
2023-03-05Optimize String#getbyteTakashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7448
2023-03-03rb_str_modify_expand: clear the string coderangeRĂ´mulo Ceccon
[Bug #19468] b0b9f7201acab05c2a3ad92c3043a1f01df3e17f errornously stopped clearing the coderange. Since `rb_str_modify` clears it, `rb_str_modify_expand` should too. Notes: Merged: https://github.com/ruby/ruby/pull/7437
2023-02-27Fix spelling (#7389)John Bampton
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-02-27Symbol#end_with? accepts Strings onlyAdam Daniels
Regular expressions are not supported (same as String#end_with?). Notes: Merged: https://github.com/ruby/ruby/pull/7384
2023-02-19Remove (newly unneeded) remarks about aliasesBurdetteLamar
2023-02-19[DOC] Small adjustment for String method docszverok
* Hide freeze method (no useful docs, same as Object#freeze) * Add dedup to call-seq of str_uminus Notes: Merged: https://github.com/ruby/ruby/pull/7316
2023-02-09Rename rb_str_splice_{0,1} -> rb_str_update_{0,1}Matt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7278
2023-02-09Remove alias macro rb_str_spliceMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7278