summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2017-12-22encoding.c: rb_enc_find_index2nobu
* string.c (str_undump): use rb_enc_find_index2 to find encoding by unterminated string. check the format before encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61396 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-21string.c: fix memory leaknobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-21Don't allow mixed escapenaruse
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-21move dump format validation into parsing epiloguenaruse
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61380 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-21fix escapes in undumpnaruse
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61379 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-16string.c: multiple codepointsnobu
* string.c (undump_after_backslash): fix multiple codepoints in braces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-16string.c: suppress warningnobu
* string.c (str_undump): suppress maybe-uninitialized warning by gcc 7 and later. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61289 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-14Implement String#undump to unescape String#dump-ed stringtadd
[Feature #12275] [close GH-1765] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61228 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-02string.c: fix rb_external_str_new_with_encnobu
* string.c (rb_external_str_new_with_enc): do not search non-ascii by NULL pointer. [ruby-core:84055] [Bug #14150] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60979 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-14string.c: prefer rb_syserr_failnobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60761 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-12string.c: fix up r60748rhe
An #ifdef was missing in r60748 and build broke on systems without crypt_r(). https://rubyci.org/logs/rubyci.s3.amazonaws.com/unstable11s/ruby-trunk/log/20171112T162503Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-12string.c: fix memory leak in String#cryptrhe
Use ALLOCV to allocate struct crypt_data for slightly cleaner and less error-prone code. It is currently possible it leaks when an invalid argument is passed to String#crypt or rb_str_new_cstr() fails to allocate memory. SIZEOF_CRYPT_DATA macro in missing/crypt.h is removed since it is not used any longer. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60748 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-07string.c: improve docs for String#{concat,<<}stomar
* string.c: [DOC] remove a misleading call-seq for String#concat, which suggests that all arguments must be Integers in this case; also clarify in the example that the receiver is modified; fix grammar for String#<<; move references to the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-07string.c: fix typosstomar
* string.c: [DOC] fix typos in doxygen comments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60707 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-29string.c: improve docsstomar
* string.c: [DOC] fix rdoc for cross reference; fix grammar. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-27string.c: Improve String#prepend performance if only one argument is givenwatson1978
* string.c (rb_str_prepend_multi): Prepend the string without generating temporary String object if only one argument is given. This is very similar with https://github.com/ruby/ruby/pull/1634 String#prepend -> 47.5 % up [Fix GH-1670] [ruby-core:82195] [Bug #13773] * Before String#prepend 1.517M (± 1.8%) i/s - 7.614M in 5.019819s * After String#prepend 2.236M (± 3.4%) i/s - 11.234M in 5.029716s * Test code require 'benchmark/ips' Benchmark.ips do |x| x.report "String#prepend" do |loop| loop.times { "!".prepend("hello") } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-22string.c: comment layout [ci skip]nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21* remove trailing spaces.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21* string.c: [DOC] Split rdoc of String#<< and String#concat [ci skip]sonots
Split String#<< and String#concat docs to reflect single and multiple arguments patched by MSP-Greg [fix GH-1614] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60328 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21* string.c: Remove errant "the" in gsub documentationsonots
patched by jlmuir (J. Lewis Muir) [fix GH-1679] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21Improve performance of string interpolationnobu
This patch will add pre-allocation in string interpolation. By this, unecessary capacity resizing is avoided. For small strings, optimized `rb_str_resurrect` operation is faster, so pre-allocation is done only when concatenated strings are large. `MIN_PRE_ALLOC_SIZE` was decided by experimenting with local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)). String interpolation will be faster around 72% when large string is created. * Before ``` Calculating ------------------------------------- Large string interpolation 1.276M (± 5.9%) i/s - 6.358M in 5.002022s Small string interpolation 5.156M (± 5.5%) i/s - 25.728M in 5.005731s ``` * After ``` Calculating ------------------------------------- Large string interpolation 2.201M (± 5.8%) i/s - 11.063M in 5.043724s Small string interpolation 5.192M (± 5.7%) i/s - 25.971M in 5.020516s ``` * Test code ```ruby require 'benchmark/ips' Benchmark.ips do |x| x.report "Large string interpolation" do |t| a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo" b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld" t.times do "#{a}, #{b}!" end end x.report "Small string interpolation" do |t| a = "Hello" b = "World" t.times do "#{a}, #{b}!" end end end ``` [Fix GH-1626] From: Nao Minami <south37777@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21Add documentation for `chomp` option.hsbt
https://github.com/ruby/ruby/pull/1717 Patch by @ksss [fix GH-1717] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21* string.c (deleted_prefix_length, deleted_suffix_length):sonots
Add doxygen comment. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60254 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-21[Feature #13712] String#start_with? supports regexpnaruse
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60234 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-10-01string.c: avoid unnecessary call of str_strlen()glass
* string.c (rb_strseq_index): refactor and avoid call of str_strlen() when offset == 0. it will improve performance of String#index and #include? * benchmark/bm_string_index.rb: benchmark for this change git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-30string.c: fix ASCII-only on succnobu
* string.c (str_succ): clear coderange cache when no alpha-numeric character case, carried part may become ASCII-only. [ruby-core:83062] [Bug #13952] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-29string.c: ASCII-incompatible is not ASCII onlynobu
* string.c (tr_trans): ASCII-incompatible encoding strings cannot be ASCII-only even if valid. [ruby-core:83056] [Bug #13950] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60060 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#split return valuenobu
* string.c (rb_str_split): return duplicated receiver, when no splits. patched by tompng (tomoya ishida) in [ruby-core:82911], and the test case by Seiei Miyagi <hanachin@gmail.com>. [Bug#13925] [Fix GH-1705] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#rpartition return valuenobu
* string.c (rb_str_rpartition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60001 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#partition return valuenobu
* string.c (rb_str_partition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-18refinements in string interpolationnobu
* compile.c (iseq_compile_each0): insert to_s method call, so that refinements activated at the caller should take place. [Feature #13812] * insns.def (tostring): fix up converted object to a string, infect and fallback. * insns.def (branchiftype): new instruction for conversion. branches if TOS is an instance of the given type. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59950 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06Fix a typo [ci skip]kazu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06string.c: fix false coderangenobu
* string.c (rb_enc_str_scrub): enc can differ from the actual encoding of the string, the cached coderange is useless then. [ruby-core:82674] [Bug #13874] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06string.c: optimize enumerate_grapheme_clustersnobu
* string.c (rb_str_enumerate_grapheme_clusters): optimize when single byte only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59762 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-04string.c: grapheme clusters on frozen stringnobu
* string.c (rb_str_enumerate_grapheme_clusters): enumerate on shared frozen string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-04string.c: enumerator_elementnobu
* string.c (enumerator_element): push or yield elements, and return 1 if needs checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59742 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: make array in WANTARRAYnobu
* string.c (WANTARRAY): make array for the result in method functions and pass it to enumerator functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: enumerator_wantarraynobu
* string.c (enumerator_wantarray): show warnings at method functions for proper method names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: fix for non-Unicode encodingsnobu
* string.c (rb_str_enumerate_grapheme_clusters): should enumerate chars for non-Unicode encodings. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59731 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: suppress a warningnobu
* string.c (rb_str_enumerate_grapheme_clusters): suppress a maybe-uninitialized warning by old gcc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59730 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31string.c: adjust indent [ci skip]nobu
* string.c (rb_str_enumerate_grapheme_clusters): adjust indent. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31String#each_grapheme_cluster and String#grapheme_clustersnaruse
added to enumerate grapheme clusters [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-28string.c: fix potential bug in String#splitglass
* string.c (rb_str_split_m): fix potential bug when rb_memsearch() matches a octet in the middle of a multi-byte character sequence. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-17Add optimization for creating zerofill stringnaruse
``` require 'benchmark' n = 1 * 1024 * 1024 * 1024 Benchmark.bmbm do |x| x.report("*") { 0.chr * n } x.report("ljust") { String.new(capacity: n).ljust(n, "\0") } end ``` Before ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.358396 0.392753 0.751149 ( 1.134231) ljust 0.203277 0.389223 0.592500 ( 0.594816) -------------------------------- total: 1.343649sec user system total real * 0.282647 0.304600 0.587247 ( 0.589205) ljust 0.201834 0.283801 0.485635 ( 0.487617) ``` After ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.000522 0.000021 0.000543 ( 0.000534) ljust 0.208551 0.321030 0.529581 ( 0.542083) -------------------------------- total: 0.530124sec user system total real * 0.000069 0.000006 0.000075 ( 0.000069) ljust 0.206698 0.301032 0.507730 ( 0.517674) ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-04string.c: improve String#scannobu
* string.c (rb_str_rstrip_bang): improve the performance in 50% for a string pattern, and in 10% for a regexp pattern. get rid of making MatchData in middle, which is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-30string.c: rb_str_initializenobu
* string.c (rb_str_initialize): new function to (re)initialize a string with data and encoding. extracted from rb_external_str_new_with_enc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-20string.c: add String#delete_suffix and String#delete_suffix!sonots
to remove trailing suffix [Feature #13665] [Fix GH-1661] * string.c (rb_str_delete_suffix_bang): add a new method to remove suffix destuctively. * string.c (rb_str_delete_suffix): add a new method to remove suffix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-19revert r59359, r59356, r59355, r59354normal
These caused numerous CI failures I haven't been able to reproduce [ruby-core:82102] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18string: preserve taint flag with String#-@ (uminus)normal
* string.c (tainted_fstr_update): move up (rb_fstring): support registering tainted strings (register_fstring_tainted): extract from rb_fstring_existing0 (rb_tainted_fstring_existing): use register_fstring_tainted instead git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18hash: keep fstrings of tainted strings for string keysnormal
The same hash keys may be loaded from tainted data sources frequently (e.g. parsing headers from socket or loading YAML data from a file). If a non-tainted fstring already exists (because the application expects the hash key), cache and deduplicate the tainted version in the new tainted_frozen_strings table. For non-embedded strings, this also allows sharing with the underlying malloc-ed data. * vm_core.h (rb_vm_struct): add tainted_frozen_strings * vm.c (ruby_vm_destruct): free tainted_frozen_strings (Init_vm_objects): initialize tainted_frozen_strings (rb_vm_tfstring_table): accessor for tainted_frozen_strings * internal.h: declare rb_fstring_existing, rb_vm_tfstring_table * hash.c (fstring_existing_str): remove (moved to string.c) (hash_aset_str): use rb_fstring_existing * string.c (rb_fstring_existing): new, based on fstring_existing_str (tainted_fstr_update): new (rb_fstring_existing0): new, based on fstring_existing_str (rb_tainted_fstring_existing): new, special case for tainted strings (rb_str_free): delete from tainted_frozen_strings table * test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test [ruby-core:82012] [Bug #13737] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e