summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2017-10-01string.c: avoid unnecessary call of str_strlen()glass
* string.c (rb_strseq_index): refactor and avoid call of str_strlen() when offset == 0. it will improve performance of String#index and #include? * benchmark/bm_string_index.rb: benchmark for this change git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-30string.c: fix ASCII-only on succnobu
* string.c (str_succ): clear coderange cache when no alpha-numeric character case, carried part may become ASCII-only. [ruby-core:83062] [Bug #13952] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-29string.c: ASCII-incompatible is not ASCII onlynobu
* string.c (tr_trans): ASCII-incompatible encoding strings cannot be ASCII-only even if valid. [ruby-core:83056] [Bug #13950] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60060 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#split return valuenobu
* string.c (rb_str_split): return duplicated receiver, when no splits. patched by tompng (tomoya ishida) in [ruby-core:82911], and the test case by Seiei Miyagi <hanachin@gmail.com>. [Bug#13925] [Fix GH-1705] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#rpartition return valuenobu
* string.c (rb_str_rpartition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60001 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-23dup String#partition return valuenobu
* string.c (rb_str_partition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-18refinements in string interpolationnobu
* compile.c (iseq_compile_each0): insert to_s method call, so that refinements activated at the caller should take place. [Feature #13812] * insns.def (tostring): fix up converted object to a string, infect and fallback. * insns.def (branchiftype): new instruction for conversion. branches if TOS is an instance of the given type. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59950 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06Fix a typo [ci skip]kazu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06string.c: fix false coderangenobu
* string.c (rb_enc_str_scrub): enc can differ from the actual encoding of the string, the cached coderange is useless then. [ruby-core:82674] [Bug #13874] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-06string.c: optimize enumerate_grapheme_clustersnobu
* string.c (rb_str_enumerate_grapheme_clusters): optimize when single byte only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59762 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-04string.c: grapheme clusters on frozen stringnobu
* string.c (rb_str_enumerate_grapheme_clusters): enumerate on shared frozen string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-04string.c: enumerator_elementnobu
* string.c (enumerator_element): push or yield elements, and return 1 if needs checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59742 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: make array in WANTARRAYnobu
* string.c (WANTARRAY): make array for the result in method functions and pass it to enumerator functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: enumerator_wantarraynobu
* string.c (enumerator_wantarray): show warnings at method functions for proper method names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: fix for non-Unicode encodingsnobu
* string.c (rb_str_enumerate_grapheme_clusters): should enumerate chars for non-Unicode encodings. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59731 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-03string.c: suppress a warningnobu
* string.c (rb_str_enumerate_grapheme_clusters): suppress a maybe-uninitialized warning by old gcc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59730 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31string.c: adjust indent [ci skip]nobu
* string.c (rb_str_enumerate_grapheme_clusters): adjust indent. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31String#each_grapheme_cluster and String#grapheme_clustersnaruse
added to enumerate grapheme clusters [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-28string.c: fix potential bug in String#splitglass
* string.c (rb_str_split_m): fix potential bug when rb_memsearch() matches a octet in the middle of a multi-byte character sequence. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-17Add optimization for creating zerofill stringnaruse
``` require 'benchmark' n = 1 * 1024 * 1024 * 1024 Benchmark.bmbm do |x| x.report("*") { 0.chr * n } x.report("ljust") { String.new(capacity: n).ljust(n, "\0") } end ``` Before ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.358396 0.392753 0.751149 ( 1.134231) ljust 0.203277 0.389223 0.592500 ( 0.594816) -------------------------------- total: 1.343649sec user system total real * 0.282647 0.304600 0.587247 ( 0.589205) ljust 0.201834 0.283801 0.485635 ( 0.487617) ``` After ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.000522 0.000021 0.000543 ( 0.000534) ljust 0.208551 0.321030 0.529581 ( 0.542083) -------------------------------- total: 0.530124sec user system total real * 0.000069 0.000006 0.000075 ( 0.000069) ljust 0.206698 0.301032 0.507730 ( 0.517674) ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-04string.c: improve String#scannobu
* string.c (rb_str_rstrip_bang): improve the performance in 50% for a string pattern, and in 10% for a regexp pattern. get rid of making MatchData in middle, which is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-30string.c: rb_str_initializenobu
* string.c (rb_str_initialize): new function to (re)initialize a string with data and encoding. extracted from rb_external_str_new_with_enc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-20string.c: add String#delete_suffix and String#delete_suffix!sonots
to remove trailing suffix [Feature #13665] [Fix GH-1661] * string.c (rb_str_delete_suffix_bang): add a new method to remove suffix destuctively. * string.c (rb_str_delete_suffix): add a new method to remove suffix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-19revert r59359, r59356, r59355, r59354normal
These caused numerous CI failures I haven't been able to reproduce [ruby-core:82102] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18string: preserve taint flag with String#-@ (uminus)normal
* string.c (tainted_fstr_update): move up (rb_fstring): support registering tainted strings (register_fstring_tainted): extract from rb_fstring_existing0 (rb_tainted_fstring_existing): use register_fstring_tainted instead git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18hash: keep fstrings of tainted strings for string keysnormal
The same hash keys may be loaded from tainted data sources frequently (e.g. parsing headers from socket or loading YAML data from a file). If a non-tainted fstring already exists (because the application expects the hash key), cache and deduplicate the tainted version in the new tainted_frozen_strings table. For non-embedded strings, this also allows sharing with the underlying malloc-ed data. * vm_core.h (rb_vm_struct): add tainted_frozen_strings * vm.c (ruby_vm_destruct): free tainted_frozen_strings (Init_vm_objects): initialize tainted_frozen_strings (rb_vm_tfstring_table): accessor for tainted_frozen_strings * internal.h: declare rb_fstring_existing, rb_vm_tfstring_table * hash.c (fstring_existing_str): remove (moved to string.c) (hash_aset_str): use rb_fstring_existing * string.c (rb_fstring_existing): new, based on fstring_existing_str (tainted_fstr_update): new (rb_fstring_existing0): new, based on fstring_existing_str (rb_tainted_fstring_existing): new, special case for tainted strings (rb_str_free): delete from tainted_frozen_strings table * test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test [ruby-core:82012] [Bug #13737] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-06string.c: preserve coderange in String#setbyterhe
Fix a wrong jump so replacing a byte in an ASCII-only string with an ASCII character won't clear the coderange. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-06string.c: remove dead code in str_fill_term()rhe
The length of a string never exceeds the capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-21string.c: add String#delete_prefix and String#delete_prefix!sonots
to remove leading substr [Feature #12694] [fix GH-1632] * string.c (rb_str_delete_prefix_bang): add a new method to remove prefix destuctively. * string.c (rb_str_delete_prefix): add a new method to remove prefix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-18string.c: check just before modificationnobu
* string.c (rb_str_chomp_bang): check if modifiable after checking an argument and just before modification, as it can get frozen during the argument conversion to String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-02string.c: docs for String#splitstomar
* string.c: [DOC] clarify docs for String#split when called with limit and capture groups. Reported by Cichol Tsai. [ruby-core:81505] [Bug #13621] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-31Improve performance of implicit type conversionwatson1978
To convert the object implicitly, it has had two parts in convert_type() which are 1. lookink up the method's id 2. calling the method Seems that strncmp() and strcmp() in convert_type() are slightly heavy to look up the method's id for type conversion. This patch will add and use internal APIs (rb_convert_type_with_id, rb_check_convert_type_with_id) to call the method without looking up the method's id when convert the object. Array#flatten -> 19 % up Array#+ -> 3 % up [ruby-dev:50024] [Bug #13341] [Fix GH-1537] ### Before Array#flatten 104.119k (± 1.1%) i/s - 525.690k in 5.049517s Array#+ 1.993M (± 1.8%) i/s - 10.010M in 5.024258s ### After Array#flatten 124.005k (± 1.0%) i/s - 624.240k in 5.034477s Array#+ 2.058M (± 4.8%) i/s - 10.302M in 5.019328s ### Test Code require 'benchmark/ips' class Foo def to_ary [1,2,3] end end Benchmark.ips do |x| ary = [] 100.times { |i| ary << i } array = [ary] x.report "Array#flatten" do |i| i.times { array.flatten } end x.report "Array#+" do |i| obj = Foo.new i.times { array + obj } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-26string.c: adjust style [ci skip]nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-25string.c: Optimize String#concat when argc is 1k0kubun
Optimize performance regression introduced in r56021. * Benchmark (i7-4790K @ 4.00GH, x86_64 GNU/Linux) Benchmark.ips do |x| x.report("String#concat (1)") { "a".concat("b") } if RUBY_VERSION >= "2.4.0" x.report("String#concat (2)") { "a".concat("b", "c") } end end * Ruby 2.3 Calculating ------------------------------------- String#concat (1) 6.003M (± 5.2%) i/s - 30.122M in 5.031646s * Ruby 2.4 (Before this patch) Calculating ------------------------------------- String#concat (1) 4.458M (± 8.9%) i/s - 22.298M in 5.058084s String#concat (2) 3.660M (± 5.6%) i/s - 18.314M in 5.020527s * Ruby 2.4 (After this patch) Calculating ------------------------------------- String#concat (1) 6.448M (± 5.2%) i/s - 32.215M in 5.010833s String#concat (2) 3.633M (± 9.0%) i/s - 18.056M in 5.022603s [fix GH-1631] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-25vm_insnhelper.c: rb_eql_opt should call eql?nobu
* vm_insnhelper.c (rb_eql_opt): should call #eql? on Float and String, not #==. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c: fix String#crypt leak introduced in r58866normal
* string.c (rb_str_crypt): define LARGE_CRYPT_DATA when allocating git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c: for small crypt_datanobu
* string.c (rb_str_crypt): struct crypt_data defined in missing/crypt.h is small enough. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24Add debug counters.ko1
* debug_counter.h: add the following counters to measure object types. obj_free: freed count obj_str_ptr: freed count of Strings they have extra buff. obj_str_embed: freed count of Strings they don't have extra buff. obj_str_shared: freed count of Strings they have shared extra buff. obj_str_nofree: freed count of Strings they are marked as nofree. obj_str_fstr: freed count of Strings they are marked as fstr. obj_ary_ptr: freed count of Arrays they have extra buff. obj_ary_embed: freed count of Arrays they don't have extra buff. obj_obj_ptr: freed count of Objects (T_OBJECT) they have extra buff. obj_obj_embed: freed count of Objects they don't have extra buff. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c (rb_str_crypt): fix excessive stack use with crypt_rnormal
"struct crypt_data" is 131232 bytes on x86-64 GNU/Linux, making it unsafe to use tiny Fiber stack sizes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-21string.c: fix String#{casecmp,casecmp?} for non-string argumentsstomar
* string.c: make String#{casecmp,casecmp?} return nil for non-string arguments instead of raising a TypeError. * test/ruby/test_string.rb: add tests. Reported by Marcus Stollsteimer. Based on a patch by Shingo Morita. [ruby-core:80145] [Bug #13312] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58837 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-14string.c: cut down intermediate stringnobu
* string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13revert r58703 & r58705nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58708 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: fix up r58703nobu
* string.c (rb_external_str_new_with_enc): fix the case of conversion failure. when conversion failed for some reason, just ignores the default internal encoding and returns in the given encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58705 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: cut down intermediate stringnobu
* string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: fix one-off bugnobu
* string.c (rb_str_cat_conv_enc_opts): fix one-off bug. `ofs` equals `olen` when appending at the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58702 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-12string.c: remove bare Unicode.nobu
* string.c (rb_str_unicode_normalize): remove bare Unicode. do not assume that all compilers can handle UTF-8. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for String#matchstomar
* string.c: [DOC] add example for String#match with pos argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for Symbolstomar
* string.c: [DOC] adopt call-seq's for Symbol#{match,match?} from String methods; other small improvements for Symbol docs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for Symbol#{match,match?}stomar
* string.c: [DOC] mention pos argument for Symbol#{match,match?}. Patch by Yuki Kurihara (ksss). [Fix GH-1606] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-09string.c: fix r58618nobu
* string.c (unicode_normalize_common): aggregation type cannot be initialized with dynamic values, in C89. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e