summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2017-08-31string.c: adjust indent [ci skip]nobu
* string.c (rb_str_enumerate_grapheme_clusters): adjust indent. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-31String#each_grapheme_cluster and String#grapheme_clustersnaruse
added to enumerate grapheme clusters [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-28string.c: fix potential bug in String#splitglass
* string.c (rb_str_split_m): fix potential bug when rb_memsearch() matches a octet in the middle of a multi-byte character sequence. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-17Add optimization for creating zerofill stringnaruse
``` require 'benchmark' n = 1 * 1024 * 1024 * 1024 Benchmark.bmbm do |x| x.report("*") { 0.chr * n } x.report("ljust") { String.new(capacity: n).ljust(n, "\0") } end ``` Before ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.358396 0.392753 0.751149 ( 1.134231) ljust 0.203277 0.389223 0.592500 ( 0.594816) -------------------------------- total: 1.343649sec user system total real * 0.282647 0.304600 0.587247 ( 0.589205) ljust 0.201834 0.283801 0.485635 ( 0.487617) ``` After ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.000522 0.000021 0.000543 ( 0.000534) ljust 0.208551 0.321030 0.529581 ( 0.542083) -------------------------------- total: 0.530124sec user system total real * 0.000069 0.000006 0.000075 ( 0.000069) ljust 0.206698 0.301032 0.507730 ( 0.517674) ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-08-04string.c: improve String#scannobu
* string.c (rb_str_rstrip_bang): improve the performance in 50% for a string pattern, and in 10% for a regexp pattern. get rid of making MatchData in middle, which is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-30string.c: rb_str_initializenobu
* string.c (rb_str_initialize): new function to (re)initialize a string with data and encoding. extracted from rb_external_str_new_with_enc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-20string.c: add String#delete_suffix and String#delete_suffix!sonots
to remove trailing suffix [Feature #13665] [Fix GH-1661] * string.c (rb_str_delete_suffix_bang): add a new method to remove suffix destuctively. * string.c (rb_str_delete_suffix): add a new method to remove suffix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-19revert r59359, r59356, r59355, r59354normal
These caused numerous CI failures I haven't been able to reproduce [ruby-core:82102] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18string: preserve taint flag with String#-@ (uminus)normal
* string.c (tainted_fstr_update): move up (rb_fstring): support registering tainted strings (register_fstring_tainted): extract from rb_fstring_existing0 (rb_tainted_fstring_existing): use register_fstring_tainted instead git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-18hash: keep fstrings of tainted strings for string keysnormal
The same hash keys may be loaded from tainted data sources frequently (e.g. parsing headers from socket or loading YAML data from a file). If a non-tainted fstring already exists (because the application expects the hash key), cache and deduplicate the tainted version in the new tainted_frozen_strings table. For non-embedded strings, this also allows sharing with the underlying malloc-ed data. * vm_core.h (rb_vm_struct): add tainted_frozen_strings * vm.c (ruby_vm_destruct): free tainted_frozen_strings (Init_vm_objects): initialize tainted_frozen_strings (rb_vm_tfstring_table): accessor for tainted_frozen_strings * internal.h: declare rb_fstring_existing, rb_vm_tfstring_table * hash.c (fstring_existing_str): remove (moved to string.c) (hash_aset_str): use rb_fstring_existing * string.c (rb_fstring_existing): new, based on fstring_existing_str (tainted_fstr_update): new (rb_fstring_existing0): new, based on fstring_existing_str (rb_tainted_fstring_existing): new, special case for tainted strings (rb_str_free): delete from tainted_frozen_strings table * test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test [ruby-core:82012] [Bug #13737] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-06string.c: preserve coderange in String#setbyterhe
Fix a wrong jump so replacing a byte in an ASCII-only string with an ASCII character won't clear the coderange. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-07-06string.c: remove dead code in str_fill_term()rhe
The length of a string never exceeds the capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-21string.c: add String#delete_prefix and String#delete_prefix!sonots
to remove leading substr [Feature #12694] [fix GH-1632] * string.c (rb_str_delete_prefix_bang): add a new method to remove prefix destuctively. * string.c (rb_str_delete_prefix): add a new method to remove prefix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-18string.c: check just before modificationnobu
* string.c (rb_str_chomp_bang): check if modifiable after checking an argument and just before modification, as it can get frozen during the argument conversion to String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-02string.c: docs for String#splitstomar
* string.c: [DOC] clarify docs for String#split when called with limit and capture groups. Reported by Cichol Tsai. [ruby-core:81505] [Bug #13621] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-31Improve performance of implicit type conversionwatson1978
To convert the object implicitly, it has had two parts in convert_type() which are 1. lookink up the method's id 2. calling the method Seems that strncmp() and strcmp() in convert_type() are slightly heavy to look up the method's id for type conversion. This patch will add and use internal APIs (rb_convert_type_with_id, rb_check_convert_type_with_id) to call the method without looking up the method's id when convert the object. Array#flatten -> 19 % up Array#+ -> 3 % up [ruby-dev:50024] [Bug #13341] [Fix GH-1537] ### Before Array#flatten 104.119k (± 1.1%) i/s - 525.690k in 5.049517s Array#+ 1.993M (± 1.8%) i/s - 10.010M in 5.024258s ### After Array#flatten 124.005k (± 1.0%) i/s - 624.240k in 5.034477s Array#+ 2.058M (± 4.8%) i/s - 10.302M in 5.019328s ### Test Code require 'benchmark/ips' class Foo def to_ary [1,2,3] end end Benchmark.ips do |x| ary = [] 100.times { |i| ary << i } array = [ary] x.report "Array#flatten" do |i| i.times { array.flatten } end x.report "Array#+" do |i| obj = Foo.new i.times { array + obj } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-26string.c: adjust style [ci skip]nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-25string.c: Optimize String#concat when argc is 1k0kubun
Optimize performance regression introduced in r56021. * Benchmark (i7-4790K @ 4.00GH, x86_64 GNU/Linux) Benchmark.ips do |x| x.report("String#concat (1)") { "a".concat("b") } if RUBY_VERSION >= "2.4.0" x.report("String#concat (2)") { "a".concat("b", "c") } end end * Ruby 2.3 Calculating ------------------------------------- String#concat (1) 6.003M (± 5.2%) i/s - 30.122M in 5.031646s * Ruby 2.4 (Before this patch) Calculating ------------------------------------- String#concat (1) 4.458M (± 8.9%) i/s - 22.298M in 5.058084s String#concat (2) 3.660M (± 5.6%) i/s - 18.314M in 5.020527s * Ruby 2.4 (After this patch) Calculating ------------------------------------- String#concat (1) 6.448M (± 5.2%) i/s - 32.215M in 5.010833s String#concat (2) 3.633M (± 9.0%) i/s - 18.056M in 5.022603s [fix GH-1631] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-25vm_insnhelper.c: rb_eql_opt should call eql?nobu
* vm_insnhelper.c (rb_eql_opt): should call #eql? on Float and String, not #==. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c: fix String#crypt leak introduced in r58866normal
* string.c (rb_str_crypt): define LARGE_CRYPT_DATA when allocating git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c: for small crypt_datanobu
* string.c (rb_str_crypt): struct crypt_data defined in missing/crypt.h is small enough. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24Add debug counters.ko1
* debug_counter.h: add the following counters to measure object types. obj_free: freed count obj_str_ptr: freed count of Strings they have extra buff. obj_str_embed: freed count of Strings they don't have extra buff. obj_str_shared: freed count of Strings they have shared extra buff. obj_str_nofree: freed count of Strings they are marked as nofree. obj_str_fstr: freed count of Strings they are marked as fstr. obj_ary_ptr: freed count of Arrays they have extra buff. obj_ary_embed: freed count of Arrays they don't have extra buff. obj_obj_ptr: freed count of Objects (T_OBJECT) they have extra buff. obj_obj_embed: freed count of Objects they don't have extra buff. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-24string.c (rb_str_crypt): fix excessive stack use with crypt_rnormal
"struct crypt_data" is 131232 bytes on x86-64 GNU/Linux, making it unsafe to use tiny Fiber stack sizes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-21string.c: fix String#{casecmp,casecmp?} for non-string argumentsstomar
* string.c: make String#{casecmp,casecmp?} return nil for non-string arguments instead of raising a TypeError. * test/ruby/test_string.rb: add tests. Reported by Marcus Stollsteimer. Based on a patch by Shingo Morita. [ruby-core:80145] [Bug #13312] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58837 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-14string.c: cut down intermediate stringnobu
* string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13revert r58703 & r58705nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58708 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: fix up r58703nobu
* string.c (rb_external_str_new_with_enc): fix the case of conversion failure. when conversion failed for some reason, just ignores the default internal encoding and returns in the given encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58705 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: cut down intermediate stringnobu
* string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-13string.c: fix one-off bugnobu
* string.c (rb_str_cat_conv_enc_opts): fix one-off bug. `ofs` equals `olen` when appending at the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58702 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-12string.c: remove bare Unicode.nobu
* string.c (rb_str_unicode_normalize): remove bare Unicode. do not assume that all compilers can handle UTF-8. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for String#matchstomar
* string.c: [DOC] add example for String#match with pos argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for Symbolstomar
* string.c: [DOC] adopt call-seq's for Symbol#{match,match?} from String methods; other small improvements for Symbol docs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-11string.c: docs for Symbol#{match,match?}stomar
* string.c: [DOC] mention pos argument for Symbol#{match,match?}. Patch by Yuki Kurihara (ksss). [Fix GH-1606] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-09string.c: fix r58618nobu
* string.c (unicode_normalize_common): aggregation type cannot be initialized with dynamic values, in C89. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-09replace hand-written argument check by call to rb_scan_args in ↵duerst
unicode_normalize_common In string.c, replace hand-written argument count check by call to rb_scan_args. This allows to use rb_funcallv once, rather than using rb_funcall twice. Thanks to Hanmac (Hans Mackowiak) for the idea, see https://bugs.ruby-lang.org/issues/11078#note-7. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-06string.c: fix typesnobu
* string.c (id_normalize, id_normalized_p): fix types, IDs should be ID. * string.c (unicode_normalize_common): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04string.c: [DOC] improve docs for String.newstomar
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04string.c: [DOC] Properly refer to keyword argument by its namektsj
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04refactor common parts of unicode normalization functions into ↵duerst
unicode_normalize_common In string.c, refactor the common parts (requiring of unicode_normalize/normalize.rb, check of number of arguments) of the unicode normalization functions (rb_str_unicode_normalize, rb_str_unicode_normalize_bang, rb_str_unicode_normalized_p) into the new function unicode_normalize_common. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04move definition of String#unicode_normalized? to C to make sure it is documentedduerst
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalized? (including documentation). Leave a comment explaining that the file is now empty. * string.c: Define String#unicode_normalized? in rb_str_unicode_normalized_p in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalized? to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-04move definition of String#unicode_normalize! to C to make sure it is documentedduerst
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalize! (including documentation) * string.c: Define String#unicode_normalize! in rb_str_unicode_normalize_bang in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize! to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-03* remove trailing spaces.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-05-03move definition of String#unicode_normalize to C to make sure it is documentedduerst
* lib/unicode_normalize.rb: Remove definition of String#unicode_normalize (including documentation) * string.c: Define String#unicode_normalize in rb_str_unicode_normalize in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-04-17string.c: improve insertion performacenobu
* string.c (rb_str_splice_0): improve performace of single byte optimizable cases, insertion 7bit string to 7bit string. [ruby-dev:49984] [Bug #13228] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-29string.c: Supress logical-op-parentheses warningsorah
* string.c(rb_str_upcase_bang): Supress logical-op-parentheses warning Patch by Fukuo Kadota <fukuo-kadota@cookpad.com>, Closes [GH-1570] [Bug #13387]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58211 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-21string.c: use the usable sizenobu
* string.c (rb_str_change_terminator_length): when called after the content has been copied, old terminator length no longer makes sense. use the whole usable size instead of capacity without terminator. [ruby-core:80257] [Bug #13339] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58042 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-18fix accidental reversal of r57997 in r58000duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58009 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-17clarifiy 'codepoint' in documentation of String#each_codepointduerst
Make sure it's clear that the returned values are not Unicode codepoints for encodings other than UTF-8/UTF-16(BE|LE)/UTF-32(BE|LE). [ci skip] [Bug #13321] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-17deduplicate static rb_str_format format stringsnormal
Anybody who hits these code paths can hit them again in the future, so try deduplicating across multiple runs of these methods to reduce garbage. * string.c (str_upto_each): fstring on "%.*d" * strftime.c (rb_strftime_with_timespec): fstring on "%0*d" git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57997 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-15string.c: shortcut argument checknobu
* string.c (str_casecmp, str_casecmp_p): split to skip argument check when it is a String certainly. * string.c (sym_casecmp, sym_casecmp_p): shortcut argument checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e