summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2017-09-14merge revision(s) 59763: [Backport #13874]usa
string.c: fix false coderange * string.c (rb_enc_str_scrub): enc can differ from the actual encoding of the string, the cached coderange is useless then. [ruby-core:82674] [Bug #13874] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@59883 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-06-30merge revision(s) 59002: [Backport #13621]usa
string.c: docs for String#split * string.c: [DOC] clarify docs for String#split when called with limit and capture groups. Reported by Cichol Tsai. [ruby-core:81505] [Bug #13621] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@59227 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-27merge revision(s) 57374: [Backport #13135]nagachika
string.c: rindex(//) should set $~. This seems a bug introduced by r520 (1.4.0). [ruby-core:79110] [Bug #13135] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@58176 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-27merge revision(s) 57302,57303,57304: [Backport #13119]nagachika
string.c: block for scrub with ASCII-incompatible * string.c (rb_enc_str_scrub): honor the given block with ASCII-incompatible encoding. [ruby-core:79039] [Bug #13120] string.c: yield invalid part * string.c (rb_enc_str_scrub): yield the invalid part only with ASCII-incompatible. [ruby-core:79039] [Bug #13120] string.c: replacement and block * string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@58175 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-22merge revision(s) 55547,55551,55552,55555,55557,55559,55575,55691,55568: ↵nagachika
[Backport #12536] * string.c: Fix memory corruptions when using UTF-16/32 strings. [Bug #12536] [ruby-dev:49699] * string.c (TERM_LEN_MAX): Macro for the longest TERM_FILL length, the same as largest value of rb_enc_mbminlen(enc) among encodings. * string.c (str_new, rb_str_buf_new, str_shared_replace): Allocate +TERM_LEN_MAX bytes instead of +1. This change may increase memory usage. * string.c (rb_str_new_with_class): Use TERM_LEN of the "obj". * string.c (rb_str_plus, rb_str_justify): Use str_new0 which is aware of termlen. * string.c (str_shared_replace): Copy +termlen bytes instead of +1. * string.c (rb_str_times): termlen should not be included in capa. * string.c (RESIZE_CAPA_TERM): When using RSTRING_EMBED_LEN_MAX, termlen should be counted with it because embedded strings are also processed by TERM_FILL. * string.c (rb_str_capacity, str_shared_replace, str_buf_cat): ditto. * string.c (rb_str_drop_bytes, rb_str_setbyte, str_byte_substr): ditto. * string.c (rb_str_subseq, str_substr): When RSTRING_EMBED_LEN_MAX is used, TERM_LEN(str) should be considered with it because embedded strings are also processed by TERM_FILL. Additional fix for [Bug #12536] [ruby-dev:49699]. Additional fix for [Bug #12536] [ruby-dev:49699]. * string.c (rb_usascii_str_new, rb_utf8_str_new): Specify termlen which is apparently 1 for the encodings. * string.c (str_new0_cstr): New static function to create a String object from a C string with specifying termlen. * string.c (rb_usascii_str_new_cstr, rb_utf8_str_new_cstr): Specify termlen by using new str_new0_cstr(). * string.c (str_new_static): Specify termlen from the given encoding when creating a new String object is needed. * string.c (rb_tainted_str_new_with_enc): New function to create a tainted String object with the given encoding. This means that the termlen is correctly specified. Curretly static function. The function name might be renamed to rb_tainted_enc_str_new or rb_enc_tainted_str_new. * string.c (rb_external_str_new_with_enc): Use encoding by using the above rb_tainted_str_new_with_enc(). * string.c (str_fill_term): When termlen increases, re-allocation of memory for termlen should always be needed. In this fix, if possible, decrease capa instead of realloc. [Bug #12536] [ruby-dev:49699] * string.c: Partially reverts r55547 and r55555. ChangeLog about the reverted changes are also deleted in this file. [Bug #12536] [ruby-dev:49699] [ruby-dev:49702] * string.c (rb_str_change_terminator_length): New function to change termlen and resize heap for the terminator. This is split from rb_str_fill_terminator (str_fill_term) because filling terminator and changing terminator length are different things. [Bug #12536] * internal.h: declaration for rb_str_change_terminator_length. * string.c (str_fill_term): Simplify only to zero-fill the terminator. For non-shared strings, it assumes that (capa + termlen) bytes of heap is allocated. This partially reverts r55557. * encoding.c (rb_enc_associate_index): rb_str_change_terminator_length is used, and it should be called whenever the termlen is changed. * string.c (str_capacity): New static function to return capacity of a string with the given termlen, because the termlen may sometimes be different from TERM_LEN(str) especially during changing termlen or filling terminator with specific termlen. * string.c (rb_str_capacity): Use str_capacity. * string.c (str_buf_cat): Fix capa size for embed string. Fix bug in r55547. [Bug #12536] * string.c: Specify termlen as far as possible. the termlen is correctly specified. Currently static function. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@55988 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-18merge revision(s) 55729: [Backport #12611]nagachika
* vm.c (vm_set_main_stack): remove unnecessary check. toplevel binding must be initialized. [Bug #12611] (N1) * win32/win32.c (w32_symlink): fix return type. [Bug #12611] (N3) * string.c (rb_str_split_m): simplify the condition. [Bug #12611](N4) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@55959 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-15merge revision(s) 55181: [Backport #12431]nagachika
* transcode.c (str_transcode0): scrub in the given encoding when the source encoding is given, not in the encoding of the receiver. [ruby-core:75732] [Bug #12431] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@55905 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01merge revision(s) 55427: [Backport #12503]nagachika
* string.c (tr_trans): consider terminator length and fix heap overflow. reported by Guido Vranken <guido AT guidovranken.nl>. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@55561 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-15merge revision(s) 55054: [Backport #12390]nagachika
* string.c (rb_str_modify_expand): check integer overflow. [ruby-core:75592] [Bug #12390] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@55426 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29merge revision(s) 53724: [Backport #11946]naruse
* string.c (str_new_frozen): if the given string is embeddedable but not embedded, embed a new copied string. [Bug #11946] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@54416 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29merge revision(s) 54210: [Backport #12204]naruse
* string.c (enc_succ_alnum_char): try to skip an invalid character gap between GREEK CAPITAL RHO and SIGMA. [ruby-core:74478] [Bug #12204] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@54384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-23* string.c: Fix document. Default value of the firstyui-knk
argument of `String#split` is not `$;` but `nil`. When `nil` is passed as first argument, `$;` is used. [ci skip] [Bug #11729] [ruby-dev:49378] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53260 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-22string.c: no exception on dummy encodingnobu
* string.c (str_compat_and_valid): as scrub does nothing for dummy encoding string now, incompatible encoding is not a matter. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53235 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-17string.c: infectionnobu
* string.c (rb_str_scrub): the result should be infected by the original string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-15string.c: radix indicators [ci skip]nobu
* string.c (rb_str_oct): [DOC] mention radix indicators. [ruby-core:71310] [Bug #11648] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-14* enum.c: fix a typo in documentation.hsbt
[ci skip][fix GH-1140] Patch by @jutaz * io.c: ditto. * iseq.c: ditto. * numeric.c: ditto. * process.c: ditto. * string.c: ditto. * vm_trace.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53105 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-10* object.c (rb_inspect): dump inspected result with rb_str_escape()naruse
instead of raising Encoding::CompatibilityError. [Feature #11801] * string.c (rb_str_escape): added to dump given string like rb_str_inspect without quotes and always dump in US-ASCII like rb_str_dump. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-08string.c: use rb_id_encodingnobu
* string.c (rb_str_init): rb_id_encoding() returns same ID with caching. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52982 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-08* string.c (rb_str_init): now accepts new option parameter `encoding'.usa
[Feature #11785] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-08string.c: removed unused variableduerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52944 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-07* string.c: introduce String#+@ and String#-@ to controlko1
String mutability. [Feature #11782] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52917 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-04string.c: should not taint fstringnobu
* string.c (rb_obj_as_string): fstring should not be infected. re-apply r52872 and fix a typo. TODO: other frozen strings also may not be. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-04Revert r52872 "string.c: should not taint fstring"naruse
This reverts commit b887c7c20ab81b50ed7cb8c7db3218c443985d6b. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52878 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-03string.c: should not taint fstringnobu
* string.c (rb_obj_as_string): fstring should not be infected. TODO: other frozen strings also may not be. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52872 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-02string.c: adjust argument qualifiernobu
* string.c (str_make_independent_expand): adjust argument qualifier to get rid of a VC bug. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52844 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-12-01string.c: no frozen error at cstrnobu
* string.c (rb_string_value_cstr): should not raise on frozen string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52833 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-11-07string.c: use predefined IDs for minor bloat reductionnormal
* string.c (id_to_s): remove redundant variable (rb_obj_as_string): trade id_to_s for idTo_s (rb_str_equal): replace rb_intern(...) with pre-defined ID (rb_str_cmp_m): ditto (rb_str_match): ditto (str_upto_each): ditto (rb_str_sum): ditto (Init_String): remove id_to_s initialization This leads to a minor size reduction on my x86 (32-bit) system: text data bss dec hex filename 129373 8 32 129413 1f985 string.o-orig 129082 8 8 129098 1f84a string.o git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-29* encoding.c (rb_enc_check_str): add for performance.ko1
This function only accept T_STRING (and T_REGEXP). This patch improves performance of a tiny_segmenter benchmark (num=2) 2.54sec -> 2.42sec on my machine. https://github.com/chezou/TinySegmenter.jl/blob/master/benchmark/benchmark.rb * encoding.c: add ENC_DEBUG and ENC_ASSERT() macros. * internal.h: add a decl. of rb_enc_check_str(). * string.c (rb_str_plus): use rb_enc_check_str(). * string.c (rb_str_subpat_set): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52350 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-29RUBY_DTRACE_CREATE_HOOKnobu
* internal.h (RUBY_DTRACE_CREATE_HOOK): macro to call hook at object creation. * vm.c (rb_source_location, rb_source_loc): retrieve source path and line number at once. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-29revert r52336 (commit miss)ko1
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52337 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-29* gc.c (gc_mark_ptr): remove debug code for #11244.ko1
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52336 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-23* string.c: Added method signature to include hash. It's inconsistencyhsbt
with `gsub` method signature. [ci skip][fix GH-1023] Patch by @danielevans git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52241 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-22fix backslash [ci skip]nobu
* string.c (rb_str_tr): [DOC] Escape backslash in String#tr documentation. [Fix GH-1063] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-17string.c: rb_str_cat_conv_enc_optsnobu
* file.c (rb_file_expand_path_internal): concatenate converted string to the result instead of making converted string and append it. * string.c (rb_str_cat_conv_enc_opts): from rb_str_conv_enc_opts, separate function to concatenate with transcoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52147 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-15proc.c: proc without envnobu
* proc.c (rb_sym_to_proc): move from string.c and create a Proc with no environments. [ruby-core:71088] [Bug #11594] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52129 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-10* import a github pull requestko1
https://github.com/ruby/ruby/pull/1050 by Kazuho Oku <kazuho@natadeco.co>. This pull request has the following commits. * gc.c: reduce # of args to 6 (max. of register args on x86-64) so that the `newobj_of_slowpass` can be called via TCO. * gc.c (newobj_of), string.c (str_duplicate): for performance, the hot functions must be inlined. * gc.c: for performance, preceding arguments of `.*newobj_of.*` must be same, so that the arg registers can be reused in case of TCO. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52099 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-07string.c: str_duplicatenobu
* string.c (str_duplicate): move from rb_str_resurrect to short circuit initialization. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52074 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-07string.c: fix non-embedded stringnobu
* string.c (rb_str_resurrect): fix resurrection of short enough to be embedded but not embedded string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52073 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-07string.c: optimize String#timesnobu
* string.c (rb_str_times): optimize for the argument 0 and 1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52071 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-07string.c: use raw macronobu
* string.c (str_new_frozen): use raw macro for RString object. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-06* string.c (rb_sym_to_proc): renamenobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-06vm_args.c: wrap symbol ifuncnobu
* vm_args.c (args_setup_block_parameter): wrap a symbol in ifunc by a proc as a block parameter. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-05string.c: optimize rb_str_resurrectnobu
* string.c (rb_str_resurrect): optimize by short circuit to copy hidden string without checking length, encoding and so on. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52048 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-10-01* string.c (rb_sym_proc_call): constifynobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51994 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-09-30proc.c: include symbol namenobu
* proc.c (proc_to_s): include the original symbol name in string form. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51988 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-09-29compile.c: fix performance of strconcatnobu
* compile.c (compile_dstr_fragments): fix performance by omitting the first empty string only for keeping literal encoding if other literals are too. [ruby-core:70930] [Bug #11556] * string.c (rb_str_append_literal): append but keep encoding non US-ASCII. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51970 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-09-20string.c: separate resetting code rangenobu
* string.c (rb_str_setbyte): separate resetting code range by each code range, and remove unnecessary branches. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51907 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-09-16string.c: keep coderangenobu
* string.c (rb_str_setbyte): keep the code range as possible. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51873 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-09-15encindex.h: ENCINDEXnobu
* encindex.h: separate encoding index constants from internal.h. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51861 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2015-08-22string.c: move common statementnobu
* string.c (sym_inspect): move common statement and change variable. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@51666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e