summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2017-03-15string.c: shortcut argument checknobu
* string.c (str_casecmp, str_casecmp_p): split to skip argument check when it is a String certainly. * string.c (sym_casecmp, sym_casecmp_p): shortcut argument checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-14string.c: use rb_check_string_typenobu
* string.c (rb_str_cmp_m): use rb_check_string_type for check and conversion, instead of calling the conversion method directly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13docs for Symbol#casecmp and Symbol#casecmp?stomar
* string.c: [DOC] improve docs of Symbol#casecmp and Symbol#casecmp? according to the similar String methods; fix RDoc markup and typos; fix call-seq's for Symbol#{upcase,downcase,capitalize,swapcase}. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57963 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13string.c (rb_str_set_len): pathological checknobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-13string.c: $; is a GC-rootnobu
* string.c (Init_String): $; must be a GC-root, not to be collected. [ruby-core:79582] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-11docs for String#casecmp and String#casecmp?stomar
* string.c: [DOC] specify when String#casecmp and String#casecmp? return nil; modify examples to better show difference to <=>; fix RDoc markup and typos. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-08string.c (str_uminus): update doc for deduplicationnormal
As of r57698, String#-@ can return pre-existing strings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-08fix parennobu
* string.c (str_byte_substr): fix misplaced parenthesis at r56155. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57809 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07string.c: [DOC] Fix a typo in String#dumpkazu
[Fix GH-1531][ci skip] Author: Alex Semyonov <alex@semyonov.us> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57802 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07string.c: negation of LONG_MINnobu
* string.c (rb_str_update): do not use negation of LONG_MIN, which is negative too. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07string.c: fix integer overflownobu
* string.c (str_byte_substr): fix another integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. [ruby-core:79951] [Bug #13289] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-07string.c: fix integer overflownobu
* string.c (rb_str_subpos): fix integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. incorpolate https://github.com/mruby/mruby/commit/7db0786abdd243ba031e24683f git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-04string.c: [DOC] fix doc formatting for String#==, #===stomar
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-03-02string.c: restore documentation for String#<<stomar
* string.c: [DOC] restore documentation for String#<< which became undocumented with r56021; fix a typo. [ruby-core:79865] [Bug #13268] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57758 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-24string.c (str_uminus): deduplicate stringsnormal
This exposes the rb_fstring internal function to return a deduped and frozen string when a non-frozen string is given. This is useful for writing all sorts of record processing key values maybe stored, but certain keys and values are often duplicated at a high frequency, so memory savings can noticeable. Use cases are many: * email/NNTP header processing There are some standard header keys everybody uses (From/To/Cc/Date/Subject/Received/Message-ID/References/In-Reply-To), as well as common ones specific to a certain lists: (ruby-core has X-Redmine-* headers) It is also useful to dedupe values, as most inboxes have multiple messages from the same sender, or MUA. * package management systems - things like RubyGems stores identical strings for licenses, dependency names, author names/emails, etc * HTTP headers/trailers - standard headers (Host/Accept/Accept-Encoding/User-Agent/...) are common, but there are also uncommon ones. Values may be deduped, as well, as it is likely a user agent will make multiple/parallel requests to the same server. * version control systems - this can be useful for deduplicating names of frequent committers (like "nobu" :) In linux.git and git.git, there are also common trailers such as Signed-Off-By/Acked-by/Reviewed-by/Fixes/... as well as less common ones. * audio metadata - There are commonly used tags (Artist/Album/Title/Tracknumber), but Vorbis comments allows arbitrary key values to be stored. Music collections contain songs by the same artist or mutiple songs from the same album, so deduplicating values will be helpful there, too. * JSON, YAML, XML, HTML processing Certain fields, tags and attributes are commonly used across the same and multiple documents There is no security concern in this being a DoS vector by causing immortal strings. The fstring table is not a GC-root and not walked during the mark phase. GC-able dynamic symbols since Ruby 2.2 are handled in the same manner, and that implementation also relies on the non-immortality of fstrings. [Feature #13077] [ruby-core:79663] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-14string.c: assertionnobu
* string.c (str_shared_replace): use RUBY_ASSERT for pre-condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-14initialize variablesnobu
* string.c (rb_str_enumerate_lines): initialize conditionally used variable. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-13suppress warningsnobu
* string.c (rb_str_enumerate_lines): hint to suppress a maybe-uninitialized warning by gcc. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-05doc: Add example for Symbol#to_snormal
* string.c: add example for Symbol#to_s. The docs for Symbol#to_s only include an example for Symbol#id2name, but not for #to_s which is an alias; the docs should include examples for both methods. From: Marcus Stollsteimer <sto.mar@web.de> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-02-03symbol.c (rb_id2str): eliminate branch to set classnormal
Since the fstring table encompasses all strings in the symbol table, we may reuse the fstring table walk to set the class and eliminate the branch in rb_id2str. * string.c (Init_String): use rb_cString immediately after definition * symbol.c (rb_id2str): eliminate branch to set class git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57521 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-30string.c (rb_str_tmp_frozen_release): release embedded stringsnormal
Handle the embedded case first, since we may have an embedded duplicate and non-embedded original string. * string.c (rb_str_tmp_frozen_release): handled embedded strings * test/ruby/test_io.rb (test_write_no_garbage): new test [ruby-core:78898] [Bug #13085] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57471 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-30io.c: recycle garbage on writenormal
* string.c (STR_IS_SHARED_M): new flag to mark shared mulitple times (STR_SET_SHARED): set STR_IS_SHARED_M (rb_str_tmp_frozen_acquire, rb_str_tmp_frozen_release): new functions (str_new_frozen): set/unset STR_IS_SHARED_M as appropriate * internal.h: declare new functions * io.c (fwrite_arg, fwrite_do, fwrite_end): new (io_fwrite): use new functions Introduce rb_str_tmp_frozen_acquire and rb_str_tmp_frozen_release to manage a hidden, frozen string. Reuse one bit of the embed length for shared strings as STR_IS_SHARED_M to indicate a string has been shared multiple times. In the common case, the string is only shared once so the object slot can be reclaimed immediately. minimum results in each 3 measurements. (time and size) Execution time (sec) name trunk built io_copy_stream_write 0.682 0.254 io_copy_stream_write_socket 1.225 0.751 Speedup ratio: compare with the result of `trunk' (greater is better) name built io_copy_stream_write 2.680 io_copy_stream_write_socket 1.630 Memory usage (last size) (B) name trunk built io_copy_stream_write 95436800.000 6512640.000 io_copy_stream_write_socket 117628928.000 7127040.000 Memory consuming ratio (size) with the result of `trunk' (greater is better) name built io_copy_stream_write 14.654 io_copy_stream_write_socket 16.505 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-19string.c: rindex(//) should set $~.shugo
This seems a bug introduced by r520 (1.4.0). [ruby-core:79110] [Bug #13135] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-16file.c: refine messagenobu
* file.c (rb_get_path_check_convert): refine the error message when the path name contains null byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57336 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-11string.c: replacement and blocknobu
* string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-11string.c: yield invalid partnobu
* string.c (rb_enc_str_scrub): yield the invalid part only with ASCII-incompatible. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57303 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-11string.c: block for scrub with ASCII-incompatiblenobu
* string.c (rb_enc_str_scrub): honor the given block with ASCII-incompatible encoding. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57302 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-25string.c: CRLF in paragraph modenobu
* string.c (rb_str_enumerate_lines): allow CRLF to separate paragraphs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-25string.c: consistent paragraph mode with IOnobu
* string.c (rb_str_enumerate_lines): in paragraph mode, do not include newlines which separate paragraphs, so that it will be consistent with IO#each_line. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57184 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-22string.c: suppress a warningnobu
* string.c (rb_str_casecmp_p): [DOC] use Unicode escape form to get rid of warning C4819 by Microsoft Visual C++. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-20string.c: add missing size_t castrhe
Add size_t cast to avoid signed integer overflow. r56157 ("string.c: avoid signed integer overflow", 2016-09-13) missed this. Suppresses UBSan. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-16no crypt.h on FreeBSD 12nobu
* string.c (crypt.h): crypt_r() was added in FreeBSD 12.0 but is declared in unistd.h. [ruby-core:78664] [Bug #13038] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-16fix chomping newline only linenobu
* string.c (chomp_newline): fix chomping newline only line. rb_enc_prev_char return NULL if no previous character and must not call rb_enc_ascget on it. a patch by Ary Borenszweig <asterite AT gmail.com> at [ruby-core:78666]. [Bug #13037] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-12string.c: fix method name in rdoc [ci skip]nobu
* string.c (rb_str_equal): [DOC] fix fallback method name. the peer's == method will be used, not ===. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-12String#match? and Symbol#match?nobu
* string.c (rb_str_match_m_p): inverse of Regexp#match?. based on the patch by Herwin Weststrate <herwin@snt.utwente.nl>. [Fix GH-1483] [Feature #12898] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57053 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-03string.c: chomp optionnobu
* string.c (rb_str_enumerate_lines): implement chomp option. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56972 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-29Fix/improve documentation of String/Symbol#casecmp[?]duerst
Fix documentation of String#casecmp? (examples didn't have the '?'). Add an example with non-ASCII characters. Clarify that casecmp, unlike casecmp?, only does case-insensitivity on A-Z/a-z. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56926 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-29string.c: use xmallocnobu
* string.c (rb_str_casemap): use xmalloc simply instead of ALLOC_N. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56920 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-28string.c: fix zero-length arraynobu
* string.c (mapping_buffer): get rid of zero-length array member, which is not a part of C90. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-28string.c: enable rdocnobu
* string.c (rb_str_casecmp_p): [DOC] move forward declaration of rb_str_downcase to enable rdoc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56913 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-28implement String/Symbol#casecmp? including Unicode case foldingduerst
* string.c: Implement String#casecmp? and Symbol#casecmp? by using String#downcase :fold for Unicode case folding. This does not include options such as :turkic, because these currently cannot be combined with the :fold option. This implements feature #12786. * test/ruby/test_string.rb/test_symbol.rb: Tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-05chomp optionnobu
* io.c (extract_getline_opts): extract chomp option. [Feature #12553] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56581 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-10-26[DOC] replace Fixnum with Integer [ci skip]nobu
* numeric.c: [DOC] update document for Integer class. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-10-21Fixed typo [ci skip]nobu
* string.c (rb_str_sub, rb_str_gsub): [DOC] 'backlash' should read 'backslash'. [Fix GH-1461] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-10-04* internal.h (ST2FIX): new macro to convert st_index_t to Fixnum.usa
a hash value of Object might be Bignum, but it causes many troubles expecially the Object is used as a key of a hash. so I've gave up to do so. * array.c (rb_ary_hash): use above macro. * bignum.c (rb_big_hash): ditto. * hash.c (rb_obj_hash, rb_hash_hash): ditto. * numeric.c (rb_dbl_hash): ditto. * proc.c (proc_hash): ditto. * re.c (rb_reg_hash, match_hash): ditto. * string.c (rb_str_hash_m): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-10-01string.c: negative hashnobu
* string.c (rb_str_hash_m): hash values may be negative. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-10-01* string.c (rb_str_hash_m): st_index_t is not guaranteed as the sameusa
size with int, and of course also not guaranteed the value can be Fixnum. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-09-26string.c: fast path of lstrip_offsetnobu
* string.c (lstrip_offset): add a fast path in the case of single byte optimizable strings, as well as rstrip_offset. [ruby-core:77392] [Feature #12788] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-09-26string.c: fix integer overflow in enc_strlen() and rb_enc_strlen_cr()rhe
* string.c (enc_strlen, rb_enc_strlen_cr): Avoid signed integer overflow. The result type of a pointer subtraction may have the same size as long. This fixes String#size returning an negative value on i686-linux environment: str = "\x00" * ((1<<31)-2)) str.slice!(-3, 3) str.force_encoding("UTF-32BE") str << 1234 p str.size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56247 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-09-16 * internal.h (WARN_UNUSED_RESULT): moved to configure.in, toshyouhei
actually check its availability rather to check GCC's version. * configure.in (WARN_UNUSED_RESULT): moved to here. * configure.in (RUBY_FUNC_ATTRIBUTE): change function declaration to return int rather than void, because it makes no sense for a warn_unused_result attributed function to return void. Funny thing however is that it also makes no sense for noreturn attributed function to return int. So there is a fundamental conflict between them. While I tested this, I confirmed both GCC 6 and Clang 3.8 prefers int over void to correctly detect necessary attributes under this setup. Maybe subject to change in future. * internal.h (UNINITIALIZED_VAR): renamed to MAYBE_UNUSED, then moved to configure.in for the same reason we move WARN_UNUSED_RESULT. * configure.in (MAYBE_UNUSED): moved to here. * internal.h (__has_attribute): deleted, because it has no use now. * string.c (rb_str_enumerate_lines): refactor macro rename. * string.c (rb_str_enumerate_bytes): ditto. * string.c (rb_str_enumerate_chars): ditto. * string.c (rb_str_enumerate_codepoints): ditto. * thread.c (do_select): ditto. * vm_backtrace.c (rb_debug_inspector_open): ditto. * vsnprintf.c (BSD_vfprintf): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e