summaryrefslogtreecommitdiff
path: root/string.c
AgeCommit message (Collapse)Author
2007-12-18* string.c (rb_str_splice): propagate encoding.matz
* string.c (rb_str_subpat_set): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-18* string.c (str_nth): need not to raise out-of-range exception.matz
* test/ruby/test_m17n.rb (TestM17N::test_str_aref_len): removed debug print. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14287 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* re.c (rb_reg_initialize): raise error if non-Unicode fixedmatz
encoding option is specified for regexp literals with \u{} escapes. * string.c (rb_str_squeeze_bang): should squeeze multibyte characters as well. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* string.c (scan_once): need no encoding compatibility check.matz
it's done inside of re_reg_seach(). * string.c (rb_str_split_m): ditto. * re.c (rb_reg_regsub): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* string.c (rb_str_index): check if substring is broken.matz
* string.c (rb_str_rindex): ditto. * string.c (rb_str_succ): should carry over. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* string.c (rb_enc_str_asciionly_p): use rb_enc_str_coderange.akr
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14263 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* string.c (rb_enc_str_coderange): set ENC_CODERANGE_BROKEN usingakr
rb_enc_precise_mbclen. (rb_str_valid_encoding_p): just check coderange is ENC_CODERANGE_BROKEN or not. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14262 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* include/ruby/encoding.h (ENC_CODERANGE_VALID): rename fromakr
ENC_CODERANGE_8BIT. * string.c (rb_enc_str_coderange): follow the renaming. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14257 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-17* string.c (tr_find): wrong condition fixed.matz
* sprintf.c (rb_str_format): check encoding based on result, not the format string. * string.c (rb_str_upto): add encoding check. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14256 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-14* io.c (rb_f_p): RDoc update. a patch from murphy <murphy AT rubychan.de>.matz
[ruby-core:14010] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14228 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-14* string.c (rb_str_cmp): encoding aware comparison.matz
* string.c (rb_str_casecmp): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14227 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13* string.c (str_nth): direct jump if string is 7bit only. greatmatz
performance boost for worst case. * string.c (str_strlen): direct size if string is 7bit only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14221 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13* string.c (rb_str_shared_replace): make str noembed after free.akr
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14215 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13* string.c (rb_str_succ): should not enter infinite loop formatz
non-ASCII, non-alphanumeric character at the bottom. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13* string.c (str_gsub): should copy encoding to the result.matz
* sprintf.c (rb_str_format): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-13* string.c (rb_str_split_m): need not to check encoding if regexpmatz
is empty. * string.c (rb_str_justify): associate encoding of original to the result. * string.c (rb_str_chomp_bang): need to check encoding of record separator. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14211 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12* re.c, regerror.c, string.c, parse.y, ruby.c, file.c:akr
use capital letter for \xHH notation. [ruby-dev:32511] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-12* re.c (rb_reg_regsub): should copy encoding.nobu
* string.c (rb_str_sub_bang, str_gsub): should check and copy encoding to be replaced. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10* string.c (rb_str_tmp_new): creates hidden temporary buffer.nobu
* transcode.c (transcoding): added a pointer to function to flush. * transcode.c (transcode_loop): do not use string internal. [ruby-dev:32512] * transcode.c (str_transcode): allow Encoding objects. * transcode_data.h (BYTE_LOOKUP): use actual struct name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14176 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10* string.c (rb_str_insert): should not add length in bytes to index innobu
chars. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14174 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10* transcode.c: new file to provide encoding conversion features.matz
code contributed by Martin Duerst. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14172 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-10* re.c (rb_reg_search): return byte offset. [ruby-dev:32452]nobu
* re.c (rb_reg_match, rb_reg_match2, rb_reg_match_m): convert byte offset to char index. * string.c (rb_str_index): return byte offset. [ruby-dev:32472] * string.c (rb_str_split_m): calculate in byte offset. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14171 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09* re.c (rb_reg_expr_str): use \xHH instead of \OOO.akr
* regerror.c (to_ascii): ditto. (onig_snprintf_with_pattern): ditto. (onig_snprintf_with_pattern): ditto. * string.c (rb_str_inspect): ditto. (rb_str_dump): ditto. * parse.y (parser_yylex): ditto. * ruby.c (proc_options): ditto. * file.c (rb_f_test): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09* string.c (tr_find): returns true if no characters to be removed isnobu
specified. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14151 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09* string.c (tr_trans): get rid of segfaults when has mulitbytes butnobu
source sets have no mulitbytes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08* encoding.c (rb_enc_mbclen): make it never fail.akr
(rb_enc_nth): don't check the return value of rb_enc_mbclen. (rb_enc_strlen): ditto. (rb_enc_precise_mbclen): return needmore(1) if e <= p. (rb_enc_get_ascii): new function for extracting ASCII character. * include/ruby/encoding.h (rb_enc_get_ascii): declared. * include/ruby/regex.h (ismbchar): removed. * re.c (rb_reg_expr_str): use rb_enc_get_ascii. (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine the termination of escaped non-ASCII character. (unescape_nonascii): use rb_enc_precise_mbclen. (rb_reg_quote): use rb_enc_get_ascii. (rb_reg_regsub): use rb_enc_get_ascii. * string.c (rb_str_reverse) don't check the return value of rb_enc_mbclen. (rb_str_split_m): don't call rb_enc_mbclen with e <= p. * parse.y (is_identchar): use ISASCII. (parser_ismbchar): removed. (parser_precise_mbclen): new macro. (parser_isascii): new macro. (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid character precisely. (parser_tokadd_string): use parser_isascii. (parser_yylex): ditto. (is_special_global_name): don't call is_identchar with e <= p. (rb_enc_symname_p): ditto. [ruby-dev:32455] * ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie because the encoding is not UTF-8. [ruby-dev:32475] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-06* encoding.c (rb_enc_precise_mbclen): new function for mbclen withakr
validation. * include/ruby/encoding.h (rb_enc_precise_mbclen): declared. (MBCLEN_CHARFOUND): new macro. (MBCLEN_INVALID): new macro. (MBCLEN_NEEDMORE): new macro. * include/ruby/oniguruma.h (OnigEncodingTypeST): replace mbc_enc_len by precise_mbc_enc_len. (ONIGENC_PRECISE_MBC_ENC_LEN): new macro. (ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND): new macro. (ONIGENC_CONSTRUCT_MBCLEN_INVALID): new macro. (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBCLEN_CHARFOUND): new macro. (ONIGENC_MBCLEN_INVALID): new macro. (ONIGENC_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBC_ENC_LEN): use ONIGENC_PRECISE_MBC_ENC_LEN. * enc/euc_jp.c: validation implemented. * enc/sjis.c: ditto. * enc/utf8.c: ditto. * string.c (rb_str_inspect): use rb_enc_precise_mbclen for invalid encoding. (rb_str_valid_encoding_p): new method String#valid_encoding?. * io.c (rb_io_getc): use rb_enc_precise_mbclen. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-27* include/ruby/encoding.h, encoding.c, re.c, string.c, parse.y: akr
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT. rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT. Because single byte 8bit character, such as Shift_JIS 1byte katakana, is represented by ENC_CODERANGE_MULTI even if it is not multi byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-25* include/ruby/encoding.h (rb_enc_str_asciionly_p): declared.akr
(rb_enc_str_asciicompat_p): defined. * re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p. (rb_reg_quote): return ascii-8bit string if the argument is ascii-only to generate encoding generic regexp if possible. (rb_reg_s_union): fix encoding handling. [ruby-dev:32094] * string.c (rb_enc_str_asciionly_p): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-13* include/ruby/ruby.h: introduce 2 macros:ko1
RFLOAT_VALUE(v), DOUBLE2NUM(dbl). Rename RFloat#value -> RFloat#double_value. Do not touch RFloat#double_value directly. * bignum.c, insns.def, marshal.c, math.c, numeric.c, object.c, pack.c, parse.y, process.c, random.c, sprintf.c, string.c, time.c: apply above changes. * ext/dl/mkcallback.rb, ext/json/ext/generator/generator.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13913 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-10* string.c (tr_trans): cast to unsigned char after dereferenceakr
a pointer to a char to avoid SEGV with "\377".tr("a", "b"). on FreeBSD/amd64. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13872 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-09* string.c (rb_str_squeeze_bang): initialize squeezing table if nonobu
arguments given. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13851 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-07* string.c (tr_setup_table, tr_trans): fix test failures in ↵davidflanagan
test/ruby/test_string.rb git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13834 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-03* string.c (tr_setup_table): use C array for characters that fitmatz
in a byte to gain performance. * string.c (rb_str_delete_bang): ditto. * string.c (rb_str_squeeze_bang): ditto. * string.c (rb_str_count): ditto. * string.c (tr_trans): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13812 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-29* string.c (rb_str_substr): perfomance improvement. [ruby-dev:31806]nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16* string.c (rb_str_ord): use encoding.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13726 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16* string.c (rb_str_new4): should copy encoding. a patch from NARUSE,nobu
Yui <naruse AT airemix.com>. [ruby-dev:32076] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13714 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-13* encoding.c (rb_cEncoding): new Encoding class.nobu
* encoding.c (rb_to_encoding, rb_to_encoding_index): helper functions. * encoding.c (rb_obj_encoding): return Encoding object now. * gc.c (garbage_collect): mark Encoding objects. * string.c (rb_str_force_encoding): accept Encoding object as well as encoding name. * include/ruby/encoding.h (rb_to_encoding_index, rb_to_encoding): prototypes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-10* string.c (rb_enc_str_coderange): fixed checkfor non-ascii.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06* string.c (rb_str_to_i): update RDoc since base can be any valuematz
between 2 and 36. [ruby-talk:272879] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13645 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06* encoding.c (rb_enc_register): returns new index or -1 if failed.nobu
* encoding.c (rb_enc_alias): check if original name is registered. * encoding.c (rb_enc_init): register in same order as kcode options in re.c. added new aliases. * string.c (rb_str_force_encoding): check if valid encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13643 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06* insns.def (opt_eq): get rid of gcc bug.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13641 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04revert rb_memcmp() change to pacify GCC optimizermatz
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13623 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04* re.c (rb_memcmp): no longer useful without ruby_ignorecase.matz
* re.c (rb_reg_prepare_re): revert recompile condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13622 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04* re.c (ignorecase_setter): change warning message.matz
* re.c (ignorecase_getter): now gives warning. * string.c (rb_str_cmp_m): update RDoc document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04* encoding.c (rb_obj_encoding): returns encoding of the given object.nobu
* re.c (Init_Regexp): new method Regexp#encoding. * string.c (str_encoding): moved to encoding.c git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-30* string.c (rb_str_append): always set encoding, and coderangenobu
cache bits. * include/ruby/encoding.h (ENC_CODERANGE_SET): fixed a bug not to set chache bits. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13578 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-29* array.c (rb_ary_combination): new method to give all combinationmatz
of elements from an array. [ruby-list:42671] * array.c (rb_ary_product): a new method to get all combinations of elements from two arrays. can be extended to combinations of n-arrays, e.g. a.product(b,c,d). anyone volunteer? * array.c (rb_ary_permutation): empty function body to calculate permutations of array elements. need volunteer. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13568 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-28* encoding.c (rb_enc_alias): allow encodings multiple aliases.nobu
* encoding.c (rb_enc_find_index): search the encoding which has the given name and return its index if found, or -1. * st.c (type_strcasehash): case-insensitive string hash type. * string.c (rb_str_force_encoding): force encoding of self. this name comes from [ruby-dev:31894] by Martin Duerst. [ruby-dev:31744] * include/ruby/encoding.h (rb_enc_find_index, rb_enc_associate_index): prototyped. * include/ruby/encoding.h (rb_enc_isctype): direct interface to ctype. * include/ruby/st.h (st_init_strcasetable): prototyped. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13556 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-28* string.c (rb_str_comparable): need not to check asciicompat here.matz
* encoding.c (rb_enc_check): ditto. * string.c (rb_enc_str_coderange): tuned a bit; no broken check. * encoding.c (rb_enc_check): new encoding comparison criteria. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e