ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2007-12-09	* re.c (match_backref_number): new function for converting a backref	akr
	name/number to an integer. (match_offset): use match_backref_number. (match_begin): ditto. (match_end): ditto. (name_to_backref_number): raise IndexError instead of RuntimeError. (match_inspect): show capture index. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-09	* re.c (append_utf8): check unicode range.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08	* re.c (rb_reg_check_preprocess): new function for validating regexp	akr
	fragment. * parse.y (regexp): invoke reg_fragment_check. (reg_fragment_check): defined. (reg_fragment_check_gen): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-08	* encoding.c (rb_enc_mbclen): make it never fail.	akr
	(rb_enc_nth): don't check the return value of rb_enc_mbclen. (rb_enc_strlen): ditto. (rb_enc_precise_mbclen): return needmore(1) if e <= p. (rb_enc_get_ascii): new function for extracting ASCII character. * include/ruby/encoding.h (rb_enc_get_ascii): declared. * include/ruby/regex.h (ismbchar): removed. * re.c (rb_reg_expr_str): use rb_enc_get_ascii. (unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine the termination of escaped non-ASCII character. (unescape_nonascii): use rb_enc_precise_mbclen. (rb_reg_quote): use rb_enc_get_ascii. (rb_reg_regsub): use rb_enc_get_ascii. * string.c (rb_str_reverse) don't check the return value of rb_enc_mbclen. (rb_str_split_m): don't call rb_enc_mbclen with e <= p. * parse.y (is_identchar): use ISASCII. (parser_ismbchar): removed. (parser_precise_mbclen): new macro. (parser_isascii): new macro. (parser_tokadd_mbchar): use parser_precise_mbclen to check invalid character precisely. (parser_tokadd_string): use parser_isascii. (parser_yylex): ditto. (is_special_global_name): don't call is_identchar with e <= p. (rb_enc_symname_p): ditto. [ruby-dev:32455] * ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie because the encoding is not UTF-8. [ruby-dev:32475] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02	fix Regexp#inspect document.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02	document MatchData#inspect.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02	* re.c (unescape_escaped_nonascii): fix mbclen argument.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14084 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-02	s/unicode/Unicode/ in error messages.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-12-01	* include/ruby/intern.h (rb_uv_to_utf8): declared.	akr
	* re.c (rb_reg_preprocess): new function for dynamic regexp with \u{} such as Regexp.new("\\u{6666}"). (rb_reg_prepare_re): preprocess regexp for recompiling. (read_escaped_byte): new function. (unescape_escaped_nonascii): new function. (append_utf8): new function. (unescape_unicode_list): new function. (unescape_unicode_bmp): new function. (unescape_nonascii): new function. (rb_reg_initialize): preprocess regexp. * pack.c (rb_uv_to_utf8): renamed from uv_to_utf8. * parse.y (STR_NEW3): take func instead of has8 and hasmb. (parser_str_new): use default coderange mechanism except for regexp. (parser_tokadd_utf8): copy regexp source as-is. (parser_read_escape): UTF-8 stuff removed. (parser_tokadd_escape): has8bit and hasmb removed. (parser_tokadd_string): fix 8-bit single byte character with \u. (parser_parse_string): has8bit and hasmb removed. (parser_here_document): has8bit and hasmb removed. (parser_yylex): call parser_tokadd_utf8 instead of read_escape for UTF-8 character. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-27	* include/ruby/encoding.h, encoding.c, re.c, string.c, parse.y:	akr
	rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT. rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT. Because single byte 8bit character, such as Shift_JIS 1byte katakana, is represented by ENC_CODERANGE_MULTI even if it is not multi byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26	* re.c (Init_Regexp): new method Regexp#fixed_encoding?	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-26	* re.c (rb_reg_fixed_encoding_p): extracted from rb_reg_prepare_re and	akr
	rb_reg_s_union. (rb_reg_s_union): refactored. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-25	* include/ruby/encoding.h (rb_enc_str_asciionly_p): declared.	akr
	(rb_enc_str_asciicompat_p): defined. * re.c (rb_reg_initialize_str): use rb_enc_str_asciionly_p. (rb_reg_quote): return ascii-8bit string if the argument is ascii-only to generate encoding generic regexp if possible. (rb_reg_s_union): fix encoding handling. [ruby-dev:32094] * string.c (rb_enc_str_asciionly_p): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14013 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23	* re.c (REG_CASESTATE): unused macro removed.	akr
	(rb_reg_prepare_re): check encoding difference. (rb_reg_initialize): check 8bit byte. * parse.y (parser_tokadd_escape): fix has8bit. [ruby-dev:32113] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-23	* re.c (match_begin): should return offset by character.	matz
	[ruby-dev:32331] * re.c (match_end): ditto. * re.c (rb_reg_search): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04	* re.c (rb_reg_quote): quote \v as well.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13818 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-11-04	* re.c (rb_reg_initialize_m): use StringValuePtr instead of	akr
	StringValueCStr because \0 exists when Regexp.new("\0"). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13817 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-19	* parse.y (parser_regx_options, reg_compile_gen): relaxened encoding	nobu
	matching rule. * re.c (rb_reg_initialize): always set encoding of Regexp. * re.c (rb_reg_initialize_str): fix enconding for non 7bit-clean strings. * re.c (rb_reg_initialize_m): use ascii encoding for 'n' option. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17	* re.c (rb_reg_s_union): the last check was not complete.	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-17	* encoding.c (rb_enc_from_encoding, rb_enc_register): associate index	nobu
	to self. * encoding.c (enc_capable): Encoding objects are encoding capable. * re.c (rb_reg_s_union): check if encoding matching by exact encoding objects. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16	* re.c (rb_reg_desc): set encoding.	nobu
	* re.c (rb_reg_s_union): check encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16	* re.c (rb_reg_initialize_m): allow binary encoding option.	nobu
	[ruby-dev:32083] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16	* re.c (rb_reg_s_union): check for encoding of original object.	nobu
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13723 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16	* parse.y (parser_regx_options): check if regexp encoding option	nobu
	matches to current encoding. * re.c (char_to_option, rb_char_to_option_kcode): 'n' is not kcode option now. * re.c (rb_reg_to_s, rb_reg_error_desc): copy encoding rather than append as an option. * re.c (make_regexp, rb_reg_prepare_re): use encoding of Regexp and String instead of kcode. * re.c (rb_reg_initialize): set fixed option if none is set. * re.c (rb_reg_regcomp): ditto. * re.c (rb_reg_equal): check if encodings are equal. * re.c (rb_reg_initialize_m): encoding option is obsolete. * re.c (rb_kcode, rb_get_kcode, rb_set_kcode): removed. * re.c (Init_Regexp): removed Regexp#kcode method. * ruby.c (proc_options): allow long encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13717 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-16	* re.c (rb_reg_s_union): encoding of all regexp objects should	matz
	match. [ruby-dev:32076] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-12	* re.c (match_values_at): make #select to be alias to #values_at	matz
	to adapt RDoc description. [ruby-core:12588] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13683 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-10	* re.c (rb_reg_s_quote): no longer takes optional second argument	matz
	that has never been documented. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13671 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-08	fix rdoc position of Regexp.union.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06	add an example for Regexp.union document.	akr
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13642 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-06	* insns.def (opt_eq): get rid of gcc bug.	nobu
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13641 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-05	* include/ruby/defines.h: no longer provide DEFAULT_KCODE.	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13640 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-05	* re.c (rb_reg_s_union_m): Regexp.union accepts single argument which	akr
	is an array of patterns. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13638 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	replace rb_memcicmp()	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13624 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	revert rb_memcmp() change to pacify GCC optimizer	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13623 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* re.c (rb_memcmp): no longer useful without ruby_ignorecase.	matz
	* re.c (rb_reg_prepare_re): revert recompile condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13622 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* re.c (kcode_setter): restore erroneously removed setter.	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* re.c (ignorecase_setter): change warning message.	matz
	* re.c (ignorecase_getter): now gives warning. * string.c (rb_str_cmp_m): update RDoc document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* re.c (Init_Regexp): remove obsolete const alias: MatchingData.	matz
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* re.c (kcode_setter): Perl-ish global variable `$=' no longer	matz
	effective. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13616 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-04	* encoding.c (rb_obj_encoding): returns encoding of the given object.	nobu
	* re.c (Init_Regexp): new method Regexp#encoding. * string.c (str_encoding): moved to encoding.c git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-29	* re.c (Init_Regexp): test DEFAULT_KCODE in C code because	akr
	KCODE_EUC, etc are enum. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13571 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-20	* re.c (rb_reg_match_m): evaluate a block if match. it would make	matz
	condition statement much shorter, if no else clause is needed. * string.c (rb_str_match_m): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-09-06	* array.c (rb_ary_cycle): typo in rdoc. a patch from Yugui	matz
	<yugui@yugui.sakura.ne.jp>. [ruby-dev:31748] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13348 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-29	* string.c (str_gsub): should not use mbclen2() which has broken API.	matz
	* re.c: remove rb_reg_mbclen2(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-29	* re.c (rb_reg_mbclen2): suppress a warning.	nobu
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-28	* string.c (rb_str_subseq): retrieve substring based on byte offset.	matz
	* string.c (rb_str_rindex_m): was confusing character offset and byte offset. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25	* parse.y, re.c: re-applied revision 13092.	nobu
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25	* encoding.c: provide basic features for M17N.	matz
	* parse.y: encoding aware parsing. * parse.y (pragma_encoding): encoding specification pragma. * parse.y (rb_intern3): encoding specified symbols. * string.c (rb_str_length): length based on characters. for older behavior, bytesize method added. * string.c (rb_str_index_m): index based on characters. rindex as well. * string.c (succ_char): encoding aware succeeding string. * string.c (rb_str_reverse): reverse based on characters. * string.c (rb_str_inspect): encoding aware string description. * string.c (rb_str_upcase_bang): encoding aware case conversion. downcase, capitalize, swapcase as well. * string.c (rb_str_tr_bang): tr based on characters. delete, squeeze, tr_s, count as well. * string.c (rb_str_split_m): split based on characters. * string.c (rb_str_each_line): encoding aware each_line. * string.c (rb_str_each_char): added. iteration based on characters. * string.c (rb_str_strip_bang): encoding aware whitespace stripping. lstrip, rstrip as well. * string.c (rb_str_justify): encoding aware justifying (ljust, rjust, center). * string.c (str_encoding): get encoding attribute from a string. * re.c (rb_reg_initialize): encoding aware regular expression * sprintf.c (rb_str_format): formatting (i.e. length count) based on characters. * io.c (rb_io_getc): getc to return one-character string. for older behavior, getbyte method added. * ext/stringio/stringio.c (strio_getc): ditto. * io.c (rb_io_ungetc): allow pushing arbitrary string at the current reading point. * ext/stringio/stringio.c (strio_ungetc): ditto. * ext/strscan/strscan.c: encoding support. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13261 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-25	* array.c (rb_ary_s_try_convert): more document description.	matz
	* re.c (rb_reg_s_try_convert): typo fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13256 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-08-24	* array.c (rb_ary_s_try_convert): a new class method to convert	matz
	object or nil if it's not target-type. this mechanism is used to convert types in the C implemented methods. * hash.c (rb_hash_s_try_convert): ditto. * io.c (rb_io_s_try_convert): ditto. * re.c (rb_reg_s_try_convert): ditto. * string.c (rb_str_s_try_convert): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e