summaryrefslogtreecommitdiff
path: root/enc/unicode.c
AgeCommit message (Collapse)Author
2014-05-30case-folding.rb: perfect hash for case unfolding1nobu
* enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case unfolding table 1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46270 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-05-30case-folding.rb: perfect hash for case foldingnobu
* enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case folding table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-05-30case-folding.rb: merge tablesnobu
* enc/unicode/case-folding.rb (print_table): merge non-locale and locale tables, and reduce initializing loops. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-05-23enc/unicode.c: lookup functionsnobu
* enc/unicode.c (onigenc_unicode_{fold,unfold{1,2,3}}_lookup): abstract lookup functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2014-05-23enc/unicode.c: constifynobu
* enc/unicode.c (code{2,3}_{cmp,hash}): constify and adjust argument types. * enc/unicode.c (onigenc_unicode_fold_lookup): constify. * enc/unicode.c (onigenc_unicode_apply_all_case_fold): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-29* regparse.c (is_onechar_cclass): optimize character classnaruse
Merge Onigmo 27278c12e6674043cc8affca6507e20e119a86ee. * regparse.c (is_onechar_cclass): [bug] unexpected match occurs when a char class contains no char * enc/unicode.c (init_case_fold_table): define the sizes of case folding tables in casefold.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34860 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2012-02-17* Merge Onigmo-5.13.1. [ruby-dev:45057] [Feature #5820]naruse
https://github.com/k-takata/Onigmo cp reg{comp,enc,error,exec,parse,syntax}.c reg{enc,int,parse}.h cp oniguruma.h cp tool/enc-unicode.rb cp -r enc/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-11-20* enc/unicode.c (PROPERTY_NAME_MAX_SIZE): +1.naruse
reported by Ken Takata. [ruby-dev:44894][Bug #5652] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2011-05-15* remove trailing spaces.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-10-03* enc/unicode.c (onigenc_unicode_property_name_to_ctype):naruse
remove useless assignment. * vm.c (vm_make_proc_from_block): ditto. * variable.c (rb_ivar_count): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29405 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2010-03-01* include/ruby/oniguruma.h: updated to follow Oniguruma 5.9.2.matz
* re.c (make_regexp): use onig_new() instead of onig_alloc_init(). * re.c (rb_reg_to_s): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-10* unicode.c (onigenc_unicode_property_name_to_ctype):naruse
ignore case of properties. * tool/enc-unicode.rb: downcase properties list. * enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: follow above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-09-08* include/ruby/st.h (st_hash_func): use st_index_t.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26* unicode.c (PROPERTY_NAME_MAX_SIZE): use MAX_WORD_LENGTH.naruse
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24677 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-26* enc/unicode.c (onigenc_unicode_mbc_case_fold): balanced braces.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-25Update Oniguruma's UnicodeData to 5.1.naruse
* tool/enc-unicode.rb: added for generate name2ctype.kwd. contributed by Run Paint Run Run [ruby-core:24775] use like following: ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \ enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd * enc/unicode.c (CodeRanges): move definitions to name2ctype.h. * enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: updated to v5.1. * enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1. * Makefile.in: add rule to generate name2ctype.kwd from UnicodeData.txt and Scripts.txt. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-21* enc/unicode/name2ctype.h: split from enc/unicode.c and made anobu
perfect hash. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2009-08-19* enc/unicode.c (CodeRanges): initialized statically.nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-18* grapheme cluster implementation reverted. [ruby-dev:36375]akr
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-09-16* include/ruby/oniguruma.h (OnigEncodingTypeST): add precise_retakr
argument for mbc_to_code. (ONIGENC_MBC_TO_CODE): provide NULL for precise_ret. (ONIGENC_MBC_PRECISE_CODEPOINT): defined. * include/ruby/encoding.h (rb_enc_mbc_precise_codepoint): defined. * regenc.h (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * regenc.c (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * string.c (count_utf8_lead_bytes_with_word): removed. (str_utf8_nth): removed. (str_utf8_offset): removed. (str_strlen): UTF-8 codepoint oriented optimization removed. (rb_str_substr): ditto. (enc_succ_char): use rb_enc_mbc_precise_codepoint. (enc_pred_char): ditto. (rb_str_succ): ditto. * encoding.c (rb_enc_ascget): check length with rb_enc_mbc_precise_codepoint. (rb_enc_codepoint): use rb_enc_mbc_precise_codepoint. * regexec.c (string_cmp_ic): add text_end argument. (match_at): check end of character after exact string matches. * enc/utf_8.c (graphme_table): defined for extended graphme cluster boundary. (grapheme_cmp): defined. (get_grapheme_properties): defined. (grapheme_boundary_p): defined. (MAX_BYTES_LENGTH): defined. (comb_char_enc_len): defined. (mbc_to_code0): extracted from mbc_to_code. (mbc_to_code): use mbc_to_code0. (left_adjust_combchar_head): defined. (utf_8): use a extended graphme cluster as a unit. * enc/unicode.c (onigenc_unicode_mbc_case_fold): use ONIGENC_MBC_PRECISE_CODEPOINT to extract codepoints. (onigenc_unicode_get_case_fold_codes_by_str): ditto. * enc/euc_jp.c (mbc_to_code): follow mbc_to_code field change. use onigenc_mbn_mbc_to_code. * enc/shift_jis.c (mbc_to_code): ditto. * enc/emacs_mule.c (mbc_to_code): ditto. * enc/gbk.c (gbk_mbc_to_code): follow mbc_to_code field and onigenc_mbn_mbc_to_code change. * enc/cp949.c (cp949_mbc_to_code): ditto. * enc/big5.c (big5_mbc_to_code): ditto. * enc/euc_tw.c (euctw_mbc_to_code): ditto. * enc/euc_kr.c (euckr_mbc_to_code): ditto. * enc/gb18030.c (gb18030_mbc_to_code): ditto. * enc/utf_32be.c (utf32be_mbc_to_code): follow mbc_to_code field change. * enc/utf_16be.c (utf16be_mbc_to_code): ditto. * enc/utf_32le.c (utf32le_mbc_to_code): ditto. * enc/utf_16le.c (utf16le_mbc_to_code): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-06-17* enc/euc_jp.c (property_name_to_ctype): core dumped when sizeof(int)mame
differs from sizeof(long). * enc/shift_jis.c (property_name_to_ctype): ditto. * enc/unicode.c (onigenc_unicode_property_name_to_ctype): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-19* enc/koi8_u.c: added.naruse
* regenc.c, enc/utf_8.c, enc/unicode.c, enc/gb18030.c: add ARG_UNUSED. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-08* regenc.c (onigenc_strlen_null, onigenc_str_bytelen_null): suppressednobu
warnings. * regenc.h, enc/unicode.c (onigenc_unicode_ctype_code_range): added encoding argument. * enc/utf{16,32}_{be,le}.c: added init functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14946 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2008-01-03* include/ruby/oniguruma.h: Oniguruma 1.9.1 merged.matz
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14874 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-10-10* Makefile.in, */Makefile.sub (VPATH): add enc directory.nobu
* common.mk (ENCOBJS): encoding objects. * enc: directory for encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13675 b2dd03c8-39d4-4d8f-98ff-823fe69b080e