summaryrefslogtreecommitdiff
path: root/enc/unicode.c
AgeCommit message (Collapse)Author
2018-10-16revert r65091, r65090 because ci failsduerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65093 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-16update to Unicode 11.0.0 (basic step, not complete yet)duerst
- common.mk: Change Unicode version to 11.0.0 - enc/unicode/case-folding.rb, enc/unicode.c: Initial changes to deal with Gregorian Mtavruli. This should bring us up to the same level as e.g. Python 3.7, by following the Unicode tables exactly. But it will produce undesirable (mixed-case) results for String#capitalize. This will be addressed in a later commit. - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. - lib/unicode_normalize/tables.rb: Updated table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-16Removed data for old Unicode [ci skip]nobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-15unicode.c: moved addtional GCB rangesnobu
* enc/unicode.c: moved additional Grapheme Cluster Break ranges which depend on the Unicode version. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-15regparse.c: Suppress duplicated range warning by mere \Xnobu
* regparse.c (node_extended_grapheme_cluster): as Unicode 10 has added Grapheme_Cluster_Break properties to some characters, remove duplicated ranges for Unicode 9. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-10Merge Onigmo 6.0.0naruse
* https://github.com/k-takata/Onigmo/blob/Onigmo-6.0.0/HISTORY * fix for ruby 2.4: https://github.com/k-takata/Onigmo/pull/78 * suppress warning: https://github.com/k-takata/Onigmo/pull/79 * include/ruby/oniguruma.h: include onigmo.h. * template/encdb.h.tmpl: ignore duplicated definition of EUC-CN in enc/euc_kr.c. It is defined in enc/gb2313.c with CRuby macro. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57045 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-04remove special processing for U+03B9/U+03BC/U+A64Bduerst
* enc/unicode.c: Remove special processing for U+03B9/U+03BC/U+A64B (GREEK SMALL LETTERs IOTA/MU, CYRILLIC SMALL LETTER MONOGRAPH UK) from onigenc_unicode_case_map and simplify code. * enc/unicode/case-folding.rb: Remove check for U+03B9/U+03BC/U+A64B. This and the previous few related commits make sure that we won't hit the equivalent of bug #12990 anymore for future updates of Unicode versions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-01constify CaseMappingSpecialsnobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-11-30fix uppercasing for U+A64B, CYRILLIC SMALL LETTER MONOGRAPH UKduerst
* enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC at the end of onigenc_unicode_case_map (Bug #12990). * enc/unicode/case-folding.rb: Add U+A64B to the special cases 03B9 and 03BC. Add a comment pointing to enc/unicode.c. Change warnings to exceptions for unpredicted cases, because this would have been more easily noticed (the warning was not noticed when upgrading to Unicode 9.0.0). * test/ruby/enc/test_case_comprehensive.rb: Remove temporary exclusion of U+A64B from testing. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-24* regenc.h/c, include/ruby/oniguruma.h, enc/ascii.c, big5.c, cp949.c,duerst
emacs_mule.c, euc_jp.c, euc_kr.c, euc_tw.c, gb18030.c, gbk.c, iso_8859_1|2|3|4|5|6|7|8|9|10|11|13|14|15|16.c, koi8_r.c, koi8_u.c, shift_jis.c, unicode.c, us_ascii.c, utf_16|32be|le.c, utf_8.c, windows_1250|51|52|53|54|57.c, windows_31j.c, unicode.c: Remove conditional compilation macro ONIG_CASE_MAPPING. [Feature #12386]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-17Move generated headers to unicode data directorynobu
* common.mk, enc/depend (casefold.h, name2ctype.h): move to unicode data directory per version. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-02* string.c: Raise ArgumentError when invalid string is detected induerst
case mapping methods. * enc/unicode.c: Check for invalid string and signal with negative length value. * test/ruby/enc/test_case_mapping.rb: Add tests for above. * test/ruby/test_m17n_comb.rb: Add a message to clarify test failure. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-25* enc/unicode.c: Handle DOTLESS_i by hand because it isn't involved in folding.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-24* enc/unicode.c: Fix flag error for switch from titlecase to lowercase.duerst
* test/ruby/enc/test_case_mapping.rb: Tests for above error. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16* enc/unicode.h: Additional uses of ONIG_CASE_MAPPING compilation switchduerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55020 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16* append newline at EOF.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55019 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-16* include/ruby/oniguruma.h: Introducing ONIG_CASE_MAPPING compilationduerst
switch * include/ruby/oniguruma.h, enc/unicode.h: Using ONIG_CASE_MAPPING compilation switch git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-01* enc/unicode/case-folding.rb, casefold.h: Data generation to implementduerst
swapcase functionality for titlecase characters. Swapcase isn't defined by Unicode, because the purpose/usage of swapcase is unclear anyway. The implementation follows a proposal from Nobu, swaping the case of each component of a titlecase character individually. This means that the titlecase characters have to be decomposed. * enc/unicode.c: Code using the above data. * test/ruby/enc/test_case_mapping.rb: Tests for the above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29fix a typo [ci skip]kazu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54400 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29* enc/unicode/case-folding.rb, casefold.h: Tweaked handling of 6duerst
special cases in CaseUnfold_11_Table. * enc/unicode.c: Adjustments for above. * test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in test_titlecase activated; test_greek added. A test in test_cherokee fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29* enc/unicode.c: Cleaned up some comments.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54349 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29* enc/unicode/case-folding.rb, casefold.h: Removing data for idempotentduerst
titlecasing. * enc/unicode.c: Adjust code to data removal. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54347 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28* enc/unicode.c: Refactoring in preparation for data reduction forduerst
titlecase. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28* enc/unicode.c: Minor refactoring for I WITH DOT ABOVE.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28* enc/unicode.c: Removed code now covered by data from table.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-28* enc/unicode.c: Adding comments. [ci skip]duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-22* include/ruby/oniguruma.h: Additional flag for characters that are titlecase.duerst
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data. * enc/unicode.c: Marking capitalized character as unmodified if it is already titlecase. * test/ruby/enc/test_case_mapping.rb: Tests for above functionality. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-17* enc/unicode.c: Fixed two macro definitions.duerst
* test/ruby/enc/test_case_mapping.rb: Test cases that detected the above bugs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15* enc/unicode.c: Eliminating common code.duerst
(with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54118 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15* enc/unicode.c: Expansion of some code repetition in preparation forduerst
elimination of common code pieces. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54117 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15* remove trailing spaces.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-15* enc/unicode.c: Additional macros and code to use mapping data induerst
CaseMappingSpecials array. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-14* include/ruby/oniguruma.h, enc/unicode.c: Adjusting flag assignmentsduerst
and macros to work with unified CaseMappingSpecials array. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54101 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12unicode.c: off-by-one errornobu
* enc/unicode.c (CodePointListValidP): fix off-by-one error. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-12unicode.c: boundary checknobu
* enc/unicode.c (CodePointListValidP): add pathological boundary check, for gcc 4.9. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-11* enc/unicode/case-folding.rb, casefold.h: Streamlining approach toduerst
case mapping data not available from case folding by unifying all three cases (special title, special upper, special lower). * enc/unicode.c: Adjust macro names for above (macros are currently inactive). (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-24* include/ruby/oniguruma.h: Rearranging flag assignments and makingduerst
space for titlecase indices; adding additional macros to add or extract titlecase index; adding comments for better documentation. * enc/unicode.c: Moving some macros to include/ruby/oniguruma.h; activating use of titlecase indices. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-23* enc/unicode/case-folding.rb, casefold.h: Outputting actual titlecaseduerst
data (new table, with indices from other tables). * enc/unicode.c: Ignoring titlecase data indices for the moment. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-19* enc/unicode.c: Activated use of case mapping data in CaseUnfold_11 array.duerst
(with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53870 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08* string.c, enc/unicode.c: Disassociating ONIGENC_CASE_FOLD flag fromduerst
ONIGENC_CASE_DOWNCASE. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08unicode.c: magic numbersnobu
* enc/unicode.c (I_WITH_DOT_ABOVE, DOTLESS_i, DOT_ABOVE): name magic numbers. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53776 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-08* enc/unicode.c: Shortened macros for enc/unicode/casefold.h toduerst
single-letter; use flags in casefold.h for logic. * enc/unicode/case-folding.rb: Added flag for case folding. Changed parameter passing. * enc/unicode/casefold.h: New flags added. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-07* common.mk: Added two more precondition files for enc/unicode/casefold.hduerst
* enc/unicode.c: Added shortening macros for enc/unicode/casefold.h * enc/unicode/case-folding.rb: Fixed file encoding for CaseFolding.txt to ASCII-8BIT (should fix some ci errors). Clarified usage. Created class MapItem. Partially implemented class CaseMapping. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53767 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06* test/ruby/enc/test_regex_casefold.rb: Added data-based testing forduerst
String#downcase :fold. * enc/unicode.c: Fixed a range error (lowest non-ASCII character affected by case operations is U+00B5, MICRO SIGN) * test/ruby/enc/test_case_mapping.rb: Explicit test for case folding of MICRO SIGN to Greek mu. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06* enc/unicode.c, test/ruby/enc/test_case_mapping.rb: Implemented :foldduerst
option for String#downcase by using case folding data from regular expression engine, and added a few simple tests. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-04* enc/unicode.c: Activated :ascii flag for ASCII-only case conversionduerst
(with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53740 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27* enc/unicode.c: Fixed bit mask in macro OnigCodePointCountduerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53670 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-27* enc/unicode.c: Protect code point count by macro, in order toduerst
be able to use the remaining bits for flags. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17* enc/unicode.c: Fixed a logical error and some comments.duerst
* test/ruby/enc/test_case_mapping.rb: Made tests more general. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17get rid of non-ascii charsnobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53563 b2dd03c8-39d4-4d8f-98ff-823fe69b080e