summaryrefslogtreecommitdiff
path: root/test/ruby/enc/test_case_mapping.rb
AgeCommit message (Collapse)Author
2019-05-17Test to disable ASCII-only optimizationNobuyoshi Nakada
Examples why ASCII-only optimization cannot apply multi-byte encodings which have 7-bit trailing bytes. Suggested by @duerst at https://github.com/ruby/ruby/pull/2187#issuecomment-492949218
2018-12-10add a test to make sure some unassigned codepoints do not get convertedduerst
In test/ruby/enc/test_case_mapping.rb, add a test to make sure the unassigned codepoints in the Georgian MTAVRULI range (U+1CBB, U+1CBC) do not get converted to unrelated codepoints by String#capitalize. (It turns out that this test was not strictly necessary, because unassigned codepoints are already excluded by the fact that they are not found in the onigenc_unicode_fold_lookup table. So this test only serves to check against future regressions.) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-09implement special behavior for Georgian for String#capitalizeduerst
The modern Georgian script is special in that it has an 'uppercase' variant called MTAVRULI which can be used for emphasis of whole words, for screamy headlines, and so on. However, in contrast to all other bicameral scripts, there is no usage of capitalizing the first letter in a word or a sentence. Words with mixed capitalization are not used at all. We therefore implement special behavior for String#capitalize. Formally, we define String#capitalize as first applying String#downcase for the whole string, then using titlecase on the first letter. Because Georgian defines titlecase as the identity function both for MTAVRULI ('uppercase') and Mkhedruli (lowercase), this results in String#capitalize being equivalent to String#downcase for Georgian. This avoids undesirable mixed case. * enc/unicode.c: Actual implementation * string.c: Add mention of this special case for documentation * test/ruby/enc/test_case_mapping.rb: Add two tests, a general one that uses String#capitalize on some (including nonsensical) combinations of MTAVRULI and Mkhedruli, and a canary test to detect the potential assignment of characters to the currently open slots (holes) at U+1CBB and U+1CBC. * test/ruby/enc/test_case_comprehensive.rb: Tweak generation of expectation data. Together with r65933, this closes issue #14839. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-12-03add tests againts regressions for upcoming codepoint reordering in unfolding ↵duerst
table * test/ruby/enc/test_case_mapping.rb: Add method test_reorder_unfold to test against problems when reordering codepoints in some entries in CaseUnfold_11_Type CaseUnfold_11_Table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56968 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-06* test/ruby/enc/test_case_mapping.rb:duerst
Remove :lithuanian guard for Unicode case mapping. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55292 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-02* string.c: Raise ArgumentError when invalid string is detected induerst
case mapping methods. * enc/unicode.c: Check for invalid string and signal with negative length value. * test/ruby/enc/test_case_mapping.rb: Add tests for above. * test/ruby/test_m17n_comb.rb: Add a message to clarify test failure. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-24* enc/unicode.c: Fix flag error for switch from titlecase to lowercase.duerst
* test/ruby/enc/test_case_mapping.rb: Tests for above error. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-04-01* enc/unicode/case-folding.rb, casefold.h: Data generation to implementduerst
swapcase functionality for titlecase characters. Swapcase isn't defined by Unicode, because the purpose/usage of swapcase is unclear anyway. The implementation follows a proposal from Nobu, swaping the case of each component of a titlecase character individually. This means that the titlecase characters have to be decomposed. * enc/unicode.c: Code using the above data. * test/ruby/enc/test_case_mapping.rb: Tests for the above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-29* enc/unicode/case-folding.rb, casefold.h: Tweaked handling of 6duerst
special cases in CaseUnfold_11_Table. * enc/unicode.c: Adjustments for above. * test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in test_titlecase activated; test_greek added. A test in test_cherokee fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-25* test/ruby/enc/test_case_mapping.rb: Additional tests title case;duerst
some not yet activated. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54259 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-22* include/ruby/oniguruma.h: Additional flag for characters that are titlecase.duerst
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data. * enc/unicode.c: Marking capitalized character as unmodified if it is already titlecase. * test/ruby/enc/test_case_mapping.rb: Tests for above functionality. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-17* enc/unicode.c: Fixed two macro definitions.duerst
* test/ruby/enc/test_case_mapping.rb: Test cases that detected the above bugs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16* test/ruby/enc/test_case_mapping.rb: Fixed and activated a test for Cherokee.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54127 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16* test/ruby/enc/test_case_mapping.rb: Fixed a logical error.duerst
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54125 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16* test/ruby/enc/test_case_mapping.rb: Adding tests for Cherokee.duerst
One test not yet working. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54124 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-03-16* test/ruby/enc/test_case_mapping.rb: Adding tests for actual Unicodeduerst
case mapping. Fixing some aliasing issues. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06* test/ruby/enc/test_regex_casefold.rb: Added data-based testing forduerst
String#downcase :fold. * enc/unicode.c: Fixed a range error (lowest non-ASCII character affected by case operations is U+00B5, MICRO SIGN) * test/ruby/enc/test_case_mapping.rb: Explicit test for case folding of MICRO SIGN to Greek mu. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-06* enc/unicode.c, test/ruby/enc/test_case_mapping.rb: Implemented :foldduerst
option for String#downcase by using case folding data from regular expression engine, and added a few simple tests. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-02-05* test/ruby/enc/test_case_mapping.rb: added tests for :ascii option.duerst
(with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53746 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17* string.c: Any kind of option is now taking the new code path forduerst
upcase/downcase/capitalize/swapcase. :lithuanian can be used for testing if no specific option is desired. * test/ruby/enc/test_case_mapping.rb: Adjusted to above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17* enc/unicode.c: Fixed a logical error and some comments.duerst
* test/ruby/enc/test_case_mapping.rb: Made tests more general. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53564 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-17* enc/unicode.c: Removed artificial expansion for Turkic,duerst
added hand-coded support for Turkic, fixed logic for swapcase. * string.c: Made use of new case mapping code possible from upcase, capitalize, and swapcase (with :lithuanian as a guard). * test/ruby/enc/test_case_mapping.rb: Adjusted for above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-16* enc/unicode.c: Artificial mapping to test buffer expansion code.duerst
* string.c: Fixed buffer expansion logic. * test/ruby/enc/test_case_mapping.rb: Tests for above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53554 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-01-16* string.c, enc/unicode.c: New code path as a preparation for Unicode-wideduerst
case mapping. The code path is currently guarded by the :lithuanian option to avoid accidental problems in daily use. * test/ruby/enc/test_case_mapping.rb: Test for above. * string.c: function 'check_case_options': fixed logical errors git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e