ruby.git/test/ruby/enc, branch ruby_2_7

Removed excess spaces

2019-06-28T07:02:47+00:00

Fixed name conflict between helper classes

2019-06-28T07:02:03+00:00

Add new encoding CESU-8 [Feature #15931]

2019-06-24T03:58:33+00:00

Test to disable ASCII-only optimization

2019-05-17T01:05:57+00:00

Examples why ASCII-only optimization cannot apply multi-byte
encodings which have 7-bit trailing bytes.

Suggested by @duerst at https://github.com/ruby/ruby/pull/2187#issuecomment-492949218

add a test to make sure some unassigned codepoints do not get converted

2018-12-10T23:12:12+00:00

In test/ruby/enc/test_case_mapping.rb, add a test to make sure the
unassigned codepoints in the Georgian MTAVRULI range (U+1CBB, U+1CBC)
do not get converted to unrelated codepoints by String#capitalize.
(It turns out that this test was not strictly necessary, because
unassigned codepoints are already excluded by the fact that they are
not found in the onigenc_unicode_fold_lookup table. So this test only
serves to check against future regressions.)

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

implement special behavior for Georgian for String#capitalize

2018-12-09T23:14:29+00:00

The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

replace hardcoded emoji version by RbConfig::CONFIG['UNICODE_EMOJI_VERSION']

2018-12-07T09:01:13+00:00

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

update to Unicode 11.0.0 (main step, not complete yet)

2018-12-05T08:10:24+00:00

- common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0
- test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version
- enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h:
  Add generated files. Files for Unicode 10.0.0 will be removed once we are
  sure 11.0.0 works.
- lib/unicode_normalize/tables.rb: Updated table.
- regparse.c: Almost completely reimplement grapheme cluster detection in
  function node_extended_grapheme_cluster().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

exclude skin tones as second component in TestEmojiBreaks#test_mixed_emoji

2018-12-04T06:31:40+00:00

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

change embedding character in TestEmojiBreaks#test_embedded_emoji

2018-12-04T04:11:51+00:00

In test/ruby/enc/test_emoji_breaks.rb, in method
TestEmojiBreaks#test_embedded_emoji, change the surrounding characters
from A/Z to the more neutral \t in preparation for upgrade to Unicode 11.0.0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e