<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/test/ruby/enc, branch ruby_2_7</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>Removed excess spaces</title>
<updated>2019-06-28T07:02:47+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2019-06-28T07:02:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=e9bce55c12f87d783c651d415b6b79beeeb79737'/>
<id>e9bce55c12f87d783c651d415b6b79beeeb79737</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fixed name conflict between helper classes</title>
<updated>2019-06-28T07:02:03+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2019-06-28T07:02:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=566e6b0546633f3da4f868c3a217bc3167008fdf'/>
<id>566e6b0546633f3da4f868c3a217bc3167008fdf</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add new encoding CESU-8 [Feature #15931]</title>
<updated>2019-06-24T03:58:33+00:00</updated>
<author>
<name>NARUSE, Yui</name>
<email>naruse@airemix.jp</email>
</author>
<published>2019-06-16T23:50:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=7f64a0b4db7ee27a04579236950d88301c7bcabb'/>
<id>7f64a0b4db7ee27a04579236950d88301c7bcabb</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Test to disable ASCII-only optimization</title>
<updated>2019-05-17T01:05:57+00:00</updated>
<author>
<name>Nobuyoshi Nakada</name>
<email>nobu@ruby-lang.org</email>
</author>
<published>2019-05-17T00:53:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=0971cab4d0cb730292461d16ac72c430aa23cc10'/>
<id>0971cab4d0cb730292461d16ac72c430aa23cc10</id>
<content type='text'>
Examples why ASCII-only optimization cannot apply multi-byte
encodings which have 7-bit trailing bytes.

Suggested by @duerst at https://github.com/ruby/ruby/pull/2187#issuecomment-492949218
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Examples why ASCII-only optimization cannot apply multi-byte
encodings which have 7-bit trailing bytes.

Suggested by @duerst at https://github.com/ruby/ruby/pull/2187#issuecomment-492949218
</pre>
</div>
</content>
</entry>
<entry>
<title>add a test to make sure some unassigned codepoints do not get converted</title>
<updated>2018-12-10T23:12:12+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-10T23:12:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=dff1e89bfb77e5d57bc56312364ac2036f3f7a99'/>
<id>dff1e89bfb77e5d57bc56312364ac2036f3f7a99</id>
<content type='text'>
In test/ruby/enc/test_case_mapping.rb, add a test to make sure the
unassigned codepoints in the Georgian MTAVRULI range (U+1CBB, U+1CBC)
do not get converted to unrelated codepoints by String#capitalize.
(It turns out that this test was not strictly necessary, because
unassigned codepoints are already excluded by the fact that they are
not found in the onigenc_unicode_fold_lookup table. So this test only
serves to check against future regressions.)

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In test/ruby/enc/test_case_mapping.rb, add a test to make sure the
unassigned codepoints in the Georgian MTAVRULI range (U+1CBB, U+1CBC)
do not get converted to unrelated codepoints by String#capitalize.
(It turns out that this test was not strictly necessary, because
unassigned codepoints are already excluded by the fact that they are
not found in the onigenc_unicode_fold_lookup table. So this test only
serves to check against future regressions.)

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
<entry>
<title>implement special behavior for Georgian for String#capitalize</title>
<updated>2018-12-09T23:14:29+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-09T23:14:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=3628eae2e754a7489feebc6f41371d42d2efcf3c'/>
<id>3628eae2e754a7489feebc6f41371d42d2efcf3c</id>
<content type='text'>
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
<entry>
<title>replace hardcoded emoji version by RbConfig::CONFIG['UNICODE_EMOJI_VERSION']</title>
<updated>2018-12-07T09:01:13+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-07T09:01:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a3798256c798255d30c365f689dc12c1eeeb40c3'/>
<id>a3798256c798255d30c365f689dc12c1eeeb40c3</id>
<content type='text'>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
<entry>
<title>update to Unicode 11.0.0 (main step, not complete yet)</title>
<updated>2018-12-05T08:10:24+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-05T08:10:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=66a6073859ac6ae2143a9d72162efedece7e1348'/>
<id>66a6073859ac6ae2143a9d72162efedece7e1348</id>
<content type='text'>
- common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0
- test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version
- enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h:
  Add generated files. Files for Unicode 10.0.0 will be removed once we are
  sure 11.0.0 works.
- lib/unicode_normalize/tables.rb: Updated table.
- regparse.c: Almost completely reimplement grapheme cluster detection in
  function node_extended_grapheme_cluster().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0
- test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version
- enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h:
  Add generated files. Files for Unicode 10.0.0 will be removed once we are
  sure 11.0.0 works.
- lib/unicode_normalize/tables.rb: Updated table.
- regparse.c: Almost completely reimplement grapheme cluster detection in
  function node_extended_grapheme_cluster().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
<entry>
<title>exclude skin tones as second component in TestEmojiBreaks#test_mixed_emoji</title>
<updated>2018-12-04T06:31:40+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-04T06:31:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=84d679794f56de3cc286aa988844a15df2c0cdf5'/>
<id>84d679794f56de3cc286aa988844a15df2c0cdf5</id>
<content type='text'>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
<entry>
<title>change embedding character in TestEmojiBreaks#test_embedded_emoji</title>
<updated>2018-12-04T04:11:51+00:00</updated>
<author>
<name>duerst</name>
<email>duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e</email>
</author>
<published>2018-12-04T04:11:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=db9f1efc4cb93401ce83356903f83caaf26879c6'/>
<id>db9f1efc4cb93401ce83356903f83caaf26879c6</id>
<content type='text'>
In test/ruby/enc/test_emoji_breaks.rb, in method
TestEmojiBreaks#test_embedded_emoji, change the surrounding characters
from A/Z to the more neutral \t in preparation for upgrade to Unicode 11.0.0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In test/ruby/enc/test_emoji_breaks.rb, in method
TestEmojiBreaks#test_embedded_emoji, change the surrounding characters
from A/Z to the more neutral \t in preparation for upgrade to Unicode 11.0.0.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
</pre>
</div>
</content>
</entry>
</feed>
