Age | Commit message (Collapse) | Author |
|
|
|
As preparation for https://bugs.ruby-lang.org/issues/20205
making sure the test suite is compatible with frozen string
literals is making things easier.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7510
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7387
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7387
|
|
The change in the Unicode emoji file header took place at
version 14.0.0, but is needed only from version 15.0.0
because in version 14.0.0, another check is still active.
|
|
Should fix issues with parallel testing sometimes not running all
tests.
This should be viewed skipping whitespace changes.
Fixes [Bug #18731]
Notes:
Merged: https://github.com/ruby/ruby/pull/5839
|
|
emoji-variation-sequences.txt"
This reverts commit 48f1e8c5d85043e6adb8e93c94532daa201d42e9.
|
|
This reverts commit fc6e4ce62bfa95b6a0d4d4898e1128c1fce4db8a.
|
|
With `make update-unicode`, some tests failed with the following error
due to header mismatch.
* `RbConfig::CONFIG['UNICODE_EMOJI_VERSION']` => 14.0
* the header line is `# emoji-variation-sequences-14.0.0.txt`
So the last `.0` is mismatch.
This patch allows additional `.0` in the header line.
Please revert this patch when a correct patach is merged.
```
1) Error:
TestEmojiBreaks#test_embedded_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:88:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:111:in `all_tests'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:127:in `test_embedded_emoji'
2) Error:
TestEmojiBreaks#test_mixed_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:88:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:111:in `all_tests'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:139:in `test_mixed_emoji'
3) Error:
TestEmojiBreaks#test_single_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:88:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:111:in `all_tests'
/tmp/ruby/v3/src/trunk/test/ruby/enc/test_emoji_breaks.rb:117:in `test_single_emoji'
```
|
|
|
|
http://ci.rvm.jp/results/trunk-no-mjit@phosphorus-docker/3870646
```
1) Error:
TestEmojiBreaks#test_single_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:84:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:105:in `all_tests'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:111:in `test_single_emoji'
2) Error:
TestEmojiBreaks#test_mixed_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:84:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:105:in `all_tests'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:133:in `test_mixed_emoji'
3) Error:
TestEmojiBreaks#test_embedded_emoji:
RuntimeError: File Name Mismatch: line: # emoji-variation-sequences-14.0.0.txt, expected filename: emoji-variation-sequences.txt
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:84:in `block (2 levels) in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `foreach'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:82:in `block in read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `each'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:79:in `read_data'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:105:in `all_tests'
/tmp/ruby/v3/src/trunk-no-mjit/test/ruby/enc/test_emoji_breaks.rb:121:in `test_embedded_emoji'
make: *** [uncommon.mk:823: yes-test-all] Error 3
```
|
|
|
|
The emoji data in emoji-variation-sequences.txt was not used for
in test/ruby/enc/test_emoji_breaks.rb, for unknown reasons.
It turned out that the format of each of the emoji data/test files
is slightly different, and that we didn't take into account that
empty fields after a semicolon, as present in
emoji-variation-sequences.txt, led to less fields than expected
when using split.
This addresses issue #18027.
|
|
Detect Unicode ranges and loop over them.
This fixes issue #18028.
|
|
Deal with the issue that the emoji files in emoji/13.1 have Unicode
Emoji version 13.1, but at the same time the files in 13.0.0/ucd/emoji
are still at Emoji version 13.0. Specifically:
- Add a version attribute to TestEmojiBreaks::BreakFile
- Take the version for emoji-variant-sequences.txt from the Unicode
version, removing the last two characters.
- Improve information in exceptions for file name and version mismatches.
|
|
- Add UNICODE_VERSION,... to deal with new location of some
of the emoji-related data files.
- Introduce class BreakFile to handle various file properties.
- Adapt main code to use BreakFile.
|
|
should not mutate test data.
|
|
|
|
|
|
|
|
Examples why ASCII-only optimization cannot apply multi-byte
encodings which have 7-bit trailing bytes.
Suggested by @duerst at https://github.com/ruby/ruby/pull/2187#issuecomment-492949218
|
|
In test/ruby/enc/test_case_mapping.rb, add a test to make sure the
unassigned codepoints in the Georgian MTAVRULI range (U+1CBB, U+1CBC)
do not get converted to unrelated codepoints by String#capitalize.
(It turns out that this test was not strictly necessary, because
unassigned codepoints are already excluded by the fact that they are
not found in the onigenc_unicode_fold_lookup table. So this test only
serves to check against future regressions.)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66314 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.
We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.
* enc/unicode.c: Actual implementation
* string.c: Add mention of this special case for documentation
* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
that uses String#capitalize on some (including nonsensical)
combinations of MTAVRULI and Mkhedruli, and a canary test to
detect the potential assignment of characters to the currently
open slots (holes) at U+1CBB and U+1CBC.
* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
expectation data.
Together with r65933, this closes issue #14839.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
- common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0
- test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version
- enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h:
Add generated files. Files for Unicode 10.0.0 will be removed once we are
sure 11.0.0 works.
- lib/unicode_normalize/tables.rb: Updated table.
- regparse.c: Almost completely reimplement grapheme cluster detection in
function node_extended_grapheme_cluster().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
In test/ruby/enc/test_emoji_breaks.rb, in method
TestEmojiBreaks#test_embedded_emoji, change the surrounding characters
from A/Z to the more neutral \t in preparation for upgrade to Unicode 11.0.0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
enc/unicode.c: - Add U+1F93C (WRESTLERS), U+1F9DE (GENIE), and U+1F9DF
to onigenc_unicode_GCB_ranges_E_Base.
- Add comments with character names.
test/ruby/enc/test_emoji_breaks.rb: Activate tests for genie/zombie/wrestlers.
This closes issue #15343.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66010 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Add file test/ruby/enc/test_emoji_breaks.rb to test String#each_grapheme_cluster
test data provided by Unicode (at https://www.unicode.org/Public/emoji/#{EMOJI_VERSION}/).
Lines containing emoji for genies, zombies, and wrestling are ignored
because there seems to be a bug (#15343) in the implementation.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65990 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Add file test/ruby/enc/test_grapheme_breaks.rb to test String#each_grapheme_cluster
and \X extended grapheme cluster matcher in regular expressions against test data
provided by Unicode (ucd/auxiliary/GraphemeBreakTest.txt).
Some lines in the data file are ignored, as follows:
- Lines with a surrogate, because Ruby doesn't handle these
- The case of "\r\n", because there is a bug (#15337) in the implementation
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65955 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_regex_casefold.rb: fix searching unicode data
directory, like as test_case_comprehensive.rb.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_case_comprehensive.rb: search ucd directory
first if exists.
* test/ruby/enc/test_regex_casefold.rb: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61415 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* enc/utf_32be.c (utf32be_mbc_enc_len): check arguments precisely.
[ruby-core:79966] [Bug #13292]
* enc/utf_32le.c (utf32le_mbc_enc_len): ditto.
* regenc.h (UNICODE_VALID_CODEPOINT_P): predicate for valid
Unicode codepoints.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57816 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_utf16.rb (test_utf16be_valid_encoding):
assert all data and use assert_predicate.
* test/ruby/enc/test_utf16.rb (test_utf16le_valid_encoding):
ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57815 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
table
* test/ruby/enc/test_case_mapping.rb: Add method test_reorder_unfold to test against
problems when reordering codepoints in some entries in
CaseUnfold_11_Type CaseUnfold_11_Table.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56968 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_case_comprehensive.rb: Change test class name from
TestComprehensiveCaseFold to TestComprehensiveCaseMapping because the
tests are about mapping in general, not only folding
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56966 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC
at the end of onigenc_unicode_case_map (Bug #12990).
* enc/unicode/case-folding.rb: Add U+A64B to the special cases
03B9 and 03BC. Add a comment pointing to enc/unicode.c.
Change warnings to exceptions for unpredicted cases,
because this would have been more easily noticed
(the warning was not noticed when upgrading to Unicode 9.0.0).
* test/ruby/enc/test_case_comprehensive.rb: Remove temporary
exclusion of U+A64B from testing.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56937 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_case_comprehensive.rb: fix test condition,
add a temporary check for U+A64B, the only character where the tests
currently fail. (Bug #12990)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56924 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for Windows-1254.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56433 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/ruby/enc/test_regex_casefold.rb (setup): skip with error
message if CaseFolding.txt does not present, instead of printing
the message, which causes unknown command in parallel test.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56017 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for ISO-8859-2, by Yushiro Ishii.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for Windows-1257, by Sho Koike.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55752 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for Windows-1250, by Sho Koike.
* ChangeLog: Fixed order of previous two entries.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55751 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for Windows-1251, by Shunsuke Sato.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55750 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Implement non-ASCII case conversion for Windows-1251, by Shunsuke Sato.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|