diff options
author | duerst <duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> | 2018-07-28 09:44:33 +0000 |
---|---|---|
committer | duerst <duerst@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> | 2018-07-28 09:44:33 +0000 |
commit | a7acec675051f8ed49bbc3ab992ac668e5c29fcf (patch) | |
tree | 9254e6a563212f767b0c5f2bf8f1f3b5739dc4e1 /test | |
parent | 9eb6304aa944183fa0e60a30a3c41a23a4ae1917 (diff) |
fix range check for Hangul jamo trailers in Unicode normalization
* lib/unicode_normalize/normalize.rb: Fix the range check for trailing
Hangul jamo characters in Unicode normalization. Different from
leading or vowel jamos, where LBASE and VBASE are actual characters,
a value equal to TBASE expresses the absence of a trailing jamo.
This fix is technically correct, but there was no bug because
the regular expressions in lib/unicode_normalize/tables.rb
eliminate jamos equal to TBASE from normalization processing.
* test/test_unicode_normalize.rb: Add preventive test
test_no_trailing_jamo based on
https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5
just for the case we ever get a regression.
This closes issue #14934, thanks to MaLin (Lin Ma) for reporting.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Diffstat (limited to 'test')
-rw-r--r-- | test/test_unicode_normalize.rb | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/test/test_unicode_normalize.rb b/test/test_unicode_normalize.rb index 0fc84343d0..fbd979206a 100644 --- a/test/test_unicode_normalize.rb +++ b/test/test_unicode_normalize.rb @@ -167,6 +167,13 @@ class TestUnicodeNormalize assert_equal "\u1100\u1161\u11A8", "\uAC00\u11A8".unicode_normalize(:nfd) end + # preventive tests for (non-)bug #14934 + def test_no_trailing_jamo + assert_equal "\u1100\u1176\u11a8", "\u1100\u1176\u11a8".unicode_normalize(:nfc) + assert_equal "\uae30\u11a7", "\u1100\u1175\u11a7".unicode_normalize(:nfc) + assert_equal "\uae30\u11c3", "\u1100\u1175\u11c3".unicode_normalize(:nfc) + end + def test_hangul_plus_accents assert_equal "\uAC00\u0323\u0300", "\uAC00\u0300\u0323".unicode_normalize(:nfc) assert_equal "\uAC00\u0323\u0300", "\u1100\u1161\u0300\u0323".unicode_normalize(:nfc) |