diff options
| author | BurdetteLamar <burdettelamar@yahoo.com> | 2025-11-16 19:46:03 +0000 |
|---|---|---|
| committer | Peter Zhu <peter@peterzhu.ca> | 2025-11-18 18:57:29 -0800 |
| commit | 1443f89d6942e19516a0fb10d25876021202ec5e (patch) | |
| tree | 40689c9d0fa56255374f1e81f6d6174c9ecb7244 /doc | |
| parent | 319001192d59bc57923ba3838eb83685cb3af014 (diff) | |
[DOC] Tweaks for String#unicode_normalize
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/string/unicode_normalize.rdoc | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/doc/string/unicode_normalize.rdoc b/doc/string/unicode_normalize.rdoc new file mode 100644 index 0000000000..5f733c0fb8 --- /dev/null +++ b/doc/string/unicode_normalize.rdoc @@ -0,0 +1,28 @@ +Returns a copy of +self+ with +{Unicode normalization}[https://unicode.org/reports/tr15] applied. + +Argument +form+ must be one of the following symbols +(see {Unicode normalization forms}[https://unicode.org/reports/tr15/#Norm_Forms]): + +- +:nfc+: Canonical decomposition, followed by canonical composition. +- +:nfd+: Canonical decomposition. +- +:nfkc+: Compatibility decomposition, followed by canonical composition. +- +:nfkd+: Compatibility decomposition. + +The encoding of +self+ must be one of: + +- <tt>Encoding::UTF_8</tt>. +- <tt>Encoding::UTF_16BE</tt>. +- <tt>Encoding::UTF_16LE</tt>. +- <tt>Encoding::UTF_32BE</tt>. +- <tt>Encoding::UTF_32LE</tt>. +- <tt>Encoding::GB18030</tt>. +- <tt>Encoding::UCS_2BE</tt>. +- <tt>Encoding::UCS_4BE</tt>. + +Examples: + + "a\u0300".unicode_normalize # => "à" # Lowercase 'a' with grave accens. + "a\u0300".unicode_normalize(:nfd) # => "à" # Same. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. |
