summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorBurdetteLamar <burdettelamar@yahoo.com>2025-11-16 19:46:03 +0000
committerPeter Zhu <peter@peterzhu.ca>2025-11-18 18:57:29 -0800
commit1443f89d6942e19516a0fb10d25876021202ec5e (patch)
tree40689c9d0fa56255374f1e81f6d6174c9ecb7244 /doc
parent319001192d59bc57923ba3838eb83685cb3af014 (diff)
[DOC] Tweaks for String#unicode_normalize
Diffstat (limited to 'doc')
-rw-r--r--doc/string/unicode_normalize.rdoc28
1 files changed, 28 insertions, 0 deletions
diff --git a/doc/string/unicode_normalize.rdoc b/doc/string/unicode_normalize.rdoc
new file mode 100644
index 0000000000..5f733c0fb8
--- /dev/null
+++ b/doc/string/unicode_normalize.rdoc
@@ -0,0 +1,28 @@
+Returns a copy of +self+ with
+{Unicode normalization}[https://unicode.org/reports/tr15] applied.
+
+Argument +form+ must be one of the following symbols
+(see {Unicode normalization forms}[https://unicode.org/reports/tr15/#Norm_Forms]):
+
+- +:nfc+: Canonical decomposition, followed by canonical composition.
+- +:nfd+: Canonical decomposition.
+- +:nfkc+: Compatibility decomposition, followed by canonical composition.
+- +:nfkd+: Compatibility decomposition.
+
+The encoding of +self+ must be one of:
+
+- <tt>Encoding::UTF_8</tt>.
+- <tt>Encoding::UTF_16BE</tt>.
+- <tt>Encoding::UTF_16LE</tt>.
+- <tt>Encoding::UTF_32BE</tt>.
+- <tt>Encoding::UTF_32LE</tt>.
+- <tt>Encoding::GB18030</tt>.
+- <tt>Encoding::UCS_2BE</tt>.
+- <tt>Encoding::UCS_4BE</tt>.
+
+Examples:
+
+ "a\u0300".unicode_normalize # => "à" # Lowercase 'a' with grave accens.
+ "a\u0300".unicode_normalize(:nfd) # => "à" # Same.
+
+Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String].