summaryrefslogtreecommitdiff
path: root/doc/string/grapheme_clusters.rdoc
blob: 07ea1e318b5573bfbe4020419a134fe397307899 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Returns an array of the grapheme clusters in +self+
(see {Unicode Grapheme Cluster Boundaries}[https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries]):

  s = "ä-pqr-b̈-xyz-c̈"
  s.size                   # => 16
  s.bytesize               # => 19
  s.grapheme_clusters.size # => 13
  s.grapheme_clusters
  # => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"]

Details:

  s = "ä"
  s.grapheme_clusters             # => ["ä"]           # One grapheme cluster.
  s.bytes                         # => [97, 204, 136]  # Three bytes.
  s.chars                         # => ["a", "̈"]       # Two characters.
  s.chars.map {|char| char.ord }  # => [97, 776]       # Their values.

Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString].