summaryrefslogtreecommitdiff
path: root/doc/string/split.rdoc
blob: 5ab065093bb8d8e4cb8b4110f39eb89a2579d087 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
Returns an array of substrings of +self+
that are the result of splitting +self+
at each occurrence of the given field separator +field_sep+.

When +field_sep+ is <tt>$;</tt>:

- If <tt>$;</tt> is +nil+ (its default value),
  the split occurs just as if +field_sep+ were given as a space character
  (see below).

- If <tt>$;</tt> is a string,
  the split occurs just as if +field_sep+ were given as that string
  (see below).

When +field_sep+ is <tt>' '</tt> and +limit+ is +nil+,
the split occurs at each sequence of whitespace:

  'abc def ghi'.split(' ')         => ["abc", "def", "ghi"]
  "abc \n\tdef\t\n  ghi".split(' ') # => ["abc", "def", "ghi"]
  'abc  def   ghi'.split(' ')      => ["abc", "def", "ghi"]
  ''.split(' ')                    => []

When +field_sep+ is a string different from <tt>' '</tt>
and +limit+ is +nil+,
the split occurs at each occurrence of +field_sep+;
trailing empty substrings are not returned:

  'abracadabra'.split('ab')  => ["", "racad", "ra"]
  'aaabcdaaa'.split('a')     => ["", "", "", "bcd"]
  ''.split('a')              => []
  '3.14159'.split('1')       => ["3.", "4", "59"]
  '!@#$%^$&*($)_+'.split('$') # => ["!@#", "%^", "&*(", ")_+"]
  'тест'.split('т')          => ["", "ес"]
  'こんにちは'.split('に')     => ["こん", "ちは"]

When +field_sep+ is a Regexp and +limit+ is +nil+,
the split occurs at each occurrence of a match;
trailing empty substrings are not returned:

  'abracadabra'.split(/ab/) # => ["", "racad", "ra"]
  'aaabcdaaa'.split(/a/)   => ["", "", "", "bcd"]
  'aaabcdaaa'.split(//)    => ["a", "a", "a", "b", "c", "d", "a", "a", "a"]
  '1 + 1 == 2'.split(/\W+/) # => ["1", "1", "2"]

If the \Regexp contains groups, their matches are also included
in the returned array:

  '1:2:3'.split(/(:)()()/, 2) # => ["1", ":", "", "", "2:3"]

As seen above, if +limit+ is +nil+,
trailing empty substrings are not returned;
the same is true if +limit+ is zero:

  'aaabcdaaa'.split('a')   => ["", "", "", "bcd"]
  'aaabcdaaa'.split('a', 0) # => ["", "", "", "bcd"]

If +limit+ is positive integer +n+, no more than <tt>n - 1-</tt>
splits occur, so that at most +n+ substrings are returned,
and trailing empty substrings are included:

  'aaabcdaaa'.split('a', 1) # => ["aaabcdaaa"]
  'aaabcdaaa'.split('a', 2) # => ["", "aabcdaaa"]
  'aaabcdaaa'.split('a', 5) # => ["", "", "", "bcd", "aa"]
  'aaabcdaaa'.split('a', 7) # => ["", "", "", "bcd", "", "", ""]
  'aaabcdaaa'.split('a', 8) # => ["", "", "", "bcd", "", "", ""]

Note that if +field_sep+ is a \Regexp containing groups,
their matches are in the returned array, but do not count toward the limit.

If +limit+ is negative, it behaves the same as if +limit+ was +nil+,
meaning that there is no limit,
and trailing empty substrings are included:

  'aaabcdaaa'.split('a', -1) # => ["", "", "", "bcd", "", "", ""]

If a block is given, it is called with each substring:

  'abc def ghi'.split(' ') {|substring| p substring }

Output:

  "abc"
  "def"
  "ghi"

Related: String#partition, String#rpartition.