summaryrefslogtreecommitdiff
path: root/lib/racc/rdoc/grammar.en.rdoc
blob: b667a7cb5ece7da376fee53f45cd404a20f46191 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
= Racc Grammar File Reference

== Global Structure

== Class Block and User Code Block

There's two block on toplevel.
one is 'class' block, another is 'user code' block. 'user code' block MUST
places after 'class' block.

== Comment

You can insert comment about all places. Two style comment can be used,
Ruby style (#.....) and C style (/*......*/) .

== Class Block

The class block is formed like this:

  class CLASS_NAME
    [precedance table]
    [token declearations]
    [expected number of S/R conflict]
    [options]
    [semantic value convertion]
    [start rule]
  rule
    GRAMMARS

CLASS_NAME is a name of parser class.
This is the name of generating parser class.

If CLASS_NAME includes '::', Racc outputs module clause.
For example, writing "class M::C" causes creating the code bellow:

  module M
    class C
      :
      :
    end
  end

== Grammar Block

The grammar block discripts grammar which is able
to be understood by parser.  Syntax is:

  (token): (token) (token) (token).... (action)

  (token): (token) (token) (token).... (action)
         | (token) (token) (token).... (action)
         | (token) (token) (token).... (action)

(action) is an action which is executed when its (token)s are found.
(action) is a ruby code block, which is surrounded by braces:

  { print val[0]
    puts val[1] }

Note that you cannot use '%' string, here document, '%r' regexp in action.

Actions can be omitted.
When it is omitted, '' (empty string) is used.

A return value of action is a value of left side value ($$).
It is value of result, or returned value by "return" statement.

Here is an example of whole grammar block.

  rule
    goal: definition ruls source { result = val }

    definition: /* none */   { result = [] }
      | definition startdesig  { result[0] = val[1] }
      | definition
               precrule   # this line continue from upper line
        {
          result[1] = val[1]
        }

    startdesig: START TOKEN

You can use following special local variables in action.

* result ($$)

The value of left-hand side (lhs). A default value is val[0].

* val ($1,$2,$3...)

An array of value of right-hand side (rhs).

* _values (...$-2,$-1,$0)

A stack of values.
DO NOT MODIFY this stack unless you know what you are doing.

== Operator Precedence

This function is equal to '%prec' in yacc.
To designate this block:

  prechigh
    nonassoc '++'
    left     '*' '/'
    left     '+' '-'
    right    '='
  preclow

`right' is yacc's %right, `left' is yacc's %left.

`=' + (symbol) means yacc's %prec:

  prechigh
    nonassoc UMINUS
    left '*' '/'
    left '+' '-'
  preclow

  rule
    exp: exp '*' exp
       | exp '-' exp
       | '-' exp       =UMINUS   # equals to "%prec UMINUS"
           :
           :

== expect

Racc has bison's "expect" directive.

  # Example

  class MyParser
  rule
    expect 3
      :
      :

This directive declears "expected" number of shift/reduce conflict.
If "expected" number is equal to real number of conflicts,
racc does not print confliction warning message.

== Declaring Tokens

By declaring tokens, you can avoid many meanless bugs.
If decleared token does not exist/existing token does not decleared,
Racc output warnings.  Declearation syntax is:

  token TOKEN_NAME AND_IS_THIS
        ALSO_THIS_IS AGAIN_AND_AGAIN THIS_IS_LAST

== Options

You can write options for racc command in your racc file.

  options OPTION OPTION ...

Options are:

* omit_action_call

omit empty action call or not.

* result_var

use/does not use local variable "result"

You can use 'no_' prefix to invert its meanings.

== Converting Token Symbol

Token symbols are, as default,

  * naked token string in racc file (TOK, XFILE, this_is_token, ...)
    --> symbol (:TOK, :XFILE, :this_is_token, ...)
  * quoted string (':', '.', '(', ...)
    --> same string (':', '.', '(', ...)

You can change this default by "convert" block.
Here is an example:

  convert
    PLUS 'PlusClass'      # We use PlusClass for symbol of `PLUS'
    MIN  'MinusClass'     # We use MinusClass for symbol of `MIN'
  end

We can use almost all ruby value can be used by token symbol,
except 'false' and 'nil'.  These are causes unexpected parse error.

If you want to use String as token symbol, special care is required.
For example:

  convert
    class '"cls"'            # in code, "cls"
    PLUS '"plus\n"'          # in code, "plus\n"
    MIN  "\"minus#{val}\""   # in code, \"minus#{val}\"
  end

== Start Rule

'%start' in yacc. This changes start rule.

  start real_target

This statement will not be used forever, I think.

== User Code Block

"User Code Block" is a Ruby source code which is copied to output.
There are three user code block, "header" "inner" and "footer".

Format of user code is like this:

  ---- header
    ruby statement
    ruby statement
    ruby statement

  ---- inner
    ruby statement
       :
       :

If four '-' exist on line head,
racc treat it as beginning of user code block.
A name of user code must be one word.