<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/internal/ruby_parser.h, branch v4.0.2</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>Change return value of `gets` function to be `rb_parser_string_t *` instead of `VALUE`</title>
<updated>2024-05-04T02:59:10+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-04-25T07:09:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=cf74ff714aa795a82cc0ea46e26937efcedfaa45'/>
<id>cf74ff714aa795a82cc0ea46e26937efcedfaa45</id>
<content type='text'>
This change reduces parser's dependency on ruby object.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change reduces parser's dependency on ruby object.
</pre>
</div>
</content>
</entry>
<entry>
<title>Rename `vast` to `ast_value`</title>
<updated>2024-05-03T03:40:35+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-05-02T23:57:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=899d9f79dde0e2dbb2da3a6ec7c1cbf1023cc56d'/>
<id>899d9f79dde0e2dbb2da3a6ec7c1cbf1023cc56d</id>
<content type='text'>
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add line_count field to rb_ast_body_t</title>
<updated>2024-04-27T03:08:26+00:00</updated>
<author>
<name>HASUMI Hitoshi</name>
<email>hasumikin@gmail.com</email>
</author>
<published>2024-04-26T12:43:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=55a402bb759597487905a94d5b1bbff4a672499c'/>
<id>55a402bb759597487905a94d5b1bbff4a672499c</id>
<content type='text'>
This patch adds `int line_count` field to `rb_ast_body_t` structure.
Instead, we no longer cast `script_lines` to Fixnum.

## Background

Ref https://github.com/ruby/ruby/pull/10618

In the PR above, we have decoupled IMEMO from `rb_ast_t`.
This means we could lift the five-words-restriction of the structure
that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field.

## Relating refactor

- Remove the second parameter of `rb_ruby_ast_new()` function

## Attention

I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()`
of vm.c, because I don't think it is necessary.
But I will make another PR for this so that we can atomically revert
in case I was wrong (See the comment on the code)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds `int line_count` field to `rb_ast_body_t` structure.
Instead, we no longer cast `script_lines` to Fixnum.

## Background

Ref https://github.com/ruby/ruby/pull/10618

In the PR above, we have decoupled IMEMO from `rb_ast_t`.
This means we could lift the five-words-restriction of the structure
that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field.

## Relating refactor

- Remove the second parameter of `rb_ruby_ast_new()` function

## Attention

I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()`
of vm.c, because I don't think it is necessary.
But I will make another PR for this so that we can atomically revert
in case I was wrong (See the comment on the code)
</pre>
</div>
</content>
</entry>
<entry>
<title>Set `SCRIPT_LINES__` outside of parser</title>
<updated>2024-04-26T11:34:49+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-04-26T09:38:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=140c59c633b554d437ebcc0931bcc7f4fc5af235'/>
<id>140c59c633b554d437ebcc0931bcc7f4fc5af235</id>
<content type='text'>
Parser should not depend on functions defiend on "ruby_parser.c".
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Parser should not depend on functions defiend on "ruby_parser.c".
</pre>
</div>
</content>
</entry>
<entry>
<title>[Universal parser] Decouple IMEMO from rb_ast_t</title>
<updated>2024-04-26T02:21:08+00:00</updated>
<author>
<name>HASUMI Hitoshi</name>
<email>hasumikin@gmail.com</email>
</author>
<published>2024-04-16T09:42:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=2244c58b009c31da60ad108d6cbccf99d97a5e29'/>
<id>2244c58b009c31da60ad108d6cbccf99d97a5e29</id>
<content type='text'>
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada &lt;nobu@ruby-lang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada &lt;nobu@ruby-lang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Refactor parser compile functions</title>
<updated>2024-04-22T22:20:22+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-04-21T00:54:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=2992e1074adf86ed6c06ba1750648a35d877001a'/>
<id>2992e1074adf86ed6c06ba1750648a35d877001a</id>
<content type='text'>
Refactor parser compile functions to reduce the dependence
on ruby functions.
This commit includes these changes

1. Refactor `gets`, `input` and `gets_` of `parser_params`

Parser needs two different data structure to get next line, function (`gets`) and input data (`input`).
However `gets_` is used for both function (`call`) and input data (`ptr`).
`call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used.
`ptr` is used for managing the current pointer on String when `parser_compile_string` is used.
This commit changes parser to used only `gets` and `input` then removes `gets_`.

2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c

This change reduces the dependence on ruby functions from parser.

3. Change ruby_parser and ripper to take care of `VALUE input` GC mark

Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper.
`input` is arbitrary data pointer from the viewpoint of parser.

4. Introduce rb_parser_compile_array function

Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know
about the detail of `lex_gets` and `input`.
Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Refactor parser compile functions to reduce the dependence
on ruby functions.
This commit includes these changes

1. Refactor `gets`, `input` and `gets_` of `parser_params`

Parser needs two different data structure to get next line, function (`gets`) and input data (`input`).
However `gets_` is used for both function (`call`) and input data (`ptr`).
`call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used.
`ptr` is used for managing the current pointer on String when `parser_compile_string` is used.
This commit changes parser to used only `gets` and `input` then removes `gets_`.

2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c

This change reduces the dependence on ruby functions from parser.

3. Change ruby_parser and ripper to take care of `VALUE input` GC mark

Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper.
`input` is arbitrary data pointer from the viewpoint of parser.

4. Introduce rb_parser_compile_array function

Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know
about the detail of `lex_gets` and `input`.
Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove unused function</title>
<updated>2024-04-20T10:30:26+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-04-20T02:15:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=af169472c791c1ed97f8e4397c0b8628dc6a6c50'/>
<id>af169472c791c1ed97f8e4397c0b8628dc6a6c50</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[Universal parser] DeVALUE of p-&gt;debug_lines and ast-&gt;body.script_lines</title>
<updated>2024-04-15T11:51:54+00:00</updated>
<author>
<name>HASUMI Hitoshi</name>
<email>hasumikin@gmail.com</email>
</author>
<published>2024-03-28T01:26:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=9b1e97b211565b605b8eb7fab277efe117fe2604'/>
<id>9b1e97b211565b605b8eb7fab277efe117fe2604</id>
<content type='text'>
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)-&gt;debug_lines`
  - `(rb_ast_t *)-&gt;body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)-&gt;variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p-&gt;debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)-&gt;debug_lines`
  - `(rb_ast_t *)-&gt;body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)-&gt;variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p-&gt;debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
</pre>
</div>
</content>
</entry>
<entry>
<title>compile.c: use rb_enc_interned_str to reduce allocations</title>
<updated>2024-04-11T07:04:31+00:00</updated>
<author>
<name>Jean Boussier</name>
<email>byroot@ruby-lang.org</email>
</author>
<published>2024-04-10T08:50:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=1b830740ba8371c4bcfdfc6eb2cb7e0ae81a84e0'/>
<id>1b830740ba8371c4bcfdfc6eb2cb7e0ae81a84e0</id>
<content type='text'>
The `rb_fstring(rb_enc_str_new())` pattern is inneficient because:

- It passes a mutable string to `rb_fstring` so if it has to be interned
  it will first be duped.
- It an equivalent interned string already exists, we allocated the string
  for nothing.

With `rb_enc_interned_str` we either directly get the pre-existing string
with 0 allocations, or efficiently directly intern the one we create
without first duping it.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The `rb_fstring(rb_enc_str_new())` pattern is inneficient because:

- It passes a mutable string to `rb_fstring` so if it has to be interned
  it will first be duped.
- It an equivalent interned string already exists, we allocated the string
  for nothing.

With `rb_enc_interned_str` we either directly get the pre-existing string
with 0 allocations, or efficiently directly intern the one we create
without first duping it.
</pre>
</div>
</content>
</entry>
<entry>
<title>Move shareable_constant_value logic from parse.y to compile.c</title>
<updated>2024-04-03T23:44:10+00:00</updated>
<author>
<name>yui-knk</name>
<email>spiketeika@gmail.com</email>
</author>
<published>2024-04-03T10:28:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=60567731051885acf38a3f91899f0d6d62d4898b'/>
<id>60567731051885acf38a3f91899f0d6d62d4898b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
