summaryrefslogtreecommitdiff
path: root/parse.y
AgeCommit message (Collapse)Author
2024-04-26[Universal parser] Decouple IMEMO from rb_ast_tHASUMI Hitoshi
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-25false is not a pointer卜部昌平
This function returned VALUE before. False made sense back then. Now that it returns a pointer. NULL should be used instead.
2024-04-23Move encoding object conversion outside of parseryui-knk
Reduce the parser's dependence on `VALUE` and `rb_enc_from_encoding`.
2024-04-23Refactor parser compile functionsyui-knk
Refactor parser compile functions to reduce the dependence on ruby functions. This commit includes these changes 1. Refactor `gets`, `input` and `gets_` of `parser_params` Parser needs two different data structure to get next line, function (`gets`) and input data (`input`). However `gets_` is used for both function (`call`) and input data (`ptr`). `call` is used for managing general callback function when `rb_ruby_parser_compile_generic` is used. `ptr` is used for managing the current pointer on String when `parser_compile_string` is used. This commit changes parser to used only `gets` and `input` then removes `gets_`. 2. Move parser_compile functions and `gets` functions from parse.y to ruby_parser.c This change reduces the dependence on ruby functions from parser. 3. Change ruby_parser and ripper to take care of `VALUE input` GC mark Move the responsibility of calling `rb_gc_mark` for `VALUE input` from parser to ruby_parser and ripper. `input` is arbitrary data pointer from the viewpoint of parser. 4. Introduce rb_parser_compile_array function Caller of `rb_parser_compile_generic` needs to take care about GC because ruby_parser doesn’t know about the detail of `lex_gets` and `input`. Introduce `rb_parser_compile_array` to reduce the complexity of ast.c.
2024-04-22Expand `Qnone` and `Qnull` macrosyui-knk
In the past, `Qnone` and `Qnull` had different values in ripper context like below. However 89cfc152071 removes the usage in ripper context, then expand the macro. ``` #ifndef RIPPER # define Qnone 0 # define Qnull 0 #else # define Qnone Qnil # define Qnull Qundef #endif ```
2024-04-20Parser and universal parser share wrapper functionsyui-knk
2024-04-15[Universal parser] DeVALUE of p->debug_lines and ast->body.script_linesHASUMI Hitoshi
This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params *)->debug_lines` - `(rb_ast_t *)->body.script_lines` - Instead, they are now `rb_parser_ary_t *` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE` ## Other details - Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see *Future tasks* below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15Emit `warn` event for duplicated hash keys on ripperyui-knk
Need to use `rb_warn` macro instead of calling `rb_compile_warn` directly to emit `warn` event on ripper.
2024-04-14Use `rb_parser_string_t *` for `delayed.token`yui-knk
2024-04-12[Bug #20423] Disallow anonymous block within argument forwardingNobuyoshi Nakada
2024-04-11compile.c: use rb_enc_interned_str to reduce allocationsJean Boussier
The `rb_fstring(rb_enc_str_new())` pattern is inneficient because: - It passes a mutable string to `rb_fstring` so if it has to be interned it will first be duped. - It an equivalent interned string already exists, we allocated the string for nothing. With `rb_enc_interned_str` we either directly get the pre-existing string with 0 allocations, or efficiently directly intern the one we create without first duping it.
2024-04-11[Bug #20417] Block local variables do not need to warn about unusedNobuyoshi Nakada
2024-04-11Remove unused AREF macroS-H-GAMELINKS
2024-04-11Fix segv when parsing `command` by ripperyui-knk
89cfc152071 made this event dispatch to pass `Qundef` to user defined callback method by mistake. This commit fix it to be `nil`.
2024-04-07Remove unused macroyui-knk
2024-04-07Fix ripper to dispatch warning event for duplicated when clauseyui-knk
Need to separate `check_literal_when` function for parser and ripper otherwise warning event is not dispatched because parser `rb_warning1` is used in ripper.
2024-04-06Remove redundant conversion between int and objectyui-knk
2024-04-06Fix a variable nameyui-knk
The first argument of `WARN_SPACE_CHAR` is always `c2` in caller side, so `c` equals to `c2`.
2024-04-05Make `nd_plen` to be int to reduce `rb_long2int` usageyui-knk
2024-04-05Remove unused functions from `struct rb_parser_config_struct`yui-knk
2024-04-04NODE_LIT is not used anymoreyui-knk
2024-04-04Move shareable_constant_value logic from parse.y to compile.cyui-knk
2024-04-02Remove `rb_imemo_tmpbuf_t` from parseryui-knk
No parser semantic value types are `VALUE` then no need to use imemo for managing semantic value stack anymore.
2024-04-02[Feature #20331] Simplify parser warnings for hash keys duplication and when ↵yui-knk
clause duplication This commit simplifies warnings for hash keys duplication and when clause duplication, based on the discussion of https://bugs.ruby-lang.org/issues/20331. Warnings are reported only when strings are same to ohters.
2024-04-02Remove VALUE from `struct rb_strterm_struct`yui-knk
In the past, it was imemo. However a075c55 changed it. Therefore no need to use `VALUE` for the first field.
2024-03-28[Bug #20398] Terminate token buffer at invalid octal numberNobuyoshi Nakada
2024-03-26[Bug #20392] Block arguments duplication check at `super`Nobuyoshi Nakada
2024-03-21Fix Ripper memory allocation size when enabled Universal ParserS-H-GAMELINKS
The size of `struct parser_params` is 8 bytes difference in `ripper_s_allocate` and `rb_ruby_parser_allocate` when the universal parser is enabled. This causes a situation where `*r->p` is not fully initialized in `ripper_s_allocate` as shown below. ```console (gdb) p *r->p $2 = {heap = 0x0, lval = 0x0, yylloc = 0x0, lex = {strterm = 0x0, gets = 0x0, input = 0, string_buffer = {head = 0x0, last = 0x0}, lastlin e = 0x0, nextline = 0x0, pbeg = 0x0, pcur = 0x0, pend = 0x0, ptok = 0x0, gets_ = {ptr = 0, call = 0x0}, state = EXPR_NONE, paren_nest = 0, lpar _seen = 0, debug = 0, has_shebang = 0, token_seen = 0, token_info_enabled = 0, error_p = 0, cr_seen = 0, value = 0, result = 0, parsing_thread = 0, s_value = 0, s_lvalue = 0, s_value_stack = 2097} ```` This seems to cause `double free or corruption (!prev)` and SEGV. So, fixing this by introduce `rb_ripper_parser_params_allocate` and `rb_ruby_parser_config` functions for Ripper, and `struct parser_params` same size is returned.
2024-03-21Fix unexpected node bug for `shareable_constant_value: literal`yui-knk
[Bug #20339] [Bug #20341] `const_decl_path` changes the value of `NODE **dest`, LHS of an assignment, with `NODE_LIT` created by `const_decl_path`. `shareable_literal_constant` calls `const_decl_path` via `ensure_shareable_node` multiple times if RHS of an assignment is array or hash. This means `NODE **dest` argument of `const_decl_path` can be `NODE_LIT` from the second time then causes `[BUG] unexpected node: NODE_LIT` in `rb_node_const_decl_val`. This commit change to not update `NODE **dest` in `const_decl_path` to fix the issue.
2024-03-17[Bug #20218] Reject keyword arguments in indexNobuyoshi Nakada
2024-03-17[Bug #19918] Reject block passing in indexNobuyoshi Nakada
2024-03-12Remove unused function in parse.yPeter Zhu
parse.y:2624:1: warning: unused function 'rb_parser_ary_new'
2024-03-12[Universal Parser] Reduce dependence on RArray in parse.yHASUMI Hitoshi
- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t *` - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t` - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c - Implement `rb_parser_str_escape()` - This is a port of the `rb_str_escape()` function in string.c - `rb_parser_str_escape()` does not depend on `VALUE` (RString) - Instead, it uses `rb_parser_stirng_t *` - This function works when --dump=y option passed - Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future - Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
2024-02-27Constify `literal_type` unless universal parserNobuyoshi Nakada
2024-02-27Use `'\n'?` instead of `opt_nl`ydah
2024-02-23[Bug #20234] Fix segv when parsing begin statement in method definitioneileencodes
In a method definition, the `begin` may not have an `nd_body`. When that happens we get a null expr back from `last_expr_node` which causes a segv for the following examples: ```ruby def (begin;end).foo; end def (begin;else;end).foo; end def (begin;ensure;else;end).foo; end ``` In addition, I've added tests for other cases that weren't causing a segv but appeared untested.` Fixes https://bugs.ruby-lang.org/issues/20234
2024-02-23Use rb_str_to_interned_str in parse.yPeter Zhu
This commit changes rb_fstring to rb_str_to_interned_str in parse.y. rb_fstring is private so it shouldn't be used by ripper.
2024-02-23[Bug #20295] Fix SEGV when parsing invalid regexpyui-knk
2024-02-22Use `rb_encoding *` as literal hash of NODE_ENCODINGyui-knk
This reduces dependency on VALUE.
2024-02-22Use `terms?` instead of `opt_terms`ydah
2024-02-21Remove not used universal parser macros and functionsyui-knk
2024-02-21`rb_parser_warn_duplicate_keys` doesn't need to take care of NODE_LIT anymoreyui-knk
NODE_LIT is created only for `shareable_constant_value`. This means hash key node is never NODE_LIT.
2024-02-21Remove hack for ripper.y generationyui-knk
Before Rearchitect Ripper (89cfc15), parser and ripper used different semantic value data type for same symbols. "ext/ripper/tools/preproc.rb" replaced these types when it generated ripper.y. Starting the line with other than `%token` suppressed the type replacement. However, after Rearchitect Ripper, both parser and ripper use same semantic value data type. Therefore these comments are not needed anymore.
2024-02-21Introduce NODE_REGX to manage regexp literalyui-knk
2024-02-20Move ripper_validate_object to ripper_init.c.tmplyui-knk
2024-02-20Suppress unused function warning for UNIVERSAL_PARSER buildyui-knk
Suppress the warning: ``` parse.y:2221:1: warning: unused function 'rb_parser_str_hash' [-Wunused-function] 2221 | rb_parser_str_hash(rb_parser_string_t *str) | ^~~~~~~~~~~~~~~~~~ ```
2024-02-20Workaround for `Prism::ParseTest#test_filepath` for ↵yui-knk
"unparser/corpus/literal/def.txt" See the discussion on https://github.com/ruby/ruby/pull/9923
2024-02-20[Feature #20257] Rearchitect Ripperyui-knk
Introduce another semantic value stack for Ripper so that Ripper can manage both Node and Ruby Object separately. This rearchitectutre of Ripper solves these issues. Therefore adding test cases for them. * [Bug 10436] https://bugs.ruby-lang.org/issues/10436 * [Bug 18988] https://bugs.ruby-lang.org/issues/18988 * [Bug 20055] https://bugs.ruby-lang.org/issues/20055 Checked the differences of `Ripper.sexp` for files under `/test/ruby` are only on test_pattern_matching.rb. The differences comes from the differences between `new_hash_pattern_tail` functions between parser and Ripper. Ripper `new_hash_pattern_tail` didn’t call `assignable` then `kw_rest_arg` wasn’t marked as local variable. This is also fixed by this commit. ``` --- a/./tmp/before/test_pattern_matching.rb +++ b/./tmp/after/test_pattern_matching.rb @@ -3607,7 +3607,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [985, 10]]], + [:var_ref, [:@ident, “a”, [985, 10]]], :==, [:hash, nil]]], nil]]], @@ -3662,7 +3662,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [994, 10]]], + [:var_ref, [:@ident, “a”, [994, 10]]], :==, [:hash, [:assoclist_from_args, @@ -3813,7 +3813,7 @@ [:command, [:@ident, “raise”, [1022, 10]], [:args_add_block, - [[:vcall, [:@ident, “b”, [1022, 16]]]], + [[:var_ref, [:@ident, “b”, [1022, 16]]]], false]]], [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]], nil, @@ -3876,7 +3876,7 @@ [:@int, “0”, [1033, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1033, 20]]], + [:var_ref, [:@ident, “b”, [1033, 20]]], :==, [:hash, nil]]]], nil]]], @@ -3946,7 +3946,7 @@ [:@int, “0”, [1042, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1042, 20]]], + [:var_ref, [:@ident, “b”, [1042, 20]]], :==, [:hash, [:assoclist_from_args, @@ -5206,7 +5206,7 @@ [[:assoc_new, [:@label, “c:“, [1352, 22]], [:@int, “0”, [1352, 25]]]]]], - [:vcall, [:@ident, “r”, [1352, 29]]]], + [:var_ref, [:@ident, “r”, [1352, 29]]]], false]]], [:binary, [:call, @@ -5299,7 +5299,7 @@ [:assoc_new, [:@label, “c:“, [1367, 34]], [:@int, “0”, [1367, 37]]]]]], - [:vcall, [:@ident, “r”, [1367, 41]]]], + [:var_ref, [:@ident, “r”, [1367, 41]]]], false]]], [:binary, [:call, @@ -5931,7 +5931,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]], [[:binary, - [:vcall, [:@ident, “r”, [1534, 8]]], + [:var_ref, [:@ident, “r”, [1534, 8]]], :==, [:hash, [:assoclist_from_args, ```
2024-02-19[Bug #20280] Check by `rb_parser_enc_str_coderange`Nobuyoshi Nakada
Co-authored-by: Yuichiro Kaneko <spiketeika@gmail.com>
2024-02-19[Bug #20280] Raise SyntaxError on invalid encoding symbolNobuyoshi Nakada