summaryrefslogtreecommitdiff
path: root/ext/ripper
AgeCommit message (Collapse)Author
2024-09-02merge revision(s) 97449338d6cb42d9dd7c9ca61550616e7e6b6ef6: [Backport #20649]Takashi Kokubun
[Bug #20649] Allow `nil` as 2nd argument of `assign_error` Fallback to the last token element in that case, for the backward compatibilities.
2023-12-10Change the semantics of rb_postponed_job_registerKJ Tsanaktsidis
Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration
2023-10-20Stop creating ripper.h because it's not usedyui-knk
2023-10-10ripper: Support member references in the DSLNobuyoshi Nakada
2023-10-01Use rb_node_opt_arg_t and rb_node_kw_arg_t instead of NODEyui-knk
2023-09-30Extract `ripper_parser_params`Nobuyoshi Nakada
2023-09-28Change RNode structure from union to structyui-knk
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.
2023-09-25ripper: Support named references in the DSLNobuyoshi Nakada
2023-09-17ripper: Preprocess ripper-dispatchable types onlyNobuyoshi Nakada
Keep the other types, which not having setter macros for ripper.
2023-09-10Set ripper_init.c.tmpl to C mode [ci skip]Nobuyoshi Nakada
2023-08-25include missing header卜部昌平
Notes: Merged: https://github.com/ruby/ruby/pull/8274
2023-08-25tool/update-deps --fix卜部昌平
Notes: Merged: https://github.com/ruby/ruby/pull/8274
2023-07-16Fix `#line` directive filename of ripper.cyui-knk
Before: ```c /* First part of user prologue. */ #line 14 "parse.y" ``` After: ```c /* First part of user prologue. */ #line 14 "ripper.y" ``` Notes: Merged: https://github.com/ruby/ruby/pull/8083
2023-07-16Fix null pointer access in Ripper#initializeNobuyoshi Nakada
In `rb_ruby_ripper_parser_allocate`, `r->p` is NULL between creating `self` and `parser_params` assignment. As GC can happen there, the typed-data functions for it need to consider the case. Notes: Merged: https://github.com/ruby/ruby/pull/8085
2023-07-15Use functions defined by parser_st.c to reduce dependency on st.cyui-knk
Notes: Merged: https://github.com/ruby/ruby/pull/8057
2023-07-09Include ripper.h into `$distcleanfiles`yui-knk
Notes: Merged: https://github.com/ruby/ruby/pull/8046
2023-06-29More dependencies for ripperNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7999
2023-06-28Fix memory leak in RipperPeter Zhu
The following script leaks memory in Ripper: ```ruby require "ripper" 20.times do 100_000.times do Ripper.parse("") end puts `ps -o rss= -p #{$$}` end ``` Notes: Merged: https://github.com/ruby/ruby/pull/7985
2023-06-12Add missing dependenciesNobuyoshi Nakada
2023-06-12[Feature #19719] Universal Parseryui-knk
Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu. Notes: Merged: https://github.com/ruby/ruby/pull/7927
2023-06-03Ripper does not depend on Bison [ci skip]yui-knk
It also uses Lrama then no dependency on Bison. Notes: Merged: https://github.com/ruby/ruby/pull/7896
2023-06-02No need to define "BISON" on extconf.rbyui-knk
"BISON" is defined in "ext/ripper/depend". Notes: Merged: https://github.com/ruby/ruby/pull/7878
2023-05-15Process parse.y without temporary filesNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7817
2023-05-14Add user argument to some macros used by bisonNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7807
2023-05-14Preprocess input parse.y from stdinNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7807
2023-05-12Use Lrama LALR parser generator instead of Bisonv3_3_0_preview1Yuichiro Kaneko
https://bugs.ruby-lang.org/issues/19637 Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/7798 Merged-By: yui-knk <spiketeika@gmail.com>
2023-04-06Update VPATH for socket, & dependenciesMatt Valentine-House
The socket extensions rubysocket.h pulls in the "private" include/gc.h, which now depends on vm_core.h. vm_core.h pulls in id.h when tool/update-deps generates the dependencies for the makefiles, it generates the line for id.h to be based on VPATH, which is configured in the extconf.rb for each of the extensions. By default VPATH does not include the actual source directory of the current Ruby so the dependency fails to resolve and linking fails. We need to append the topdir and top_srcdir to VPATH to have the dependancy picked up correctly (and I believe we need both of these to cope with in-tree and out-of-tree builds). I copied this from the approach taken in https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3 Notes: Merged: https://github.com/ruby/ruby/pull/7393
2023-02-28Update the depend filesMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7310
2023-02-27Remove intern/gc.h from Make depsMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7330
2023-02-08Extract include/ruby/internal/attr/packed_struct.hNobuyoshi Nakada
Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the macros bellow: * `RBIMPL_ATTR_PACKED_STRUCT_BEGIN` * `RBIMPL_ATTR_PACKED_STRUCT_END` * `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN` * `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END` Notes: Merged: https://github.com/ruby/ruby/pull/7268
2023-02-02[Bug #19399] Parsing invalid heredoc inside block parameterNobuyoshi Nakada
Although this is of course invalid as Ruby code, allow to just parse and tokenize. Notes: Merged: https://github.com/ruby/ruby/pull/7229
2022-12-02Introduce encoding check macroS-H-GAMELINKS
Notes: Merged: https://github.com/ruby/ruby/pull/6700
2022-11-21Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methodsyui-knk
Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: | src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. | |compare-ruby|built-ruby| |:--------------------|-----------:|---------:| |without_keep_tokens | 21.659k| 21.303k| | | 1.02x| -| |with_keep_tokens | 6.220k| 5.691k| | | 1.09x| -| ``` Notes: Merged: https://github.com/ruby/ruby/pull/6770
2022-11-10Transition shape when object's capacity changesJemma Issroff
This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6699
2022-11-10Preprocess for older bison is no longer neededNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6701
2022-11-08Set default %printer for NODE ntermsyui-knk
Before: ``` Reducing stack by rule 639 (line 5062): $1 = token "integer literal" (1.0-1.1: 1) -> $$ = nterm simple_numeric (1.0-1.1: ) ``` After: ``` Reducing stack by rule 641 (line 5078): $1 = token "integer literal" (1.0-1.1: 1) -> $$ = nterm simple_numeric (1.0-1.1: NODE_LIT) ``` `"<*>"` is supported by Bison 2.3b (2008-05-27) or later. https://git.savannah.gnu.org/cgit/bison.git/commit/?id=12e3584054c16ab255672c07af0ffc7bb220e8bc Therefore developers need to install Bison 2.3b+ to build ruby from source codes if their Bison is older. Minimum version requirement for Bison is changed to 3.0. See: https://bugs.ruby-lang.org/issues/19068 [Feature #19068] Notes: Merged: https://github.com/ruby/ruby/pull/6579
2022-09-08Process token IDs from id.def without id.hNobuyoshi Nakada
Fixes id.h error during updating ripper.c by `make after-update`. While it used to update id.h in the build directory, but was trying to update ripper.c in the source directory. In principle, files in the source directory can or should not depend on files in the build directory.
2022-02-22[Feature #18249] Update dependenciesPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/5474
2021-12-09ext/ripper/lib/ripper/lexer.rb: Do not deprecate Ripper::Lexer::State#[]Yusuke Endoh
The old code of IRB still uses this method. The warning is noisy on rails console. In principle, Ruby 3.1 deprecates nothing, so let's avoid the deprecation for the while. I think It is not so hard to continue to maintain it as it is a trivial shim. https://github.com/ruby/ruby/pull/5093 Notes: Merged: https://github.com/ruby/ruby/pull/5219
2021-12-02Define Ripper::Lexer::Elem#to_sNobuyoshi Nakada
Alias `#inspect` as `#to_s` also in the new `Ripper::Lexer::Elem` class, so that `puts Ripper::Lexer.new(code).scan` shows the attributes.
2021-12-02Deprecate `Lexer::Elem#[]` and `Lexer::State#[]`schneems
Discussed in https://github.com/ruby/ruby/pull/5093#issuecomment-964426481. > it would be enough to mimic only [] for almost all cases This adds back the `Lexer::Elem#[]` and `Lexer::State#[]` and adds deprecation warnings for them.
2021-12-02Only iterate Lexer heredoc arraysschneems
The last element in the `@buf` may be either an array or an `Elem`. In the case it is an `Elem` we iterate over every element, when we do not need to. This check guards that case by ensuring that we only iterate over an array of elements. Notes: Merged: https://github.com/ruby/ruby/pull/5093
2021-12-02~1.10x faster Change Ripper.lex structs to classesschneems
## Concept I am proposing we replace the Struct implementation of data structures inside of ripper with real classes. This will improve performance and the implementation is not meaningfully more complicated. ## Example Struct versus class comparison: ```ruby Elem = Struct.new(:pos, :event, :tok, :state, :message) do def initialize(pos, event, tok, state, message = nil) super(pos, event, tok, State.new(state), message) end # ... def to_a a = super a.pop unless a.empty? a end end class ElemClass attr_accessor :pos, :event, :tok, :state, :message def initialize(pos, event, tok, state, message = nil) @pos = pos @event = event @tok = tok @state = State.new(state) @message = message end def to_a if @message [@pos, @event, @tok, @state, @message] else [@pos, @event, @tok, @state] end end end # stub state class creation for now class State; def initialize(val); end; end ``` ## MicroBenchmark creation ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG Benchmark.ips do |x| x.report("struct") { Elem.new(pos, event, tok, state) } x.report("class ") { ElemClass.new(pos, event, tok, state) } x.compare! end; nil ``` Gives ~1.2x faster creation: ``` Warming up -------------------------------------- struct 263.983k i/100ms class 303.367k i/100ms Calculating ------------------------------------- struct 2.638M (± 5.9%) i/s - 13.199M in 5.023460s class 3.171M (± 4.6%) i/s - 16.078M in 5.082369s Comparison: class : 3170690.2 i/s struct: 2638493.5 i/s - 1.20x (± 0.00) slower ``` ## MicroBenchmark `to_a` (Called by Ripper.lex for every element) ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.to_a } x.report("class ") { from_class.to_a } x.compare! end; nil ``` Gives 1.46x faster `to_a`: ``` Warming up -------------------------------------- struct 612.094k i/100ms class 893.233k i/100ms Calculating ------------------------------------- struct 6.121M (± 5.4%) i/s - 30.605M in 5.015851s class 8.931M (± 7.9%) i/s - 44.662M in 5.039733s Comparison: class : 8930619.0 i/s struct: 6121358.9 i/s - 1.46x (± 0.00) slower ``` ## MicroBenchmark data access ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.pos[1] } x.report("class ") { from_class.pos[1] } x.compare! end; nil ``` Gives ~1.17x faster data access: ``` Warming up -------------------------------------- struct 1.694M i/100ms class 1.868M i/100ms Calculating ------------------------------------- struct 16.149M (± 6.8%) i/s - 81.318M in 5.060633s class 18.886M (± 2.9%) i/s - 95.262M in 5.048359s Comparison: class : 18885669.6 i/s struct: 16149255.8 i/s - 1.17x (± 0.00) slower ``` ## Full benchmark integration of this inside of Ripper.lex Inside of this repo with this commit ``` $ cd ext/ripper $ make $ cat test.rb file = File.join(__dir__, "../../array.rb") source = File.read(file) bench = Benchmark.measure do 10_000.times.each do Ripper.lex(source) end end puts bench ``` Then execute with and without this change 50 times: ``` rm new.txt rm old.txt for i in {0..50} do `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt` `ruby -rripper -rbenchmark ./test.rb >> old.txt` done ``` I used derailed benchmarks internals to compare the results: ``` dir = Pathname(".") branch_info = {} branch_info["old"] = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" } branch_info["new"] = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" } stats = DerailedBenchmarks::StatsFromDir.new(branch_info) stats.call.banner ``` Which gave us: ``` ❤️ ❤️ ❤️ (Statistically Significant) ❤️ ❤️ ❤️ [new] (3.3139 seconds) "Class lex" ref: "new" FASTER 🚀🚀🚀 by: 1.1046x [older/newer] 9.4700% [(older - newer) / older * 100] [old] (3.6606 seconds) "Struct lex" ref: "old" Iterations per sample: Samples: 51 Test type: Kolmogorov Smirnov Confidence level: 99.0 % Is significant? (max > critical): true D critical: 0.30049534876137013 D max: 0.9607843137254902 Histograms (time ranges are in seconds): [new] description: [old] description: "Class lex" "Struct lex" ┌ ┐ ┌ ┐ [3.0, 3.3) ┤▇ 1 [3.0, 3.3) ┤ 0 [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47 [3.3, 3.6) ┤ 0 [3.5, 3.8) ┤▇▇ 2 [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46 [3.8, 4.1) ┤▇ 1 [3.8, 4.1) ┤▇▇▇ 4 [4.0, 4.3) ┤ 0 [4.0, 4.3) ┤ 0 [4.3, 4.6) ┤ 0 [4.3, 4.6) ┤▇ 1 └ ┘ └ ┘ # of runs in range # of runs in range ``` To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum. Notes: Merged: https://github.com/ruby/ruby/pull/5093
2021-11-25Keep the generated source files when clean [Bug #18363]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/5174
2021-11-21Update dependenciesNobuyoshi Nakada
2021-10-05ruby tool/update-deps --fix卜部昌平
Notes: Merged: https://github.com/ruby/ruby/pull/4909
2021-04-13dependency updates卜部昌平
Notes: Merged: https://github.com/ruby/ruby/pull/4371
2021-02-19ripper: fix a bug of Ripper::Lexer with syntax error and heredoc [Bug #17644]Shugo Maeda
2021-01-17Fix Ripper with heredoc.manga_osyo
Notes: Merged: https://github.com/ruby/ruby/pull/4083
2021-01-04ripper: call #pretty_print on also `state`Nobuyoshi Nakada