summaryrefslogtreecommitdiff
path: root/ext/ripper/lib/ripper
diff options
context:
space:
mode:
authorschneems <richard.schneeman+foo@gmail.com>2021-11-07 13:57:24 -0600
committerYusuke Endoh <mame@ruby-lang.org>2021-12-02 15:55:42 +0900
commit3f74eaa7a83e42b31c219a534ec5330e511d2921 (patch)
tree1d1617acb0e7483fa5bc5d2a68a33dcb634cfe28 /ext/ripper/lib/ripper
parent6721ce1cc4035fe5508c13d0151f748e8b8e8cf1 (diff)
~1.10x faster Change Ripper.lex structs to classes
## Concept I am proposing we replace the Struct implementation of data structures inside of ripper with real classes. This will improve performance and the implementation is not meaningfully more complicated. ## Example Struct versus class comparison: ```ruby Elem = Struct.new(:pos, :event, :tok, :state, :message) do def initialize(pos, event, tok, state, message = nil) super(pos, event, tok, State.new(state), message) end # ... def to_a a = super a.pop unless a.empty? a end end class ElemClass attr_accessor :pos, :event, :tok, :state, :message def initialize(pos, event, tok, state, message = nil) @pos = pos @event = event @tok = tok @state = State.new(state) @message = message end def to_a if @message [@pos, @event, @tok, @state, @message] else [@pos, @event, @tok, @state] end end end # stub state class creation for now class State; def initialize(val); end; end ``` ## MicroBenchmark creation ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG Benchmark.ips do |x| x.report("struct") { Elem.new(pos, event, tok, state) } x.report("class ") { ElemClass.new(pos, event, tok, state) } x.compare! end; nil ``` Gives ~1.2x faster creation: ``` Warming up -------------------------------------- struct 263.983k i/100ms class 303.367k i/100ms Calculating ------------------------------------- struct 2.638M (± 5.9%) i/s - 13.199M in 5.023460s class 3.171M (± 4.6%) i/s - 16.078M in 5.082369s Comparison: class : 3170690.2 i/s struct: 2638493.5 i/s - 1.20x (± 0.00) slower ``` ## MicroBenchmark `to_a` (Called by Ripper.lex for every element) ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.to_a } x.report("class ") { from_class.to_a } x.compare! end; nil ``` Gives 1.46x faster `to_a`: ``` Warming up -------------------------------------- struct 612.094k i/100ms class 893.233k i/100ms Calculating ------------------------------------- struct 6.121M (± 5.4%) i/s - 30.605M in 5.015851s class 8.931M (± 7.9%) i/s - 44.662M in 5.039733s Comparison: class : 8930619.0 i/s struct: 6121358.9 i/s - 1.46x (± 0.00) slower ``` ## MicroBenchmark data access ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.pos[1] } x.report("class ") { from_class.pos[1] } x.compare! end; nil ``` Gives ~1.17x faster data access: ``` Warming up -------------------------------------- struct 1.694M i/100ms class 1.868M i/100ms Calculating ------------------------------------- struct 16.149M (± 6.8%) i/s - 81.318M in 5.060633s class 18.886M (± 2.9%) i/s - 95.262M in 5.048359s Comparison: class : 18885669.6 i/s struct: 16149255.8 i/s - 1.17x (± 0.00) slower ``` ## Full benchmark integration of this inside of Ripper.lex Inside of this repo with this commit ``` $ cd ext/ripper $ make $ cat test.rb file = File.join(__dir__, "../../array.rb") source = File.read(file) bench = Benchmark.measure do 10_000.times.each do Ripper.lex(source) end end puts bench ``` Then execute with and without this change 50 times: ``` rm new.txt rm old.txt for i in {0..50} do `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt` `ruby -rripper -rbenchmark ./test.rb >> old.txt` done ``` I used derailed benchmarks internals to compare the results: ``` dir = Pathname(".") branch_info = {} branch_info["old"] = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" } branch_info["new"] = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" } stats = DerailedBenchmarks::StatsFromDir.new(branch_info) stats.call.banner ``` Which gave us: ``` ❤️ ❤️ ❤️ (Statistically Significant) ❤️ ❤️ ❤️ [new] (3.3139 seconds) "Class lex" ref: "new" FASTER 🚀🚀🚀 by: 1.1046x [older/newer] 9.4700% [(older - newer) / older * 100] [old] (3.6606 seconds) "Struct lex" ref: "old" Iterations per sample: Samples: 51 Test type: Kolmogorov Smirnov Confidence level: 99.0 % Is significant? (max > critical): true D critical: 0.30049534876137013 D max: 0.9607843137254902 Histograms (time ranges are in seconds): [new] description: [old] description: "Class lex" "Struct lex" ┌ ┐ ┌ ┐ [3.0, 3.3) ┤▇ 1 [3.0, 3.3) ┤ 0 [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47 [3.3, 3.6) ┤ 0 [3.5, 3.8) ┤▇▇ 2 [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46 [3.8, 4.1) ┤▇ 1 [3.8, 4.1) ┤▇▇▇ 4 [4.0, 4.3) ┤ 0 [4.0, 4.3) ┤ 0 [4.3, 4.6) ┤ 0 [4.3, 4.6) ┤▇ 1 └ ┘ └ ┘ # of runs in range # of runs in range ``` To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
Notes
Notes: Merged: https://github.com/ruby/ruby/pull/5093
Diffstat (limited to 'ext/ripper/lib/ripper')
-rw-r--r--ext/ripper/lib/ripper/lexer.rb30
1 files changed, 22 insertions, 8 deletions
diff --git a/ext/ripper/lib/ripper/lexer.rb b/ext/ripper/lib/ripper/lexer.rb
index f6051c6341..92ee4ef7df 100644
--- a/ext/ripper/lib/ripper/lexer.rb
+++ b/ext/ripper/lib/ripper/lexer.rb
@@ -53,10 +53,16 @@ class Ripper
end
class Lexer < ::Ripper #:nodoc: internal use only
- State = Struct.new(:to_int, :to_s) do
+ class State
+ attr_reader :to_int, :to_s
+
+ def initialize(i)
+ @to_int = i
+ @to_s = Ripper.lex_state_name(i)
+ freeze
+ end
+
alias to_i to_int
- def initialize(i) super(i, Ripper.lex_state_name(i)).freeze end
- # def inspect; "#<#{self.class}: #{self}>" end
alias inspect to_s
def pretty_print(q) q.text(to_s) end
def ==(i) super or to_int == i end
@@ -67,9 +73,15 @@ class Ripper
def nobits?(i) to_int.nobits?(i) end
end
- Elem = Struct.new(:pos, :event, :tok, :state, :message) do
+ class Elem
+ attr_accessor :pos, :event, :tok, :state, :message
+
def initialize(pos, event, tok, state, message = nil)
- super(pos, event, tok, State.new(state), message)
+ @pos = pos
+ @event = event
+ @tok = tok
+ @state = State.new(state)
+ @message = message
end
def inspect
@@ -94,9 +106,11 @@ class Ripper
end
def to_a
- a = super
- a.pop unless a.last
- a
+ if @message
+ [@pos, @event, @tok, @state, @message]
+ else
+ [@pos, @event, @tok, @state]
+ end
end
end