diff options
author | Eileen M. Uchitelle <eileencodes@users.noreply.github.com> | 2022-07-12 16:40:49 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-07-12 16:40:49 -0400 |
commit | 59c6b7b7abefdf8bc93d7117a3893d581f3a6c90 (patch) | |
tree | 9cb8840687d2f9f8b0fee59495cfe81a09daa078 /yjit.rb | |
parent | 8c1808151f4c1b44e8b0fe935c571f05b2641b8b (diff) |
Speed up --yjit-trace-exits code (#6106)
In a small script the speed of this feature isn't really noticeable but
on Rails it's very noticeable how slow this can be. This PR aims to
speed up two parts of the functionality.
1) The Rust exit recording code
Instead of adding all samples as we see them to the yjit_raw_samples and
yjit_line_samples, we can increment the counter on the ones we've seen
before. This will be faster on traces where we are hitting the same
stack often. In a crude measurement of booting just the active record
base test (`test/cases/base_test.rb`) we found that this improved the
speed by 1 second.
This also results in a smaller marshal dump file which sped up the test
boot time by 4 seconds with trace exits on.
2) The Ruby parsing code
Previously we were allocating new arrays using `shift` and
`each_with_index`. This change avoids allocating new arrays by using an
index. This change saves us the most amount of time, gaining 11 seconds.
Before this change the test boot time took 62 seconds, after it took 47
seconds. This is still too long but it's a step closer to faster
functionality. Next we're going to tackle allowing you to collect trace
exits for a specific instruction. There is also some potential slowness
in the GC code that I'd like to take a second look at.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Notes
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
Diffstat (limited to 'yjit.rb')
-rw-r--r-- | yjit.rb | 45 |
1 files changed, 26 insertions, 19 deletions
@@ -41,16 +41,11 @@ module RubyVM::YJIT frames = results[:frames].dup samples_count = 0 - frames.each do |frame_id, frame| - frame[:samples] = 0 - frame[:edges] = {} - end - # Loop through the instructions and set the frame hash with the data. # We use nonexistent.def for the file name, otherwise insns.def will be displayed # and that information isn't useful in this context. RubyVM::INSTRUCTION_NAMES.each_with_index do |name, frame_id| - frame_hash = { samples: 0, total_samples: 0, edges: {}, name: name, file: "nonexistent.def", line: nil } + frame_hash = { samples: 0, total_samples: 0, edges: {}, name: name, file: "nonexistent.def", line: nil, lines: {} } results[:frames][frame_id] = frame_hash frames[frame_id] = frame_hash end @@ -58,12 +53,22 @@ module RubyVM::YJIT # Loop through the raw_samples and build the hashes for StackProf. # The loop is based off an example in the StackProf documentation and therefore # this functionality can only work with that library. - while raw_samples.length > 0 - stack_trace = raw_samples.shift(raw_samples.shift + 1) - lines = line_samples.shift(line_samples.shift + 1) + # + # Raw Samples: + # [ length, frame1, frame2, frameN, ..., instruction, count + # + # Line Samples + # [ length, line_1, line_2, line_n, ..., dummy value, count + i = 0 + while i < raw_samples.length + stack_length = raw_samples[i] + 1 + i += 1 # consume the stack length + prev_frame_id = nil + stack_length.times do |idx| + idx += i + frame_id = raw_samples[idx] - stack_trace.each_with_index do |frame_id, idx| if prev_frame_id prev_frame = frames[prev_frame_id] prev_frame[:edges][frame_id] ||= 0 @@ -71,26 +76,28 @@ module RubyVM::YJIT end frame_info = frames[frame_id] - frame_info[:total_samples] ||= 0 frame_info[:total_samples] += 1 - frame_info[:lines] ||= {} - frame_info[:lines][lines[idx]] ||= [0, 0] - frame_info[:lines][lines[idx]][0] += 1 + frame_info[:lines][line_samples[idx]] ||= [0, 0] + frame_info[:lines][line_samples[idx]][0] += 1 prev_frame_id = frame_id end - top_frame_id = stack_trace.last + i += stack_length # consume the stack + + top_frame_id = prev_frame_id top_frame_line = 1 - frames[top_frame_id][:samples] += 1 + sample_count = raw_samples[i] + + frames[top_frame_id][:samples] += sample_count frames[top_frame_id][:lines] ||= {} frames[top_frame_id][:lines][top_frame_line] ||= [0, 0] - frames[top_frame_id][:lines][top_frame_line][1] += 1 + frames[top_frame_id][:lines][top_frame_line][1] += sample_count - samples_count += raw_samples.shift - line_samples.shift + samples_count += sample_count + i += 1 end results[:samples] = samples_count |