summaryrefslogtreecommitdiff
path: root/yjit/src
AgeCommit message (Collapse)Author
2023-02-01use correct svar (#7225)Koichi Sasada
* use correct svar Without this patch, svar location is used "nearest Ruby frame". It is almost correct but it doesn't correct when the `each` method is written in Ruby. ```ruby class C include Enumerable def each %w(bar baz).each{|e| yield e} end end C.new.grep(/(b.)/){|e| p [$1, e]} ``` This patch fix this issue by traversing ifunc's cfp. Note that if cfp doesn't specify this Thread's cfp stack, reserved svar location (`ec->root_svar`) is used. * make yjit-bindgen --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-02-01Remove whitespaceMaxime Chevalier-Boisvert
2023-01-31YJIT: Handle splat with opt more fully (#7209)Jimmy Miller
* YJIT: Handle splat with opt more fully * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-31YJIT: Fix BorrowMutError on BOP invalidation (#7212)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-31YJIT: Group unimplemented method types togetherAlan Wu
Grouping these together helps with finding all of the unimplemented method types. It was interleaved with some other match arm long and short previously. Notes: Merged: https://github.com/ruby/ruby/pull/7210
2023-01-31YJIT: Implement codegen for Kernel#block_given? (#7202)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-30YJIT: Add splat optimized_send (#7167)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-30YJIT: Initial implementation of splat with optional params (#7166)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-30YJIT: Fix BorrowMutError on GC.compact (#7176)Takashi Kokubun
YJIT: Fix BorrowMutError Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-30YJIT: Skip defer_compilation for fixnums if possible (#7168)Takashi Kokubun
* YJIT: Skip defer_compilation for fixnums if possible * YJIT: It should be Some(false) * YJIT: Define two_fixnums_on_stack on Context Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-30YJIT: Inline return address callback (#7198)Alan Wu
This makes it so that the generator and the output code read in the same order. I think it reads better this way. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-20YJIT: Avoid BorrowError on GC.compact (#7164)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-20YJIT: get rid of unneeded `.into()`Jimmy Miller
Notes: Merged: https://github.com/ruby/ruby/pull/7158 Merged-By: XrXr
2023-01-19YJIT: Refactor side_exitsJimmy Miller
Notes: Merged: https://github.com/ruby/ruby/pull/7155
2023-01-19YJIT: Remove duplicated information in BranchTarget (#7151)Takashi Kokubun
Note: On the new code of yjit/src/core.rs:2178, we no longer leave the state `.block=None` but `.address=Some...`, which might be important. We assume it's actually not needed and take a risk here to minimize heap allocations, but in case it turns out to be necessary, we could signal/resurrect that state by introducing a new BranchTarget (or BranchShape) variant dedicated to it. Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-19Implement splat for cfuncs. Split exit exit cases to better capture where we ↵Jimmy Miller
are exiting (#6929) YJIT: Implement splat for cfuncs. Split exit cases This also implements a new check for ruby2keywords as the last argument of a splat. This does mean that we generate more code, but in actual benchmarks where we gained speed from this (binarytrees) I don't see any significant slow down. I did have to struggle here with the register allocator to find code that didn't allocate too many registers. It's a bit hard when everything is implicit. But I think I got to the minimal amount of copying and stuff given our current allocation strategy. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-18YJIT: Use .as_side_exit() for jumps to counted exitsAlan Wu
Fewer cycles running nops when these jumps are not taken. Fixing all these so when they get copy pasted in the future we save on padding. Notes: Merged: https://github.com/ruby/ruby/pull/7150
2023-01-18YJIT: implement codegen for `String#empty?` (#7148)Maxime Chevalier-Boisvert
YJIT: implement codegen for String#empty? Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-18Add stats so we can keep track of x86 rel32 vs register calls (#7142)Maxime Chevalier-Boisvert
* Add stats so we can keep track of x86 rel32 vs register calls To know if we get that "prime real estate" as Alan put it. * Fix bug pointed by Alan Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-13YJIT: Use SIZEOF_VALUE_I32 instead of `... as i32`Alan Wu
Shorter, and easier to parse without parentheses. Notes: Merged: https://github.com/ruby/ruby/pull/7121
2023-01-13YJIT: Factor out VALUE_BITS = (8 * SIZE_OF_VALUE as u8)Alan Wu
Using a constant shows intention better and is less noisy. It always took me a second to parse the long expression. Notes: Merged: https://github.com/ruby/ruby/pull/7121
2023-01-12Enable `clippy` checks for yjit in CI (#7093)Ian Ker-Seymer
* Add job to check clippy lints in CI * Address all remaining clippy lints * Check lints on arm64 as well * Apply latest clippy lints * Do not exit 0 on clippy warnings Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-12Strip trailing spaces [ci skip]Nobuyoshi Nakada
2023-01-11YJIT: Add a few asm comments (#7105)Takashi Kokubun
* YJIT: Add a few asm comments * YJIT: Clarify exiting insns * YJIT: Fix cargo test Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-10Differentiate T_ARRAY and array subclasses (#7091)Aaron Patterson
* Differentiate T_ARRAY and array subclasses This commit teaches the YJIT context the difference between Arrays (objects with type T_ARRAY and class rb_cArray) vs Array subclasses (objects with type T_ARRAY but _not_ class rb_cArray). It uses this information to reduce the number of guards emitted when using `jit_guard_known_klass` with rb_cArray, notably opt_aref * Update yjit/src/core.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-10YJIT: Save PC and SP before calling leaf builtins (#7090)Alan Wu
Previously, we did not update `cfp->sp` before calling the C function of ISEQs marked with `Primitive.attr! "inline"` (leaf builtins). This caused the GC to miss temporary values on the stack in case the function allocates and triggers a GC run. Right now, there is only a few leaf builtins in numeric.rb on Integer methods such as `Integer#~`. Since these methods only allocate when operating on big numbers, we missed this issue. Fix by saving PC and SP before calling the functions -- our usual protocol for calling C functions that may allocate on the GC heap. [Bug #19316] Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-10YJIT: Fix a compilation warning with release build (#7092)Takashi Kokubun
warning: unused variable: `start_addr` --> ../yjit/src/asm/mod.rs:359:39 | 359 | pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) { | ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_start_addr` | = note: `#[warn(unused_variables)]` on by default warning: unused variable: `end_addr` --> ../yjit/src/asm/mod.rs:359:60 | 359 | pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) { | Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-09YJIT: Remove old comments for regenerated branches (#7083)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-01-06YJIT: Make iseq_get_location consistent with iseq.c (#7074)Takashi Kokubun
* YJIT: Make iseq_get_location consistent with iseq.c * YJIT: Call it "YJIT entry point" Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-06YJIT: Colorize outlined code differently on --yjit-dump-disasm (#7073)Takashi Kokubun
* YJIT: Colorize outlined code differently on --yjit-dump-disasm * YJIT: Reduce the number of escape sequences Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-01-05Use a different name for megamorphic setivar exitsAaron Patterson
We should differentiate between set and get for megamorphic exits. This patch fixes the megamorphic exit name in gen_setinstancevariable so that we can tell the difference between megamorphic get / set sites Notes: Merged: https://github.com/ruby/ruby/pull/7072
2023-01-03YJIT: Dump spill error to stderr [ci skip]Alan Wu
Since the panic message is in stderr, better to use the same stream in case stdout and stderr are not synced due to IO redirection.
2023-01-03YJIT: Fix `yield` into block with >=30 locals on ARMAlan Wu
It's a register spill issue. Fix by moving the Qnil fill snippet to after registers are released. [Bug #19299] Notes: Merged: https://github.com/ruby/ruby/pull/7059
2022-12-23MJIT: Export fewer shape functions (#7007)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-12-17Use a BOP for Hash#defaultJohn Hawthorn
On a hash miss we need to call default if it is redefined in order to return the default value to be used. Previously we checked this with rb_method_basic_definition_p, which avoids the method call but requires a method lookup. This commit replaces the previous check with BASIC_OP_UNREDEFINED_P and a new BOP_DEFAULT. We still need to fall back to rb_method_basic_definition_p when called on a subclasss of hash. | |compare-ruby|built-ruby| |:---------------|-----------:|---------:| |hash_aref_miss | 2.692| 3.531| | | -| 1.31x| Co-authored-by: Daniel Colson <danieljamescolson@gmail.com> Co-authored-by: "Ian C. Anderson" <ian@iancanderson.com> Co-authored-by: Jack McCracken <me@jackmc.xyz> Notes: Merged: https://github.com/ruby/ruby/pull/6945
2022-12-15YJIT: Fix `obj.send(:call)`Alan Wu
All the method call types need to handle argument shifting in case they're called by `.send`, and we weren't handling that in `OPTIMIZED_METHOD_TYPE_CALL`. Lack of shifting caused the stack size assertion in gen_leave() to fail. Discovered by Rails CI: https://buildkite.com/rails/rails/builds/91705#018516c4-f8f8-469e-bc2d-ddeb25ca8317/1920-2067 Diagnosed with help from `@eileencodes` and `@k0kubun`. Notes: Merged: https://github.com/ruby/ruby/pull/6943 Merged-By: XrXr
2022-12-15Move definition of SIZE_POOL_COUNT back to gc.hPeter Zhu
SIZE_POOL_COUNT is a GC macro, it should belong in gc.h and not shape.h. SIZE_POOL_COUNT doesn't depend on shape.h so we can have shape.h depend on gc.h. Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com> Notes: Merged: https://github.com/ruby/ruby/pull/6940
2022-12-15YJIT: Fix code GC freeing stubs with a trampoline (#6937)Alan Wu
Stubs we generate for invalidation don't necessarily co-locate with the code that jump to the stub. Since we rely on co-location to keep stubs alive as they are in the outlined code block, it used to be possible for code GC inside branch_stub_hit() to free the stub that's its direct caller, leading us to return to freed code after. Stubs used to look like: ``` mov arg0, branch_ptr mov arg1, target_idx mov arg2, ec call branch_stub_hit jmp return_reg ``` Since the call and the jump after the call is the same for all stubs, we can extract them and use a static trampoline for them. That makes branch_stub_hit() always return to static code. Stubs now look like: ``` mov arg0, branch_ptr mov arg1, target_idx jmp trampoline ``` Where the trampoline is: ``` mov arg2, ec call branch_stub_hit jmp return_reg ``` Code GC can now free stubs without problems since we'll always return to the trampoline, which we generate once on boot and lives forever. This might save a small bit of memory due to factoring out the static part of stubs, but it's probably minor. [Bug #19234] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-12-15Transition complex objects to "too complex" shapeJemma Issroff
When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is *not* too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6931
2022-12-14YJIT: Remove duplicate call to jit_prepare_routine_call()Alan Wu
It's idempotent. Notes: Merged: https://github.com/ruby/ruby/pull/6930
2022-12-13YJIT: Change the default mem size to 64MiB (#6912)Takashi Kokubun
* YJIT: Change the default mem size to 64MiB * Also update ruby --help Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-12-12YJIT: Implement opt_newarray_max instruction (#6893)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-12-09YJIT: Split send_iseq_complex_callee exit reasons (#6895)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-12-09YJIT: implement `getconstant` YARV instruction (#6884)Maxime Chevalier-Boisvert
* YJIT: implement getconstant YARV instruction * Constant id is not a pointer * Stack operands must be read after jit_prepare_routine_call Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-12-08YJIT: Upgrade bindgen to stabilize and reduce outputAlan Wu
The new version has an option to merge everything into a big `extern "C"` block and it's nicer. More importantly, this upgrade fixes an issue where Ubuntu with Clang 12 and macOS with Clang 14 gave a one line diff for `rb_shape_t`. It was slightly annoying because we use macOS locally. Notes: Merged: https://github.com/ruby/ruby/pull/6887
2022-12-08YJIT: Drop Copy trait from Context (#6889)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-12-08YJIT: implement opt_newarray_min YARV instruction (#6888)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-12-08Introduce `IO.new(..., path:)` and promote `File#path` to `IO#path`. (#6867)Samuel Williams
Notes: Merged-By: ioquatix <samuel@codeotaku.com>
2022-12-06Set max_iv_count (used for object shapes) based on inline cachesJemma Issroff
With this change, we're storing the iv name on an inline cache on setinstancevariable instructions. This allows us to check the inline cache to count instance variables set in initialize and give us an estimate of iv capacity for an object. For the purpose of estimating the number of instance variables required for an object, we're assuming that all initialize methods will call `super`. This change allows us to estimate the number of instance variables required without disassembling instruction sequences. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6870
2022-12-06Introduce BOP_CMP for optimized comparisonDaniel Colson
Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup to determine whether `<=>` was overridden. The result of the lookup was cached, but only for the duration of the specific method that initialized the cmp_opt_data cache structure. With this method lookup, `[x,y].max` is slower than doing `x > y ? x : y` even though there's an optimized instruction for "new array max". (John noticed somebody a proposed micro-optimization based on this fact in https://github.com/mastodon/mastodon/pull/19903.) ```rb a, b = 1, 2 Benchmark.ips do |bm| bm.report('conditional') { a > b ? a : b } bm.report('method') { [a, b].max } bm.compare! end ``` Before: ``` Comparison: conditional: 22603733.2 i/s method: 19820412.7 i/s - 1.14x (± 0.00) slower ``` This commit replaces the method lookup with a new CMP basic op, which gives the examples above equivalent performance. After: ``` Comparison: method: 24022466.5 i/s conditional: 23851094.2 i/s - same-ish: difference falls within error ``` Relevant benchmarks show an improvement to Array#max and Array#min when not using the optimized newarray_max instruction as well. They are noticeably faster for small arrays with the relevant types, and the same or maybe a touch faster on larger arrays. ``` $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max ``` The benchmarks added in this commit also look generally improved. Co-authored-by: John Hawthorn <jhawthorn@github.com>