summaryrefslogtreecommitdiff
path: root/yjit/src
AgeCommit message (Collapse)Author
2024-06-17YJIT: Fix an unused field warning in `DumpDisasm`.Kevin Menard
2024-06-17YJIT: `--yjit-dump-disasm=dir`: Hold descriptor for dump fileAlan Wu
This mainly aims to make `--yjit-dump-disasm=<relative_path>` more usable. Previously, it crashed if the program did chdir(2), since it opened the dump file every time when appending. Tested with: ./miniruby --yjit-dump-disasm=. --yjit-call-threshold=1 -e 'Dir.chdir("/") {}' And the `lobsters` benchmark.
2024-06-13YJIT: Delete otherwise-empty defer_compilation() blocksAlan Wu
Calls to defer_compilation() leave behind a stub and a `struct Block` that we retain. If the block is empty, it only exits to hold the `struct Branch` that the stub needs. This patch transplants the branch out of the empty block into the newly generated block when the defer_compilation() stub is hit, and deletes the empty block to save memory. To assist the transplantation, `Block::outgoing` is now a `MutableBranchList`, and `Branch::Block` now in a `Cell`. These types don't incur a size cost. On the `lobsters` benchmark, `yjit_alloc_size` is roughly 98% of what it was before the change. Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Randy Stauner <randy@r4s6.net> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-06-12YJIT: add context cache hits stat (#10979)Maxime Chevalier-Boisvert
* YJIT: add context cache hits stat This stat should make more sense when it comes to interpreting the effectiveness of the cache on large deployed apps.
2024-06-11YJIT: Make num_contexts_encoded a default counterTakashi Kokubun
2024-06-11YJIT: add context cache size stat, lazily allocate cacheMaxime Chevalier-Boisvert
* YJIT: add context cache size stat * Allocate the context cache in a box so CRuby doesn't pay overhead * Add an extra debug assertion
2024-06-07YJIT: implement cache for recently encoded/decoded contexts (#10938)Maxime Chevalier-Boisvert
* YJIT: implement cache for recently encoded/decoded contexts * Increase cache size to 512
2024-06-07YJIT: implement variable-length context encoding scheme (#10888)Maxime Chevalier-Boisvert
* Implement BitVector data structure for variable-length context encoding * Rename method to make intent clearer * Rename write_uint => push_uint to make intent clearer * Implement debug trait for BitVector * Fix bug in BitVector::read_uint_at(), enable more tests * Add one more test for good measure * Start sketching Context::encode() * Progress on variable length context encoding * Add tests. Fix bug. * Encode stack state * Add comments. Try to estimate context encoding size. * More compact encoding for stack size * Commit before rebase * Change Context::encode() to take a BitVector as input * Refactor BitVector::read_uint(), add helper read functions * Implement Context::decode() function. Add test. * Fix bug, add tests * Rename methods * Add Context::encode() and decode() methods using global data * Make encode and decode methods use u32 indices * Refactor YJIT to use variable-length context encoding * Tag functions as allow unused * Add a simple caching mechanism and stats for bytes per context etc * Add comments, fix formatting * Grow vector of bytes by 1.2x instead of 2x * Add debug assert to check round-trip encoding-decoding * Take some rustfmt formatting * Add decoded_from field to Context to reuse previous encodings * Remove olde context stats * Re-add stack_size assert * Disable decoded_from optimization for now
2024-06-05Don't add `+YJIT` to `RUBY_DESCRIPTION` until it's actually enabledJean Boussier
If you start Ruby with `--yjit-disable`, the `+YJIT` shouldn't be added until `RubyVM::YJIT.enable` is actually called. Otherwise it's confusing in crash reports etc.
2024-06-04Do not emit shape transition warnings when YJIT is compilingJean Boussier
[Bug #20522] If `Warning.warn` is redefined in Ruby, emitting a warning would invoke Ruby code, which can't safely be done when YJIT is compiling.
2024-06-04YJIT: Fix getconstant exits after opt_ltlt fusion (#10903)Takashi Kokubun
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
2024-05-31YJIT: Fix out of bounds access when splatting empty arrayAlan Wu
Previously, we read the last element array even when the array was empty, doing an out-of-bounds access. This sometimes caused a SEGV. [Bug #20496]
2024-05-29YJIT: Fix a warning from nightly rustAlan Wu
No plan about migrating to the 2024 edition yet (it's not even available yet), but this is a simple enough suggestion so we can just take it. ``` warning: this method call resolves to `<&Box<[T]> as IntoIterator>::into_iter` (due to backwards compatibility), but will resolve to `<Box<[T]> as IntoIterator>::into_iter` in Rust 2024 --> ../yjit/src/core.rs:1003:49 | 1003 | formatter.debug_list().entries(branches.into_iter()).finish() | ^^^^^^^^^ | = warning: this changes meaning in Rust 2024 = note: `#[warn(boxed_slice_into_iter)]` on by default help: use `.iter()` instead of `.into_iter()` to avoid ambiguity | 1003 | formatter.debug_list().entries(branches.iter()).finish() | ~~~~ help: or use `IntoIterator::into_iter(..)` instead of `.into_iter()` to explicitly iterate by value | 1003 | formatter.debug_list().entries(IntoIterator::into_iter(branches)).finish() | ++++++++++++++++++++++++ ~ ```
2024-05-28YJIT: limit size of call count stats dict (#10858)Maxime Chevalier-Boisvert
* YJIT: limit size of call count stats dict Someone reported that logs were getting bloated because the ISEQ and C call count dicts were huge, since they include all of the call sites. I wrote code on the Rust size to limit the size of the dict to avoid this problem. The size limit is hardcoded at 20, but I figure this is probably fine? * Fix bug reported by Kokubun.
2024-05-28Stop marking chilled strings as frozenÉtienne Barrié
They were initially made frozen to avoid false positives for cases such as: str = str.dup if str.frozen? But this may cause bugs and is generally confusing for users. [Feature #20205] Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-05-23Introduce a specialize instruction for Array#packNobuyoshi Nakada
Instructions for this code: ```ruby # frozen_string_literal: true [a].pack("C") ``` Before this commit: ``` == disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)> 0000 putself ( 3)[Li] 0001 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0003 newarray 1 0005 putobject "C" 0007 opt_send_without_block <calldata!mid:pack, argc:1, ARGS_SIMPLE> 0009 leave ``` After this commit: ``` == disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)> 0000 putself ( 3)[Li] 0001 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0003 putobject "C" 0005 opt_newarray_send 2, :pack 0008 leave ``` Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2024-05-06YJIT: Fix comment and counter in rb_yjit_invalidate_ep_is_bp() (#10722)Alan Wu
`mem::take` substitutes an empty instance which makes `jit.ep_is_bp()` return false.
2024-05-01YJIT: Fix `Struct` accessors not firing tracing events (#10690)Alan Wu
* YJIT: Fix `Struct` accessors not firing tracing events Reading and writing to structs should fire `c_call` and `c_return`, but YJIT wasn't correctly dropping those calls when tracing. This has been missing since this functionality was added in 3081c83169c, but the added test only fails when ran in isolation with `--yjit-call-threshold=1`. The test sometimes failed on CI. * RJIT: YJIT: Fix `Struct` readers not firing tracing events Same issue as YJIT, but it looks like RJIT doesn't support writing to structs, so only reading needs changing.
2024-04-29YJIT: Remove CString allocation when using `src_loc!()`Alan Wu
Since we often take the VM lock as the first thing we do when entering YJIT, and that needs a `src_loc!()`, this removes a allocation from that. The main trick here is `concat!(file!(), '\0')` to get a C string statically baked into the binary.
2024-04-29YJIT: Take VM lock when invalidatingAlan Wu
We need the lock to patch code safely. This might fix some Ractor related crashes seen on CI.
2024-04-29YJIT: Expand codegen for `TrueClass#===` to `FalseClass` and `NilClass` (#10679)Randy Stauner
2024-04-29YJIT: Add specialized codegen function for `TrueClass#===` (#10640)Randy Stauner
* YJIT: Add specialized codegen function for `TrueClass#===` TrueClass#=== is currently number 10 in the most frequent C calls list of the lobsters benchmark. ``` require "benchmark/ips" def wrap true === true true === false true === :x end Benchmark.ips do |x| x.report(:wrap) do wrap end end ``` ``` before Warming up -------------------------------------- wrap 1.791M i/100ms Calculating ------------------------------------- wrap 17.806M (± 1.0%) i/s - 89.544M in 5.029363s after Warming up -------------------------------------- wrap 4.024M i/100ms Calculating ------------------------------------- wrap 40.149M (± 1.1%) i/s - 201.223M in 5.012527s ``` Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Fix the new test for RJIT --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-04-26Revert "YJIT: Try splitting getlocal/setlocal blocks (#10648)"Takashi Kokubun
This reverts commit ab228bd0844758a1c444e39030c153874adf9120.
2024-04-26YJIT: Correct signature of rb_yjit_root_mark()Alan Wu
Even though unused, it's supposed to take a pointer like the C side expects.
2024-04-26YJIT: Fix reference update for `Invariants::no_ep_escape_iseqs`Alan Wu
Previously, the update was done in the ISEQ callback. That effectively never updated anything because the callback itself is given an intact reference, so it could update its content, and `rb_gc_location(iseq)` never returned a new address. Update the whole table once in the YJIT root instead.
2024-04-26YJIT: Try splitting getlocal/setlocal blocks (#10648)Takashi Kokubun
2024-04-25YJIT: Relax `--yjit-verify-ctx` after singleton class creationAlan Wu
Types like `Type::CString` really only assert that at one point the object had its class field equal to `String`. Once a singleton class is created for any strings, the type makes no assertion about any class field anymore, and becomes the same as `Type::TString`. Previously, the `--yjit-verify-ctx` option wasn't allowing objects of these kind that have have singleton classes to pass verification even though the code generators handle it just fine. Found through `ruby/spec`.
2024-04-25YJIT: Optimize local variables when EP == BP (take 2) (#10607)Takashi Kokubun
* Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit c8783441952217c18e523749c821f82cd7e5d222. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> --------- Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
2024-04-24YJIT: Add a specialized codegen function for `Class#superclass`. (#10613)Kevin Menard
Add a specialized codegen function for `Class#superclass`. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Randy Stauner <randy.stauner@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-04-22YJIT: Fix shrinking block with assumption too much (#10585)Alan Wu
* YJIT: Fix shrinking block with assumption too much Under the very specific circumstances, discovered by a test case in `ruby/spec`, an `expandarray` block can contain just a branch and carry a method lookup assumption. Previously, when we regenerated the branch, we allowed it to shrink to empty, since we put the code at the jump target immediately after it. That was incorrect and caused a crash while the block is invalidated, since that left no room to patch in an exit. When regenerating a branch that makes up a block entirely, and the block could be invalidated, we need to ensure there is room for invalidation. When there is code before the branch, they should act as padding, so we don't need to worry about those cases. * skip on RJIT
2024-04-19Revert "YJIT: Optimize local variables when EP == BP" (#10584)Alan Wu
This reverts commit 4cc58ea0b865f2fd20f1e881ddbd4c4fab0b072c. Since the change landed call-threshold=1 CI runs have been timing out. There has also been `verify-ctx` violations. Revert for now while we debug.
2024-04-18chore: remove repetitive words (#10573)careworry
Signed-off-by: careworry <worrycare@outlook.com>
2024-04-18YJIT: Fix canary crash with Array#<< (#10568)Alan Wu
Previously, we got "We are killing the stack canary set by opt_ltlt" from `$./miniruby --yjit-call-threshold=1 -e 'a = [].freeze; a << 1'` Found by running ruby-spec with yjit-call-threshold=1.
2024-04-17YJIT: A64: Use CBZ/CBNZ to check for zeroAlan Wu
* YJIT: A64: Add CBZ and CBNZ encoding functions * YJIT: A64: Use CBZ/CBNZ to check for zero Instead of emitting `cmp x0, #0` plus `b.z #target`, A64 offers Compare and Branch on Zero for us to just do `cbz x0, #target`. This commit utilizes that and the related CBNZ instruction when appropriate. We check for zero most commonly in interrupt checks: ```diff # Insn: 0003 leave (stack_size: 1) # RUBY_VM_CHECK_INTS(ec) ldur w11, [x20, #0x20] -tst w11, w11 -b.ne #0x109002164 +cbnz w11, #0x1049021d0 ``` * fix copy paste error Co-authored-by: Randy Stauner <randy@r4s6.net> --------- Co-authored-by: Randy Stauner <randy@r4s6.net>
2024-04-17YJIT: Optimize local variables when EP == BP (#10487)Takashi Kokubun
2024-04-16YJIT: End send fallback blocks (#10539)Takashi Kokubun
2024-04-15YJIT: A64: Avoid intermediate register in `opt_and` and friends (#10509)Alan Wu
Same idea as the x64 equivalent in c2622b52536c5, removing the register shuffle coming from the pop two, push one stack motion these VM instructions perform. ``` # Insn: 0004 opt_or (stack_size: 2) - orr x11, x1, x9 - mov x1, x11 + orr x1, x1, x9 ```
2024-04-11YJIT: x64: Remove register shuffle with `opt_and` and friends (#10498)Alan Wu
This is best understood by looking at the change to the output: ```diff # Insn: 0002 opt_and (stack_size: 2) - mov rax, rsi - and rax, rdi - mov rsi, rax + and rsi, rdi ``` It's a bit awkward to match against due to how stack operands are lowered, but hey, it's nice to save the 2 unnecessary MOVs.
2024-04-10Fix a typo in a commentTakashi Kokubun
2024-04-03YJIT: Let sp_opnd take the number of slots (#10442)Takashi Kokubun
2024-04-03YJIT: Suppress warn(static_mut_refs) (#10440)Takashi Kokubun
2024-04-02YJIT: A64: Use ADDS/SUBS/CMP (immediate) when possible (#10402)Alan Wu
* YJIT: A64: Use ADDS/SUBS/CMP (immediate) when possible We were loading 1 into a register and then doing ADDS/SUBS previously. That was particularly bad since those come up in fixnum operations. ```diff # integer left shift with rhs=1 - mov x11, #1 - subs x11, x1, x11 + subs x11, x1, #1 lsl x12, x11, #1 asr x13, x12, #1 cmp x13, x11 - b.ne #0x106ab60f8 - mov x11, #1 - adds x12, x12, x11 + b.ne #0x10903a0f8 + adds x12, x12, #1 mov x1, x12 ``` Note that it's fine to cast between i64 and u64 since the bit pattern is preserved, and the add/sub themselves don't care about the signedness of the operands. CMP is just another mnemonic for SUBS. * YJIT: A64: Split asm.mul() with immediates properly There is in fact no MUL on A64 that takes an immediate, so this instruction was using the wrong split method. No current usages of this form in YJIT. --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-03-28YJIT: Optimize putobject+opt_ltlt for integersAlan Wu
In `jit_rb_int_lshift()`, we guard against the right hand side changing since we want to avoid generating variable length shifts. When control reaches a `putobject` and `opt_ltlt` pair, though, we know that the right hand side never changes. This commit detects this situation and substitutes an implementation that does not guard against the right hand side changing, saving that work. Deleted some `putobject` Rust tests since they aren't that valuable and cause linking issues. Nice boost to `optcarrot` and `protoboeuf`: ``` ---------- ------------------ bench yjit-pre/yjit-post optcarrot 1.09 protoboeuf 1.12 ---------- ------------------ ```
2024-03-28YJIT: add iseq_alloc_count to stats (#10398)Maxime Chevalier-Boisvert
* YJIT: add iseq_alloc_count to stats * Remove an empty line --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2024-03-25YJIT: Inline simple getlocal+leave iseqsAlan Wu
This mainly targets things like `T.unsafe()` from Sorbet, which is just an identity function at runtime and only a hint for the static checker. Only deal with simple caller and callees (no keywords and splat etc.). Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com>
2024-03-25YJIT: Propagate Array, Hash, and String classes (#10323)Takashi Kokubun
2024-03-20YJIT: Get rid of Type::TProc (#10287)Takashi Kokubun
2024-03-19Implement chilled stringsÉtienne Barrié
[Feature #20205] As a path toward enabling frozen string literals by default in the future, this commit introduce "chilled strings". From a user perspective chilled strings pretend to be frozen, but on the first attempt to mutate them, they lose their frozen status and emit a warning rather than to raise a `FrozenError`. Implementation wise, `rb_compile_option_struct.frozen_string_literal` is no longer a boolean but a tri-state of `enabled/disabled/unset`. When code is compiled with frozen string literals neither explictly enabled or disabled, string literals are compiled with a new `putchilledstring` instruction. This instruction is identical to `putstring` except it marks the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags. Chilled strings have the `FL_FREEZE` flag as to minimize the need to check for chilled strings across the codebase, and to improve compatibility with C extensions. Notes: - `String#freeze`: clears the chilled flag. - `String#-@`: acts as if the string was mutable. - `String#+@`: acts as if the string was mutable. - `String#clone`: copies the chilled flag. Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-18YJIT: Support arity=-2 cfuncs (#10268)Alan Wu
This type of cfuncs shows up as consume a lot of cycles in profiles of the lobsters benchmark, even though in the stats they don't happen that frequently. Might be a bug in the profiling, but these calls are not too bad to support, so might as well do it. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-03-13YJIT: Fallback cfunc varg splat for ruby2_keywords (#10226)Takashi Kokubun