summaryrefslogtreecommitdiff
path: root/yjit
AgeCommit message (Collapse)Author
2024-04-29YJIT: Remove CString allocation when using `src_loc!()`Alan Wu
Since we often take the VM lock as the first thing we do when entering YJIT, and that needs a `src_loc!()`, this removes a allocation from that. The main trick here is `concat!(file!(), '\0')` to get a C string statically baked into the binary.
2024-04-29YJIT: Take VM lock when invalidatingAlan Wu
We need the lock to patch code safely. This might fix some Ractor related crashes seen on CI.
2024-04-29YJIT: Expand codegen for `TrueClass#===` to `FalseClass` and `NilClass` (#10679)Randy Stauner
2024-04-29YJIT: Add specialized codegen function for `TrueClass#===` (#10640)Randy Stauner
* YJIT: Add specialized codegen function for `TrueClass#===` TrueClass#=== is currently number 10 in the most frequent C calls list of the lobsters benchmark. ``` require "benchmark/ips" def wrap true === true true === false true === :x end Benchmark.ips do |x| x.report(:wrap) do wrap end end ``` ``` before Warming up -------------------------------------- wrap 1.791M i/100ms Calculating ------------------------------------- wrap 17.806M (± 1.0%) i/s - 89.544M in 5.029363s after Warming up -------------------------------------- wrap 4.024M i/100ms Calculating ------------------------------------- wrap 40.149M (± 1.1%) i/s - 201.223M in 5.012527s ``` Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Fix the new test for RJIT --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-04-26Revert "YJIT: Try splitting getlocal/setlocal blocks (#10648)"Takashi Kokubun
This reverts commit ab228bd0844758a1c444e39030c153874adf9120.
2024-04-26YJIT: Correct signature of rb_yjit_root_mark()Alan Wu
Even though unused, it's supposed to take a pointer like the C side expects.
2024-04-26YJIT: Fix reference update for `Invariants::no_ep_escape_iseqs`Alan Wu
Previously, the update was done in the ISEQ callback. That effectively never updated anything because the callback itself is given an intact reference, so it could update its content, and `rb_gc_location(iseq)` never returned a new address. Update the whole table once in the YJIT root instead.
2024-04-26YJIT: Try splitting getlocal/setlocal blocks (#10648)Takashi Kokubun
2024-04-25YJIT: Relax `--yjit-verify-ctx` after singleton class creationAlan Wu
Types like `Type::CString` really only assert that at one point the object had its class field equal to `String`. Once a singleton class is created for any strings, the type makes no assertion about any class field anymore, and becomes the same as `Type::TString`. Previously, the `--yjit-verify-ctx` option wasn't allowing objects of these kind that have have singleton classes to pass verification even though the code generators handle it just fine. Found through `ruby/spec`.
2024-04-25YJIT: Optimize local variables when EP == BP (take 2) (#10607)Takashi Kokubun
* Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit c8783441952217c18e523749c821f82cd7e5d222. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> --------- Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
2024-04-24YJIT: Add a specialized codegen function for `Class#superclass`. (#10613)Kevin Menard
Add a specialized codegen function for `Class#superclass`. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Randy Stauner <randy.stauner@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2024-04-22YJIT: Fix shrinking block with assumption too much (#10585)Alan Wu
* YJIT: Fix shrinking block with assumption too much Under the very specific circumstances, discovered by a test case in `ruby/spec`, an `expandarray` block can contain just a branch and carry a method lookup assumption. Previously, when we regenerated the branch, we allowed it to shrink to empty, since we put the code at the jump target immediately after it. That was incorrect and caused a crash while the block is invalidated, since that left no room to patch in an exit. When regenerating a branch that makes up a block entirely, and the block could be invalidated, we need to ensure there is room for invalidation. When there is code before the branch, they should act as padding, so we don't need to worry about those cases. * skip on RJIT
2024-04-19Revert "YJIT: Optimize local variables when EP == BP" (#10584)Alan Wu
This reverts commit 4cc58ea0b865f2fd20f1e881ddbd4c4fab0b072c. Since the change landed call-threshold=1 CI runs have been timing out. There has also been `verify-ctx` violations. Revert for now while we debug.
2024-04-18chore: remove repetitive words (#10573)careworry
Signed-off-by: careworry <worrycare@outlook.com>
2024-04-18YJIT: Fix canary crash with Array#<< (#10568)Alan Wu
Previously, we got "We are killing the stack canary set by opt_ltlt" from `$./miniruby --yjit-call-threshold=1 -e 'a = [].freeze; a << 1'` Found by running ruby-spec with yjit-call-threshold=1.
2024-04-17YJIT: A64: Use CBZ/CBNZ to check for zeroAlan Wu
* YJIT: A64: Add CBZ and CBNZ encoding functions * YJIT: A64: Use CBZ/CBNZ to check for zero Instead of emitting `cmp x0, #0` plus `b.z #target`, A64 offers Compare and Branch on Zero for us to just do `cbz x0, #target`. This commit utilizes that and the related CBNZ instruction when appropriate. We check for zero most commonly in interrupt checks: ```diff # Insn: 0003 leave (stack_size: 1) # RUBY_VM_CHECK_INTS(ec) ldur w11, [x20, #0x20] -tst w11, w11 -b.ne #0x109002164 +cbnz w11, #0x1049021d0 ``` * fix copy paste error Co-authored-by: Randy Stauner <randy@r4s6.net> --------- Co-authored-by: Randy Stauner <randy@r4s6.net>
2024-04-17YJIT: Optimize local variables when EP == BP (#10487)Takashi Kokubun
2024-04-16YJIT: End send fallback blocks (#10539)Takashi Kokubun
2024-04-15YJIT: A64: Avoid intermediate register in `opt_and` and friends (#10509)Alan Wu
Same idea as the x64 equivalent in c2622b52536c5, removing the register shuffle coming from the pop two, push one stack motion these VM instructions perform. ``` # Insn: 0004 opt_or (stack_size: 2) - orr x11, x1, x9 - mov x1, x11 + orr x1, x1, x9 ```
2024-04-11YJIT: x64: Remove register shuffle with `opt_and` and friends (#10498)Alan Wu
This is best understood by looking at the change to the output: ```diff # Insn: 0002 opt_and (stack_size: 2) - mov rax, rsi - and rax, rdi - mov rsi, rax + and rsi, rdi ``` It's a bit awkward to match against due to how stack operands are lowered, but hey, it's nice to save the 2 unnecessary MOVs.
2024-04-10Fix a typo in a commentTakashi Kokubun
2024-04-03YJIT: Let sp_opnd take the number of slots (#10442)Takashi Kokubun
2024-04-03YJIT: Suppress warn(static_mut_refs) (#10440)Takashi Kokubun
2024-04-02YJIT: A64: Use ADDS/SUBS/CMP (immediate) when possible (#10402)Alan Wu
* YJIT: A64: Use ADDS/SUBS/CMP (immediate) when possible We were loading 1 into a register and then doing ADDS/SUBS previously. That was particularly bad since those come up in fixnum operations. ```diff # integer left shift with rhs=1 - mov x11, #1 - subs x11, x1, x11 + subs x11, x1, #1 lsl x12, x11, #1 asr x13, x12, #1 cmp x13, x11 - b.ne #0x106ab60f8 - mov x11, #1 - adds x12, x12, x11 + b.ne #0x10903a0f8 + adds x12, x12, #1 mov x1, x12 ``` Note that it's fine to cast between i64 and u64 since the bit pattern is preserved, and the add/sub themselves don't care about the signedness of the operands. CMP is just another mnemonic for SUBS. * YJIT: A64: Split asm.mul() with immediates properly There is in fact no MUL on A64 that takes an immediate, so this instruction was using the wrong split method. No current usages of this form in YJIT. --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-03-28YJIT: Optimize putobject+opt_ltlt for integersAlan Wu
In `jit_rb_int_lshift()`, we guard against the right hand side changing since we want to avoid generating variable length shifts. When control reaches a `putobject` and `opt_ltlt` pair, though, we know that the right hand side never changes. This commit detects this situation and substitutes an implementation that does not guard against the right hand side changing, saving that work. Deleted some `putobject` Rust tests since they aren't that valuable and cause linking issues. Nice boost to `optcarrot` and `protoboeuf`: ``` ---------- ------------------ bench yjit-pre/yjit-post optcarrot 1.09 protoboeuf 1.12 ---------- ------------------ ```
2024-03-28YJIT: add iseq_alloc_count to stats (#10398)Maxime Chevalier-Boisvert
* YJIT: add iseq_alloc_count to stats * Remove an empty line --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2024-03-25YJIT: Inline simple getlocal+leave iseqsAlan Wu
This mainly targets things like `T.unsafe()` from Sorbet, which is just an identity function at runtime and only a hint for the static checker. Only deal with simple caller and callees (no keywords and splat etc.). Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com>
2024-03-25YJIT: Propagate Array, Hash, and String classes (#10323)Takashi Kokubun
2024-03-22Propagate jobserver FDs to `cargo` and `rustc` [ci skip]Nobuyoshi Nakada
https://doc.rust-lang.org/cargo/reference/build-scripts.html#jobserver > Cargo and `rustc` use the jobserver protocol, developed for GNU > make, to coordinate concurrency across processes. Older GNU make used FDs to connect to the jobserver, and needs an explicit use of `$(MAKE)` variable or `+` prefix to pass the FDs.
2024-03-20YJIT: Get rid of Type::TProc (#10287)Takashi Kokubun
2024-03-19Implement chilled stringsÉtienne Barrié
[Feature #20205] As a path toward enabling frozen string literals by default in the future, this commit introduce "chilled strings". From a user perspective chilled strings pretend to be frozen, but on the first attempt to mutate them, they lose their frozen status and emit a warning rather than to raise a `FrozenError`. Implementation wise, `rb_compile_option_struct.frozen_string_literal` is no longer a boolean but a tri-state of `enabled/disabled/unset`. When code is compiled with frozen string literals neither explictly enabled or disabled, string literals are compiled with a new `putchilledstring` instruction. This instruction is identical to `putstring` except it marks the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags. Chilled strings have the `FL_FREEZE` flag as to minimize the need to check for chilled strings across the codebase, and to improve compatibility with C extensions. Notes: - `String#freeze`: clears the chilled flag. - `String#-@`: acts as if the string was mutable. - `String#+@`: acts as if the string was mutable. - `String#clone`: copies the chilled flag. Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-18YJIT: Support arity=-2 cfuncs (#10268)Alan Wu
This type of cfuncs shows up as consume a lot of cycles in profiles of the lobsters benchmark, even though in the stats they don't happen that frequently. Might be a bug in the profiling, but these calls are not too bad to support, so might as well do it. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-03-13YJIT: Fallback cfunc varg splat for ruby2_keywords (#10226)Takashi Kokubun
2024-03-13Update cruby_bindings.inc.rsPeter Zhu
2024-03-12Revisions for #10198Takashi Kokubun
This fixes some inconsistencies introduced by that PR.
2024-03-06YJIT: String#getbyte codegen (#10188)Maxime Chevalier-Boisvert
* WIP getbyte implementation * WIP String#getbyte implementation * Fix whitespace in stats.rs * fix? * Fix whitespace, add comment --------- Co-authored-by: Aaron Patterson <aaron.patterson@shopify.com>
2024-03-06Move FL_SINGLETON to FL_USER1Jean Boussier
This frees FL_USER0 on both T_MODULE and T_CLASS. Note: prior to this, FL_SINGLETON was never set on T_MODULE, so checking for `FL_SINGLETON` without first checking that `FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-05[DOC] fix some commentscui fliter
Signed-off-by: cui fliter <imcusg@gmail.com>
2024-03-01Update bindgen for YJIT and RJITTakashi Kokubun
2024-03-01Correctly set anon_kwrest flag for def f(b: 1, **)Jeremy Evans
In cases where a method accepts both keywords and an anonymous keyword splat, the method was not marked as taking an anonymous keyword splat. Fix that in the compiler. Doing that broke handling of nil keyword splats in yjit, so update yjit to handle that. Add a test to check that calling a method that accepts both a keyword argument and an anonymous keyword splat does not modify a passed keyword splat hash. Move the anon_kwrest check from setup_parameters_complex to ignore_keyword_hash_p, and only use it if the keyword hash is already a hash. This should speed things up slightly as it avoids a check previously used for all callers of setup_parameters_complex. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-03-01YJIT: No need to set cfp->sp when setting escaped localsAlan Wu
While writing to the env object can add it to the remember set, it shouldn't trigger a GC run.
2024-02-29YJIT: Support inlining putself (#10137)Takashi Kokubun
2024-02-28YJIT: Squash canary before falling backAlan Wu
Recent flaky canary-related CI failures have all happened while trying to fall back. It's unclear what is leaving the canary on the stack and causing gen_send_dynamic() to falsely assume that it should be leaf, and this patch isn't going to help us find the source. One source I found is Array#<< with a frozen array, but it's unclear if that's what's causing the CI failures. I'm somewhat afraid to add a canary check to rb_longjmp() since that might introduce more flaky failures, and maybe ones unrelated to YJIT. See: https://github.com/ruby/ruby/actions/runs/8083502532/job/22086714152 See: https://github.com/ruby/ruby/actions/runs/8066858522/job/22035963315
2024-02-28YJIT: Reject keywords hash in -1 arity cfunc splat supportAlan Wu
`test_keyword.rb` caught this issue. Just need to run with `threshold=1`
2024-02-27YJIT: Support splat with C methods with -1 arityAlan Wu
Usually we deal with splats by speculating that they're of a specific size. In this case, the C method takes a pointer and a length, so we can support changing sizes just fine.
2024-02-25Bump capstone from 0.11.0 to 0.12.0 in /yjit (#10094)dependabot[bot]
Bumps [capstone](https://github.com/capstone-rust/capstone-rs) from 0.11.0 to 0.12.0. - [Release notes](https://github.com/capstone-rust/capstone-rs/releases) - [Changelog](https://github.com/capstone-rust/capstone-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/capstone-rust/capstone-rs/compare/capstone-v0.11.0...capstone-v0.12.0) --- updated-dependencies: - dependency-name: capstone dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-23Assert running_iseq before using itTakashi Kokubun
When running_iseq happens to be 0, it's better to fail on the assertion rather than referencing the null pointer.
2024-02-23[DOC] Fix a typoTakashi Kokubun
2024-02-23YJIT: Lazily push a frame for specialized C funcs (#10080)Takashi Kokubun
* YJIT: Lazily push a frame for specialized C funcs Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> * Fix a comment on pc_to_cfunc * Rename rb_yjit_check_pc to rb_yjit_lazy_push_frame * Rename it to jit_prepare_lazy_frame_call * Fix a typo * Optimize String#getbyte as well * Optimize String#byteslice as well --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-02-22YJIT: Optimize attr_writer (#9986)Takashi Kokubun
* YJIT: Optimize attr_writer * Comment about StackOpnd vs SelfOpnd