summaryrefslogtreecommitdiff
path: root/yjit
AgeCommit message (Collapse)Author
19 hoursZJIT: Optimize common `invokesuper` cases (#15816)Kevin Menard
* ZJIT: Profile `invokesuper` instructions * ZJIT: Introduce the `InvokeSuperDirect` HIR instruction The new instruction is an optimized version of `InvokeSuper` when we know the `super` target is an ISEQ. * ZJIT: Expand definition of unspecializable to more complex cases * ZJIT: Ensure `invokesuper` optimization works when the inheritance hierarchy is modified * ZJIT: Simplify `invokesuper` specialization to most common case Looking at ruby-bench, most `super` calls don't pass a block, which means we can use the already optimized `SendWithoutBlockDirect`. * ZJIT: Track `super` method entries directly to avoid GC issues Because the method entry isn't typed as a `VALUE`, we set up barriers on its `VALUE` fields. But, that was insufficient as the method entry itself could be collected in certain cases, resulting in dangling objects. Now we track the method entry as a `VALUE` and can more naturally mark it and its children. * ZJIT: Optimize `super` calls with simple argument forms * ZJIT: Report the reason why we can't optimize an `invokesuper` instance * ZJIT: Revise send fallback reasons for `super` calls * ZJIT: Assert `super` calls are `FCALL` and don't need visibily checks
21 hoursYJIT: A64: In CPopAll, pop into the register before using MSRAlan Wu
Or else we put garbage into the flags.
21 hoursYJIT: Properly preserve register mapping in cpush_all() and cpop_all()Alan Wu
Previously, cpop_all() did not in fact restore the register mapping state since it was effectively doing a no-op `self.ctx.set_reg_mapping(self.ctx.get_reg_mapping())`. This desync in bookkeeping led to issues with the --yjit-dump-insns option because print_str() used to use cpush_all() and cpop_all().
21 hoursYJIT: Fix --yjit-dump-insns by removing {cpush,cpop}_all() in printersAlan Wu
cpush_all() and cpop_all() in theory enabled these `print_*` utilities to work in more spots, but with automatically spilling in asm.ccall(), the benefits are now limited. They also have a bug at the moment. Stop using them to dodge the bug.
6 daysYJIT: Add frozen guard for struct aset (#15835)Max Bernstein
We used to just skip this check (oops), but we should not allow modifying frozen objects.
6 daysYJIT: gen_struct_aset check for frozen statusJean Boussier
2025-12-26Remove taintedness/trustedness enums/macros deprecated for 4 yearsNobuyoshi Nakada
2025-12-25Implement declaring weak referencesPeter Zhu
[Feature #21084] # Summary The current way of marking weak references uses `rb_gc_mark_weak(VALUE *ptr)`. This presents challenges because Ruby's GC is incremental, meaning that if the `ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory access. This also overwrites `*ptr = Qundef` if `*ptr` is dead, which prevents any cleanup to be run (e.g. freeing memory or deleting entries from hash tables). This ticket proposes `rb_gc_declare_weak_references` which declares that an object has weak references and calls a cleanup function after marking, allowing the object to clean up any memory for dead objects. # Introduction In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an API allowing objects to mark weak references, the function signature looks like this: ```c void rb_gc_mark_weak(VALUE *ptr); ``` `rb_gc_mark_weak` is called during the marking phase of the GC to specify that the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced. `rb_gc_mark_weak` appends this pointer to a list that is processed after the marking phase of the GC. If the object at `*ptr` is no longer alive, then it overwrites the object reference with a special value (`*ptr = Qundef`). However, this API resulted in two challenges: 1. Ruby's default GC is incremental, which means that the GC is not ran in one phase, but rather split into chunks of work that interleaves with Ruby execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc heap, and that memory could be realloc'd or even free'd. We had to use workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted runtime performance, and increased memory usage. 2. When an object dies, `rb_gc_mark_weak` only overwites the reference with `Qundef`. This means that if we want to do any cleanup (e.g. free a piece of memory or delete a hash table entry), we could not do that and had to defer this process elsewhere (e.g. during marking or runtime). In this ticket, I'm proposing a new API for weak references. Instead of an object marking its weak references during the marking phase, the object declares that it has weak references using the `rb_gc_declare_weak_references` function. This declaration occurs during runtime (e.g. after the object has been created) rather than during GC. After an object declares that it has weak references, it will have its callback function called after marking as long as that object is alive. This callback function can then call a special function `rb_gc_handle_weak_references_alive_p` to determine whether its references are alive. This will allow the callback function to do whatever it wants on the object, allowing it to perform any cleanup work it needs. This significantly simplifies the code for `ObjectSpace::WeakMap` and `ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for the limitations of `rb_gc_mark_weak`. # Performance The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now about 60% faster because the implementation has been simplified and the number of allocations has been reduced. We can see that there is not a significant impact on the performance of `ObjectSpace::WeakMap#[]`. Base: ``` ObjectSpace::WeakMap#[]= 4.620M (± 6.4%) i/s (216.44 ns/i) - 23.342M in 5.072149s ObjectSpace::WeakMap#[] 30.967M (± 1.9%) i/s (32.29 ns/i) - 154.998M in 5.007157s ``` Branch: ``` ObjectSpace::WeakMap#[]= 7.336M (± 2.8%) i/s (136.31 ns/i) - 36.755M in 5.013983s ObjectSpace::WeakMap#[] 30.902M (± 5.4%) i/s (32.36 ns/i) - 155.901M in 5.064060s ``` Code: ``` require "bundler/inline" gemfile do source "https://rubygems.org" gem "benchmark-ips" end wmap = ObjectSpace::WeakMap.new key = Object.new val = Object.new wmap[key] = val Benchmark.ips do |x| x.report("ObjectSpace::WeakMap#[]=") do |times| i = 0 while i < times wmap[Object.new] = Object.new i += 1 end end x.report("ObjectSpace::WeakMap#[]") do |times| i = 0 while i < times wmap[key] wmap[val] # does not exist i += 1 end end end ``` # Alternative designs Currently, `rb_gc_declare_weak_references` is designed to be an internal-only API. This allows us to assume the object types that call `rb_gc_declare_weak_references`. In the future, if we want to open up this API to third parties, we may want to change this function to something like: ```c void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj)); ``` This will allow the third party to implement a custom `callback` that gets called after the marking phase of GC to clean up any dead references. I chose not to implement this design because it is less efficient as we would need to store a mapping from `obj` to `callback`, which requires extra memory.
2025-12-18JIT: Move EC offsets to jit_bindgen_constantsJohn Hawthorn
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-12-18Co-authored-by: Luke Gruber <luke.gru@gmail.com>John Hawthorn
Co-authored-by: Alan Wu <alanwu@ruby-lang.org> YJIT: Support calling bmethods in Ractors Co-authored-by: Luke Gruber <luke.gru@gmail.com> Suggestion from Alan
2025-12-18YJIT: Support calling bmethods in RactorsJohn Hawthorn
Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-12-17JITs: Pass down GNU make jobserver resources when appropriateAlan Wu
To fix warnings from rustc on e.g. Make 4.3, which is in Ubuntu 24.04: > warning: failed to connect to jobserver from environment variable
2025-12-16YJIT: Print `Rc` strong and weak count on assert failureAlan Wu
For <https://bugs.ruby-lang.org/issues/21716>, the panic is looking like some sort of third party memory corruption, with YJIT taking the fall. At the point of this assert, the assembler has dropped, so there's nothing in YJIT's code other than JITState that could be holding on to these transient `PendingBranchRef`. The strong count being more than a handful or the weak count is non-zero shows that someone in the process (likely some native extension) corrupted the Rc's counts.
2025-12-15YJIT: Bail out if proc would be stored above stack topRandy Stauner
Fixes [Bug #21266].
2025-12-12YJIT: Fix panic from overly loose filtering in identity method inliningAlan Wu
Credits to @rwstauner for noticing this issue in GH-15533.
2025-12-12YJIT: Add missing local variable type update for fallback setlocal blocksAlan Wu
Previously, the chain_depth>0 version of setlocal blocks did not update the type of the local variable in the context. This can leave the context with stale type information and trigger panics like in [Bug #21772] or lead to miscompilation. To trigger the issue, YJIT needs to see the same ISEQ before and after environment escape and have tracked type info before the escape. To trigger in ISEQs that do not send with a block, it probably requires Kernel#binding or the use of include/ruby/debug.h APIs.
2025-12-10JITs: Drop cargo and use just rustc for release combo buildAlan Wu
So we don't expose builders to network flakiness which cannot be worked around using cargo's --offline flag.
2025-12-10YJIT: For rustc build, remove cargo touch(1) workaroundAlan Wu
2025-12-09ZJIT: Add codegen for FixnumDiv (#15452)Abrar Habib
Fixes https://github.com/Shopify/ruby/issues/902 This pull request adds code generation for dividing fixnums. Testing confirms the normal case, flooring, and side-exiting on division by zero.
2025-12-05YJIT: Fix duplicate make rule warning in combo buildAlan Wu
2025-12-05YJIT: Fix including stats for ZJIT instructions when ZJIT not in buildAlan Wu
2025-12-05JITs: Update bindings to include interpreter zjit_ opcodesAlan Wu
Mostly YJIT. ZJIT already has the right bindings and this just tweaks the CI configuration.
2025-12-03YJIT: Pass class and shape ID directly instead of objectMax Bernstein
2025-12-01Add BOP_GTGTMax Bernstein
This will help JITs (and maybe later the interpreter) optimize Integer#>>.
2025-12-01ZJIT: Specialize String#<< with FixnumMax Bernstein
Append a codepoint.
2025-11-26YJIT: Abort expandarray optimization if method_missing is definedRandy Stauner
Fixes: [Bug #21707] [AW: rewrote comments] Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-11-20Name the `iseq->body->param` struct and update bindings for JITsAlan Wu
This will make reading the parameters nicer for the JITs. Should be no-op for the C side.
2025-11-18Extract `KW_SPECIFIED_BITS_MAX` for JITs (GH-15039)Jacob
Rename to `VM_KW_SPECIFIED_BITS_MAX` now that it's in `vm_core.h`.
2025-11-18YJIT: omit single ractor mode assumption for `proc#call` (#15092)Luke Gruber
The comptime receiver, which is a proc, is either shareable or from this ractor so we don't need to assume single-ractor mode. We should never get the "defined with an un-shareable Proc in a different ractor" error.
2025-11-14YJIT: Fix stack handling in rb_str_dupJohn Hawthorn
Previously because we did a stack_push before ccall, in some cases we could end up pushing an uninitialized value to the VM stack when spilling regs as part of the ccall. Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-11-07Follow renaming from Namespace to Ruby::BoxSatoshi Tagomori
2025-10-28YJIT, ZJIT: Fix unnecessary `use` of macrosTakashi Kokubun
https://github.com/ruby/ruby/actions/runs/18887695798/job/53907237061?pr=14975
2025-10-22ZJIT: Fetch Primitive.attr!(leaf) for InvokeBuiltinMax Bernstein
Fix https://github.com/Shopify/ruby/issues/670
2025-10-22YJIT: Buffer writes to the perf mapAlan Wu
2025-10-22ZJIT: Inline String#==, String#===Max Bernstein
2025-10-21YJIT: ZJIT: Extract common bindings to jit.c and remove unnamed enums.Alan Wu
The type name bindgen picks for anonymous enums creates desync issues on the bindgen CI checks.
2025-10-20ZJIT: Implement codegen for FixnumMod (#14857)Max Bernstein
This is mostly to see what happens to the loops-times benchmark.
2025-10-20ZJIT: Implement expandarray (#14847)Max Bernstein
Only support the simple case: no splat or rest. lobsters before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (60.5% of total 11,039,954): Kernel#is_a?: 1,030,769 ( 9.3%) String#<<: 851,954 ( 7.7%) Hash#[]=: 742,941 ( 6.7%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 353,775 ( 3.2%) Hash#key?: 349,147 ( 3.2%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.9%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.2%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.9%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,735 ( 1.4%) Module#clock_gettime: 144,992 ( 1.3%) Top-20 not annotated C methods (61.4% of total 11,202,087): Kernel#is_a?: 1,212,660 (10.8%) String#<<: 851,954 ( 7.6%) Hash#[]=: 743,120 ( 6.6%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 361,013 ( 3.2%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.5%) String#==: 163,667 ( 1.5%) Module#clock_gettime: 144,992 ( 1.3%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,682): iseq: 2,271,936 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,703 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,171): invokesuper: 2,373,404 (55.3%) invokeblock: 811,926 (18.9%) sendforward: 505,452 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,530,724): send_without_block_polymorphic: 9,722,491 (38.1%) send_no_profiles: 5,894,788 (23.1%) send_without_block_not_optimized_method_type: 4,523,682 (17.7%) not_optimized_instruction: 4,293,171 (16.8%) send_without_block_no_profiles: 998,746 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,950): expandarray: 328,490 (47.5%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 49,119 ( 7.1%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,718,636): register_spill_on_alloc: 3,418,255 (91.9%) register_spill_on_ccall: 182,018 ( 4.9%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,385): compile_error: 3,718,636 (34.2%) guard_type_failure: 2,638,926 (24.3%) guard_shape_failure: 1,917,209 (17.7%) unhandled_yarv_insn: 690,950 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 4.9%) unhandled_kwarg: 455,347 ( 4.2%) patchpoint: 370,476 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 76,397 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,244,604 dynamic_send_count: 25,530,724 (41.0%) optimized_send_count: 36,713,880 (59.0%) iseq_optimized_send_count: 18,587,512 (29.9%) inline_cfunc_optimized_send_count: 7,086,414 (11.4%) non_variadic_cfunc_optimized_send_count: 8,375,754 (13.5%) variadic_cfunc_optimized_send_count: 2,664,200 ( 4.3%) dynamic_getivar_count: 7,365,995 dynamic_setivar_count: 7,245,005 compiled_iseq_count: 4,796 failed_iseq_count: 447 compile_time: 814ms profile_time: 9ms gc_time: 9ms invalidation_time: 72ms vm_write_pc_count: 64,156,223 vm_write_sp_count: 62,812,449 vm_write_locals_count: 62,812,449 vm_write_stack_count: 62,812,449 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,599,701 code_region_bytes: 22,953,984 side_exit_count: 10,860,385 total_insn_count: 517,606,340 vm_insn_count: 162,979,530 zjit_insn_count: 354,626,810 ratio_in_zjit: 68.5% ``` </details> lobsters after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.9% of total 11,291,815): Kernel#is_a?: 1,046,269 ( 9.3%) String#<<: 851,954 ( 7.5%) Hash#[]=: 743,274 ( 6.6%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 353,775 ( 3.1%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,742 ( 1.4%) Top-20 not annotated C methods (60.9% of total 11,466,928): Kernel#is_a?: 1,239,923 (10.8%) String#<<: 851,954 ( 7.4%) Hash#[]=: 743,453 ( 6.5%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 361,013 ( 3.1%) Hash#key?: 349,147 ( 3.0%) String#start_with?: 334,961 ( 2.9%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.8%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.5%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.4%) String#==: 163,674 ( 1.4%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,524,016): iseq: 2,272,269 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,704 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,294,241): invokesuper: 2,375,446 (55.3%) invokeblock: 810,955 (18.9%) sendforward: 505,451 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,534,542): send_without_block_polymorphic: 9,723,469 (38.1%) send_no_profiles: 5,896,023 (23.1%) send_without_block_not_optimized_method_type: 4,524,016 (17.7%) not_optimized_instruction: 4,294,241 (16.8%) send_without_block_no_profiles: 998,947 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 362,460): checkkeyword: 190,694 (52.6%) getclassvariable: 59,901 (16.5%) invokesuperforward: 49,503 (13.7%) getblockparam: 49,119 (13.6%) opt_duparray_send: 11,978 ( 3.3%) getconstant: 952 ( 0.3%) checkmatch: 290 ( 0.1%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,798,744): register_spill_on_alloc: 3,495,669 (92.0%) register_spill_on_ccall: 184,712 ( 4.9%) exception_handler: 118,363 ( 3.1%) Top-15 side exit reasons (100.0% of total 10,637,319): compile_error: 3,798,744 (35.7%) guard_type_failure: 2,655,504 (25.0%) guard_shape_failure: 1,917,217 (18.0%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 5.0%) unhandled_kwarg: 455,492 ( 4.3%) patchpoint: 370,478 ( 3.5%) unhandled_yarv_insn: 362,460 ( 3.4%) unknown_newarray_send: 314,786 ( 3.0%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 83,066 ( 0.8%) block_param_proxy_modified: 19,193 ( 0.2%) guard_int_equals_failure: 1,914 ( 0.0%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,495,067 dynamic_send_count: 25,534,542 (40.9%) optimized_send_count: 36,960,525 (59.1%) iseq_optimized_send_count: 18,582,072 (29.7%) inline_cfunc_optimized_send_count: 7,086,638 (11.3%) non_variadic_cfunc_optimized_send_count: 8,392,657 (13.4%) variadic_cfunc_optimized_send_count: 2,899,158 ( 4.6%) dynamic_getivar_count: 7,365,994 dynamic_setivar_count: 7,248,500 compiled_iseq_count: 4,780 failed_iseq_count: 463 compile_time: 816ms profile_time: 9ms gc_time: 11ms invalidation_time: 70ms vm_write_pc_count: 64,363,541 vm_write_sp_count: 63,022,221 vm_write_locals_count: 63,022,221 vm_write_stack_count: 63,022,221 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,850,977 code_region_bytes: 23,019,520 side_exit_count: 10,637,319 total_insn_count: 517,303,190 vm_insn_count: 160,562,103 zjit_insn_count: 356,741,087 ratio_in_zjit: 69.0% ``` </details> railsbench before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.1% of total 25,524,934): Hash#[]=: 1,700,237 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) String#<<: 1,494,022 ( 5.9%) Kernel#is_a?: 1,429,930 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,409 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.8% of total 25,392,654): Hash#[]=: 1,700,416 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,672 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.9%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,115 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,394,873): invokesuper: 5,602,274 (66.7%) invokeblock: 1,764,936 (21.0%) sendforward: 551,832 ( 6.6%) opt_eq: 441,959 ( 5.3%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,748,753): send_without_block_polymorphic: 12,933,923 (31.7%) send_no_profiles: 9,033,636 (22.2%) not_optimized_instruction: 8,394,873 (20.6%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 1,604,456): getclassvariable: 458,136 (28.6%) getblockparam: 455,921 (28.4%) checkkeyword: 265,425 (16.5%) invokesuperforward: 239,383 (14.9%) expandarray: 137,305 ( 8.6%) getconstant: 48,100 ( 3.0%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,570,130): register_spill_on_alloc: 4,994,130 (89.7%) exception_handler: 356,784 ( 6.4%) register_spill_on_ccall: 219,216 ( 3.9%) Top-13 side exit reasons (100.0% of total 12,412,181): compile_error: 5,570,130 (44.9%) unhandled_yarv_insn: 1,604,456 (12.9%) guard_shape_failure: 1,462,872 (11.8%) guard_type_failure: 845,891 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.2%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.1%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,205 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,067,587 dynamic_send_count: 40,748,753 (34.2%) optimized_send_count: 78,318,834 (65.8%) iseq_optimized_send_count: 39,936,542 (33.5%) inline_cfunc_optimized_send_count: 12,857,358 (10.8%) non_variadic_cfunc_optimized_send_count: 19,722,584 (16.6%) variadic_cfunc_optimized_send_count: 5,802,350 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,962,726 compiled_iseq_count: 2,531 failed_iseq_count: 245 compile_time: 414ms profile_time: 21ms gc_time: 33ms invalidation_time: 5ms vm_write_pc_count: 129,093,714 vm_write_sp_count: 126,023,084 vm_write_locals_count: 126,023,084 vm_write_stack_count: 126,023,084 vm_write_to_parent_iseq_local_count: 385,461 vm_read_from_parent_iseq_local_count: 11,266,484 code_region_bytes: 12,156,928 side_exit_count: 12,412,181 total_insn_count: 866,780,158 vm_insn_count: 216,821,134 zjit_insn_count: 649,959,024 ratio_in_zjit: 75.0% ``` </details> railsbench after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.0% of total 25,597,895): Hash#[]=: 1,724,042 ( 6.7%) String#getbyte: 1,572,123 ( 6.1%) String#<<: 1,494,022 ( 5.8%) Kernel#is_a?: 1,429,946 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.2%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,699 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.7% of total 25,465,615): Hash#[]=: 1,724,221 ( 6.8%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,688 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,405 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,442,456): invokesuper: 5,649,857 (66.9%) invokeblock: 1,764,936 (20.9%) sendforward: 551,832 ( 6.5%) opt_eq: 441,959 ( 5.2%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,796,314): send_without_block_polymorphic: 12,933,921 (31.7%) send_no_profiles: 9,033,616 (22.1%) not_optimized_instruction: 8,442,456 (20.7%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 1,467,151): getclassvariable: 458,136 (31.2%) getblockparam: 455,921 (31.1%) checkkeyword: 265,425 (18.1%) invokesuperforward: 239,383 (16.3%) getconstant: 48,100 ( 3.3%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,825,923): register_spill_on_alloc: 5,225,940 (89.7%) exception_handler: 356,784 ( 6.1%) register_spill_on_ccall: 243,199 ( 4.2%) Top-13 side exit reasons (100.0% of total 12,530,763): compile_error: 5,825,923 (46.5%) unhandled_yarv_insn: 1,467,151 (11.7%) guard_shape_failure: 1,462,876 (11.7%) guard_type_failure: 845,913 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.1%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.0%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,273 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,163,569 dynamic_send_count: 40,796,314 (34.2%) optimized_send_count: 78,367,255 (65.8%) iseq_optimized_send_count: 39,911,967 (33.5%) inline_cfunc_optimized_send_count: 12,857,393 (10.8%) non_variadic_cfunc_optimized_send_count: 19,770,401 (16.6%) variadic_cfunc_optimized_send_count: 5,827,494 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,986,381 compiled_iseq_count: 2,523 failed_iseq_count: 252 compile_time: 420ms profile_time: 21ms gc_time: 30ms invalidation_time: 4ms vm_write_pc_count: 128,973,665 vm_write_sp_count: 125,926,968 vm_write_locals_count: 125,926,968 vm_write_stack_count: 125,926,968 vm_write_to_parent_iseq_local_count: 385,752 vm_read_from_parent_iseq_local_count: 11,267,766 code_region_bytes: 12,189,696 side_exit_count: 12,530,763 total_insn_count: 866,667,490 vm_insn_count: 217,813,201 zjit_insn_count: 648,854,289 ratio_in_zjit: 74.9% ``` </details>
2025-10-14YJIT: Use `mem::take` over `drain(..).collect()`Alan Wu
2025-10-14[DOC] Fix minor typos in YJIT comments (#14829)Vincent Lin
[DOC] Fix typos in YJIT core
2025-10-12YJIT: ZJIT: Fix rustdoc dead linksAlan Wu
2025-10-12YJIT: Fix unused warning from `cargo test`Alan Wu
2025-10-02YJIT: Prevent making a branch from a dead block to a live blockAlan Wu
I'm seeing some memory corruption in the wild on blocks in `IseqPayload::dead_blocks`. While I unfortunately can't recreate the issue, (For all I know, it could be some external code corrupting YJIT's memory.) establishing a link between dead blocks and live blocks seems fishy enough that we ought to prevent it. When it did happen, it might've had bad interacts with Code GC and the optimization to immediately free empty blocks.
2025-09-30ZJIT: Add --zjit-trace-exits (#14640)Aiden Fox Ivey
Add side exit tracing functionality for ZJIT
2025-09-29YJIT: respect the code in master branchTakashi Kokubun
* Originally, k0kubun added a change to respect namespace in gen_block_given https://github.com/ruby/ruby/pull/13454/commits/d129669b1729b9570da7958394ea594031e79f59 * Just after the change, XrXr proposes a change on master and he says it can fix the problem around block_given?, and it has been merged into the master https://github.com/ruby/ruby/pull/14208 * tagomoris respect the commit on master then will see if it works
2025-09-29Add and fix dependenciesSatoshi Tagomori
2025-09-22YJIT: Pass iseq pointer to get/set classvariable functions (#14625)Stan Lo
* YJIT: Pass iseq pointer to get/set classvariable functions Since we already have the iseq pointer, we can actually save one memory read by passing it directly. We need to wrap the iseq in a VALUE so it can be marked correctly by GC. * YJIT: Fix missing GC marking when passing iseq to rb_vm_setinstancevariable Without wrapping the iseq in a `Operand::Value`, the iseq would not be marked correctly by GC and when compacting the heap, the iseq would be lost and cause a crash.
2025-09-12ZJIT: Share more code with YJIT in jit.c (#14520)Takashi Kokubun
* ZJIT: Share more code with YJIT in jit.c * Fix ZJIT references to JIT
2025-09-11ZJIT: Add support for stats_allocatorAiden Fox Ivey
* Using the shared jit crate, support for a single global_allocator can function * Solves --zjit-mem-size
2025-09-11ZJIT: Move jit.rs to ruby.rs and create a shared crate `jit`Aiden Fox Ivey
* ruby.rs should hold the main entrypoint to YJIT and ZJIT * The crate jit will hold code shared between them