summaryrefslogtreecommitdiff
path: root/yjit/src/codegen.rs
AgeCommit message (Collapse)Author
17 hoursYJIT: Properly preserve register mapping in cpush_all() and cpop_all()Alan Wu
Previously, cpop_all() did not in fact restore the register mapping state since it was effectively doing a no-op `self.ctx.set_reg_mapping(self.ctx.get_reg_mapping())`. This desync in bookkeeping led to issues with the --yjit-dump-insns option because print_str() used to use cpush_all() and cpop_all().
6 daysYJIT: Add frozen guard for struct aset (#15835)Max Bernstein
We used to just skip this check (oops), but we should not allow modifying frozen objects.
6 daysYJIT: gen_struct_aset check for frozen statusJean Boussier
2025-12-18JIT: Move EC offsets to jit_bindgen_constantsJohn Hawthorn
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-12-18Co-authored-by: Luke Gruber <luke.gru@gmail.com>John Hawthorn
Co-authored-by: Alan Wu <alanwu@ruby-lang.org> YJIT: Support calling bmethods in Ractors Co-authored-by: Luke Gruber <luke.gru@gmail.com> Suggestion from Alan
2025-12-18YJIT: Support calling bmethods in RactorsJohn Hawthorn
Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-12-15YJIT: Bail out if proc would be stored above stack topRandy Stauner
Fixes [Bug #21266].
2025-12-12YJIT: Fix panic from overly loose filtering in identity method inliningAlan Wu
Credits to @rwstauner for noticing this issue in GH-15533.
2025-12-12YJIT: Add missing local variable type update for fallback setlocal blocksAlan Wu
Previously, the chain_depth>0 version of setlocal blocks did not update the type of the local variable in the context. This can leave the context with stale type information and trigger panics like in [Bug #21772] or lead to miscompilation. To trigger the issue, YJIT needs to see the same ISEQ before and after environment escape and have tracked type info before the escape. To trigger in ISEQs that do not send with a block, it probably requires Kernel#binding or the use of include/ruby/debug.h APIs.
2025-12-03YJIT: Pass class and shape ID directly instead of objectMax Bernstein
2025-12-01ZJIT: Specialize String#<< with FixnumMax Bernstein
Append a codepoint.
2025-11-26YJIT: Abort expandarray optimization if method_missing is definedRandy Stauner
Fixes: [Bug #21707] [AW: rewrote comments] Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-11-18Extract `KW_SPECIFIED_BITS_MAX` for JITs (GH-15039)Jacob
Rename to `VM_KW_SPECIFIED_BITS_MAX` now that it's in `vm_core.h`.
2025-11-18YJIT: omit single ractor mode assumption for `proc#call` (#15092)Luke Gruber
The comptime receiver, which is a proc, is either shareable or from this ractor so we don't need to assume single-ractor mode. We should never get the "defined with an un-shareable Proc in a different ractor" error.
2025-11-14YJIT: Fix stack handling in rb_str_dupJohn Hawthorn
Previously because we did a stack_push before ccall, in some cases we could end up pushing an uninitialized value to the VM stack when spilling regs as part of the ccall. Co-authored-by: Luke Gruber <luke.gru@gmail.com>
2025-10-22ZJIT: Fetch Primitive.attr!(leaf) for InvokeBuiltinMax Bernstein
Fix https://github.com/Shopify/ruby/issues/670
2025-10-22YJIT: Buffer writes to the perf mapAlan Wu
2025-10-21YJIT: ZJIT: Extract common bindings to jit.c and remove unnamed enums.Alan Wu
The type name bindgen picks for anonymous enums creates desync issues on the bindgen CI checks.
2025-10-20ZJIT: Implement expandarray (#14847)Max Bernstein
Only support the simple case: no splat or rest. lobsters before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (60.5% of total 11,039,954): Kernel#is_a?: 1,030,769 ( 9.3%) String#<<: 851,954 ( 7.7%) Hash#[]=: 742,941 ( 6.7%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 353,775 ( 3.2%) Hash#key?: 349,147 ( 3.2%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.9%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.2%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.9%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,735 ( 1.4%) Module#clock_gettime: 144,992 ( 1.3%) Top-20 not annotated C methods (61.4% of total 11,202,087): Kernel#is_a?: 1,212,660 (10.8%) String#<<: 851,954 ( 7.6%) Hash#[]=: 743,120 ( 6.6%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 361,013 ( 3.2%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.5%) String#==: 163,667 ( 1.5%) Module#clock_gettime: 144,992 ( 1.3%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,682): iseq: 2,271,936 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,703 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,171): invokesuper: 2,373,404 (55.3%) invokeblock: 811,926 (18.9%) sendforward: 505,452 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,530,724): send_without_block_polymorphic: 9,722,491 (38.1%) send_no_profiles: 5,894,788 (23.1%) send_without_block_not_optimized_method_type: 4,523,682 (17.7%) not_optimized_instruction: 4,293,171 (16.8%) send_without_block_no_profiles: 998,746 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,950): expandarray: 328,490 (47.5%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 49,119 ( 7.1%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,718,636): register_spill_on_alloc: 3,418,255 (91.9%) register_spill_on_ccall: 182,018 ( 4.9%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,385): compile_error: 3,718,636 (34.2%) guard_type_failure: 2,638,926 (24.3%) guard_shape_failure: 1,917,209 (17.7%) unhandled_yarv_insn: 690,950 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 4.9%) unhandled_kwarg: 455,347 ( 4.2%) patchpoint: 370,476 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 76,397 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,244,604 dynamic_send_count: 25,530,724 (41.0%) optimized_send_count: 36,713,880 (59.0%) iseq_optimized_send_count: 18,587,512 (29.9%) inline_cfunc_optimized_send_count: 7,086,414 (11.4%) non_variadic_cfunc_optimized_send_count: 8,375,754 (13.5%) variadic_cfunc_optimized_send_count: 2,664,200 ( 4.3%) dynamic_getivar_count: 7,365,995 dynamic_setivar_count: 7,245,005 compiled_iseq_count: 4,796 failed_iseq_count: 447 compile_time: 814ms profile_time: 9ms gc_time: 9ms invalidation_time: 72ms vm_write_pc_count: 64,156,223 vm_write_sp_count: 62,812,449 vm_write_locals_count: 62,812,449 vm_write_stack_count: 62,812,449 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,599,701 code_region_bytes: 22,953,984 side_exit_count: 10,860,385 total_insn_count: 517,606,340 vm_insn_count: 162,979,530 zjit_insn_count: 354,626,810 ratio_in_zjit: 68.5% ``` </details> lobsters after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.9% of total 11,291,815): Kernel#is_a?: 1,046,269 ( 9.3%) String#<<: 851,954 ( 7.5%) Hash#[]=: 743,274 ( 6.6%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 353,775 ( 3.1%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,742 ( 1.4%) Top-20 not annotated C methods (60.9% of total 11,466,928): Kernel#is_a?: 1,239,923 (10.8%) String#<<: 851,954 ( 7.4%) Hash#[]=: 743,453 ( 6.5%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 361,013 ( 3.1%) Hash#key?: 349,147 ( 3.0%) String#start_with?: 334,961 ( 2.9%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.8%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.5%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.4%) String#==: 163,674 ( 1.4%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,524,016): iseq: 2,272,269 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,704 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,294,241): invokesuper: 2,375,446 (55.3%) invokeblock: 810,955 (18.9%) sendforward: 505,451 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,534,542): send_without_block_polymorphic: 9,723,469 (38.1%) send_no_profiles: 5,896,023 (23.1%) send_without_block_not_optimized_method_type: 4,524,016 (17.7%) not_optimized_instruction: 4,294,241 (16.8%) send_without_block_no_profiles: 998,947 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 362,460): checkkeyword: 190,694 (52.6%) getclassvariable: 59,901 (16.5%) invokesuperforward: 49,503 (13.7%) getblockparam: 49,119 (13.6%) opt_duparray_send: 11,978 ( 3.3%) getconstant: 952 ( 0.3%) checkmatch: 290 ( 0.1%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,798,744): register_spill_on_alloc: 3,495,669 (92.0%) register_spill_on_ccall: 184,712 ( 4.9%) exception_handler: 118,363 ( 3.1%) Top-15 side exit reasons (100.0% of total 10,637,319): compile_error: 3,798,744 (35.7%) guard_type_failure: 2,655,504 (25.0%) guard_shape_failure: 1,917,217 (18.0%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 5.0%) unhandled_kwarg: 455,492 ( 4.3%) patchpoint: 370,478 ( 3.5%) unhandled_yarv_insn: 362,460 ( 3.4%) unknown_newarray_send: 314,786 ( 3.0%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 83,066 ( 0.8%) block_param_proxy_modified: 19,193 ( 0.2%) guard_int_equals_failure: 1,914 ( 0.0%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,495,067 dynamic_send_count: 25,534,542 (40.9%) optimized_send_count: 36,960,525 (59.1%) iseq_optimized_send_count: 18,582,072 (29.7%) inline_cfunc_optimized_send_count: 7,086,638 (11.3%) non_variadic_cfunc_optimized_send_count: 8,392,657 (13.4%) variadic_cfunc_optimized_send_count: 2,899,158 ( 4.6%) dynamic_getivar_count: 7,365,994 dynamic_setivar_count: 7,248,500 compiled_iseq_count: 4,780 failed_iseq_count: 463 compile_time: 816ms profile_time: 9ms gc_time: 11ms invalidation_time: 70ms vm_write_pc_count: 64,363,541 vm_write_sp_count: 63,022,221 vm_write_locals_count: 63,022,221 vm_write_stack_count: 63,022,221 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,850,977 code_region_bytes: 23,019,520 side_exit_count: 10,637,319 total_insn_count: 517,303,190 vm_insn_count: 160,562,103 zjit_insn_count: 356,741,087 ratio_in_zjit: 69.0% ``` </details> railsbench before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.1% of total 25,524,934): Hash#[]=: 1,700,237 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) String#<<: 1,494,022 ( 5.9%) Kernel#is_a?: 1,429,930 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,409 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.8% of total 25,392,654): Hash#[]=: 1,700,416 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,672 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.9%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,115 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,394,873): invokesuper: 5,602,274 (66.7%) invokeblock: 1,764,936 (21.0%) sendforward: 551,832 ( 6.6%) opt_eq: 441,959 ( 5.3%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,748,753): send_without_block_polymorphic: 12,933,923 (31.7%) send_no_profiles: 9,033,636 (22.2%) not_optimized_instruction: 8,394,873 (20.6%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 1,604,456): getclassvariable: 458,136 (28.6%) getblockparam: 455,921 (28.4%) checkkeyword: 265,425 (16.5%) invokesuperforward: 239,383 (14.9%) expandarray: 137,305 ( 8.6%) getconstant: 48,100 ( 3.0%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,570,130): register_spill_on_alloc: 4,994,130 (89.7%) exception_handler: 356,784 ( 6.4%) register_spill_on_ccall: 219,216 ( 3.9%) Top-13 side exit reasons (100.0% of total 12,412,181): compile_error: 5,570,130 (44.9%) unhandled_yarv_insn: 1,604,456 (12.9%) guard_shape_failure: 1,462,872 (11.8%) guard_type_failure: 845,891 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.2%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.1%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,205 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,067,587 dynamic_send_count: 40,748,753 (34.2%) optimized_send_count: 78,318,834 (65.8%) iseq_optimized_send_count: 39,936,542 (33.5%) inline_cfunc_optimized_send_count: 12,857,358 (10.8%) non_variadic_cfunc_optimized_send_count: 19,722,584 (16.6%) variadic_cfunc_optimized_send_count: 5,802,350 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,962,726 compiled_iseq_count: 2,531 failed_iseq_count: 245 compile_time: 414ms profile_time: 21ms gc_time: 33ms invalidation_time: 5ms vm_write_pc_count: 129,093,714 vm_write_sp_count: 126,023,084 vm_write_locals_count: 126,023,084 vm_write_stack_count: 126,023,084 vm_write_to_parent_iseq_local_count: 385,461 vm_read_from_parent_iseq_local_count: 11,266,484 code_region_bytes: 12,156,928 side_exit_count: 12,412,181 total_insn_count: 866,780,158 vm_insn_count: 216,821,134 zjit_insn_count: 649,959,024 ratio_in_zjit: 75.0% ``` </details> railsbench after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.0% of total 25,597,895): Hash#[]=: 1,724,042 ( 6.7%) String#getbyte: 1,572,123 ( 6.1%) String#<<: 1,494,022 ( 5.8%) Kernel#is_a?: 1,429,946 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.2%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,699 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.7% of total 25,465,615): Hash#[]=: 1,724,221 ( 6.8%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,688 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,405 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,442,456): invokesuper: 5,649,857 (66.9%) invokeblock: 1,764,936 (20.9%) sendforward: 551,832 ( 6.5%) opt_eq: 441,959 ( 5.2%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,796,314): send_without_block_polymorphic: 12,933,921 (31.7%) send_no_profiles: 9,033,616 (22.1%) not_optimized_instruction: 8,442,456 (20.7%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 1,467,151): getclassvariable: 458,136 (31.2%) getblockparam: 455,921 (31.1%) checkkeyword: 265,425 (18.1%) invokesuperforward: 239,383 (16.3%) getconstant: 48,100 ( 3.3%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,825,923): register_spill_on_alloc: 5,225,940 (89.7%) exception_handler: 356,784 ( 6.1%) register_spill_on_ccall: 243,199 ( 4.2%) Top-13 side exit reasons (100.0% of total 12,530,763): compile_error: 5,825,923 (46.5%) unhandled_yarv_insn: 1,467,151 (11.7%) guard_shape_failure: 1,462,876 (11.7%) guard_type_failure: 845,913 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.1%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.0%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,273 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,163,569 dynamic_send_count: 40,796,314 (34.2%) optimized_send_count: 78,367,255 (65.8%) iseq_optimized_send_count: 39,911,967 (33.5%) inline_cfunc_optimized_send_count: 12,857,393 (10.8%) non_variadic_cfunc_optimized_send_count: 19,770,401 (16.6%) variadic_cfunc_optimized_send_count: 5,827,494 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,986,381 compiled_iseq_count: 2,523 failed_iseq_count: 252 compile_time: 420ms profile_time: 21ms gc_time: 30ms invalidation_time: 4ms vm_write_pc_count: 128,973,665 vm_write_sp_count: 125,926,968 vm_write_locals_count: 125,926,968 vm_write_stack_count: 125,926,968 vm_write_to_parent_iseq_local_count: 385,752 vm_read_from_parent_iseq_local_count: 11,267,766 code_region_bytes: 12,189,696 side_exit_count: 12,530,763 total_insn_count: 866,667,490 vm_insn_count: 217,813,201 zjit_insn_count: 648,854,289 ratio_in_zjit: 74.9% ``` </details>
2025-09-29YJIT: respect the code in master branchTakashi Kokubun
* Originally, k0kubun added a change to respect namespace in gen_block_given https://github.com/ruby/ruby/pull/13454/commits/d129669b1729b9570da7958394ea594031e79f59 * Just after the change, XrXr proposes a change on master and he says it can fix the problem around block_given?, and it has been merged into the master https://github.com/ruby/ruby/pull/14208 * tagomoris respect the commit on master then will see if it works
2025-09-22YJIT: Pass iseq pointer to get/set classvariable functions (#14625)Stan Lo
* YJIT: Pass iseq pointer to get/set classvariable functions Since we already have the iseq pointer, we can actually save one memory read by passing it directly. We need to wrap the iseq in a VALUE so it can be marked correctly by GC. * YJIT: Fix missing GC marking when passing iseq to rb_vm_setinstancevariable Without wrapping the iseq in a `Operand::Value`, the iseq would not be marked correctly by GC and when compacting the heap, the iseq would be lost and cause a crash.
2025-09-12ZJIT: Share more code with YJIT in jit.c (#14520)Takashi Kokubun
* ZJIT: Share more code with YJIT in jit.c * Fix ZJIT references to JIT
2025-09-10YJIT: Print more disassembly in release buildsAlan Wu
These `#[cfg(feature = "disasm")]` were unnecessary and we can provide the information like ruby source location regardless of the availability of capstone.
2025-09-10YJIT: Remove dead code: `asm_comment!` checks `--yjit-dump-disasm`Alan Wu
2025-09-10YJIT: Stop sharing rb_vm_invokesuper among different instructions (#14492)Takashi Kokubun
2025-09-08YJIT: Add more information to an assert message in jit_guard_known_class ↵Takashi Kokubun
(#14480) * YJIT: Add more information to an assert message in jit_guard_known_class * Say "should be a heap object" instead Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2025-08-29YJIT: Stop sharing rb_vm_send among different instructions (#14393)Takashi Kokubun
2025-08-29ZJIT: Specialize monomorphic GetIvar (#14388)Max Bernstein
Specialize monomorphic `GetIvar` into: * `GuardType(HeapObject)` * `GuardShape` * `LoadIvarEmbedded` or `LoadIvarExtended` This requires profiling self for `getinstancevariable` (it's not on the operand stack). This also optimizes `GetIvar`s that happen as a result of inlining `attr_reader` and `attr_accessor`. Also move some (newly) shared JIT helpers into jit.c.
2025-08-29YJIT: rb_ivar_get_at skip ractor checksJean Boussier
Using `assume_single_ractor_mode` we can skip all ractor safety checks if we're in single ractor mode. ``` compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-28T21:23:38Z yjit-get-exivar 3cc21b76d4) +YJIT +PRISM [arm64-darwin24] | |compare-ruby|built-ruby| |:--------------------------|-----------:|---------:| |vm_ivar_get_on_obj | 975.981| 975.772| | | 1.00x| -| |vm_ivar_get_on_class | 136.214| 470.912| | | -| 3.46x| |vm_ivar_get_on_generic | 148.315| 299.122| | | -| 2.02x| ```
2025-08-29YJIT: rb_ivar_get_at assume leaf-call when single ractorJean Boussier
The only exception it could raise is if we're in multi ractor mode.
2025-08-29YJIT: getinstancevariable cache indexes for types other than T_OBJECTJean Boussier
While accessing the ivars of other types is too complicated to realistically generate the ASM for it, we can at least provide the ivar index as to not have to lookup the shape tree every time. ``` compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-28T17:58:32Z yjit-get-exivar efaa8c9b09) +YJIT +PRISM [arm64-darwin24] | |compare-ruby|built-ruby| |:--------------------------|-----------:|---------:| |vm_ivar_get_on_obj | 930.458| 936.865| | | -| 1.01x| |vm_ivar_get_on_class | 134.471| 431.622| | | -| 3.21x| |vm_ivar_get_on_generic | 146.679| 284.408| | | -| 1.94x| ``` Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2025-08-28YJIT simplify gen_get_iver and gen_set_ivarJean Boussier
The `shape_id` now includes 3 bits for the `heap_id`. It is always non-zero for `T_OBJECT` and always zero for all other types. Hence all these allocator checks are no longer necessary.
2025-08-27Replace ROBJECT_EMBED by ROBJECT_HEAPJean Boussier
The embed layout is way more common than the heap one, especially since WVA. I think it makes for more readable code to inverse the flag.
2025-08-26Remove `opt_aref_with` and `opt_aset_with`Aaron Patterson
When these instructions were introduced it was common to read from a hash with mutable string literals. However, these days, I think these instructions are fairly rare. I tested this with the lobsters benchmark, and saw no difference in speed. In order to be sure, I tracked down every use of this instruction in the lobsters benchmark, and there were only 4 places where it was used. Additionally, this patch fixes a case where "chilled strings" should emit a warning but they don't. ```ruby class Foo def self.[](x)= x.gsub!(/hello/, "hi") end Foo["hello world"] ``` Removing these instructions shows this warning: ``` > ./miniruby -vw test.rb ruby 3.5.0dev (2025-08-25T21:36:50Z rm-opt_aref_with dca08e286c) +PRISM [arm64-darwin24] test.rb:2: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information) ``` [Feature #21553]
2025-08-20YJIT: Improve locals names (#14285)Stan Lo
2025-08-18Don't allow looking at the shape ID of immediates (#14266)Max Bernstein
It only makes sense for heap objects.
2025-08-14YJIT: Fix `defined?(yield)` and `block_given?` at top levelAlan Wu
Previously, YJIT returned truthy for the block given query at the top level. That's incorrect because the top level script never receives a block, and `yield` is a syntax error there. Inside methods, the number of hops to get from `iseq` to `iseq->body->local_iseq` is the same as the number of `VM_ENV_PREV_EP(ep)` hops to get to an environment with `VM_ENV_FLAG_LOCAL`. YJIT and the interpreter both rely on this as can be seen in get_lvar_level(). However, this identity does not hold for the top level frame because of vm_set_eval_stack(), which sets up `TOPLEVEL_BINDING`. Since only methods can take a block that `yield` goes to, have ISEQs that are the child of a non-method ISEQ return falsy for the block given query. This fixes the issue for the top level script and is an optimization for non-method contexts such as inside `ISEQ_TYPE_CLASS`.
2025-07-16YJIT: Side-exit on String#dup when it's not leaf (#13921)Takashi Kokubun
* YJIT: Side-exit on String#dup when it's not leaf * Use an enum instead of a macro for bindgen
2025-07-14YJIT: Move RefCell one level downKunshan Wang
This is the second part of making YJIT work with parallel GC. During GC, `rb_yjit_iseq_mark` and `rb_yjit_iseq_update_references` need to resolve offsets in `Block::gc_obj_offsets` into absolute addresses before reading or updating the fields. This needs the base address stored in `VirtualMemory::region_start` which was previously behind a `RefCell`. When multiple GC threads scan multiple iseq simultaneously (which is possible for some GC modules such as MMTk), it will panic because the `RefCell` is already borrowed. We notice that some fields of `VirtualMemory`, such as `region_start`, are never modified once `VirtualMemory` is constructed. We change the type of the field `CodeBlock::mem_block` from `Rc<RefCell<T>>` to `Rc<T>`, and push the `RefCell` into `VirtualMemory`. We extract mutable fields of `VirtualMemory` into a dedicated struct `VirtualMemoryMut`, and store them in a field `VirtualMemory::mutable` which is a `RefCell<VirtualMemoryMut>`. After this change, methods that access immutable fields in `VirtualMemory`, particularly `base_ptr()` which reads `region_start`, will no longer need to borrow any `RefCell`. Methods that access mutable fields will need to borrow `VirtualMemory::mutable`, but the number of borrowing operations becomes strictly fewer than before because borrowing operations previously done in callers (such as `CodeBlock::write_mem`) are moved into methods of `VirtualMemory` (such as `VirtualMemory::write_bytes`).
2025-06-13Add SHAPE_ID_HAS_IVAR_MASK for quick ivar checkJean Boussier
This allow checking if an object has ivars with just a shape_id mask. Notes: Merged: https://github.com/ruby/ruby/pull/13606
2025-06-12Get rid of `rb_shape_lookup`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13596
2025-06-05Refactor raw accesses to rb_shape_t.capacityJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13524
2025-06-04Get rid of TOO_COMPLEX shape typeJean Boussier
Instead it's now a `shape_id` flag. This allows to check if an object is complex without having to chase the `rb_shape_t` pointer. Notes: Merged: https://github.com/ruby/ruby/pull/13511
2025-06-03Use all 32bits of `shape_id_t` on all platformsJean Boussier
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353] Even thought `shape_id_t` has been make 32bits, we were still limited to use only the lower 16 bits because they had to fit alongside `attr_index_t` inside a `uintptr_t` in inline caches. By enlarging inline caches we can unlock the full 32bits on all platforms, allowing to use these extra bits for tagging. Notes: Merged: https://github.com/ruby/ruby/pull/13500
2025-05-27Refactor `rb_shape_too_complex_p` to take a `shape_id_t`.Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-27Refactor `rb_shape_get_iv_index` to take a `shape_id_t`Jean Boussier
Further reduce exposure of `rb_shape_t`. Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-15YJIT: handle opt_aset_withJean Boussier
``` # frozen_string_ltieral: true hash["literal"] = value ``` Notes: Merged: https://github.com/ruby/ruby/pull/13342
2025-05-12YJIT: Split the block on optimized getlocal/setlocal (#13282)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2025-05-09Rename `RB_OBJ_SHAPE` -> `rb_obj_shape`Jean Boussier
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id` and `RSHAPE` is now a simple alias for `rb_shape_lookup`. I tried to turn all these into `static inline` but I'm having trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;` not being exposed as I'd expect. Notes: Merged: https://github.com/ruby/ruby/pull/13283
2025-05-09Rename `rb_shape_get_shape_id` -> `RB_OBJ_SHAPE_ID`Jean Boussier
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`. Notes: Merged: https://github.com/ruby/ruby/pull/13283