summaryrefslogtreecommitdiff
path: root/zjit/src/profile.rs
AgeCommit message (Collapse)Author
33 hoursZJIT: Optimize common `invokesuper` cases (#15816)Kevin Menard
* ZJIT: Profile `invokesuper` instructions * ZJIT: Introduce the `InvokeSuperDirect` HIR instruction The new instruction is an optimized version of `InvokeSuper` when we know the `super` target is an ISEQ. * ZJIT: Expand definition of unspecializable to more complex cases * ZJIT: Ensure `invokesuper` optimization works when the inheritance hierarchy is modified * ZJIT: Simplify `invokesuper` specialization to most common case Looking at ruby-bench, most `super` calls don't pass a block, which means we can use the already optimized `SendWithoutBlockDirect`. * ZJIT: Track `super` method entries directly to avoid GC issues Because the method entry isn't typed as a `VALUE`, we set up barriers on its `VALUE` fields. But, that was insufficient as the method entry itself could be collected in certain cases, resulting in dangling objects. Now we track the method entry as a `VALUE` and can more naturally mark it and its children. * ZJIT: Optimize `super` calls with simple argument forms * ZJIT: Report the reason why we can't optimize an `invokesuper` instance * ZJIT: Revise send fallback reasons for `super` calls * ZJIT: Assert `super` calls are `FCALL` and don't need visibily checks
2025-12-16ZJIT: Add a VALUE#write_barrier helper method to deduplicate logicBenoit Daloze
2025-12-16ZJIT: Guard other calls to rb_gc_writebarrier() with a !special_const_p() checkBenoit Daloze
2025-11-25ZJIT: Specialize setinstancevariable when ivar is already in shape (#15290)Max Bernstein
Don't support shape transitions for now.
2025-11-21ZJIT: Specialize monomorphic DefinedIvar (#15281)Max Bernstein
This lets us constant-fold common monomorphic cases.
2025-11-19ZJIT: Fix assertion failure when profiling VM_BLOCK_HANDLER_NONEAlan Wu
As can be seen in vm_block_handler_verify(), VM_BLOCK_HANDLER_NONE is not a valid argument for vm_block_handler(). Store nil in the profiler when seen instead of crashing.
2025-11-07ZJIT: Carve out IseqPayload into a separate module (#15098)Takashi Kokubun
2025-11-06ZJIT: Untag block handler (#15085)Max Bernstein
Storing the tagged block handler in profiles is not GC-safe (nice catch, Kokubun). Store the untagged block handler instead. Fix bug in https://github.com/ruby/ruby/pull/15051
2025-11-05ZJIT: Profile specific objects for invokeblock (#15051)Max Bernstein
I made a special kind of `ProfiledType` that looks at specific objects, not just their classes/shapes (https://github.com/ruby/ruby/pull/15051). Then I profiled some of our benchmarks. For lobsters: ``` Top-6 invokeblock handler (100.0% of total 1,064,155): megamorphic: 494,931 (46.5%) monomorphic_iseq: 337,171 (31.7%) polymorphic: 113,381 (10.7%) monomorphic_ifunc: 52,260 ( 4.9%) monomorphic_other: 38,970 ( 3.7%) no_profiles: 27,442 ( 2.6%) ``` For railsbench: ``` Top-6 invokeblock handler (100.0% of total 2,529,104): monomorphic_iseq: 834,452 (33.0%) megamorphic: 818,347 (32.4%) polymorphic: 632,273 (25.0%) monomorphic_ifunc: 224,243 ( 8.9%) monomorphic_other: 19,595 ( 0.8%) no_profiles: 194 ( 0.0%) ``` For shipit: ``` Top-6 invokeblock handler (100.0% of total 2,104,148): megamorphic: 1,269,889 (60.4%) polymorphic: 411,475 (19.6%) no_profiles: 173,367 ( 8.2%) monomorphic_other: 118,619 ( 5.6%) monomorphic_iseq: 84,891 ( 4.0%) monomorphic_ifunc: 45,907 ( 2.2%) ``` Seems like a monomorphic case for a specific ISEQ actually isn't a bad way of going about this, at least to start...
2025-10-30ZJIT: Inline struct arefMax Bernstein
2025-10-20ZJIT: Optimize send with block into CCallWithFrame (#14863)Stan Lo
Since `Send` has a block iseq, I updated `CCallWithFrame` to take an optional `blockiseq` as well, and then generate `CCallWithFrame` for `Send` when the condition is right. ## Stats `liquid-render` Benchmark | Metric | Before | After | Change | |----------------------|--------------------|--------------------|--------------------- | | send_no_profiles | 3,209,418 (34.1%) | 4,119 (0.1%) | -3,205,299 (-99.9%) | | dynamic_send_count | 9,410,758 (23.1%) | 6,459,678 (15.9%) | -2,951,080 (-31.4%) | | optimized_send_count | 31,269,388 (76.9%) | 34,220,474 (84.1%) | +2,951,086 (+9.4%) | `lobsters` Benchmark | Metric | Before | After | Change | |----------------------|------------|------------|---------------------| | send_no_profiles | 10,769,052 | 2,902,865 | -7,866,187 (-73.0%) | | dynamic_send_count | 45,673,185 | 42,880,160 | -2,793,025 (-6.1%) | | optimized_send_count | 75,142,407 | 78,378,514 | +3,236,107 (+4.3%) | ### `liquid-render` Before <details> ``` Average of last 22, non-warmup iters: 262ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (96.9% of total 10,370,809): Kernel#respond_to?: 5,069,204 (48.9%) Hash#key?: 2,394,488 (23.1%) Set#include?: 778,429 ( 7.5%) String#===: 326,134 ( 3.1%) String#<<: 203,231 ( 2.0%) Integer#<<: 166,768 ( 1.6%) Kernel#is_a?: 164,272 ( 1.6%) Kernel#format: 124,262 ( 1.2%) Integer#/: 124,262 ( 1.2%) Array#<<: 115,325 ( 1.1%) Regexp.last_match: 94,862 ( 0.9%) Hash#[]=: 88,485 ( 0.9%) String#start_with?: 55,933 ( 0.5%) CGI::EscapeExt#escapeHTML: 55,471 ( 0.5%) Array#shift: 55,298 ( 0.5%) Regexp#===: 48,928 ( 0.5%) String#=~: 48,477 ( 0.5%) Array#unshift: 47,331 ( 0.5%) String#empty?: 42,870 ( 0.4%) Array#push: 41,215 ( 0.4%) Top-20 not annotated C methods (97.1% of total 10,394,421): Kernel#respond_to?: 5,069,204 (48.8%) Hash#key?: 2,394,488 (23.0%) Set#include?: 778,429 ( 7.5%) String#===: 326,134 ( 3.1%) Kernel#is_a?: 208,664 ( 2.0%) String#<<: 203,231 ( 2.0%) Integer#<<: 166,768 ( 1.6%) Integer#/: 124,262 ( 1.2%) Kernel#format: 124,262 ( 1.2%) Array#<<: 115,325 ( 1.1%) Regexp.last_match: 94,862 ( 0.9%) Hash#[]=: 88,485 ( 0.9%) String#start_with?: 55,933 ( 0.5%) CGI::EscapeExt#escapeHTML: 55,471 ( 0.5%) Array#shift: 55,298 ( 0.5%) Regexp#===: 48,928 ( 0.5%) String#=~: 48,477 ( 0.5%) Array#unshift: 47,331 ( 0.5%) String#empty?: 42,870 ( 0.4%) Array#push: 41,215 ( 0.4%) Top-2 not optimized method types for send (100.0% of total 2,382): cfunc: 1,196 (50.2%) iseq: 1,186 (49.8%) Top-4 not optimized method types for send_without_block (100.0% of total 2,561,006): iseq: 2,442,091 (95.4%) optimized: 118,882 ( 4.6%) alias: 20 ( 0.0%) null: 13 ( 0.0%) Top-9 not optimized instructions (100.0% of total 685,128): invokeblock: 227,376 (33.2%) opt_neq: 166,471 (24.3%) opt_and: 166,471 (24.3%) opt_eq: 66,721 ( 9.7%) invokesuper: 39,363 ( 5.7%) opt_le: 16,278 ( 2.4%) opt_minus: 1,574 ( 0.2%) opt_send_without_block: 772 ( 0.1%) opt_or: 102 ( 0.0%) Top-8 send fallback reasons (100.0% of total 9,410,758): send_no_profiles: 3,209,418 (34.1%) send_without_block_polymorphic: 2,858,558 (30.4%) send_without_block_not_optimized_method_type: 2,561,006 (27.2%) not_optimized_instruction: 685,128 ( 7.3%) send_without_block_no_profiles: 91,913 ( 1.0%) send_not_optimized_method_type: 2,382 ( 0.0%) obj_to_string_not_string: 2,352 ( 0.0%) send_without_block_cfunc_array_variadic: 1 ( 0.0%) Top-3 unhandled YARV insns (100.0% of total 83,682): getclassvariable: 83,431 (99.7%) once: 137 ( 0.2%) getconstant: 114 ( 0.1%) Top-3 compile error reasons (100.0% of total 5,431,910): register_spill_on_alloc: 4,665,393 (85.9%) exception_handler: 766,347 (14.1%) register_spill_on_ccall: 170 ( 0.0%) Top-11 side exit reasons (100.0% of total 14,635,508): compile_error: 5,431,910 (37.1%) guard_shape_failure: 3,436,341 (23.5%) guard_type_failure: 2,545,791 (17.4%) unhandled_splat: 2,162,907 (14.8%) unhandled_kwarg: 952,568 ( 6.5%) unhandled_yarv_insn: 83,682 ( 0.6%) unhandled_hir_insn: 19,112 ( 0.1%) patchpoint_stable_constant_names: 1,608 ( 0.0%) obj_to_string_fallback: 902 ( 0.0%) patchpoint_method_redefined: 599 ( 0.0%) block_param_proxy_not_iseq_or_ifunc: 88 ( 0.0%) send_count: 40,680,153 dynamic_send_count: 9,410,758 (23.1%) optimized_send_count: 31,269,395 (76.9%) iseq_optimized_send_count: 13,886,902 (34.1%) inline_cfunc_optimized_send_count: 7,011,684 (17.2%) non_variadic_cfunc_optimized_send_count: 4,670,333 (11.5%) variadic_cfunc_optimized_send_count: 5,700,476 (14.0%) dynamic_getivar_count: 1,144,613 dynamic_setivar_count: 950,830 compiled_iseq_count: 402 failed_iseq_count: 48 compile_time: 976ms profile_time: 3,223ms gc_time: 22ms invalidation_time: 0ms vm_write_pc_count: 37,744,491 vm_write_sp_count: 37,511,865 vm_write_locals_count: 37,511,865 vm_write_stack_count: 37,511,865 vm_write_to_parent_iseq_local_count: 558,177 vm_read_from_parent_iseq_local_count: 14,317,032 code_region_bytes: 2,211,840 side_exit_count: 14,635,508 total_insn_count: 476,097,972 vm_insn_count: 253,795,154 zjit_insn_count: 222,302,818 ratio_in_zjit: 46.7% ``` </details> ### `liquid-render` After <details> ``` Average of last 21, non-warmup iters: 272ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (96.8% of total 10,093,966): Kernel#respond_to?: 4,932,224 (48.9%) Hash#key?: 2,329,928 (23.1%) Set#include?: 757,389 ( 7.5%) String#===: 317,494 ( 3.1%) String#<<: 197,831 ( 2.0%) Integer#<<: 162,268 ( 1.6%) Kernel#is_a?: 159,892 ( 1.6%) Kernel#format: 120,902 ( 1.2%) Integer#/: 120,902 ( 1.2%) Array#<<: 112,225 ( 1.1%) Regexp.last_match: 92,382 ( 0.9%) Hash#[]=: 86,145 ( 0.9%) String#start_with?: 54,953 ( 0.5%) Array#shift: 54,038 ( 0.5%) CGI::EscapeExt#escapeHTML: 53,971 ( 0.5%) Regexp#===: 47,848 ( 0.5%) String#=~: 47,237 ( 0.5%) Array#unshift: 46,051 ( 0.5%) String#empty?: 41,750 ( 0.4%) Array#push: 40,115 ( 0.4%) Top-20 not annotated C methods (97.1% of total 10,116,938): Kernel#respond_to?: 4,932,224 (48.8%) Hash#key?: 2,329,928 (23.0%) Set#include?: 757,389 ( 7.5%) String#===: 317,494 ( 3.1%) Kernel#is_a?: 203,084 ( 2.0%) String#<<: 197,831 ( 2.0%) Integer#<<: 162,268 ( 1.6%) Kernel#format: 120,902 ( 1.2%) Integer#/: 120,902 ( 1.2%) Array#<<: 112,225 ( 1.1%) Regexp.last_match: 92,382 ( 0.9%) Hash#[]=: 86,145 ( 0.9%) String#start_with?: 54,953 ( 0.5%) Array#shift: 54,038 ( 0.5%) CGI::EscapeExt#escapeHTML: 53,971 ( 0.5%) Regexp#===: 47,848 ( 0.5%) String#=~: 47,237 ( 0.5%) Array#unshift: 46,051 ( 0.5%) String#empty?: 41,750 ( 0.4%) Array#push: 40,115 ( 0.4%) Top-2 not optimized method types for send (100.0% of total 182,938): iseq: 178,414 (97.5%) cfunc: 4,524 ( 2.5%) Top-4 not optimized method types for send_without_block (100.0% of total 2,492,246): iseq: 2,376,511 (95.4%) optimized: 115,702 ( 4.6%) alias: 20 ( 0.0%) null: 13 ( 0.0%) Top-9 not optimized instructions (100.0% of total 667,727): invokeblock: 221,375 (33.2%) opt_neq: 161,971 (24.3%) opt_and: 161,971 (24.3%) opt_eq: 64,921 ( 9.7%) invokesuper: 39,243 ( 5.9%) opt_le: 15,838 ( 2.4%) opt_minus: 1,534 ( 0.2%) opt_send_without_block: 772 ( 0.1%) opt_or: 102 ( 0.0%) Top-9 send fallback reasons (100.0% of total 6,287,956): send_without_block_polymorphic: 2,782,058 (44.2%) send_without_block_not_optimized_method_type: 2,492,246 (39.6%) not_optimized_instruction: 667,727 (10.6%) send_not_optimized_method_type: 182,938 ( 2.9%) send_without_block_no_profiles: 89,613 ( 1.4%) send_polymorphic: 66,962 ( 1.1%) send_no_profiles: 4,059 ( 0.1%) obj_to_string_not_string: 2,352 ( 0.0%) send_without_block_cfunc_array_variadic: 1 ( 0.0%) Top-3 unhandled YARV insns (100.0% of total 81,482): getclassvariable: 81,231 (99.7%) once: 137 ( 0.2%) getconstant: 114 ( 0.1%) Top-3 compile error reasons (100.0% of total 5,286,310): register_spill_on_alloc: 4,540,413 (85.9%) exception_handler: 745,727 (14.1%) register_spill_on_ccall: 170 ( 0.0%) Top-12 side exit reasons (100.0% of total 14,244,881): compile_error: 5,286,310 (37.1%) guard_shape_failure: 3,346,873 (23.5%) guard_type_failure: 2,477,071 (17.4%) unhandled_splat: 2,104,447 (14.8%) unhandled_kwarg: 926,828 ( 6.5%) unhandled_yarv_insn: 81,482 ( 0.6%) unhandled_hir_insn: 18,672 ( 0.1%) patchpoint_stable_constant_names: 1,608 ( 0.0%) obj_to_string_fallback: 902 ( 0.0%) patchpoint_method_redefined: 599 ( 0.0%) block_param_proxy_not_iseq_or_ifunc: 88 ( 0.0%) interrupt: 1 ( 0.0%) send_count: 39,591,410 dynamic_send_count: 6,287,956 (15.9%) optimized_send_count: 33,303,454 (84.1%) iseq_optimized_send_count: 13,514,283 (34.1%) inline_cfunc_optimized_send_count: 6,823,745 (17.2%) non_variadic_cfunc_optimized_send_count: 7,417,432 (18.7%) variadic_cfunc_optimized_send_count: 5,547,994 (14.0%) dynamic_getivar_count: 1,110,647 dynamic_setivar_count: 927,309 compiled_iseq_count: 403 failed_iseq_count: 48 compile_time: 968ms profile_time: 3,547ms gc_time: 22ms invalidation_time: 0ms vm_write_pc_count: 36,735,108 vm_write_sp_count: 36,508,262 vm_write_locals_count: 36,508,262 vm_write_stack_count: 36,508,262 vm_write_to_parent_iseq_local_count: 543,097 vm_read_from_parent_iseq_local_count: 13,930,672 code_region_bytes: 2,228,224 side_exit_count: 14,244,881 total_insn_count: 463,357,969 vm_insn_count: 247,003,727 zjit_insn_count: 216,354,242 ratio_in_zjit: 46.7% ``` </details> ### `lobsters` Before <details> ``` Average of last 10, non-warmup iters: 898ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (61.3% of total 19,495,906): String#<<: 1,764,437 ( 9.1%) Kernel#is_a?: 1,615,120 ( 8.3%) Hash#[]=: 1,159,455 ( 5.9%) Regexp#match?: 777,496 ( 4.0%) String#empty?: 722,953 ( 3.7%) Hash#key?: 685,258 ( 3.5%) Kernel#respond_to?: 602,017 ( 3.1%) TrueClass#===: 447,671 ( 2.3%) FalseClass#===: 439,276 ( 2.3%) Array#include?: 426,758 ( 2.2%) Kernel#block_given?: 405,271 ( 2.1%) Hash#fetch: 382,302 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 356,654 ( 1.8%) String#start_with?: 353,793 ( 1.8%) Kernel#kind_of?: 340,341 ( 1.7%) Kernel#dup: 328,162 ( 1.7%) String.new: 306,667 ( 1.6%) String#==: 287,549 ( 1.5%) BasicObject#!=: 284,642 ( 1.5%) String#length: 256,070 ( 1.3%) Top-20 not annotated C methods (62.4% of total 19,796,172): Kernel#is_a?: 1,993,676 (10.1%) String#<<: 1,764,437 ( 8.9%) Hash#[]=: 1,159,634 ( 5.9%) Regexp#match?: 777,496 ( 3.9%) String#empty?: 738,030 ( 3.7%) Hash#key?: 685,258 ( 3.5%) Kernel#respond_to?: 602,017 ( 3.0%) TrueClass#===: 447,671 ( 2.3%) FalseClass#===: 439,276 ( 2.2%) Array#include?: 426,758 ( 2.2%) Kernel#block_given?: 425,813 ( 2.2%) Hash#fetch: 382,302 ( 1.9%) ObjectSpace::WeakKeyMap#[]: 356,654 ( 1.8%) String#start_with?: 353,793 ( 1.8%) Kernel#kind_of?: 340,375 ( 1.7%) Kernel#dup: 328,169 ( 1.7%) String.new: 306,667 ( 1.5%) String#==: 293,520 ( 1.5%) BasicObject#!=: 284,825 ( 1.4%) String#length: 256,070 ( 1.3%) Top-2 not optimized method types for send (100.0% of total 115,007): cfunc: 76,172 (66.2%) iseq: 38,835 (33.8%) Top-6 not optimized method types for send_without_block (100.0% of total 8,003,641): iseq: 3,999,211 (50.0%) bmethod: 1,750,271 (21.9%) optimized: 1,653,426 (20.7%) alias: 591,342 ( 7.4%) null: 8,174 ( 0.1%) cfunc: 1,217 ( 0.0%) Top-13 not optimized instructions (100.0% of total 7,590,826): invokesuper: 4,335,446 (57.1%) invokeblock: 1,329,215 (17.5%) sendforward: 841,463 (11.1%) opt_eq: 810,614 (10.7%) opt_plus: 141,773 ( 1.9%) opt_minus: 52,270 ( 0.7%) opt_send_without_block: 43,248 ( 0.6%) opt_neq: 15,047 ( 0.2%) opt_mult: 13,824 ( 0.2%) opt_or: 7,451 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 45,673,212): send_without_block_polymorphic: 17,390,335 (38.1%) send_no_profiles: 10,769,053 (23.6%) send_without_block_not_optimized_method_type: 8,003,641 (17.5%) not_optimized_instruction: 7,590,826 (16.6%) send_without_block_no_profiles: 1,757,109 ( 3.8%) send_not_optimized_method_type: 115,007 ( 0.3%) send_without_block_cfunc_array_variadic: 31,149 ( 0.1%) obj_to_string_not_string: 15,518 ( 0.0%) send_without_block_direct_too_many_args: 574 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 1,242,228): expandarray: 622,203 (50.1%) checkkeyword: 316,111 (25.4%) getclassvariable: 120,540 ( 9.7%) getblockparam: 88,480 ( 7.1%) invokesuperforward: 78,842 ( 6.3%) opt_duparray_send: 14,149 ( 1.1%) getconstant: 1,588 ( 0.1%) checkmatch: 288 ( 0.0%) once: 27 ( 0.0%) Top-3 compile error reasons (100.0% of total 6,769,693): register_spill_on_alloc: 6,188,305 (91.4%) register_spill_on_ccall: 347,108 ( 5.1%) exception_handler: 234,280 ( 3.5%) Top-17 side exit reasons (100.0% of total 20,142,827): compile_error: 6,769,693 (33.6%) guard_type_failure: 5,169,050 (25.7%) guard_shape_failure: 3,726,362 (18.5%) unhandled_yarv_insn: 1,242,228 ( 6.2%) block_param_proxy_not_iseq_or_ifunc: 984,480 ( 4.9%) unhandled_kwarg: 800,154 ( 4.0%) unknown_newarray_send: 539,317 ( 2.7%) patchpoint_stable_constant_names: 340,283 ( 1.7%) unhandled_splat: 229,440 ( 1.1%) unhandled_hir_insn: 147,351 ( 0.7%) patchpoint_no_singleton_class: 128,856 ( 0.6%) patchpoint_method_redefined: 32,718 ( 0.2%) block_param_proxy_modified: 25,274 ( 0.1%) patchpoint_no_ep_escape: 7,559 ( 0.0%) obj_to_string_fallback: 24 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 16 ( 0.0%) send_count: 120,815,640 dynamic_send_count: 45,673,212 (37.8%) optimized_send_count: 75,142,428 (62.2%) iseq_optimized_send_count: 32,188,039 (26.6%) inline_cfunc_optimized_send_count: 23,458,483 (19.4%) non_variadic_cfunc_optimized_send_count: 14,809,797 (12.3%) variadic_cfunc_optimized_send_count: 4,686,109 ( 3.9%) dynamic_getivar_count: 13,023,437 dynamic_setivar_count: 12,311,158 compiled_iseq_count: 4,806 failed_iseq_count: 466 compile_time: 8,943ms profile_time: 99ms gc_time: 45ms invalidation_time: 239ms vm_write_pc_count: 113,652,291 vm_write_sp_count: 111,209,623 vm_write_locals_count: 111,209,623 vm_write_stack_count: 111,209,623 vm_write_to_parent_iseq_local_count: 516,800 vm_read_from_parent_iseq_local_count: 11,225,587 code_region_bytes: 22,609,920 side_exit_count: 20,142,827 total_insn_count: 926,088,942 vm_insn_count: 297,636,255 zjit_insn_count: 628,452,687 ratio_in_zjit: 67.9% ``` </details> ### `lobsters` After <details> ``` Average of last 10, non-warmup iters: 919ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (61.3% of total 19,495,868): String#<<: 1,764,437 ( 9.1%) Kernel#is_a?: 1,615,110 ( 8.3%) Hash#[]=: 1,159,455 ( 5.9%) Regexp#match?: 777,496 ( 4.0%) String#empty?: 722,953 ( 3.7%) Hash#key?: 685,258 ( 3.5%) Kernel#respond_to?: 602,016 ( 3.1%) TrueClass#===: 447,671 ( 2.3%) FalseClass#===: 439,276 ( 2.3%) Array#include?: 426,758 ( 2.2%) Kernel#block_given?: 405,271 ( 2.1%) Hash#fetch: 382,302 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 356,654 ( 1.8%) String#start_with?: 353,793 ( 1.8%) Kernel#kind_of?: 340,341 ( 1.7%) Kernel#dup: 328,162 ( 1.7%) String.new: 306,667 ( 1.6%) String#==: 287,545 ( 1.5%) BasicObject#!=: 284,642 ( 1.5%) String#length: 256,070 ( 1.3%) Top-20 not annotated C methods (62.4% of total 19,796,134): Kernel#is_a?: 1,993,666 (10.1%) String#<<: 1,764,437 ( 8.9%) Hash#[]=: 1,159,634 ( 5.9%) Regexp#match?: 777,496 ( 3.9%) String#empty?: 738,030 ( 3.7%) Hash#key?: 685,258 ( 3.5%) Kernel#respond_to?: 602,016 ( 3.0%) TrueClass#===: 447,671 ( 2.3%) FalseClass#===: 439,276 ( 2.2%) Array#include?: 426,758 ( 2.2%) Kernel#block_given?: 425,813 ( 2.2%) Hash#fetch: 382,302 ( 1.9%) ObjectSpace::WeakKeyMap#[]: 356,654 ( 1.8%) String#start_with?: 353,793 ( 1.8%) Kernel#kind_of?: 340,375 ( 1.7%) Kernel#dup: 328,169 ( 1.7%) String.new: 306,667 ( 1.5%) String#==: 293,516 ( 1.5%) BasicObject#!=: 284,825 ( 1.4%) String#length: 256,070 ( 1.3%) Top-4 not optimized method types for send (100.0% of total 4,749,678): iseq: 2,563,391 (54.0%) cfunc: 2,064,888 (43.5%) alias: 118,577 ( 2.5%) null: 2,822 ( 0.1%) Top-6 not optimized method types for send_without_block (100.0% of total 8,003,641): iseq: 3,999,211 (50.0%) bmethod: 1,750,271 (21.9%) optimized: 1,653,426 (20.7%) alias: 591,342 ( 7.4%) null: 8,174 ( 0.1%) cfunc: 1,217 ( 0.0%) Top-13 not optimized instructions (100.0% of total 7,590,818): invokesuper: 4,335,442 (57.1%) invokeblock: 1,329,215 (17.5%) sendforward: 841,463 (11.1%) opt_eq: 810,610 (10.7%) opt_plus: 141,773 ( 1.9%) opt_minus: 52,270 ( 0.7%) opt_send_without_block: 43,248 ( 0.6%) opt_neq: 15,047 ( 0.2%) opt_mult: 13,824 ( 0.2%) opt_or: 7,451 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-10 send fallback reasons (100.0% of total 43,152,037): send_without_block_polymorphic: 17,390,322 (40.3%) send_without_block_not_optimized_method_type: 8,003,641 (18.5%) not_optimized_instruction: 7,590,818 (17.6%) send_not_optimized_method_type: 4,749,678 (11.0%) send_no_profiles: 2,893,666 ( 6.7%) send_without_block_no_profiles: 1,757,109 ( 4.1%) send_polymorphic: 719,562 ( 1.7%) send_without_block_cfunc_array_variadic: 31,149 ( 0.1%) obj_to_string_not_string: 15,518 ( 0.0%) send_without_block_direct_too_many_args: 574 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 1,242,215): expandarray: 622,203 (50.1%) checkkeyword: 316,111 (25.4%) getclassvariable: 120,540 ( 9.7%) getblockparam: 88,467 ( 7.1%) invokesuperforward: 78,842 ( 6.3%) opt_duparray_send: 14,149 ( 1.1%) getconstant: 1,588 ( 0.1%) checkmatch: 288 ( 0.0%) once: 27 ( 0.0%) Top-3 compile error reasons (100.0% of total 6,769,688): register_spill_on_alloc: 6,188,305 (91.4%) register_spill_on_ccall: 347,108 ( 5.1%) exception_handler: 234,275 ( 3.5%) Top-17 side exit reasons (100.0% of total 20,144,372): compile_error: 6,769,688 (33.6%) guard_type_failure: 5,169,204 (25.7%) guard_shape_failure: 3,726,374 (18.5%) unhandled_yarv_insn: 1,242,215 ( 6.2%) block_param_proxy_not_iseq_or_ifunc: 984,480 ( 4.9%) unhandled_kwarg: 800,154 ( 4.0%) unknown_newarray_send: 539,317 ( 2.7%) patchpoint_stable_constant_names: 340,283 ( 1.7%) unhandled_splat: 229,440 ( 1.1%) unhandled_hir_insn: 147,351 ( 0.7%) patchpoint_no_singleton_class: 130,252 ( 0.6%) patchpoint_method_redefined: 32,716 ( 0.2%) block_param_proxy_modified: 25,274 ( 0.1%) patchpoint_no_ep_escape: 7,559 ( 0.0%) obj_to_string_fallback: 24 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 19 ( 0.0%) send_count: 120,812,030 dynamic_send_count: 43,152,037 (35.7%) optimized_send_count: 77,659,993 (64.3%) iseq_optimized_send_count: 32,187,900 (26.6%) inline_cfunc_optimized_send_count: 23,458,491 (19.4%) non_variadic_cfunc_optimized_send_count: 17,327,499 (14.3%) variadic_cfunc_optimized_send_count: 4,686,103 ( 3.9%) dynamic_getivar_count: 13,023,424 dynamic_setivar_count: 12,310,991 compiled_iseq_count: 4,806 failed_iseq_count: 466 compile_time: 9,012ms profile_time: 104ms gc_time: 44ms invalidation_time: 239ms vm_write_pc_count: 113,648,665 vm_write_sp_count: 111,205,997 vm_write_locals_count: 111,205,997 vm_write_stack_count: 111,205,997 vm_write_to_parent_iseq_local_count: 516,800 vm_read_from_parent_iseq_local_count: 11,225,587 code_region_bytes: 23,052,288 side_exit_count: 20,144,372 total_insn_count: 926,090,214 vm_insn_count: 297,647,811 zjit_insn_count: 628,442,403 ratio_in_zjit: 67.9% ``` </details>
2025-10-15ZJIT: Profile opt_succ and inline Integer#succ for Fixnum (#14846)Max Bernstein
This is only really called a lot in the benchmark harness, as far as I can tell.
2025-10-14ZJIT: Profile opt_size, opt_length, opt_regexpmatch2 (#14837)Max Bernstein
These bring `send_without_block_no_profiles` numbers down more. On lobsters: Before: send_without_block_no_profiles: 1,293,375 After: send_without_block_no_profiles: 998,724 all stats before: ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (71.1% of total 15,575,335): Hash#[]: 4,519,774 (29.0%) Kernel#is_a?: 1,030,758 ( 6.6%) String#<<: 851,929 ( 5.5%) Hash#[]=: 742,941 ( 4.8%) Regexp#match?: 399,889 ( 2.6%) String#empty?: 353,775 ( 2.3%) Hash#key?: 349,129 ( 2.2%) String#start_with?: 334,961 ( 2.2%) Kernel#respond_to?: 316,527 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,381 ( 1.4%) Hash#fetch: 204,702 ( 1.3%) Kernel#block_given?: 181,792 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.2%) Kernel#dup: 179,340 ( 1.2%) BasicObject#!=: 175,997 ( 1.1%) Class#new: 168,078 ( 1.1%) Kernel#kind_of?: 165,600 ( 1.1%) Top-20 not annotated C methods (71.6% of total 15,737,478): Hash#[]: 4,519,784 (28.7%) Kernel#is_a?: 1,212,649 ( 7.7%) String#<<: 851,929 ( 5.4%) Hash#[]=: 743,120 ( 4.7%) Regexp#match?: 399,889 ( 2.5%) String#empty?: 361,013 ( 2.3%) Hash#key?: 349,129 ( 2.2%) String#start_with?: 334,961 ( 2.1%) Kernel#respond_to?: 316,527 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,381 ( 1.3%) Hash#fetch: 204,702 ( 1.3%) Kernel#block_given?: 191,661 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.2%) Kernel#dup: 179,347 ( 1.1%) BasicObject#!=: 176,181 ( 1.1%) Class#new: 168,078 ( 1.1%) Kernel#kind_of?: 165,634 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,648): iseq: 2,271,904 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,702 (21.0%) alias: 310,746 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,096): invokesuper: 2,373,391 (55.3%) invokeblock: 811,872 (18.9%) sendforward: 505,448 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,403 ( 1.7%) opt_minus: 36,225 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,824,463): send_without_block_polymorphic: 9,721,727 (37.6%) send_no_profiles: 5,894,760 (22.8%) send_without_block_not_optimized_method_type: 4,523,648 (17.5%) not_optimized_instruction: 4,293,096 (16.6%) send_without_block_no_profiles: 1,293,386 ( 5.0%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,482): expandarray: 328,490 (47.6%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 48,651 ( 7.0%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,752,502): register_spill_on_alloc: 3,457,791 (92.1%) register_spill_on_ccall: 176,348 ( 4.7%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,787): compile_error: 3,752,502 (34.6%) guard_type_failure: 2,638,903 (24.3%) guard_shape_failure: 1,917,195 (17.7%) unhandled_yarv_insn: 690,482 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,787 ( 4.9%) unhandled_kwarg: 421,943 ( 3.9%) patchpoint: 370,449 ( 3.4%) unknown_newarray_send: 314,785 ( 2.9%) unhandled_splat: 122,060 ( 1.1%) unhandled_hir_insn: 76,396 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) interrupt: 504 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 66,945,801 dynamic_send_count: 25,824,463 (38.6%) optimized_send_count: 41,121,338 (61.4%) iseq_optimized_send_count: 18,587,368 (27.8%) inline_cfunc_optimized_send_count: 6,958,635 (10.4%) non_variadic_cfunc_optimized_send_count: 12,911,155 (19.3%) variadic_cfunc_optimized_send_count: 2,664,180 ( 4.0%) dynamic_getivar_count: 7,365,975 dynamic_setivar_count: 7,245,897 compiled_iseq_count: 4,794 failed_iseq_count: 450 compile_time: 760ms profile_time: 9ms gc_time: 8ms invalidation_time: 55ms vm_write_pc_count: 64,284,053 vm_write_sp_count: 62,940,297 vm_write_locals_count: 62,940,297 vm_write_stack_count: 62,940,297 vm_write_to_parent_iseq_local_count: 292,446 vm_read_from_parent_iseq_local_count: 6,470,923 code_region_bytes: 23,019,520 side_exit_count: 10,860,787 total_insn_count: 517,576,320 vm_insn_count: 163,188,910 zjit_insn_count: 354,387,410 ratio_in_zjit: 68.5% ``` all stats after: ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (70.4% of total 15,740,856): Hash#[]: 4,519,792 (28.7%) Kernel#is_a?: 1,030,776 ( 6.5%) String#<<: 851,940 ( 5.4%) Hash#[]=: 742,914 ( 4.7%) Regexp#match?: 399,887 ( 2.5%) String#empty?: 353,775 ( 2.2%) Hash#key?: 349,139 ( 2.2%) String#start_with?: 334,961 ( 2.1%) Kernel#respond_to?: 316,529 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,381 ( 1.3%) Hash#fetch: 204,702 ( 1.3%) Kernel#block_given?: 181,788 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.2%) Kernel#dup: 179,341 ( 1.1%) BasicObject#!=: 175,996 ( 1.1%) Class#new: 168,079 ( 1.1%) Kernel#kind_of?: 165,600 ( 1.1%) Top-20 not annotated C methods (70.9% of total 15,902,999): Hash#[]: 4,519,802 (28.4%) Kernel#is_a?: 1,212,667 ( 7.6%) String#<<: 851,940 ( 5.4%) Hash#[]=: 743,093 ( 4.7%) Regexp#match?: 399,887 ( 2.5%) String#empty?: 361,013 ( 2.3%) Hash#key?: 349,139 ( 2.2%) String#start_with?: 334,961 ( 2.1%) Kernel#respond_to?: 316,529 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,381 ( 1.3%) Hash#fetch: 204,702 ( 1.3%) Kernel#block_given?: 191,657 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.1%) Kernel#dup: 179,348 ( 1.1%) BasicObject#!=: 176,180 ( 1.1%) Class#new: 168,079 ( 1.1%) Kernel#kind_of?: 165,634 ( 1.0%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,637): iseq: 2,271,900 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,695 (21.0%) alias: 310,746 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,128): invokesuper: 2,373,401 (55.3%) invokeblock: 811,890 (18.9%) sendforward: 505,449 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,403 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,530,605): send_without_block_polymorphic: 9,722,499 (38.1%) send_no_profiles: 5,894,763 (23.1%) send_without_block_not_optimized_method_type: 4,523,637 (17.7%) not_optimized_instruction: 4,293,128 (16.8%) send_without_block_no_profiles: 998,732 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,482): expandarray: 328,490 (47.6%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 48,651 ( 7.0%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,752,500): register_spill_on_alloc: 3,457,792 (92.1%) register_spill_on_ccall: 176,348 ( 4.7%) exception_handler: 118,360 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,797): compile_error: 3,752,500 (34.6%) guard_type_failure: 2,638,909 (24.3%) guard_shape_failure: 1,917,203 (17.7%) unhandled_yarv_insn: 690,482 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,784 ( 4.9%) unhandled_kwarg: 421,947 ( 3.9%) patchpoint: 370,474 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,067 ( 1.1%) unhandled_hir_insn: 76,395 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) interrupt: 469 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 66,945,326 dynamic_send_count: 25,530,605 (38.1%) optimized_send_count: 41,414,721 (61.9%) iseq_optimized_send_count: 18,587,439 (27.8%) inline_cfunc_optimized_send_count: 7,086,426 (10.6%) non_variadic_cfunc_optimized_send_count: 13,076,682 (19.5%) variadic_cfunc_optimized_send_count: 2,664,174 ( 4.0%) dynamic_getivar_count: 7,365,985 dynamic_setivar_count: 7,245,954 compiled_iseq_count: 4,794 failed_iseq_count: 450 compile_time: 748ms profile_time: 9ms gc_time: 8ms invalidation_time: 58ms vm_write_pc_count: 64,155,801 vm_write_sp_count: 62,812,041 vm_write_locals_count: 62,812,041 vm_write_stack_count: 62,812,041 vm_write_to_parent_iseq_local_count: 292,448 vm_read_from_parent_iseq_local_count: 6,470,939 code_region_bytes: 23,052,288 side_exit_count: 10,860,797 total_insn_count: 517,576,915 vm_insn_count: 163,192,099 zjit_insn_count: 354,384,816 ratio_in_zjit: 68.5% ```
2025-10-14ZJIT: Profile opt_ltlt and opt_aset (#14834)Max Bernstein
These bring `send_without_block_no_profiles` numbers down dramatically. On lobsters: Before: send_without_block_no_profiles: 3,466,375 After: send_without_block_no_profiles: 1,293,375 all stats before: ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (70.4% of total 14,174,061): Hash#[]: 4,519,771 (31.9%) Kernel#is_a?: 1,030,757 ( 7.3%) Regexp#match?: 399,885 ( 2.8%) String#empty?: 353,775 ( 2.5%) Hash#key?: 349,125 ( 2.5%) Hash#[]=: 344,348 ( 2.4%) String#start_with?: 334,961 ( 2.4%) Kernel#respond_to?: 316,527 ( 2.2%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.7%) TrueClass#===: 235,770 ( 1.7%) FalseClass#===: 231,143 ( 1.6%) Array#include?: 211,383 ( 1.5%) Hash#fetch: 204,702 ( 1.4%) Kernel#block_given?: 181,793 ( 1.3%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.3%) Kernel#dup: 179,341 ( 1.3%) BasicObject#!=: 175,996 ( 1.2%) Class#new: 168,079 ( 1.2%) Kernel#kind_of?: 165,600 ( 1.2%) String#==: 157,734 ( 1.1%) Top-20 not annotated C methods (71.1% of total 14,336,035): Hash#[]: 4,519,781 (31.5%) Kernel#is_a?: 1,212,647 ( 8.5%) Regexp#match?: 399,885 ( 2.8%) String#empty?: 361,013 ( 2.5%) Hash#key?: 349,125 ( 2.4%) Hash#[]=: 344,348 ( 2.4%) String#start_with?: 334,961 ( 2.3%) Kernel#respond_to?: 316,527 ( 2.2%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.7%) TrueClass#===: 235,770 ( 1.6%) FalseClass#===: 231,143 ( 1.6%) Array#include?: 211,383 ( 1.5%) Hash#fetch: 204,702 ( 1.4%) Kernel#block_given?: 191,662 ( 1.3%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.3%) Kernel#dup: 179,348 ( 1.3%) BasicObject#!=: 176,180 ( 1.2%) Class#new: 168,079 ( 1.2%) Kernel#kind_of?: 165,634 ( 1.2%) String#==: 163,666 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,536,895): iseq: 2,281,897 (50.3%) bmethod: 985,679 (21.7%) optimized: 952,914 (21.0%) alias: 310,745 ( 6.8%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,123): invokesuper: 2,373,396 (55.3%) invokeblock: 811,891 (18.9%) sendforward: 505,449 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,403 ( 1.7%) opt_minus: 36,227 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 27,795,022): send_without_block_polymorphic: 9,505,835 (34.2%) send_no_profiles: 5,894,763 (21.2%) send_without_block_not_optimized_method_type: 4,536,895 (16.3%) not_optimized_instruction: 4,293,123 (15.4%) send_without_block_no_profiles: 3,466,407 (12.5%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,918 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,482): expandarray: 328,490 (47.6%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 48,651 ( 7.0%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,752,391): register_spill_on_alloc: 3,457,680 (92.1%) register_spill_on_ccall: 176,348 ( 4.7%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,852,021): compile_error: 3,752,391 (34.6%) guard_type_failure: 2,630,877 (24.2%) guard_shape_failure: 1,917,208 (17.7%) unhandled_yarv_insn: 690,482 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,784 ( 4.9%) unhandled_kwarg: 421,989 ( 3.9%) patchpoint: 369,799 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,062 ( 1.1%) unhandled_hir_insn: 76,394 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) interrupt: 468 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 66,989,407 dynamic_send_count: 27,795,022 (41.5%) optimized_send_count: 39,194,385 (58.5%) iseq_optimized_send_count: 18,060,194 (27.0%) inline_cfunc_optimized_send_count: 6,960,130 (10.4%) non_variadic_cfunc_optimized_send_count: 11,523,682 (17.2%) variadic_cfunc_optimized_send_count: 2,650,379 ( 4.0%) dynamic_getivar_count: 7,365,982 dynamic_setivar_count: 7,245,929 compiled_iseq_count: 4,795 failed_iseq_count: 449 compile_time: 846ms profile_time: 12ms gc_time: 9ms invalidation_time: 61ms vm_write_pc_count: 64,326,442 vm_write_sp_count: 62,982,524 vm_write_locals_count: 62,982,524 vm_write_stack_count: 62,982,524 vm_write_to_parent_iseq_local_count: 292,448 vm_read_from_parent_iseq_local_count: 6,471,353 code_region_bytes: 22,708,224 side_exit_count: 10,852,021 total_insn_count: 517,550,288 vm_insn_count: 162,946,459 zjit_insn_count: 354,603,829 ratio_in_zjit: 68.5% ``` all stats after: ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (71.1% of total 15,575,343): Hash#[]: 4,519,778 (29.0%) Kernel#is_a?: 1,030,758 ( 6.6%) String#<<: 851,931 ( 5.5%) Hash#[]=: 742,938 ( 4.8%) Regexp#match?: 399,886 ( 2.6%) String#empty?: 353,775 ( 2.3%) Hash#key?: 349,127 ( 2.2%) String#start_with?: 334,961 ( 2.2%) Kernel#respond_to?: 316,529 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,380 ( 1.4%) Hash#fetch: 204,701 ( 1.3%) Kernel#block_given?: 181,792 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.2%) Kernel#dup: 179,341 ( 1.2%) BasicObject#!=: 175,997 ( 1.1%) Class#new: 168,079 ( 1.1%) Kernel#kind_of?: 165,600 ( 1.1%) Top-20 not annotated C methods (71.6% of total 15,737,486): Hash#[]: 4,519,788 (28.7%) Kernel#is_a?: 1,212,649 ( 7.7%) String#<<: 851,931 ( 5.4%) Hash#[]=: 743,117 ( 4.7%) Regexp#match?: 399,886 ( 2.5%) String#empty?: 361,013 ( 2.3%) Hash#key?: 349,127 ( 2.2%) String#start_with?: 334,961 ( 2.1%) Kernel#respond_to?: 316,529 ( 2.0%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 1.5%) TrueClass#===: 235,771 ( 1.5%) FalseClass#===: 231,144 ( 1.5%) Array#include?: 211,380 ( 1.3%) Hash#fetch: 204,701 ( 1.3%) Kernel#block_given?: 191,661 ( 1.2%) ActiveSupport::OrderedOptions#_get: 181,272 ( 1.2%) Kernel#dup: 179,348 ( 1.1%) BasicObject#!=: 176,181 ( 1.1%) Class#new: 168,079 ( 1.1%) Kernel#kind_of?: 165,634 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,650): iseq: 2,271,911 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,696 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,126): invokesuper: 2,373,395 (55.3%) invokeblock: 811,894 (18.9%) sendforward: 505,449 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,403 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,824,512): send_without_block_polymorphic: 9,721,725 (37.6%) send_no_profiles: 5,894,761 (22.8%) send_without_block_not_optimized_method_type: 4,523,650 (17.5%) not_optimized_instruction: 4,293,126 (16.6%) send_without_block_no_profiles: 1,293,404 ( 5.0%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,482): expandarray: 328,490 (47.6%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 48,651 ( 7.0%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,752,504): register_spill_on_alloc: 3,457,793 (92.1%) register_spill_on_ccall: 176,348 ( 4.7%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,754): compile_error: 3,752,504 (34.6%) guard_type_failure: 2,638,901 (24.3%) guard_shape_failure: 1,917,198 (17.7%) unhandled_yarv_insn: 690,482 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,785 ( 4.9%) unhandled_kwarg: 421,947 ( 3.9%) patchpoint: 370,447 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,065 ( 1.1%) unhandled_hir_insn: 76,395 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) interrupt: 463 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 66,945,926 dynamic_send_count: 25,824,512 (38.6%) optimized_send_count: 41,121,414 (61.4%) iseq_optimized_send_count: 18,587,430 (27.8%) inline_cfunc_optimized_send_count: 6,958,641 (10.4%) non_variadic_cfunc_optimized_send_count: 12,911,166 (19.3%) variadic_cfunc_optimized_send_count: 2,664,177 ( 4.0%) dynamic_getivar_count: 7,365,985 dynamic_setivar_count: 7,245,942 compiled_iseq_count: 4,794 failed_iseq_count: 450 compile_time: 852ms profile_time: 13ms gc_time: 11ms invalidation_time: 63ms vm_write_pc_count: 64,284,194 vm_write_sp_count: 62,940,427 vm_write_locals_count: 62,940,427 vm_write_stack_count: 62,940,427 vm_write_to_parent_iseq_local_count: 292,447 vm_read_from_parent_iseq_local_count: 6,470,931 code_region_bytes: 23,019,520 side_exit_count: 10,860,754 total_insn_count: 517,576,267 vm_insn_count: 163,188,187 zjit_insn_count: 354,388,080 ratio_in_zjit: 68.5% ```
2025-10-09ZJIT: Profile opt_aref (#14778)Aiden Fox Ivey
* ZJIT: Profile opt_aref * ZJIT: Add test for opt_aref * ZJIT: Move test and add hash opt test * ZJIT: Update zjit bindgen * ZJIT: Add inspect calls to opt_aref tests
2025-10-08ZJIT: Use type alias for num-profile and call-threshold's types (#14777)Stan Lo
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2025-10-01ZJIT: Allow higher profile num (#14698)Stan Lo
When we investigate guard failure issues, we sometimes need to use profile num around 100k (e.g. `lobsters` in ruby-bench). This change is to allow that.
2025-09-17ZJIT: Revert documentation indent (#14580)Aiden Fox Ivey
2025-09-09ZJIT: Optimize `ObjToString` with type guards (#14469)André Luiz Tiago Soares
* failing test for ObjToString optimization with GuardType * profile ObjToString receiver and rewrite with guard * adjust integration tests for objtostring type guard optimization * Implement new GuardTypeNot HIR; objtostring sends to_s directly on profiled nonstrings * codegen for GuardTypeNot * typo fixes * better name for tests; fix side exit reason for GuardTypeNot * revert accidental change * make bindgen * Fix is_string to identify subclasses of String; fix codegen for identifying if val is String
2025-09-03ZJIT: Ensure `clippy` passes and silence unnecessary warnings (#14439)Aiden Fox Ivey
2025-09-03ZJIT: Add missing module doc commentsAiden Fox Ivey
2025-08-29ZJIT: Specialize monomorphic GetIvar (#14388)Max Bernstein
Specialize monomorphic `GetIvar` into: * `GuardType(HeapObject)` * `GuardShape` * `LoadIvarEmbedded` or `LoadIvarExtended` This requires profiling self for `getinstancevariable` (it's not on the operand stack). This also optimizes `GetIvar`s that happen as a result of inlining `attr_reader` and `attr_accessor`. Also move some (newly) shared JIT helpers into jit.c.
2025-08-28ZJIT: Track if object is a T_OBJECTMax Bernstein
We will (for now) only cache ivar reads from T_OBJECTs.
2025-08-28ZJIT: Track object embedded bitMax Bernstein
This lets us know where to look for an ivar: in the object or indirect elsewhere in the heap.
2025-08-27ZJIT: Specialize some Sends (#14363)Max Bernstein
* ZJIT: Profile and specialize Array#empty? * ZJIT: Specialize BasicObject#== * ZJIT: Specialize Hash#empty? * ZJIT: Specialize BasicObject#! Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2025-08-18Don't allow looking at the shape ID of immediates (#14266)Max Bernstein
It only makes sense for heap objects.
2025-08-11ZJIT: Add compile/profile/GC/invalidation time stats (#14158)Takashi Kokubun
Co-authored-by: Stan Lo <stan001212@gmail.com>
2025-08-06ZJIT: Implement SingleRactorMode invalidation (#14121)Stan Lo
* ZJIT: Implement SingleRactorMode invalidation * ZJIT: Add macro for compiling jumps * ZJIT: Fix typo in comment * YJIT: Fix typo in comment * ZJIT: Avoid using unexported types in zjit.h `enum ruby_vminsn_type` is declared in `insns.inc` and is not exported. Using it in `zjit.h` would cause build errors when the file including it doesn't include `insns.inc`.
2025-08-05ZJIT: Profile type+shape distributions (#13901)Max Bernstein
ZJIT uses the interpreter to take type profiles of what objects pass through the code. It stores a compressed record of the history per opcode for the opcodes we select. Before this change, we re-used the HIR Type data-structure, a shallow type lattice, to store historical type information. This was quick for bringup but is quite lossy as profiles go: we get one bit per built-in type seen, and if we see a non-built-in type in addition, we end up with BasicObject. Not very helpful. Additionally, it does not give us any notion of cardinality: how many of each type did we see? This change brings with it a much more interesting slice of type history: a histogram. A Distribution holds a record of the top-N (where N is fixed at Ruby compile-time) `(Class, ShapeId)` pairs and their counts. It also holds an *other* count in case we see more than N pairs. Using this distribution, we can make more informed decisions about when we should use type information. We can determine if we are strictly monomorphic, very nearly monomorphic, or something else. Maybe the call-site is polymorphic, so we should have a polymorphic inline cache. Exciting stuff. I also plumb this new distribution into the HIR part of the compilation pipeline.
2025-07-17ZJIT: Precise GC writebarriersJohn Hawthorn
This issues writebarriers for objects added via gc_offsets or by profiling. This may be slower than writebarrier_remember, but we would like it to be more debuggable. Co-authored-by: Max Bernstein <ruby@bernsteinbear.com> Co-authored-by: Stan Lo <stan001212@gmail.com>
2025-07-16ZJIT: Add missing write barrier in profiling (GH-13922)Alan Wu
Fixes `TestZJIT::test_require_rubygems`. It was crashing locally due to false collection of a live object. See <https://alanwu.space/post/write-barrier/>. Co-authored-by: Max Bernstein <max@bernsteinbear.com> Co-authored-by: Takashi Kokubun <takashi.kokubun@shopify.com> Co-authored-by: Stan Lo <stan.lo@shopify.com>
2025-07-16ZJIT: Remove dead have_two_fixnums function (#13913)Max Bernstein
2025-07-16ZJIT: Profile each instruction at most num_profiles times (#13903)Takashi Kokubun
* ZJIT: Profile each instruction at most num_profiles times * Use saturating_add for num_profiles
2025-07-11ZJIT: Mark objects baked in JIT code (#13862)Takashi Kokubun
2025-07-11ZJIT: Use Vec instead of HashMap for profiling (#13809)Max Bernstein
This is notably faster: no need to hash indices. Before: ``` plum% samply record ~/.rubies/ruby-zjit/bin/ruby --zjit benchmarks/getivar.rb ruby 3.5.0dev (2025-07-10T14:40:49Z master 51252ef8d7) +ZJIT dev +PRISM [arm64-darwin24] itr: time #1: 5311ms #2: 49ms #3: 49ms #4: 48ms ``` After: ``` plum% samply record ~/.rubies/ruby-zjit/bin/ruby --zjit benchmarks/getivar.rb ruby 3.5.0dev (2025-07-10T15:09:06Z mb-benchmark-compile 42ffd3c1ee) +ZJIT dev +PRISM [arm64-darwin24] itr: time #1: 1332ms #2: 49ms #3: 48ms #4: 48ms ```
2025-07-09ZJIT: Mark profiled objects when marking ISEQ (#13784)Takashi Kokubun
2025-07-09ZJIT: Profile `opt_and` and `opt_or` instructionsStan Lo
2025-07-08ZJIT: Profile `nil?` callsStan Lo
This allows ZJIT to profile `nil?` calls and create type guards for its receiver. - Add `zjit_profile` to `opt_nil_p` insn - Start profiling `opt_nil_p` calls - Use `runtime_exact_ruby_class` instead of `exact_ruby_class` to determine the profiled receiver class
2025-04-18Implement JIT-to-JIT calls (https://github.com/Shopify/zjit/pull/109)Takashi Kokubun
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-04-18Add profiling for opt_send_without_blockAlan Wu
Split out from the CCall changes since we discussed during pairing that this is useful to unblock some other changes. No tests since no one consumes this profiling data yet. Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-04-18Implement Insn::Param using the SP register ↵Takashi Kokubun
(https://github.com/Shopify/zjit/pull/39) Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-04-18Rename Top to Any and Bottom to EmptyMax Bernstein
Top/Bottom can be unintuitive or ambiguous. Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-04-18Profile instructions for fixnum arithmetic ↵Takashi Kokubun
(https://github.com/Shopify/zjit/pull/24) * Profile instructions for fixnum arithmetic * Drop PartialEq from Type * Do not push PatchPoint onto the stack * Avoid pushing the output of the guards * Pop operands after guards * Test HIR from profiled runs * Implement Display for new instructions * Drop unused FIXNUM_BITS * Use a Rust function to split lines * Use Display for GuardType operands Co-authored-by: Max Bernstein <max@bernsteinbear.com> * Fix tests with Display-ed values --------- Co-authored-by: Max Bernstein <max@bernsteinbear.com> Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-04-18Add zjit_* instructions to profile the interpreter ↵Takashi Kokubun
(https://github.com/Shopify/zjit/pull/16) * Add zjit_* instructions to profile the interpreter * Rename FixnumPlus to FixnumAdd * Update a comment about Invalidate * Rename Guard to GuardType * Rename Invalidate to PatchPoint * Drop unneeded debug!() * Plan on profiling the types * Use the output of GuardType as type refined outputs Notes: Merged: https://github.com/ruby/ruby/pull/13131