summaryrefslogtreecommitdiff
path: root/zjit/src/stats.rs
AgeCommit message (Collapse)Author
28 hoursZJIT: Optimize common `invokesuper` cases (#15816)Kevin Menard
* ZJIT: Profile `invokesuper` instructions * ZJIT: Introduce the `InvokeSuperDirect` HIR instruction The new instruction is an optimized version of `InvokeSuper` when we know the `super` target is an ISEQ. * ZJIT: Expand definition of unspecializable to more complex cases * ZJIT: Ensure `invokesuper` optimization works when the inheritance hierarchy is modified * ZJIT: Simplify `invokesuper` specialization to most common case Looking at ruby-bench, most `super` calls don't pass a block, which means we can use the already optimized `SendWithoutBlockDirect`. * ZJIT: Track `super` method entries directly to avoid GC issues Because the method entry isn't typed as a `VALUE`, we set up barriers on its `VALUE` fields. But, that was insufficient as the method entry itself could be collected in certain cases, resulting in dangling objects. Now we track the method entry as a `VALUE` and can more naturally mark it and its children. * ZJIT: Optimize `super` calls with simple argument forms * ZJIT: Report the reason why we can't optimize an `invokesuper` instance * ZJIT: Revise send fallback reasons for `super` calls * ZJIT: Assert `super` calls are `FCALL` and don't need visibily checks
30 hoursZJIT: Add assume_no_singleton_classes to avoid invalidation loops (#15871)Max Bernstein
Make sure we check if we have seen a singleton for this class before assuming we have not. Port the API from YJIT.
7 daysZJIT: Replace GuardShape with LoadField+GuardBitEquals (#15821)Max Bernstein
GuardShape is just load+guard, so use the existing HIR instructions for load+guard. Probably makes future analysis slightly easier.
8 daysZJIT: Add ArrayAset instruction to HIR (#15747)Nozomi Hijikata
Inline `Array#[]=` into `ArrayAset`.
2025-12-11ZJIT: Check method visibility when optimizing sends (#15501)Max Bernstein
Fix https://github.com/Shopify/ruby/issues/874
2025-12-10ZJIT: Re-compile ISEQs invalidated by PatchPoint (#15459)Takashi Kokubun
2025-12-10ZJIT: Use inline format args (#15482)Alex Rocha
2025-12-09ZJIT: Handle caller_kwarg in direct send when all keyword params are requiredRandy Stauner
2025-12-09ZJIT: Add codegen for FixnumDiv (#15452)Abrar Habib
Fixes https://github.com/Shopify/ruby/issues/902 This pull request adds code generation for dividing fixnums. Testing confirms the normal case, flooring, and side-exiting on division by zero.
2025-12-03ZJIT: Optimize setivar with shape transition (#15375)Max Bernstein
Since we do a decent job of pre-sizing objects, don't handle the case where we would need to re-size an object. Also don't handle too-complex shapes. lobsters stats before: ``` Top-20 calls to C functions from JIT code (79.4% of total 90,051,140): rb_vm_opt_send_without_block: 19,762,433 (21.9%) rb_vm_setinstancevariable: 7,698,314 ( 8.5%) rb_hash_aref: 6,767,461 ( 7.5%) rb_vm_env_write: 5,373,080 ( 6.0%) rb_vm_send: 5,049,229 ( 5.6%) rb_vm_getinstancevariable: 4,535,259 ( 5.0%) rb_obj_is_kind_of: 3,746,306 ( 4.2%) rb_ivar_get_at_no_ractor_check: 3,745,237 ( 4.2%) rb_vm_invokesuper: 3,037,467 ( 3.4%) rb_ary_entry: 2,351,983 ( 2.6%) rb_vm_opt_getconstant_path: 1,344,740 ( 1.5%) rb_vm_invokeblock: 1,184,474 ( 1.3%) Hash#[]=: 1,064,288 ( 1.2%) rb_gc_writebarrier: 1,006,972 ( 1.1%) rb_ec_ary_new_from_values: 902,687 ( 1.0%) fetch: 898,667 ( 1.0%) rb_str_buf_append: 833,787 ( 0.9%) rb_class_allocate_instance: 822,024 ( 0.9%) Hash#fetch: 699,580 ( 0.8%) _bi20: 682,068 ( 0.8%) Top-4 setivar fallback reasons (100.0% of total 7,732,326): shape_transition: 6,032,109 (78.0%) not_monomorphic: 1,469,300 (19.0%) not_t_object: 172,636 ( 2.2%) too_complex: 58,281 ( 0.8%) ``` lobsters stats after: ``` Top-20 calls to C functions from JIT code (79.0% of total 88,322,656): rb_vm_opt_send_without_block: 19,777,880 (22.4%) rb_hash_aref: 6,771,589 ( 7.7%) rb_vm_env_write: 5,372,789 ( 6.1%) rb_gc_writebarrier: 5,195,527 ( 5.9%) rb_vm_send: 5,049,145 ( 5.7%) rb_vm_getinstancevariable: 4,538,485 ( 5.1%) rb_obj_is_kind_of: 3,746,241 ( 4.2%) rb_ivar_get_at_no_ractor_check: 3,745,172 ( 4.2%) rb_vm_invokesuper: 3,037,157 ( 3.4%) rb_ary_entry: 2,351,968 ( 2.7%) rb_vm_setinstancevariable: 1,703,337 ( 1.9%) rb_vm_opt_getconstant_path: 1,344,730 ( 1.5%) rb_vm_invokeblock: 1,184,290 ( 1.3%) Hash#[]=: 1,061,868 ( 1.2%) rb_ec_ary_new_from_values: 902,666 ( 1.0%) fetch: 898,666 ( 1.0%) rb_str_buf_append: 833,784 ( 0.9%) rb_class_allocate_instance: 821,778 ( 0.9%) Hash#fetch: 755,913 ( 0.9%) Top-4 setivar fallback reasons (100.0% of total 1,703,337): not_monomorphic: 1,472,405 (86.4%) not_t_object: 172,629 (10.1%) too_complex: 58,281 ( 3.4%) new_shape_needs_extension: 22 ( 0.0%) ``` I also noticed that primitive printing in HIR was broken so I fixed that. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-11-26ZJIT: Count fallback reasons for set/get/definedivar (#15324)Max Bernstein
lobsters: ``` Top-4 setivar fallback reasons (100.0% of total 7,789,008): shape_transition: 6,074,085 (78.0%) not_monomorphic: 1,484,013 (19.1%) not_t_object: 172,629 ( 2.2%) too_complex: 58,281 ( 0.7%) Top-3 getivar fallback reasons (100.0% of total 9,348,832): not_t_object: 4,658,833 (49.8%) not_monomorphic: 4,542,316 (48.6%) too_complex: 147,683 ( 1.6%) Top-3 definedivar fallback reasons (100.0% of total 366,383): not_monomorphic: 361,389 (98.6%) too_complex: 3,062 ( 0.8%) not_t_object: 1,932 ( 0.5%) ``` railsbench: ``` Top-3 setivar fallback reasons (100.0% of total 15,119,057): shape_transition: 13,760,763 (91.0%) not_monomorphic: 982,368 ( 6.5%) not_t_object: 375,926 ( 2.5%) Top-2 getivar fallback reasons (100.0% of total 14,438,747): not_t_object: 7,643,870 (52.9%) not_monomorphic: 6,794,877 (47.1%) Top-2 definedivar fallback reasons (100.0% of total 209,613): not_monomorphic: 209,526 (100.0%) not_t_object: 87 ( 0.0%) ``` shipit: ``` Top-3 setivar fallback reasons (100.0% of total 14,516,254): shape_transition: 8,613,512 (59.3%) not_monomorphic: 5,761,398 (39.7%) not_t_object: 141,344 ( 1.0%) Top-2 getivar fallback reasons (100.0% of total 21,016,444): not_monomorphic: 11,313,482 (53.8%) not_t_object: 9,702,962 (46.2%) Top-2 definedivar fallback reasons (100.0% of total 290,382): not_monomorphic: 287,755 (99.1%) not_t_object: 2,627 ( 0.9%) ```
2025-11-21ZJIT: Inline Integer#<< for constant rhs (#15258)Max Bernstein
This is good for protoboeuf and other binary parsing
2025-11-19ZJIT: Count all calls to C functions from generated code (#15240)Max Bernstein
lobsters: ``` Top-20 calls to C functions from JIT code (79.9% of total 97,004,883): rb_vm_opt_send_without_block: 19,874,212 (20.5%) rb_vm_setinstancevariable: 9,774,841 (10.1%) rb_ivar_get: 9,358,866 ( 9.6%) rb_hash_aref: 6,828,948 ( 7.0%) rb_vm_send: 6,441,551 ( 6.6%) rb_vm_env_write: 5,375,989 ( 5.5%) rb_vm_invokesuper: 3,037,836 ( 3.1%) Module#===: 2,562,446 ( 2.6%) rb_ary_entry: 2,354,546 ( 2.4%) Kernel#is_a?: 1,424,092 ( 1.5%) rb_vm_opt_getconstant_path: 1,344,923 ( 1.4%) Thread.current: 1,300,822 ( 1.3%) rb_zjit_defined_ivar: 1,222,613 ( 1.3%) rb_vm_invokeblock: 1,184,555 ( 1.2%) Hash#[]=: 1,061,969 ( 1.1%) rb_ary_push: 1,024,987 ( 1.1%) rb_ary_new_capa: 904,003 ( 0.9%) rb_str_buf_append: 833,782 ( 0.9%) rb_class_allocate_instance: 822,626 ( 0.8%) Hash#fetch: 755,913 ( 0.8%) ``` railsbench: ``` Top-20 calls to C functions from JIT code (74.8% of total 189,170,268): rb_vm_opt_send_without_block: 29,870,307 (15.8%) rb_vm_setinstancevariable: 17,631,199 ( 9.3%) rb_hash_aref: 16,928,890 ( 8.9%) rb_ivar_get: 14,441,240 ( 7.6%) rb_vm_env_write: 11,571,001 ( 6.1%) rb_vm_send: 11,153,457 ( 5.9%) rb_vm_invokesuper: 7,568,267 ( 4.0%) Module#===: 6,065,923 ( 3.2%) Hash#[]=: 2,842,990 ( 1.5%) rb_ary_entry: 2,766,125 ( 1.5%) rb_ary_push: 2,722,079 ( 1.4%) rb_vm_invokeblock: 2,594,398 ( 1.4%) Thread.current: 2,560,129 ( 1.4%) rb_str_getbyte: 1,965,627 ( 1.0%) Kernel#is_a?: 1,961,815 ( 1.0%) rb_vm_opt_getconstant_path: 1,863,678 ( 1.0%) rb_hash_new_with_size: 1,796,456 ( 0.9%) rb_class_allocate_instance: 1,785,043 ( 0.9%) String#empty?: 1,713,414 ( 0.9%) rb_ary_new_capa: 1,678,834 ( 0.9%) ``` shipit: ``` Top-20 calls to C functions from JIT code (83.4% of total 182,402,821): rb_vm_opt_send_without_block: 45,753,484 (25.1%) rb_ivar_get: 21,020,650 (11.5%) rb_vm_setinstancevariable: 17,528,603 ( 9.6%) rb_hash_aref: 11,892,856 ( 6.5%) rb_vm_send: 11,723,471 ( 6.4%) rb_vm_env_write: 10,434,452 ( 5.7%) Module#===: 4,225,048 ( 2.3%) rb_vm_invokesuper: 3,705,906 ( 2.0%) Thread.current: 3,337,603 ( 1.8%) rb_ary_entry: 3,114,378 ( 1.7%) Hash#[]=: 2,509,912 ( 1.4%) Array#empty?: 2,282,994 ( 1.3%) rb_vm_invokeblock: 2,210,511 ( 1.2%) Hash#fetch: 2,017,960 ( 1.1%) _bi20: 1,975,147 ( 1.1%) rb_zjit_defined_ivar: 1,897,127 ( 1.0%) rb_vm_opt_getconstant_path: 1,813,294 ( 1.0%) rb_ary_new_capa: 1,615,406 ( 0.9%) Kernel#is_a?: 1,567,854 ( 0.9%) rb_class_allocate_instance: 1,560,035 ( 0.9%) ``` Thanks to @eregon for the idea. Co-authored-by: Jacob Denbeaux <jacob.denbeaux@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2025-11-14ZJIT: Support JIT-to-JIT calls to callees with optional parametersAlan Wu
* Correct JIT entry points for optionals so each optional start with nil before their initialization routine runs. Establish `jit_entry_points[filled_opts_num]` gives the appropriate entry point * Correct number of HIR block parameters for each JIT entry point * Entry points that share the same ISEQ PC get separate entries since they start with different state. No more deduplication. * Reject post parameters. Was hidden behind check for optionals. * Make sure to visit every BB in iseq_to_hir(). Some wasn't visited when the initialization routine for an optional terminates the block in a `SideExit`. Remove the now impossible `FailedOptionalArguments`.
2025-11-14ZJIT: Check argument count matches callee's parametersAlan Wu
2025-11-14ZJIT: Break out CFunc send fallback stats (#15193)Max Bernstein
lobsters before: ``` Top-14 instructions with uncategorized fallback reason (100.0% of total 5,583,226): invokesuper: 3,039,693 (54.4%) invokeblock: 1,181,433 (21.2%) sendforward: 572,612 (10.3%) opt_eq: 464,760 ( 8.3%) opt_plus: 169,904 ( 3.0%) opt_minus: 77,487 ( 1.4%) opt_send_without_block: 42,264 ( 0.8%) opt_gt: 12,263 ( 0.2%) opt_neq: 9,033 ( 0.2%) opt_mult: 8,384 ( 0.2%) opt_or: 4,792 ( 0.1%) opt_lt: 404 ( 0.0%) opt_and: 160 ( 0.0%) opt_ge: 37 ( 0.0%) Top-15 send fallback reasons (100.0% of total 33,316,627): send_without_block_polymorphic: 12,847,877 (38.6%) uncategorized: 5,583,226 (16.8%) one_or_more_complex_arg_pass: 4,504,446 (13.5%) send_not_optimized_method_type: 3,773,513 (11.3%) send_without_block_no_profiles: 2,663,575 ( 8.0%) send_no_profiles: 2,206,479 ( 6.6%) send_without_block_not_optimized_method_type_optimized: 742,574 ( 2.2%) send_polymorphic: 467,750 ( 1.4%) send_without_block_megamorphic: 428,364 ( 1.3%) send_without_block_direct_too_many_args: 33,097 ( 0.1%) send_without_block_cfunc_array_variadic: 22,255 ( 0.1%) obj_to_string_not_string: 19,435 ( 0.1%) send_megamorphic: 17,153 ( 0.1%) send_without_block_not_optimized_method_type: 5,922 ( 0.0%) ccall_with_frame_too_many_args: 961 ( 0.0%) ``` lobsters after: ``` Top-4 instructions with uncategorized fallback reason (100.0% of total 4,835,995): invokesuper: 3,039,692 (62.9%) invokeblock: 1,181,427 (24.4%) sendforward: 572,612 (11.8%) opt_send_without_block: 42,264 ( 0.9%) Top-17 send fallback reasons (100.0% of total 33,316,645): send_without_block_polymorphic: 12,847,879 (38.6%) uncategorized: 4,835,995 (14.5%) one_or_more_complex_arg_pass: 4,502,767 (13.5%) send_without_block_no_profiles: 2,663,578 ( 8.0%) send_not_optimized_method_type: 2,381,743 ( 7.1%) send_no_profiles: 2,206,481 ( 6.6%) send_cfunc_variadic: 1,391,775 ( 4.2%) send_without_block_operands_not_fixnum: 747,228 ( 2.2%) send_without_block_not_optimized_method_type_optimized: 742,574 ( 2.2%) send_polymorphic: 467,750 ( 1.4%) send_without_block_megamorphic: 428,364 ( 1.3%) send_without_block_direct_too_many_args: 33,097 ( 0.1%) send_without_block_cfunc_array_variadic: 22,255 ( 0.1%) obj_to_string_not_string: 19,440 ( 0.1%) send_megamorphic: 17,153 ( 0.1%) send_without_block_not_optimized_method_type: 7,605 ( 0.0%) ccall_with_frame_too_many_args: 961 ( 0.0%) ```
2025-11-12ZJIT: Revert patch_point_count counter (#15160)Takashi Kokubun
2025-11-10ZJIT: Rename things so that they aren't named "not_optimized_optimized" (#15135)Randy Stauner
These refer to "OptimizedMethodType" which is a subcategory of "MethodType::Optimized" so name them after the latter to avoid "not_optimized_optimized".
2025-11-10ZJIT: Split unhandled_hir_insn and unknown_newarray_send stats (#15127)Takashi Kokubun
2025-11-10ZJIT: Rename not_optimized_instruction to uncategorized_instruction (#15130)Randy Stauner
Make it more obvious that this hasn't been handled and could be broken down more.
2025-11-10ZJIT: Add patch_point_count stat (#15100)Takashi Kokubun
2025-11-10ZJIT: handle megamorphic and skewed megamorphic profiling resultsStan Lo
2025-11-07ZJIT: Specialize String#setbyte for fixnum case (#14927)Aiden Fox Ivey
2025-11-07ZJIT: Add compilation for checkkeyword (#14764)Jacob
<details> <summary>Before</summary> <br> ``` **ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (64.0% of total 3,683,424): Kernel#is_a?: 427,127 (11.6%) Hash#[]=: 426,276 (11.6%) String#start_with?: 336,245 ( 9.1%) ObjectSpace::WeakKeyMap#[]: 139,406 ( 3.8%) Hash#fetch: 127,291 ( 3.5%) String#hash: 79,259 ( 2.2%) Process.clock_gettime: 74,658 ( 2.0%) Array#any?: 74,441 ( 2.0%) Integer#==: 71,067 ( 1.9%) Kernel#dup: 68,058 ( 1.8%) Hash#key?: 62,306 ( 1.7%) Regexp#match?: 62,247 ( 1.7%) SQLite3::Statement#step: 61,172 ( 1.7%) SQLite3::Statement#done?: 61,172 ( 1.7%) Kernel#Array: 55,015 ( 1.5%) Integer#<=>: 49,127 ( 1.3%) String.new: 48,363 ( 1.3%) IO#read: 47,753 ( 1.3%) Array#include?: 43,307 ( 1.2%) Struct#initialize: 42,650 ( 1.2%) Top-3 not optimized method types for send (100.0% of total 1,022,743): iseq: 736,483 (72.0%) cfunc: 286,174 (28.0%) null: 86 ( 0.0%) Top-6 not optimized method types for send_without_block (100.0% of total 189,556): optimized_call: 115,966 (61.2%) optimized_send: 36,767 (19.4%) optimized_struct_aset: 33,788 (17.8%) null: 2,521 ( 1.3%) optimized_block_call: 510 ( 0.3%) cfunc: 4 ( 0.0%) Top-13 not optimized instructions (100.0% of total 1,648,882): invokesuper: 697,471 (42.3%) invokeblock: 496,687 (30.1%) sendforward: 221,094 (13.4%) opt_eq: 147,620 ( 9.0%) opt_minus: 40,865 ( 2.5%) opt_plus: 22,912 ( 1.4%) opt_send_without_block: 18,932 ( 1.1%) opt_gt: 867 ( 0.1%) opt_mult: 768 ( 0.0%) opt_neq: 654 ( 0.0%) opt_or: 508 ( 0.0%) opt_lt: 359 ( 0.0%) opt_ge: 145 ( 0.0%) Top-13 send fallback reasons (100.0% of total 8,308,826): send_without_block_polymorphic: 3,174,975 (38.2%) not_optimized_instruction: 1,648,882 (19.8%) fancy_call_feature: 1,072,807 (12.9%) send_not_optimized_method_type: 1,022,743 (12.3%) send_no_profiles: 599,715 ( 7.2%) send_without_block_no_profiles: 486,108 ( 5.9%) send_without_block_not_optimized_optimized_method_type: 187,031 ( 2.3%) send_polymorphic: 101,834 ( 1.2%) obj_to_string_not_string: 7,610 ( 0.1%) send_without_block_not_optimized_method_type: 2,525 ( 0.0%) send_without_block_direct_too_many_args: 2,369 ( 0.0%) send_without_block_cfunc_array_variadic: 2,190 ( 0.0%) ccall_with_frame_too_many_args: 37 ( 0.0%) Top-8 popular unsupported argument-parameter features (100.0% of total 1,209,121): param_opt: 583,595 (48.3%) param_forwardable: 178,162 (14.7%) param_block: 162,689 (13.5%) param_kw: 150,575 (12.5%) param_rest: 90,091 ( 7.5%) param_kwrest: 33,791 ( 2.8%) caller_splat: 10,214 ( 0.8%) caller_kw_splat: 4 ( 0.0%) Top-7 unhandled YARV insns (100.0% of total 128,032): checkkeyword: 88,698 (69.3%) invokesuperforward: 22,296 (17.4%) getblockparam: 16,292 (12.7%) getconstant: 336 ( 0.3%) checkmatch: 290 ( 0.2%) setblockparam: 101 ( 0.1%) once: 19 ( 0.0%) Top-1 compile error reasons (100.0% of total 21,283): exception_handler: 21,283 (100.0%) Top-18 side exit reasons (100.0% of total 2,335,562): guard_type_failure: 677,930 (29.0%) guard_shape_failure: 410,183 (17.6%) unhandled_kwarg: 235,100 (10.1%) patchpoint_stable_constant_names: 206,172 ( 8.8%) block_param_proxy_not_iseq_or_ifunc: 199,931 ( 8.6%) patchpoint_no_singleton_class: 188,359 ( 8.1%) unhandled_yarv_insn: 128,032 ( 5.5%) unknown_newarray_send: 124,805 ( 5.3%) patchpoint_method_redefined: 73,062 ( 3.1%) unhandled_hir_insn: 56,688 ( 2.4%) compile_error: 21,283 ( 0.9%) block_param_proxy_modified: 11,647 ( 0.5%) fixnum_mult_overflow: 954 ( 0.0%) patchpoint_no_ep_escape: 813 ( 0.0%) guard_bit_equals_failure: 316 ( 0.0%) obj_to_string_fallback: 230 ( 0.0%) interrupt: 35 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 26,775,579 dynamic_send_count: 8,308,826 (31.0%) optimized_send_count: 18,466,753 (69.0%) iseq_optimized_send_count: 7,611,729 (28.4%) inline_cfunc_optimized_send_count: 5,935,290 (22.2%) inline_iseq_optimized_send_count: 657,555 ( 2.5%) non_variadic_cfunc_optimized_send_count: 3,169,054 (11.8%) variadic_cfunc_optimized_send_count: 1,093,125 ( 4.1%) dynamic_getivar_count: 2,793,635 dynamic_setivar_count: 3,040,844 compiled_iseq_count: 4,496 failed_iseq_count: 0 compile_time: 915ms profile_time: 6ms gc_time: 6ms invalidation_time: 20ms vm_write_pc_count: 26,857,114 vm_write_sp_count: 25,770,558 vm_write_locals_count: 25,770,558 vm_write_stack_count: 25,770,558 vm_write_to_parent_iseq_local_count: 106,036 vm_read_from_parent_iseq_local_count: 3,213,992 guard_type_count: 27,683,170 guard_type_exit_ratio: 2.4% code_region_bytes: 32,178,176 side_exit_count: 2,335,562 total_insn_count: 170,714,077 vm_insn_count: 28,999,194 zjit_insn_count: 141,714,883 ratio_in_zjit: 83.0% ``` </details> <details> <summary>After</summary> <br> ``` **ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (63.9% of total 3,686,703): Kernel#is_a?: 427,123 (11.6%) Hash#[]=: 426,276 (11.6%) String#start_with?: 336,245 ( 9.1%) ObjectSpace::WeakKeyMap#[]: 139,406 ( 3.8%) Hash#fetch: 127,291 ( 3.5%) String#hash: 79,259 ( 2.1%) Process.clock_gettime: 74,658 ( 2.0%) Array#any?: 74,441 ( 2.0%) Integer#==: 71,067 ( 1.9%) Kernel#dup: 68,058 ( 1.8%) Regexp#match?: 62,336 ( 1.7%) Hash#key?: 62,306 ( 1.7%) SQLite3::Statement#step: 61,172 ( 1.7%) SQLite3::Statement#done?: 61,172 ( 1.7%) Kernel#Array: 55,048 ( 1.5%) Integer#<=>: 49,127 ( 1.3%) String.new: 48,363 ( 1.3%) IO#read: 47,753 ( 1.3%) Array#include?: 43,309 ( 1.2%) Struct#initialize: 42,650 ( 1.2%) Top-3 not optimized method types for send (100.0% of total 1,026,413): iseq: 737,496 (71.9%) cfunc: 288,831 (28.1%) null: 86 ( 0.0%) Top-6 not optimized method types for send_without_block (100.0% of total 189,556): optimized_call: 115,966 (61.2%) optimized_send: 36,767 (19.4%) optimized_struct_aset: 33,788 (17.8%) null: 2,521 ( 1.3%) optimized_block_call: 510 ( 0.3%) cfunc: 4 ( 0.0%) Top-13 not optimized instructions (100.0% of total 1,648,949): invokesuper: 697,452 (42.3%) invokeblock: 496,687 (30.1%) sendforward: 221,094 (13.4%) opt_eq: 147,620 ( 9.0%) opt_minus: 40,863 ( 2.5%) opt_plus: 22,912 ( 1.4%) opt_send_without_block: 19,020 ( 1.2%) opt_gt: 867 ( 0.1%) opt_mult: 768 ( 0.0%) opt_neq: 654 ( 0.0%) opt_or: 508 ( 0.0%) opt_lt: 359 ( 0.0%) opt_ge: 145 ( 0.0%) Top-13 send fallback reasons (100.0% of total 8,318,975): send_without_block_polymorphic: 3,177,471 (38.2%) not_optimized_instruction: 1,648,949 (19.8%) fancy_call_feature: 1,075,143 (12.9%) send_not_optimized_method_type: 1,026,413 (12.3%) send_no_profiles: 599,748 ( 7.2%) send_without_block_no_profiles: 486,190 ( 5.8%) send_without_block_not_optimized_optimized_method_type: 187,031 ( 2.2%) send_polymorphic: 102,497 ( 1.2%) obj_to_string_not_string: 8,412 ( 0.1%) send_without_block_not_optimized_method_type: 2,525 ( 0.0%) send_without_block_direct_too_many_args: 2,369 ( 0.0%) send_without_block_cfunc_array_variadic: 2,190 ( 0.0%) ccall_with_frame_too_many_args: 37 ( 0.0%) Top-8 popular unsupported argument-parameter features (100.0% of total 1,211,457): param_opt: 584,073 (48.2%) param_forwardable: 178,907 (14.8%) param_block: 162,689 (13.4%) param_kw: 151,688 (12.5%) param_rest: 90,091 ( 7.4%) param_kwrest: 33,791 ( 2.8%) caller_splat: 10,214 ( 0.8%) caller_kw_splat: 4 ( 0.0%) Top-6 unhandled YARV insns (100.0% of total 39,334): invokesuperforward: 22,296 (56.7%) getblockparam: 16,292 (41.4%) getconstant: 336 ( 0.9%) checkmatch: 290 ( 0.7%) setblockparam: 101 ( 0.3%) once: 19 ( 0.0%) Top-1 compile error reasons (100.0% of total 21,283): exception_handler: 21,283 (100.0%) Top-18 side exit reasons (100.0% of total 2,253,541): guard_type_failure: 682,695 (30.3%) guard_shape_failure: 410,183 (18.2%) unhandled_kwarg: 236,780 (10.5%) patchpoint_stable_constant_names: 206,310 ( 9.2%) block_param_proxy_not_iseq_or_ifunc: 199,931 ( 8.9%) patchpoint_no_singleton_class: 188,438 ( 8.4%) unknown_newarray_send: 124,805 ( 5.5%) patchpoint_method_redefined: 73,056 ( 3.2%) unhandled_hir_insn: 56,686 ( 2.5%) unhandled_yarv_insn: 39,334 ( 1.7%) compile_error: 21,283 ( 0.9%) block_param_proxy_modified: 11,647 ( 0.5%) fixnum_mult_overflow: 954 ( 0.0%) patchpoint_no_ep_escape: 813 ( 0.0%) guard_bit_equals_failure: 316 ( 0.0%) obj_to_string_fallback: 230 ( 0.0%) interrupt: 58 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) send_count: 27,032,751 dynamic_send_count: 8,318,975 (30.8%) optimized_send_count: 18,713,776 (69.2%) iseq_optimized_send_count: 7,809,698 (28.9%) inline_cfunc_optimized_send_count: 5,980,083 (22.1%) inline_iseq_optimized_send_count: 657,677 ( 2.4%) non_variadic_cfunc_optimized_send_count: 3,170,381 (11.7%) variadic_cfunc_optimized_send_count: 1,095,937 ( 4.1%) dynamic_getivar_count: 2,793,987 dynamic_setivar_count: 3,350,905 compiled_iseq_count: 4,498 failed_iseq_count: 0 compile_time: 884ms profile_time: 6ms gc_time: 6ms invalidation_time: 19ms vm_write_pc_count: 27,417,915 vm_write_sp_count: 26,327,928 vm_write_locals_count: 26,327,928 vm_write_stack_count: 26,327,928 vm_write_to_parent_iseq_local_count: 106,036 vm_read_from_parent_iseq_local_count: 3,213,992 guard_type_count: 27,937,831 guard_type_exit_ratio: 2.4% code_region_bytes: 32,571,392 side_exit_count: 2,253,541 total_insn_count: 170,630,429 vm_insn_count: 26,617,244 zjit_insn_count: 144,013,185 ratio_in_zjit: 84.4% ``` </details>
2025-11-06ZJIT: Remove obsolete register spill counters (#15089)Takashi Kokubun
2025-11-05ZJIT: Profile specific objects for invokeblock (#15051)Max Bernstein
I made a special kind of `ProfiledType` that looks at specific objects, not just their classes/shapes (https://github.com/ruby/ruby/pull/15051). Then I profiled some of our benchmarks. For lobsters: ``` Top-6 invokeblock handler (100.0% of total 1,064,155): megamorphic: 494,931 (46.5%) monomorphic_iseq: 337,171 (31.7%) polymorphic: 113,381 (10.7%) monomorphic_ifunc: 52,260 ( 4.9%) monomorphic_other: 38,970 ( 3.7%) no_profiles: 27,442 ( 2.6%) ``` For railsbench: ``` Top-6 invokeblock handler (100.0% of total 2,529,104): monomorphic_iseq: 834,452 (33.0%) megamorphic: 818,347 (32.4%) polymorphic: 632,273 (25.0%) monomorphic_ifunc: 224,243 ( 8.9%) monomorphic_other: 19,595 ( 0.8%) no_profiles: 194 ( 0.0%) ``` For shipit: ``` Top-6 invokeblock handler (100.0% of total 2,104,148): megamorphic: 1,269,889 (60.4%) polymorphic: 411,475 (19.6%) no_profiles: 173,367 ( 8.2%) monomorphic_other: 118,619 ( 5.6%) monomorphic_iseq: 84,891 ( 4.0%) monomorphic_ifunc: 45,907 ( 2.2%) ``` Seems like a monomorphic case for a specific ISEQ actually isn't a bad way of going about this, at least to start...
2025-11-05ZJIT: Add zjit_alloc_bytes and total_mem_bytes stats (#15059)Takashi Kokubun
2025-11-05ZJIT: Track guard shape exit ratio (#15052)Randy Stauner
new ZJIT stats excerpt from liquid-runtime: ``` vm_read_from_parent_iseq_local_count: 10,909,753 guard_type_count: 45,109,441 guard_type_exit_ratio: 4.3% guard_shape_count: 15,272,133 guard_shape_exit_ratio: 20.1% code_region_bytes: 3,899,392 ``` lobsters ``` guard_type_count: 71,765,580 guard_type_exit_ratio: 4.3% guard_shape_count: 21,872,560 guard_shape_exit_ratio: 8.0% ``` railsbench ``` guard_type_count: 117,661,124 guard_type_exit_ratio: 0.7% guard_shape_count: 28,032,665 guard_shape_exit_ratio: 5.1% ``` shipit ``` guard_type_count: 106,195,615 guard_type_exit_ratio: 3.5% guard_shape_count: 33,672,673 guard_shape_exit_ratio: 10.1% ```
2025-11-04ZJIT: Fallback counter rename: s/fancy/complex/Alan Wu
Kokubun bought up that "complex" is a more fitting name for what these counters count. Thanks! Also: - make the SendFallbackReason enum name consistent with the counter name - rewrite the printout prompt in zjit.rb
2025-11-03ZJIT: Inline String#bytesize (#15033)Max Leopold
Inline the `String#bytesize` function and remove the C call.
2025-11-03ZJIT: Implement include_p for opt_(new|dup)array_send YARV insns (#14885)Randy Stauner
These just call to the C functions that do the optimized test but this avoids the side exit. See https://github.com/ruby/ruby/pull/12123 for the original CRuby/YJIT implementation.
2025-10-30ZJIT: Count unsupported fancy caller side featuresAlan Wu
These count caller-side features we don't support. But because we side exit when we see them through unhandled_call_type(), these new counters currently don't trigger.
2025-10-30ZJIT: Unsupported call feature accounting, and new ↵Alan Wu
`send_fallback_fancy_call_feature` In cases we fall back when the callee has an unsupported signature, it was a little inaccurate to use `send_fallback_send_not_optimized_method_type`. We do support the method type in other situations. Add a new `send_fallback_fancy_call_feature` for these situations. Also, `send_fallback_bmethod_non_iseq_proc` so we can stop using `not_optimized_method_type` completely for bmethods. Add accompanying `fancy_arg_pass_*` counters. These don't sum to the number of unoptimized calls that run, but establishes the level of support the optimizer provides for a given workload.
2025-10-30ZJIT: Split out optimized method types in stats (#15002)Max Bernstein
We can see send/block call/struct aref/... e.g. on lobsters: ``` Top-9 not optimized method types for send_without_block (100.0% of total 3,133,812): iseq: 2,004,557 (64.0%) optimized_struct_aref: 496,232 (15.8%) alias: 268,579 ( 8.6%) optimized_call: 224,883 ( 7.2%) optimized_send: 120,531 ( 3.8%) bmethod: 12,011 ( 0.4%) null: 4,636 ( 0.1%) optimized_block_call: 1,930 ( 0.1%) cfunc: 453 ( 0.0%) ``` railsbench: ``` Top-8 not optimized method types for send_without_block (100.0% of total 5,735,608): iseq: 2,854,551 (49.8%) optimized_struct_aref: 871,459 (15.2%) optimized_call: 862,185 (15.0%) alias: 588,486 (10.3%) optimized_send: 482,171 ( 8.4%) null: 39,942 ( 0.7%) bmethod: 36,784 ( 0.6%) cfunc: 30 ( 0.0%) ``` shipit: ``` Top-10 not optimized method types for send_without_block (100.0% of total 4,844,304): iseq: 2,881,206 (59.5%) optimized_struct_aref: 1,158,935 (23.9%) optimized_call: 472,898 ( 9.8%) alias: 208,010 ( 4.3%) optimized_send: 55,479 ( 1.1%) null: 47,273 ( 1.0%) bmethod: 12,608 ( 0.3%) optimized_block_call: 7,860 ( 0.2%) cfunc: 31 ( 0.0%) optimized_struct_aset: 4 ( 0.0%) ```
2025-10-29ZJIT: Add type checker to HIR (#14978)Max Bernstein
Allow instructions to constrain their operands' input types to avoid accidentally creating invalid HIR.
2025-10-28ZJIT: Count GuardType instructionsMax Bernstein
We can measure how many we can remove by adding type information to C functions, etc.
2025-10-28ZJIT: Specialize Array#pop for no argument case (#14933)Aiden Fox Ivey
Fixes https://github.com/Shopify/ruby/issues/814 This change specializes the case of calling `Array#pop` on a non frozen array with no arguments. `Array#pop` exists in the non-inlined C function list in the ZJIT SFR performance burndown list. If in the future it is helpful, this patch could be extended to support the case where an argument is provided, but this initial work seeks to elide the ruby frame normally pushed in the case of `Array#pop` without an argument.
2025-10-22ZJIT: Inline simple SendWithoutBlockDirect (#14888)Max Bernstein
Copy the YJIT simple inliner except for the kwargs bit. It works great!
2025-10-21ZJIT: Issue `SendWithoutBlockDirect` to `VM_METHOD_TYPE_BMETHOD`Alan Wu
This helps ZJIT optimize ~300,000 more sends in ruby-bench's lobsters Top-6 not optimized method types for send_without_block Before After iseq: 713,899 (48.0%) iseq: 725,668 (62.4%) optimized: 359,864 (24.2%) optimized: 359,940 (31.0%) bmethod: 339,040 (22.8%) alias: 73,541 ( 6.3%) alias: 73,392 ( 4.9%) null: 2,521 ( 0.2%) null: 2,521 ( 0.2%) bmethod: 979 ( 0.1%) cfunc: 4 ( 0.0%) cfunc: 4 ( 0.0%)
2025-10-20ZJIT: Implement codegen for FixnumMod (#14857)Max Bernstein
This is mostly to see what happens to the loops-times benchmark.
2025-10-20ZJIT: Implement expandarray (#14847)Max Bernstein
Only support the simple case: no splat or rest. lobsters before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (60.5% of total 11,039,954): Kernel#is_a?: 1,030,769 ( 9.3%) String#<<: 851,954 ( 7.7%) Hash#[]=: 742,941 ( 6.7%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 353,775 ( 3.2%) Hash#key?: 349,147 ( 3.2%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.9%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.2%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.9%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,735 ( 1.4%) Module#clock_gettime: 144,992 ( 1.3%) Top-20 not annotated C methods (61.4% of total 11,202,087): Kernel#is_a?: 1,212,660 (10.8%) String#<<: 851,954 ( 7.6%) Hash#[]=: 743,120 ( 6.6%) Regexp#match?: 399,894 ( 3.6%) String#empty?: 361,013 ( 3.2%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,528 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.1%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.5%) String#==: 163,667 ( 1.5%) Module#clock_gettime: 144,992 ( 1.3%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,523,682): iseq: 2,271,936 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,703 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,293,171): invokesuper: 2,373,404 (55.3%) invokeblock: 811,926 (18.9%) sendforward: 505,452 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,530,724): send_without_block_polymorphic: 9,722,491 (38.1%) send_no_profiles: 5,894,788 (23.1%) send_without_block_not_optimized_method_type: 4,523,682 (17.7%) not_optimized_instruction: 4,293,171 (16.8%) send_without_block_no_profiles: 998,746 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 690,950): expandarray: 328,490 (47.5%) checkkeyword: 190,694 (27.6%) getclassvariable: 59,901 ( 8.7%) invokesuperforward: 49,503 ( 7.2%) getblockparam: 49,119 ( 7.1%) opt_duparray_send: 11,978 ( 1.7%) getconstant: 952 ( 0.1%) checkmatch: 290 ( 0.0%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,718,636): register_spill_on_alloc: 3,418,255 (91.9%) register_spill_on_ccall: 182,018 ( 4.9%) exception_handler: 118,363 ( 3.2%) Top-14 side exit reasons (100.0% of total 10,860,385): compile_error: 3,718,636 (34.2%) guard_type_failure: 2,638,926 (24.3%) guard_shape_failure: 1,917,209 (17.7%) unhandled_yarv_insn: 690,950 ( 6.4%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 4.9%) unhandled_kwarg: 455,347 ( 4.2%) patchpoint: 370,476 ( 3.4%) unknown_newarray_send: 314,786 ( 2.9%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 76,397 ( 0.7%) block_param_proxy_modified: 19,193 ( 0.2%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,244,604 dynamic_send_count: 25,530,724 (41.0%) optimized_send_count: 36,713,880 (59.0%) iseq_optimized_send_count: 18,587,512 (29.9%) inline_cfunc_optimized_send_count: 7,086,414 (11.4%) non_variadic_cfunc_optimized_send_count: 8,375,754 (13.5%) variadic_cfunc_optimized_send_count: 2,664,200 ( 4.3%) dynamic_getivar_count: 7,365,995 dynamic_setivar_count: 7,245,005 compiled_iseq_count: 4,796 failed_iseq_count: 447 compile_time: 814ms profile_time: 9ms gc_time: 9ms invalidation_time: 72ms vm_write_pc_count: 64,156,223 vm_write_sp_count: 62,812,449 vm_write_locals_count: 62,812,449 vm_write_stack_count: 62,812,449 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,599,701 code_region_bytes: 22,953,984 side_exit_count: 10,860,385 total_insn_count: 517,606,340 vm_insn_count: 162,979,530 zjit_insn_count: 354,626,810 ratio_in_zjit: 68.5% ``` </details> lobsters after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.9% of total 11,291,815): Kernel#is_a?: 1,046,269 ( 9.3%) String#<<: 851,954 ( 7.5%) Hash#[]=: 743,274 ( 6.6%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 353,775 ( 3.1%) Hash#key?: 349,147 ( 3.1%) String#start_with?: 334,961 ( 3.0%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.9%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 181,797 ( 1.6%) Kernel#dup: 179,341 ( 1.6%) BasicObject#!=: 175,997 ( 1.6%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,600 ( 1.5%) String#==: 157,742 ( 1.4%) Top-20 not annotated C methods (60.9% of total 11,466,928): Kernel#is_a?: 1,239,923 (10.8%) String#<<: 851,954 ( 7.4%) Hash#[]=: 743,453 ( 6.5%) Regexp#match?: 399,894 ( 3.5%) String#empty?: 361,013 ( 3.1%) Hash#key?: 349,147 ( 3.0%) String#start_with?: 334,961 ( 2.9%) Kernel#respond_to?: 316,502 ( 2.8%) ObjectSpace::WeakKeyMap#[]: 238,978 ( 2.1%) TrueClass#===: 235,771 ( 2.1%) FalseClass#===: 231,144 ( 2.0%) String#sub!: 219,579 ( 1.9%) Array#include?: 211,385 ( 1.8%) Hash#fetch: 204,702 ( 1.8%) Kernel#block_given?: 191,666 ( 1.7%) Kernel#dup: 179,348 ( 1.6%) BasicObject#!=: 176,181 ( 1.5%) Class#new: 168,079 ( 1.5%) Kernel#kind_of?: 165,634 ( 1.4%) String#==: 163,674 ( 1.4%) Top-2 not optimized method types for send (100.0% of total 72,318): cfunc: 48,055 (66.4%) iseq: 24,263 (33.6%) Top-6 not optimized method types for send_without_block (100.0% of total 4,524,016): iseq: 2,272,269 (50.2%) bmethod: 985,636 (21.8%) optimized: 949,704 (21.0%) alias: 310,747 ( 6.9%) null: 5,106 ( 0.1%) cfunc: 554 ( 0.0%) Top-13 not optimized instructions (100.0% of total 4,294,241): invokesuper: 2,375,446 (55.3%) invokeblock: 810,955 (18.9%) sendforward: 505,451 (11.8%) opt_eq: 451,754 (10.5%) opt_plus: 74,404 ( 1.7%) opt_minus: 36,228 ( 0.8%) opt_send_without_block: 21,792 ( 0.5%) opt_neq: 7,231 ( 0.2%) opt_mult: 6,752 ( 0.2%) opt_or: 3,753 ( 0.1%) opt_lt: 348 ( 0.0%) opt_ge: 91 ( 0.0%) opt_gt: 36 ( 0.0%) Top-9 send fallback reasons (100.0% of total 25,534,542): send_without_block_polymorphic: 9,723,469 (38.1%) send_no_profiles: 5,896,023 (23.1%) send_without_block_not_optimized_method_type: 4,524,016 (17.7%) not_optimized_instruction: 4,294,241 (16.8%) send_without_block_no_profiles: 998,947 ( 3.9%) send_not_optimized_method_type: 72,318 ( 0.3%) send_without_block_cfunc_array_variadic: 15,134 ( 0.1%) obj_to_string_not_string: 9,765 ( 0.0%) send_without_block_direct_too_many_args: 629 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 362,460): checkkeyword: 190,694 (52.6%) getclassvariable: 59,901 (16.5%) invokesuperforward: 49,503 (13.7%) getblockparam: 49,119 (13.6%) opt_duparray_send: 11,978 ( 3.3%) getconstant: 952 ( 0.3%) checkmatch: 290 ( 0.1%) once: 23 ( 0.0%) Top-3 compile error reasons (100.0% of total 3,798,744): register_spill_on_alloc: 3,495,669 (92.0%) register_spill_on_ccall: 184,712 ( 4.9%) exception_handler: 118,363 ( 3.1%) Top-15 side exit reasons (100.0% of total 10,637,319): compile_error: 3,798,744 (35.7%) guard_type_failure: 2,655,504 (25.0%) guard_shape_failure: 1,917,217 (18.0%) block_param_proxy_not_iseq_or_ifunc: 535,789 ( 5.0%) unhandled_kwarg: 455,492 ( 4.3%) patchpoint: 370,478 ( 3.5%) unhandled_yarv_insn: 362,460 ( 3.4%) unknown_newarray_send: 314,786 ( 3.0%) unhandled_splat: 122,071 ( 1.1%) unhandled_hir_insn: 83,066 ( 0.8%) block_param_proxy_modified: 19,193 ( 0.2%) guard_int_equals_failure: 1,914 ( 0.0%) obj_to_string_fallback: 566 ( 0.0%) guard_type_not_failure: 22 ( 0.0%) interrupt: 17 ( 0.0%) send_count: 62,495,067 dynamic_send_count: 25,534,542 (40.9%) optimized_send_count: 36,960,525 (59.1%) iseq_optimized_send_count: 18,582,072 (29.7%) inline_cfunc_optimized_send_count: 7,086,638 (11.3%) non_variadic_cfunc_optimized_send_count: 8,392,657 (13.4%) variadic_cfunc_optimized_send_count: 2,899,158 ( 4.6%) dynamic_getivar_count: 7,365,994 dynamic_setivar_count: 7,248,500 compiled_iseq_count: 4,780 failed_iseq_count: 463 compile_time: 816ms profile_time: 9ms gc_time: 11ms invalidation_time: 70ms vm_write_pc_count: 64,363,541 vm_write_sp_count: 63,022,221 vm_write_locals_count: 63,022,221 vm_write_stack_count: 63,022,221 vm_write_to_parent_iseq_local_count: 292,458 vm_read_from_parent_iseq_local_count: 6,850,977 code_region_bytes: 23,019,520 side_exit_count: 10,637,319 total_insn_count: 517,303,190 vm_insn_count: 160,562,103 zjit_insn_count: 356,741,087 ratio_in_zjit: 69.0% ``` </details> railsbench before: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.1% of total 25,524,934): Hash#[]=: 1,700,237 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) String#<<: 1,494,022 ( 5.9%) Kernel#is_a?: 1,429,930 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,409 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.8% of total 25,392,654): Hash#[]=: 1,700,416 ( 6.7%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,672 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.9%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,115 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,394,873): invokesuper: 5,602,274 (66.7%) invokeblock: 1,764,936 (21.0%) sendforward: 551,832 ( 6.6%) opt_eq: 441,959 ( 5.3%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,748,753): send_without_block_polymorphic: 12,933,923 (31.7%) send_no_profiles: 9,033,636 (22.2%) not_optimized_instruction: 8,394,873 (20.6%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-9 unhandled YARV insns (100.0% of total 1,604,456): getclassvariable: 458,136 (28.6%) getblockparam: 455,921 (28.4%) checkkeyword: 265,425 (16.5%) invokesuperforward: 239,383 (14.9%) expandarray: 137,305 ( 8.6%) getconstant: 48,100 ( 3.0%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,570,130): register_spill_on_alloc: 4,994,130 (89.7%) exception_handler: 356,784 ( 6.4%) register_spill_on_ccall: 219,216 ( 3.9%) Top-13 side exit reasons (100.0% of total 12,412,181): compile_error: 5,570,130 (44.9%) unhandled_yarv_insn: 1,604,456 (12.9%) guard_shape_failure: 1,462,872 (11.8%) guard_type_failure: 845,891 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.2%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.1%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,205 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,067,587 dynamic_send_count: 40,748,753 (34.2%) optimized_send_count: 78,318,834 (65.8%) iseq_optimized_send_count: 39,936,542 (33.5%) inline_cfunc_optimized_send_count: 12,857,358 (10.8%) non_variadic_cfunc_optimized_send_count: 19,722,584 (16.6%) variadic_cfunc_optimized_send_count: 5,802,350 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,962,726 compiled_iseq_count: 2,531 failed_iseq_count: 245 compile_time: 414ms profile_time: 21ms gc_time: 33ms invalidation_time: 5ms vm_write_pc_count: 129,093,714 vm_write_sp_count: 126,023,084 vm_write_locals_count: 126,023,084 vm_write_stack_count: 126,023,084 vm_write_to_parent_iseq_local_count: 385,461 vm_read_from_parent_iseq_local_count: 11,266,484 code_region_bytes: 12,156,928 side_exit_count: 12,412,181 total_insn_count: 866,780,158 vm_insn_count: 216,821,134 zjit_insn_count: 649,959,024 ratio_in_zjit: 75.0% ``` </details> railsbench after: <details> ``` ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (66.0% of total 25,597,895): Hash#[]=: 1,724,042 ( 6.7%) String#getbyte: 1,572,123 ( 6.1%) String#<<: 1,494,022 ( 5.8%) Kernel#is_a?: 1,429,946 ( 5.6%) String#empty?: 1,370,323 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.2%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 590,699 ( 2.3%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,771 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-20 not annotated C methods (66.7% of total 25,465,615): Hash#[]=: 1,724,221 ( 6.8%) String#getbyte: 1,572,123 ( 6.2%) Kernel#is_a?: 1,515,688 ( 6.0%) String#<<: 1,494,022 ( 5.9%) String#empty?: 1,370,478 ( 5.4%) Regexp#match?: 1,235,067 ( 4.8%) Kernel#respond_to?: 1,198,251 ( 4.7%) Hash#key?: 1,087,406 ( 4.3%) String#setbyte: 810,022 ( 3.2%) Integer#^: 766,624 ( 3.0%) Kernel#block_given?: 603,613 ( 2.4%) String#==: 601,405 ( 2.4%) Class#new: 506,216 ( 2.0%) Hash#delete: 455,288 ( 1.8%) BasicObject#!=: 428,876 ( 1.7%) Hash#fetch: 408,621 ( 1.6%) String#ascii_only?: 373,915 ( 1.5%) ObjectSpace::WeakKeyMap#[]: 287,957 ( 1.1%) NilClass#===: 277,244 ( 1.1%) Kernel#Array: 269,590 ( 1.1%) Top-2 not optimized method types for send (100.0% of total 186,159): iseq: 112,747 (60.6%) cfunc: 73,412 (39.4%) Top-6 not optimized method types for send_without_block (100.0% of total 8,142,248): iseq: 3,464,671 (42.6%) optimized: 2,632,884 (32.3%) bmethod: 1,290,701 (15.9%) alias: 706,020 ( 8.7%) null: 47,942 ( 0.6%) cfunc: 30 ( 0.0%) Top-11 not optimized instructions (100.0% of total 8,442,456): invokesuper: 5,649,857 (66.9%) invokeblock: 1,764,936 (20.9%) sendforward: 551,832 ( 6.5%) opt_eq: 441,959 ( 5.2%) opt_plus: 31,635 ( 0.4%) opt_send_without_block: 1,163 ( 0.0%) opt_lt: 372 ( 0.0%) opt_mult: 251 ( 0.0%) opt_ge: 193 ( 0.0%) opt_neq: 149 ( 0.0%) opt_or: 109 ( 0.0%) Top-8 send fallback reasons (100.0% of total 40,796,314): send_without_block_polymorphic: 12,933,921 (31.7%) send_no_profiles: 9,033,616 (22.1%) not_optimized_instruction: 8,442,456 (20.7%) send_without_block_not_optimized_method_type: 8,142,248 (20.0%) send_without_block_no_profiles: 1,839,228 ( 4.5%) send_without_block_cfunc_array_variadic: 215,046 ( 0.5%) send_not_optimized_method_type: 186,159 ( 0.5%) obj_to_string_not_string: 3,640 ( 0.0%) Top-8 unhandled YARV insns (100.0% of total 1,467,151): getclassvariable: 458,136 (31.2%) getblockparam: 455,921 (31.1%) checkkeyword: 265,425 (18.1%) invokesuperforward: 239,383 (16.3%) getconstant: 48,100 ( 3.3%) checkmatch: 149 ( 0.0%) once: 23 ( 0.0%) opt_duparray_send: 14 ( 0.0%) Top-3 compile error reasons (100.0% of total 5,825,923): register_spill_on_alloc: 5,225,940 (89.7%) exception_handler: 356,784 ( 6.1%) register_spill_on_ccall: 243,199 ( 4.2%) Top-13 side exit reasons (100.0% of total 12,530,763): compile_error: 5,825,923 (46.5%) unhandled_yarv_insn: 1,467,151 (11.7%) guard_shape_failure: 1,462,876 (11.7%) guard_type_failure: 845,913 ( 6.8%) block_param_proxy_not_iseq_or_ifunc: 765,968 ( 6.1%) unhandled_kwarg: 658,341 ( 5.3%) patchpoint: 504,437 ( 4.0%) unhandled_splat: 446,990 ( 3.6%) unknown_newarray_send: 332,740 ( 2.7%) unhandled_hir_insn: 160,273 ( 1.3%) block_param_proxy_modified: 59,589 ( 0.5%) obj_to_string_fallback: 553 ( 0.0%) interrupt: 9 ( 0.0%) send_count: 119,163,569 dynamic_send_count: 40,796,314 (34.2%) optimized_send_count: 78,367,255 (65.8%) iseq_optimized_send_count: 39,911,967 (33.5%) inline_cfunc_optimized_send_count: 12,857,393 (10.8%) non_variadic_cfunc_optimized_send_count: 19,770,401 (16.6%) variadic_cfunc_optimized_send_count: 5,827,494 ( 4.9%) dynamic_getivar_count: 10,980,323 dynamic_setivar_count: 12,986,381 compiled_iseq_count: 2,523 failed_iseq_count: 252 compile_time: 420ms profile_time: 21ms gc_time: 30ms invalidation_time: 4ms vm_write_pc_count: 128,973,665 vm_write_sp_count: 125,926,968 vm_write_locals_count: 125,926,968 vm_write_stack_count: 125,926,968 vm_write_to_parent_iseq_local_count: 385,752 vm_read_from_parent_iseq_local_count: 11,267,766 code_region_bytes: 12,189,696 side_exit_count: 12,530,763 total_insn_count: 866,667,490 vm_insn_count: 217,813,201 zjit_insn_count: 648,854,289 ratio_in_zjit: 74.9% ``` </details>
2025-10-16ZJIT: Break out patchpoint exit reasons (#14858)Max Bernstein
We have a lot of patchpoint exits on some applications and this helps pin down why.
2025-10-15ZJIT: Add trace exit counter (#14831)Aiden Fox Ivey
2025-10-11ZJIT: Count unoptimized `Send` (#14801)Stan Lo
* ZJIT: Count unoptimized `Send` This includes `Send` in `send fallback reasons` to guide future optimizations. * ZJIT: Create dedicated def_type counter for Send
2025-10-09ZJIT: Get stats for which C functions are not annotatedMax Bernstein
2025-10-06ZJIT: reduce string allocation in the Counter::name() (#14743)Hoa Nguyen
The Counter::name() method creates a new String on every call, each call allocates memory and copies the string. Using %'static str would reduce memory pressure. The change is safe as no breaking changes to the API
2025-10-03ZJIT: Count CCallWithFrame as optimized_send_count (#14722)Takashi Kokubun
2025-10-03ZJIT: Add HIR for calling Cfunc with frame (#14661)Stan Lo
* ZJIT: Add HIR for CCallWithFrame * ZJIT: Update stats to count not inlined cfunc calls * ZJIT: Stops optimizing SendWithoutBlock when TracePoint is activated * ZJIT: Fallback to SendWithoutBlock when CCallWithFrame has too many args * ZJIT: Rename cfun -> cfunc
2025-10-02ZJIT: Enable sample rate for side exit tracing (#14696)Aiden Fox Ivey
2025-09-30ZJIT: Add more *_send_count stats (#14689)Takashi Kokubun