summaryrefslogtreecommitdiff
path: root/zjit/src/cruby.rs
AgeCommit message (Collapse)Author
2025-12-18JIT: Move EC offsets to jit_bindgen_constantsJohn Hawthorn
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-12-16ZJIT: Add a VALUE#write_barrier helper method to deduplicate logicBenoit Daloze
2025-12-10ZJIT: Use inline format args (#15482)Alex Rocha
2025-12-05ZJIT: Avoid binding to `rb_iseq_constant_body`Alan Wu
Its definition changes depending on e.g. whether there is YJIT in the build.
2025-12-03ZJIT: Optimize setivar with shape transition (#15375)Max Bernstein
Since we do a decent job of pre-sizing objects, don't handle the case where we would need to re-size an object. Also don't handle too-complex shapes. lobsters stats before: ``` Top-20 calls to C functions from JIT code (79.4% of total 90,051,140): rb_vm_opt_send_without_block: 19,762,433 (21.9%) rb_vm_setinstancevariable: 7,698,314 ( 8.5%) rb_hash_aref: 6,767,461 ( 7.5%) rb_vm_env_write: 5,373,080 ( 6.0%) rb_vm_send: 5,049,229 ( 5.6%) rb_vm_getinstancevariable: 4,535,259 ( 5.0%) rb_obj_is_kind_of: 3,746,306 ( 4.2%) rb_ivar_get_at_no_ractor_check: 3,745,237 ( 4.2%) rb_vm_invokesuper: 3,037,467 ( 3.4%) rb_ary_entry: 2,351,983 ( 2.6%) rb_vm_opt_getconstant_path: 1,344,740 ( 1.5%) rb_vm_invokeblock: 1,184,474 ( 1.3%) Hash#[]=: 1,064,288 ( 1.2%) rb_gc_writebarrier: 1,006,972 ( 1.1%) rb_ec_ary_new_from_values: 902,687 ( 1.0%) fetch: 898,667 ( 1.0%) rb_str_buf_append: 833,787 ( 0.9%) rb_class_allocate_instance: 822,024 ( 0.9%) Hash#fetch: 699,580 ( 0.8%) _bi20: 682,068 ( 0.8%) Top-4 setivar fallback reasons (100.0% of total 7,732,326): shape_transition: 6,032,109 (78.0%) not_monomorphic: 1,469,300 (19.0%) not_t_object: 172,636 ( 2.2%) too_complex: 58,281 ( 0.8%) ``` lobsters stats after: ``` Top-20 calls to C functions from JIT code (79.0% of total 88,322,656): rb_vm_opt_send_without_block: 19,777,880 (22.4%) rb_hash_aref: 6,771,589 ( 7.7%) rb_vm_env_write: 5,372,789 ( 6.1%) rb_gc_writebarrier: 5,195,527 ( 5.9%) rb_vm_send: 5,049,145 ( 5.7%) rb_vm_getinstancevariable: 4,538,485 ( 5.1%) rb_obj_is_kind_of: 3,746,241 ( 4.2%) rb_ivar_get_at_no_ractor_check: 3,745,172 ( 4.2%) rb_vm_invokesuper: 3,037,157 ( 3.4%) rb_ary_entry: 2,351,968 ( 2.7%) rb_vm_setinstancevariable: 1,703,337 ( 1.9%) rb_vm_opt_getconstant_path: 1,344,730 ( 1.5%) rb_vm_invokeblock: 1,184,290 ( 1.3%) Hash#[]=: 1,061,868 ( 1.2%) rb_ec_ary_new_from_values: 902,666 ( 1.0%) fetch: 898,666 ( 1.0%) rb_str_buf_append: 833,784 ( 0.9%) rb_class_allocate_instance: 821,778 ( 0.9%) Hash#fetch: 755,913 ( 0.9%) Top-4 setivar fallback reasons (100.0% of total 1,703,337): not_monomorphic: 1,472,405 (86.4%) not_t_object: 172,629 (10.1%) too_complex: 58,281 ( 3.4%) new_shape_needs_extension: 22 ( 0.0%) ``` I also noticed that primitive printing in HIR was broken so I fixed that. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-12-02ZJIT: Optimize GetIvar for non-T_OBJECTBenoit Daloze
* All Invariant::SingleRactorMode PatchPoint are replaced by assume_single_ractor_mode() to fix https://github.com/Shopify/ruby/issues/875 for SingleRactorMode patchpoints.
2025-12-01ZJIT: Specialize String#<< with FixnumMax Bernstein
Append a codepoint.
2025-11-26ZJIT: Remove dead unnecessary_transmutes allowStan Lo
``` warning: unknown lint: `unnecessary_transmutes` --> zjit/src/cruby.rs:107:9 | 107 | #[allow(unnecessary_transmutes)] // https://github.com/rust-lang/rust-bindgen/issues/2807 | ^^^^^^^^^^^^^^^^^^^^^^ | = note: `#[warn(unknown_lints)]` on by default ```
2025-11-25ZJIT: Specialize setinstancevariable when ivar is already in shape (#15290)Max Bernstein
Don't support shape transitions for now.
2025-11-21ZJIT: Inline Thread.current (#15272)Max Bernstein
Add `LoadEC` then it's just two `LoadField`.
2025-11-20ZJIT: Read `iseq->body->param` directly instead of through FFIAlan Wu
Going through a call to a C function just to read a bitfield was a little extreme. We did it to be super conservative since bitfields have historically been the trigger of many bugs and surprises. Let's try directly accessing them with code from rust-bindgen. If this ends up causing issues, we can use the FFI approach behind nicer wrappers. In any case, directly access regular struct fields such as `lead_num` and `opt_num` to remove boilerplate.
2025-11-20Name the `iseq->body->param` struct and update bindings for JITsAlan Wu
This will make reading the parameters nicer for the JITs. Should be no-op for the C side.
2025-11-20ZJIT: Put optional interpreter cache on both GetIvar and SetIvarMax Bernstein
2025-11-19ZJIT: Count all calls to C functions from generated code (#15240)Max Bernstein
lobsters: ``` Top-20 calls to C functions from JIT code (79.9% of total 97,004,883): rb_vm_opt_send_without_block: 19,874,212 (20.5%) rb_vm_setinstancevariable: 9,774,841 (10.1%) rb_ivar_get: 9,358,866 ( 9.6%) rb_hash_aref: 6,828,948 ( 7.0%) rb_vm_send: 6,441,551 ( 6.6%) rb_vm_env_write: 5,375,989 ( 5.5%) rb_vm_invokesuper: 3,037,836 ( 3.1%) Module#===: 2,562,446 ( 2.6%) rb_ary_entry: 2,354,546 ( 2.4%) Kernel#is_a?: 1,424,092 ( 1.5%) rb_vm_opt_getconstant_path: 1,344,923 ( 1.4%) Thread.current: 1,300,822 ( 1.3%) rb_zjit_defined_ivar: 1,222,613 ( 1.3%) rb_vm_invokeblock: 1,184,555 ( 1.2%) Hash#[]=: 1,061,969 ( 1.1%) rb_ary_push: 1,024,987 ( 1.1%) rb_ary_new_capa: 904,003 ( 0.9%) rb_str_buf_append: 833,782 ( 0.9%) rb_class_allocate_instance: 822,626 ( 0.8%) Hash#fetch: 755,913 ( 0.8%) ``` railsbench: ``` Top-20 calls to C functions from JIT code (74.8% of total 189,170,268): rb_vm_opt_send_without_block: 29,870,307 (15.8%) rb_vm_setinstancevariable: 17,631,199 ( 9.3%) rb_hash_aref: 16,928,890 ( 8.9%) rb_ivar_get: 14,441,240 ( 7.6%) rb_vm_env_write: 11,571,001 ( 6.1%) rb_vm_send: 11,153,457 ( 5.9%) rb_vm_invokesuper: 7,568,267 ( 4.0%) Module#===: 6,065,923 ( 3.2%) Hash#[]=: 2,842,990 ( 1.5%) rb_ary_entry: 2,766,125 ( 1.5%) rb_ary_push: 2,722,079 ( 1.4%) rb_vm_invokeblock: 2,594,398 ( 1.4%) Thread.current: 2,560,129 ( 1.4%) rb_str_getbyte: 1,965,627 ( 1.0%) Kernel#is_a?: 1,961,815 ( 1.0%) rb_vm_opt_getconstant_path: 1,863,678 ( 1.0%) rb_hash_new_with_size: 1,796,456 ( 0.9%) rb_class_allocate_instance: 1,785,043 ( 0.9%) String#empty?: 1,713,414 ( 0.9%) rb_ary_new_capa: 1,678,834 ( 0.9%) ``` shipit: ``` Top-20 calls to C functions from JIT code (83.4% of total 182,402,821): rb_vm_opt_send_without_block: 45,753,484 (25.1%) rb_ivar_get: 21,020,650 (11.5%) rb_vm_setinstancevariable: 17,528,603 ( 9.6%) rb_hash_aref: 11,892,856 ( 6.5%) rb_vm_send: 11,723,471 ( 6.4%) rb_vm_env_write: 10,434,452 ( 5.7%) Module#===: 4,225,048 ( 2.3%) rb_vm_invokesuper: 3,705,906 ( 2.0%) Thread.current: 3,337,603 ( 1.8%) rb_ary_entry: 3,114,378 ( 1.7%) Hash#[]=: 2,509,912 ( 1.4%) Array#empty?: 2,282,994 ( 1.3%) rb_vm_invokeblock: 2,210,511 ( 1.2%) Hash#fetch: 2,017,960 ( 1.1%) _bi20: 1,975,147 ( 1.1%) rb_zjit_defined_ivar: 1,897,127 ( 1.0%) rb_vm_opt_getconstant_path: 1,813,294 ( 1.0%) rb_ary_new_capa: 1,615,406 ( 0.9%) Kernel#is_a?: 1,567,854 ( 0.9%) rb_class_allocate_instance: 1,560,035 ( 0.9%) ``` Thanks to @eregon for the idea. Co-authored-by: Jacob Denbeaux <jacob.denbeaux@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
2025-11-04ZJIT: Use a shared trampoline across all ISEQs (#15042)Takashi Kokubun
2025-11-03ZJIT: Inline String#bytesize (#15033)Max Leopold
Inline the `String#bytesize` function and remove the C call.
2025-10-30ZJIT: Inline struct arefMax Bernstein
2025-10-22ZJIT: Fetch Primitive.attr!(leaf) for InvokeBuiltinMax Bernstein
Fix https://github.com/Shopify/ruby/issues/670
2025-10-22ZJIT: Inline String#==, String#===Max Bernstein
2025-10-22ZJIT: Inline Fixnum#^Benoit Daloze
* Handled in cruby_methods.rs because there is no basic operation for Fixnum#^.
2025-10-21ZJIT: Issue `SendWithoutBlockDirect` to `VM_METHOD_TYPE_BMETHOD`Alan Wu
This helps ZJIT optimize ~300,000 more sends in ruby-bench's lobsters Top-6 not optimized method types for send_without_block Before After iseq: 713,899 (48.0%) iseq: 725,668 (62.4%) optimized: 359,864 (24.2%) optimized: 359,940 (31.0%) bmethod: 339,040 (22.8%) alias: 73,541 ( 6.3%) alias: 73,392 ( 4.9%) null: 2,521 ( 0.2%) null: 2,521 ( 0.2%) bmethod: 979 ( 0.1%) cfunc: 4 ( 0.0%) cfunc: 4 ( 0.0%)
2025-10-21ZJIT: Fix binding to `INVALID_SHAPE_ID` under `-std=c99 -pedantic`Alan Wu
``` /src/jit.c:19:5: error: ISO C restricts enumerator values to range of 'int' (4294967295 is too large) [-Werror,-Wpedantic] 19 | RB_INVALID_SHAPE_ID = INVALID_SHAPE_ID, | ^ ~~~~~~~~~~~~~~~~ ```
2025-10-20ZJIT: Implement codegen for FixnumMod (#14857)Max Bernstein
This is mostly to see what happens to the loops-times benchmark.
2025-10-16ZJIT: Inline String#getbyte (#14842)Max Bernstein
2025-10-15ZJIT: Use rb_gc_disable() over rb_gc_disable_no_rest()Alan Wu
no_rest() trips an assert inside the GC when we allocate with the GC disabled this way: (gc_continue) ../src/gc/default/default.c:2029 (newobj_cache_miss+0x128) [0x105040048] ../src/gc/default/default.c:2370 (rb_gc_impl_new_obj+0x7c) [0x105036374] ../src/gc/default/default.c:2482 (newobj_of) ../src/gc.c:995 (rb_method_entry_alloc+0x40) [0x1051e6c64] ../src/vm_method.c:1102 (rb_method_entry_complement_defined_class) ../src/vm_method.c:1180 (prepare_callable_method_entry+0x14c) [0x1051e87b8] ../src/vm_method.c:1728 (callable_method_entry_or_negative+0x1e8) [0x1051e809c] ../src/vm_method.c:1874 It's tries to continue the GC because it was out of space. Looks like it's not safe to allocate new objects after using rb_gc_disable_no_rest(); existing usages use it for malloc calls.
2025-10-15ZJIT: Never yield to the GC while compilingAlan Wu
This fixes a reliable "ZJIT saw a dead object" repro on my machine, and should fix the flaky ones on CI. The code for disabling the GC is the same as the code in newobj_of(). See: https://github.com/ruby/ruby/actions/runs/18511676257/job/52753782036
2025-10-14ZJIT: Include GC object dump when seeing dead objectsAlan Wu
Strictly more info than just the builtin_type from `assert_ne!`. Old: assertion `left != right` failed: ZJIT should only see live objects left: 0 right: 0 New: ZJIT saw a dead object. T_type=0, out-of-heap:0x0000000110d4bb40 Also, the new `VALUE::obj_info` is more flexible for print debugging than the dump_info() it replaces. It now allows you to use it as part of a `format!` string instead of always printing to stderr for you.
2025-10-02ZJIT: Add `NoSingletonClass` patch point (#14680)Stan Lo
* ZJIT: Add NoSingletonClass patch point This patch point makes sure that when the object has a singleton class, the JIT code is invalidated. As of now, this is only needed for C call optimization. In YJIT, the singleton class guard only applies to Array, Hash, and String. But in ZJIT, we may optimize C calls from gems (e.g. `sqlite3`). So the patch point needs to be applied to a broader range of classes. * ZJIT: Only generate NoSingletonClass guard when the type can have singleton class * ZJIT: Update or forget NoSingletonClass patch point when needed
2025-09-29ZJIT: Incorporate parameter loads into HIR (#14659)Takashi Kokubun
2025-09-25ZJIT: Compile ISEQ with optional arguments (#14653)Takashi Kokubun
2025-09-23ZJIT: Allow testing JIT code on zjit-test (#14639)Takashi Kokubun
* ZJIT: Allow testing JIT code on zjit-test * Resurrect TestingAllocator tests
2025-09-19ZJIT: Expand the list of safe allocatorsMax Bernstein
It's not just the default allocator; other allocators are also leaf.
2025-09-05ZJIT: Stop optimizing toplevel locals (#14458)Takashi Kokubun
2025-09-03ZJIT: Ensure `clippy` passes and silence unnecessary warnings (#14439)Aiden Fox Ivey
2025-09-02ZJIT: Clear jit entry from iseqs after TracePoint activation (#14407)Stan Lo
ZJIT: Remove JITed code after TracePoint is enabled
2025-09-02ZJIT: Bump default --zjit-call-threshold to 30 (#14410)Takashi Kokubun
2025-08-29Add rb_jit_vm_unlock and share it in ZJIT and YJITStan Lo
2025-08-29Add rb_jit_vm_lock_then_barrier and share it in ZJIT and YJITStan Lo
2025-08-29ZJIT: Specialize monomorphic GetIvar (#14388)Max Bernstein
Specialize monomorphic `GetIvar` into: * `GuardType(HeapObject)` * `GuardShape` * `LoadIvarEmbedded` or `LoadIvarExtended` This requires profiling self for `getinstancevariable` (it's not on the operand stack). This also optimizes `GetIvar`s that happen as a result of inlining `attr_reader` and `attr_accessor`. Also move some (newly) shared JIT helpers into jit.c.
2025-08-27Replace ROBJECT_EMBED by ROBJECT_HEAPJean Boussier
The embed layout is way more common than the heap one, especially since WVA. I think it makes for more readable code to inverse the flag.
2025-08-21Remove unused SPECIAL_CONST_SHAPE_IDÉtienne Barrié
Its usage was removed in 306d50811dd060d876d1eb364a0d5e6106f5e4f1.
2025-08-13ZJIT: Fix `ObjToString` rewrite (#14196)Stan Lo
ZJIT: Fix ObjToString rewrite Currently, the rewrite for `ObjToString` always replaces it with a `SendWithoutBlock(to_s)` instruction when the receiver is not a string literal. This is incorrect because it calls `to_s` on the receiver even if it's already a string. This change fixes it by: - Avoiding the `SendWithoutBlock(to_s)` rewrite - Implement codegen for `ObjToString`
2025-08-05ZJIT: Profile type+shape distributions (#13901)Max Bernstein
ZJIT uses the interpreter to take type profiles of what objects pass through the code. It stores a compressed record of the history per opcode for the opcodes we select. Before this change, we re-used the HIR Type data-structure, a shallow type lattice, to store historical type information. This was quick for bringup but is quite lossy as profiles go: we get one bit per built-in type seen, and if we see a non-built-in type in addition, we end up with BasicObject. Not very helpful. Additionally, it does not give us any notion of cardinality: how many of each type did we see? This change brings with it a much more interesting slice of type history: a histogram. A Distribution holds a record of the top-N (where N is fixed at Ruby compile-time) `(Class, ShapeId)` pairs and their counts. It also holds an *other* count in case we see more than N pairs. Using this distribution, we can make more informed decisions about when we should use type information. We can determine if we are strictly monomorphic, very nearly monomorphic, or something else. Maybe the call-site is polymorphic, so we should have a polymorphic inline cache. Exciting stuff. I also plumb this new distribution into the HIR part of the compilation pipeline.
2025-07-31ZJIT: Add the ISEQ name to Block asm comments (#14070)Takashi Kokubun
2025-07-30ZJIT: Get rid of CallInfoMax Bernstein
2025-07-30ZJIT: Prepare for sharing JIT hooks with ZJIT (#14044)Takashi Kokubun
2025-07-28ZJIT: Support invalidating constant patch points (#13998)Stan Lo
2025-07-11ZJIT: Gracefully handle iseq_name with NULL ISEQMax Bernstein
2025-07-10ZJIT: Print a message about ZJIT_RB_BUG when unused (#13852)Takashi Kokubun
2025-07-08ZJIT: Support inference of ModuleExact typeStan Lo