summaryrefslogtreecommitdiff
path: root/yjit/src
AgeCommit message (Collapse)Author
2025-08-29ZJIT: Specialize monomorphic GetIvar (#14388)Max Bernstein
Specialize monomorphic `GetIvar` into: * `GuardType(HeapObject)` * `GuardShape` * `LoadIvarEmbedded` or `LoadIvarExtended` This requires profiling self for `getinstancevariable` (it's not on the operand stack). This also optimizes `GetIvar`s that happen as a result of inlining `attr_reader` and `attr_accessor`. Also move some (newly) shared JIT helpers into jit.c.
2025-08-29YJIT: rb_ivar_get_at skip ractor checksJean Boussier
Using `assume_single_ractor_mode` we can skip all ractor safety checks if we're in single ractor mode. ``` compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-28T21:23:38Z yjit-get-exivar 3cc21b76d4) +YJIT +PRISM [arm64-darwin24] | |compare-ruby|built-ruby| |:--------------------------|-----------:|---------:| |vm_ivar_get_on_obj | 975.981| 975.772| | | 1.00x| -| |vm_ivar_get_on_class | 136.214| 470.912| | | -| 3.46x| |vm_ivar_get_on_generic | 148.315| 299.122| | | -| 2.02x| ```
2025-08-29YJIT: rb_ivar_get_at assume leaf-call when single ractorJean Boussier
The only exception it could raise is if we're in multi ractor mode.
2025-08-29YJIT: getinstancevariable cache indexes for types other than T_OBJECTJean Boussier
While accessing the ivars of other types is too complicated to realistically generate the ASM for it, we can at least provide the ivar index as to not have to lookup the shape tree every time. ``` compare-ruby: ruby 3.5.0dev (2025-08-27T14:58:58Z merge-vm-setivar-d.. 5b749d8e53) +YJIT +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-28T17:58:32Z yjit-get-exivar efaa8c9b09) +YJIT +PRISM [arm64-darwin24] | |compare-ruby|built-ruby| |:--------------------------|-----------:|---------:| |vm_ivar_get_on_obj | 930.458| 936.865| | | -| 1.01x| |vm_ivar_get_on_class | 134.471| 431.622| | | -| 3.21x| |vm_ivar_get_on_generic | 146.679| 284.408| | | -| 1.94x| ``` Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2025-08-28YJIT simplify gen_get_iver and gen_set_ivarJean Boussier
The `shape_id` now includes 3 bits for the `heap_id`. It is always non-zero for `T_OBJECT` and always zero for all other types. Hence all these allocator checks are no longer necessary.
2025-08-27ZJIT: Implement side exit stats (#14357)Takashi Kokubun
2025-08-27Replace ROBJECT_EMBED by ROBJECT_HEAPJean Boussier
The embed layout is way more common than the heap one, especially since WVA. I think it makes for more readable code to inverse the flag.
2025-08-26Remove `opt_aref_with` and `opt_aset_with`Aaron Patterson
When these instructions were introduced it was common to read from a hash with mutable string literals. However, these days, I think these instructions are fairly rare. I tested this with the lobsters benchmark, and saw no difference in speed. In order to be sure, I tracked down every use of this instruction in the lobsters benchmark, and there were only 4 places where it was used. Additionally, this patch fixes a case where "chilled strings" should emit a warning but they don't. ```ruby class Foo def self.[](x)= x.gsub!(/hello/, "hi") end Foo["hello world"] ``` Removing these instructions shows this warning: ``` > ./miniruby -vw test.rb ruby 3.5.0dev (2025-08-25T21:36:50Z rm-opt_aref_with dca08e286c) +PRISM [arm64-darwin24] test.rb:2: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information) ``` [Feature #21553]
2025-08-26Ensure T_OBJECT and T_IMEMO/fields have identical layoutJean Boussier
2025-08-20YJIT: Improve locals names (#14285)Stan Lo
2025-08-18Don't allow looking at the shape ID of immediates (#14266)Max Bernstein
It only makes sense for heap objects.
2025-08-14YJIT: Fix `defined?(yield)` and `block_given?` at top levelAlan Wu
Previously, YJIT returned truthy for the block given query at the top level. That's incorrect because the top level script never receives a block, and `yield` is a syntax error there. Inside methods, the number of hops to get from `iseq` to `iseq->body->local_iseq` is the same as the number of `VM_ENV_PREV_EP(ep)` hops to get to an environment with `VM_ENV_FLAG_LOCAL`. YJIT and the interpreter both rely on this as can be seen in get_lvar_level(). However, this identity does not hold for the top level frame because of vm_set_eval_stack(), which sets up `TOPLEVEL_BINDING`. Since only methods can take a block that `yield` goes to, have ISEQs that are the child of a non-method ISEQ return falsy for the block given query. This fixes the issue for the top level script and is an optimization for non-method contexts such as inside `ISEQ_TYPE_CLASS`.
2025-08-11YJIT: Fix `mismatched_lifetime_syntaxes`, new in Rust 1.89.0Alan Wu
2025-08-06ZJIT: Implement SingleRactorMode invalidation (#14121)Stan Lo
* ZJIT: Implement SingleRactorMode invalidation * ZJIT: Add macro for compiling jumps * ZJIT: Fix typo in comment * YJIT: Fix typo in comment * ZJIT: Avoid using unexported types in zjit.h `enum ruby_vminsn_type` is declared in `insns.inc` and is not exported. Using it in `zjit.h` would cause build errors when the file including it doesn't include `insns.inc`.
2025-07-31ZJIT: Stub JIT-to-JIT calls (#14052)Takashi Kokubun
2025-07-30ZJIT: Prepare for sharing JIT hooks with ZJIT (#14044)Takashi Kokubun
2025-07-29Get rid of imemo_astJean Boussier
It has been marked as obsolete for a while and I see no reason to keep it.
2025-07-28YJIT: Call YJIT hooks before enabling YJIT (#14032)Takashi Kokubun
2025-07-24YJIT: Use raw memory write to update pointers in codeKunshan Wang
Because we have set all code memory to writable before the reference updating phase, we can use raw memory writes directly.
2025-07-24Remove unused imemo_parser_strtermPeter Zhu
2025-07-16YJIT: Side-exit on String#dup when it's not leaf (#13921)Takashi Kokubun
* YJIT: Side-exit on String#dup when it's not leaf * Use an enum instead of a macro for bindgen
2025-07-14YJIT: Move RefCell one level downKunshan Wang
This is the second part of making YJIT work with parallel GC. During GC, `rb_yjit_iseq_mark` and `rb_yjit_iseq_update_references` need to resolve offsets in `Block::gc_obj_offsets` into absolute addresses before reading or updating the fields. This needs the base address stored in `VirtualMemory::region_start` which was previously behind a `RefCell`. When multiple GC threads scan multiple iseq simultaneously (which is possible for some GC modules such as MMTk), it will panic because the `RefCell` is already borrowed. We notice that some fields of `VirtualMemory`, such as `region_start`, are never modified once `VirtualMemory` is constructed. We change the type of the field `CodeBlock::mem_block` from `Rc<RefCell<T>>` to `Rc<T>`, and push the `RefCell` into `VirtualMemory`. We extract mutable fields of `VirtualMemory` into a dedicated struct `VirtualMemoryMut`, and store them in a field `VirtualMemory::mutable` which is a `RefCell<VirtualMemoryMut>`. After this change, methods that access immutable fields in `VirtualMemory`, particularly `base_ptr()` which reads `region_start`, will no longer need to borrow any `RefCell`. Methods that access mutable fields will need to borrow `VirtualMemory::mutable`, but the number of borrowing operations becomes strictly fewer than before because borrowing operations previously done in callers (such as `CodeBlock::write_mem`) are moved into methods of `VirtualMemory` (such as `VirtualMemory::write_bytes`).
2025-07-14YJIT: Set code mem permissions in bulkKunshan Wang
Some GC modules, notably MMTk, support parallel GC, i.e. multiple GC threads work in parallel during a GC. Currently, when two GC threads scan two iseq objects simultaneously when YJIT is enabled, both threads will attempt to borrow `CodeBlock::mem_block`, which will result in panic. This commit makes one part of the change. We now set the YJIT code memory to writable in bulk before the reference-updating phase, and reset it to executable in bulk after the reference-updating phase. Previously, YJIT lazily sets memory pages writable while updating object references embedded in JIT-compiled machine code, and sets the memory back to executable by calling `mark_all_executable`. This approach is inherently unfriendly to parallel GC because (1) it borrows `CodeBlock::mem_block`, and (2) it sets the whole `CodeBlock` as executable which races with other GC threads that are updating other iseq objects. It also has performance overhead due to the frequent invocation of system calls. We now set the permission of all the code memory in bulk before and after the reference updating phase. Multiple GC threads can now perform raw memory writes in parallel. We should also see performance improvement during moving GC because of the reduced number of `mprotect` system calls.
2025-07-09ZJIT: Mark profiled objects when marking ISEQ (#13784)Takashi Kokubun
2025-06-28ZJIT: Codegen for `defined?(yield)`Alan Wu
Lots of stdlib methods such as Integer#times and Kernel#then use this, so at least this will make writing tests slightly easier.
2025-06-23ZJIT: Optimize frozen array aref (#13666)Max Bernstein
If we have a frozen array `[..., a, ...]` and a compile-time fixnum index `i`, we can do the array load at compile-time.
2025-06-17Rename `imemo_class_fields` -> `imemo_fields`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13626
2025-06-13Get rid of FL_EXIVARJean Boussier
Now that the shape_id gives us all the same information, it's no longer needed. Notes: Merged: https://github.com/ruby/ruby/pull/13612
2025-06-13Add SHAPE_ID_HAS_IVAR_MASK for quick ivar checkJean Boussier
This allow checking if an object has ivars with just a shape_id mask. Notes: Merged: https://github.com/ruby/ruby/pull/13606
2025-06-12Get rid of `rb_shape_lookup`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13596
2025-06-12Turn `rb_classext_t.fields` into a T_IMEMO/class_fieldsJean Boussier
This behave almost exactly as a T_OBJECT, the layout is entirely compatible. This aims to solve two problems. First, it solves the problem of namspaced classes having a single `shape_id`. Now each namespaced classext has an object that can hold the namespace specific shape. Second, it open the door to later make class instance variable writes atomics, hence be able to read class variables without locking the VM. In the future, in multi-ractor mode, we can do the write on a copy of the `fields_obj` and then atomically swap it. Considerations: - Right now the `RClass` shape_id is always synchronized, but with namespace we should likely mark classes that have multiple namespace with a specific shape flag. Notes: Merged: https://github.com/ruby/ruby/pull/13411
2025-06-11YJIT: x86: Fix panic writing 32-bit number with top bit setAlan Wu
Previously, `asm.mov(m32, imm32)` panicked when `imm32 > 0x80000000`. It attempted to split imm32 into a register before doing the store, but then the register size didn't match the destination size. Instead of splitting, use the `MOV r/m32, imm32` form which works for all 32-bit values. Adjust asserts that assumed that all forms undergo sign extension, which is not true for this case. See: 54edc930f9f0a658da45cfcef46648d1b6f82467 Notes: Merged: https://github.com/ruby/ruby/pull/13576
2025-06-07Get rid of rb_shape_t.heap_idJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13556
2025-06-05Refactor raw accesses to rb_shape_t.capacityJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13524
2025-06-05Get rid of `rb_shape_t.flags`Jean Boussier
Now all flags are only in the `shape_id_t`, and can all be checked without needing to dereference a pointer. Notes: Merged: https://github.com/ruby/ruby/pull/13515
2025-06-04Get rid of TOO_COMPLEX shape typeJean Boussier
Instead it's now a `shape_id` flag. This allows to check if an object is complex without having to chase the `rb_shape_t` pointer. Notes: Merged: https://github.com/ruby/ruby/pull/13511
2025-06-03Use all 32bits of `shape_id_t` on all platformsJean Boussier
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353] Even thought `shape_id_t` has been make 32bits, we were still limited to use only the lower 16 bits because they had to fit alongside `attr_index_t` inside a `uintptr_t` in inline caches. By enlarging inline caches we can unlock the full 32bits on all platforms, allowing to use these extra bits for tagging. Notes: Merged: https://github.com/ruby/ruby/pull/13500
2025-06-02shape.c: Implement a lock-free version of get_next_shape_internalJean Boussier
Whenever we run into an inline cache miss when we try to set an ivar, we may need to take the global lock, just to be able to lookup inside `shape->edges`. To solve that, when we're in multi-ractor mode, we can treat the `shape->edges` as immutable. When we need to add a new edge, we first copy the table, and then replace it with CAS. This increases memory allocations, however we expect that creating new transitions becomes increasingly rare over time. ```ruby class A def initialize(bool) @a = 1 if bool @b = 2 else @c = 3 end end def test @d = 4 end end def bench(iterations) i = iterations while i > 0 A.new(true).test A.new(false).test i -= 1 end end if ARGV.first == "ractor" ractors = 8.times.map do Ractor.new do bench(20_000_000 / 8) end end ractors.each(&:take) else bench(20_000_000) end ``` The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4, and only 1.7s with this branch. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/13441
2025-05-28Use flag for RCLASS_IS_INITIALIZEDJohn Hawthorn
Previously we used a flag to set whether a module was uninitialized. When checked whether a class was initialized, we first had to check that it had a non-zero superclass, as well as that it wasn't BasicObject. With the advent of namespaces, RCLASS_SUPER is now an expensive operation, and though we could just check for the prime superclass, we might as well take this opportunity to use a flag so that we can perform the initialized check with as few instructions as possible. It's possible in the future that we could prevent uninitialized classes from being available to the user, but currently there are a few ways to do that. Notes: Merged: https://github.com/ruby/ruby/pull/13443
2025-05-27Refactor `rb_shape_too_complex_p` to take a `shape_id_t`.Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-27Refactor `rb_shape_get_iv_index` to take a `shape_id_t`Jean Boussier
Further reduce exposure of `rb_shape_t`. Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-27Get rid of `rb_shape_id(rb_shape_t *)`Jean Boussier
We should avoid conversions from `rb_shape_t *` into `shape_id_t` outside of `shape.c` as the short term goal is to have `shape_id_t` contain tags. Notes: Merged: https://github.com/ruby/ruby/pull/13448
2025-05-15YJIT: handle opt_aset_withJean Boussier
``` # frozen_string_ltieral: true hash["literal"] = value ``` Notes: Merged: https://github.com/ruby/ruby/pull/13342
2025-05-12YJIT: Split the block on optimized getlocal/setlocal (#13282)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2025-05-11Add yjit/zjit bindings for adding namespaceSatoshi Tagomori
2025-05-09Rename `RB_OBJ_SHAPE` -> `rb_obj_shape`Jean Boussier
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id` and `RSHAPE` is now a simple alias for `rb_shape_lookup`. I tried to turn all these into `static inline` but I'm having trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;` not being exposed as I'd expect. Notes: Merged: https://github.com/ruby/ruby/pull/13283
2025-05-09Rename `rb_shape_get_shape_id` -> `RB_OBJ_SHAPE_ID`Jean Boussier
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`. Notes: Merged: https://github.com/ruby/ruby/pull/13283
2025-05-09Refactor `rb_shape_get_next` to return an IDJean Boussier
Also rename it, and change parameters to be consistent with other transition functions. Notes: Merged: https://github.com/ruby/ruby/pull/13283
2025-05-09Rename `rb_shape_obj_too_complex` -> `rb_shape_obj_too_complex_p`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13283
2025-05-09Rename `rb_shape_get_shape_by_id` -> `RSHAPE`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13283