| Age | Commit message (Collapse) | Author |
|
This will help JITs (and maybe later the interpreter) optimize
Integer#>>.
|
|
Append a codepoint.
|
|
This otherwise would miss annotations of C methods.
|
|
Use actual receiver type. This gives us better method lookup.
|
|
|
|
ZJIT: Standardize C call related insn fields
- Add `recv` field to `CCall` and `CCallWithFrame` so now all method dispatch
related instructions have `recv` field, separate from `args` field.
This ensures consistent pointer arithmetic when generating code for these
instructions.
- Standardize `recv` field's display position in send related instructions.
|
|
ZJIT: Optimize variadic cfunc Send calls into CCallVariadic
|
|
lobsters:
```
Top-4 setivar fallback reasons (100.0% of total 7,789,008):
shape_transition: 6,074,085 (78.0%)
not_monomorphic: 1,484,013 (19.1%)
not_t_object: 172,629 ( 2.2%)
too_complex: 58,281 ( 0.7%)
Top-3 getivar fallback reasons (100.0% of total 9,348,832):
not_t_object: 4,658,833 (49.8%)
not_monomorphic: 4,542,316 (48.6%)
too_complex: 147,683 ( 1.6%)
Top-3 definedivar fallback reasons (100.0% of total 366,383):
not_monomorphic: 361,389 (98.6%)
too_complex: 3,062 ( 0.8%)
not_t_object: 1,932 ( 0.5%)
```
railsbench:
```
Top-3 setivar fallback reasons (100.0% of total 15,119,057):
shape_transition: 13,760,763 (91.0%)
not_monomorphic: 982,368 ( 6.5%)
not_t_object: 375,926 ( 2.5%)
Top-2 getivar fallback reasons (100.0% of total 14,438,747):
not_t_object: 7,643,870 (52.9%)
not_monomorphic: 6,794,877 (47.1%)
Top-2 definedivar fallback reasons (100.0% of total 209,613):
not_monomorphic: 209,526 (100.0%)
not_t_object: 87 ( 0.0%)
```
shipit:
```
Top-3 setivar fallback reasons (100.0% of total 14,516,254):
shape_transition: 8,613,512 (59.3%)
not_monomorphic: 5,761,398 (39.7%)
not_t_object: 141,344 ( 1.0%)
Top-2 getivar fallback reasons (100.0% of total 21,016,444):
not_monomorphic: 11,313,482 (53.8%)
not_t_object: 9,702,962 (46.2%)
Top-2 definedivar fallback reasons (100.0% of total 290,382):
not_monomorphic: 287,755 (99.1%)
not_t_object: 2,627 ( 0.9%)
```
|
|
```
warning: unknown lint: `unnecessary_transmutes`
--> zjit/src/cruby.rs:107:9
|
107 | #[allow(unnecessary_transmutes)] // https://github.com/rust-lang/rust-bindgen/issues/2807
| ^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unknown_lints)]` on by default
```
|
|
Don't support shape transitions for now.
|
|
JIT-to-JIT sends don't blit locals to nil in the callee's
EP memory region because HIR is aware of this initial state and
memory ops are only done when necessary. Previously, we
read from this initialized memory by emitting `GetLocal` in e.g. BBs
that are immediate successor to an entrypoint.
The entry points sets up the frame state properly and we also reload
locals if necessary after an operation that potentially makes the
environment escape. So, listen to the frame state when it's supposed to
be up-to-date (`!local_inval`).
|
|
|
|
No sense calling a C function.
|
|
|
|
|
|
|
|
This is good for protoboeuf and other binary parsing
|
|
Add `LoadEC` then it's just two `LoadField`.
|
|
|
|
Fix https://github.com/Shopify/ruby/issues/876
|
|
This lets us constant-fold common monomorphic cases.
|
|
Don't emit a CCall.
|
|
Going through a call to a C function just to read a bitfield was a
little extreme. We did it to be super conservative since bitfields
have historically been the trigger of many bugs and surprises. Let's
try directly accessing them with code from rust-bindgen. If this
ends up causing issues, we can use the FFI approach behind nicer
wrappers.
In any case, directly access regular struct fields such as `lead_num`
and `opt_num` to remove boilerplate.
|
|
This will make reading the parameters nicer for the JITs. Should be
no-op for the C side.
|
|
|
|
|
|
|
|
|
|
Fixes https://github.com/Shopify/ruby/issues/877
I didn't consider the ability to have the successor or predecessor sets having duplicates when originally crafting the Iongraph support PR, but have added this to prevent that happening in the future.
I don't think it interferes with the underlying Iongraph implementation, but it doesn't really make sense.
I think this kind of behaviour happens when there are multiple jump instructions that go to the same basic block within a given block.
|
|
|
|
|
|
As can be seen in vm_block_handler_verify(), VM_BLOCK_HANDLER_NONE is
not a valid argument for vm_block_handler(). Store nil in the profiler
when seen instead of crashing.
|
|
## Components
This PR adds functionality to visualize HIR using the [Iongraph](https://spidermonkey.dev/blog/2025/10/28/iongraph-web.html) tool first created for use with Spidermonkey.
## Justification
Iongraph's viewer is (as mentioned in the article above) a few notches above graphviz for viewing large CFGs. It also allows easily inspecting different compiler optimization passes and multiple functions in the same browser window. Since Spidermonkey is using this format, it may be beneficial to use it for our own JIT development.
The requirement for JSON is downstream from that of the Iongraph format. As for writing the implementation myself, ZJIT leans towards having fewer dependencies, so this is the preferred approach.
## How does it look?
<img width="902" height="957" alt="image" src="https://github.com/user-attachments/assets/e4e0991b-572a-41fd-9fed-1215bd1926c3" />
<img width="770" height="624" alt="image" src="https://github.com/user-attachments/assets/01398373-1f75-46b8-b1aa-7f5d4cbca6b8" />
Right now, it's aesthetically minimal, but is fairly robust.
## Functionality
Using `--zjit-dump-hir-iongraph` will dump all compiled functions into a directory named `/tmp/zjit-iongraph-{PROCESS_PID}`. Each file will be named `func_{ZJIT_FUNC_NAME}.json`. In order to use them in the Iongraph viewer, you'll need to use `jq` to collate them to a single file. An example invocation of `jq` is shown below for reference. The name of the file created does not matter to my understanding.
`jq --slurp --null-input '.functions=inputs | .version=2' /tmp/zjit-iongraph-{PROCESS_PID}/func*.json > ~/Downloads/foo.json`
From there, you can use https://mozilla-spidermonkey.github.io/iongraph/ to view your trace.
### Caveats
- The upstream Iongraph viewer doesn't allow you to click arguments to an instruction to find the instruction that they originate from when using the format that this PR generates. (I have made a small fork at https://github.com/aidenfoxivey/iongraph that fixes that functionality via https://github.com/aidenfoxivey/iongraph/commit/9e9c29b41c4dbb35cf66cb6161e5b19c8b796379.patch)
- The upstream Iongraph viewer can sometimes show "exiting edges" in the CFG as being not attached to the box representing its basic block.
<img width="1814" height="762" alt="image" src="https://github.com/user-attachments/assets/afbbaa16-332f-498f-849e-11c69a8cb0cc" />
(Image courtesy of @tekknolagi)
This is because the original tool was (to our understanding) written for an SSA format that does not use extended basic blocks. (Extended basic blocks let you put a jump instruction, conditional or otherwise, anywhere in the basic block.) This means that our format may generate more outgoing edges than the viewer is written to handle.
|
|
|
|
lobsters:
```
Top-20 calls to C functions from JIT code (79.9% of total 97,004,883):
rb_vm_opt_send_without_block: 19,874,212 (20.5%)
rb_vm_setinstancevariable: 9,774,841 (10.1%)
rb_ivar_get: 9,358,866 ( 9.6%)
rb_hash_aref: 6,828,948 ( 7.0%)
rb_vm_send: 6,441,551 ( 6.6%)
rb_vm_env_write: 5,375,989 ( 5.5%)
rb_vm_invokesuper: 3,037,836 ( 3.1%)
Module#===: 2,562,446 ( 2.6%)
rb_ary_entry: 2,354,546 ( 2.4%)
Kernel#is_a?: 1,424,092 ( 1.5%)
rb_vm_opt_getconstant_path: 1,344,923 ( 1.4%)
Thread.current: 1,300,822 ( 1.3%)
rb_zjit_defined_ivar: 1,222,613 ( 1.3%)
rb_vm_invokeblock: 1,184,555 ( 1.2%)
Hash#[]=: 1,061,969 ( 1.1%)
rb_ary_push: 1,024,987 ( 1.1%)
rb_ary_new_capa: 904,003 ( 0.9%)
rb_str_buf_append: 833,782 ( 0.9%)
rb_class_allocate_instance: 822,626 ( 0.8%)
Hash#fetch: 755,913 ( 0.8%)
```
railsbench:
```
Top-20 calls to C functions from JIT code (74.8% of total 189,170,268):
rb_vm_opt_send_without_block: 29,870,307 (15.8%)
rb_vm_setinstancevariable: 17,631,199 ( 9.3%)
rb_hash_aref: 16,928,890 ( 8.9%)
rb_ivar_get: 14,441,240 ( 7.6%)
rb_vm_env_write: 11,571,001 ( 6.1%)
rb_vm_send: 11,153,457 ( 5.9%)
rb_vm_invokesuper: 7,568,267 ( 4.0%)
Module#===: 6,065,923 ( 3.2%)
Hash#[]=: 2,842,990 ( 1.5%)
rb_ary_entry: 2,766,125 ( 1.5%)
rb_ary_push: 2,722,079 ( 1.4%)
rb_vm_invokeblock: 2,594,398 ( 1.4%)
Thread.current: 2,560,129 ( 1.4%)
rb_str_getbyte: 1,965,627 ( 1.0%)
Kernel#is_a?: 1,961,815 ( 1.0%)
rb_vm_opt_getconstant_path: 1,863,678 ( 1.0%)
rb_hash_new_with_size: 1,796,456 ( 0.9%)
rb_class_allocate_instance: 1,785,043 ( 0.9%)
String#empty?: 1,713,414 ( 0.9%)
rb_ary_new_capa: 1,678,834 ( 0.9%)
```
shipit:
```
Top-20 calls to C functions from JIT code (83.4% of total 182,402,821):
rb_vm_opt_send_without_block: 45,753,484 (25.1%)
rb_ivar_get: 21,020,650 (11.5%)
rb_vm_setinstancevariable: 17,528,603 ( 9.6%)
rb_hash_aref: 11,892,856 ( 6.5%)
rb_vm_send: 11,723,471 ( 6.4%)
rb_vm_env_write: 10,434,452 ( 5.7%)
Module#===: 4,225,048 ( 2.3%)
rb_vm_invokesuper: 3,705,906 ( 2.0%)
Thread.current: 3,337,603 ( 1.8%)
rb_ary_entry: 3,114,378 ( 1.7%)
Hash#[]=: 2,509,912 ( 1.4%)
Array#empty?: 2,282,994 ( 1.3%)
rb_vm_invokeblock: 2,210,511 ( 1.2%)
Hash#fetch: 2,017,960 ( 1.1%)
_bi20: 1,975,147 ( 1.1%)
rb_zjit_defined_ivar: 1,897,127 ( 1.0%)
rb_vm_opt_getconstant_path: 1,813,294 ( 1.0%)
rb_ary_new_capa: 1,615,406 ( 0.9%)
Kernel#is_a?: 1,567,854 ( 0.9%)
rb_class_allocate_instance: 1,560,035 ( 0.9%)
```
Thanks to @eregon for the idea.
Co-authored-by: Jacob Denbeaux <jacob.denbeaux@shopify.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
|
|
Name contradictory now, and we have other tests testing the same thing.
|
|
Rename to `VM_KW_SPECIFIED_BITS_MAX` now that it's in `vm_core.h`.
|
|
|
|
* When writing to an object, the receiver should be checked if it's frozen,
not the value, so this avoids an error-prone autocomplete.
|
|
* Add Insn::StoreField and Insn::WriteBarrier
|
|
This implements Shopify#854:
- Splits boot-time and enable-time initialization,
tracks progress with `InitializationState` enum
- Introduces `RubyVM::ZJIT.enable` Ruby method for
enabling the JIT lazily, if not already enabled
- Introduces `--zjit-disable` flag, which can be
used alongside the other `--zjit-*` flags but
prevents enabling the JIT at boot time
- Adds ZJIT infra to support JIT hooks, but this
is not currently exercised (Shopify/ruby#667)
Left for future enhancements:
- Support kwargs for overriding the CLI flags in
`RubyVM::ZJIT.enable`
Closes Shopify#854
|
|
* This can catch subtle errors early, so avoid a fallback case and
handle every instruction explicitly.
|
|
Make it easier to see what happens when one is changed.
|
|
* Correct JIT entry points for optionals so each optional start with nil
before their initialization routine runs. Establish
`jit_entry_points[filled_opts_num]` gives the appropriate entry point
* Correct number of HIR block parameters for each JIT entry point
* Entry points that share the same ISEQ PC get separate entries since
they start with different state. No more deduplication.
* Reject post parameters. Was hidden behind check for optionals.
* Make sure to visit every BB in iseq_to_hir(). Some wasn't visited
when the initialization routine for an optional terminates the block
in a `SideExit`. Remove the now impossible `FailedOptionalArguments`.
|
|
|
|
|
|
|
|
|
|
|