| Age | Commit message (Collapse) | Author |
|
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
|
|
|
|
|
|
Its definition changes depending on e.g. whether there is YJIT in the
build.
|
|
Since we do a decent job of pre-sizing objects, don't handle the case where we would need to re-size an object. Also don't handle too-complex shapes.
lobsters stats before:
```
Top-20 calls to C functions from JIT code (79.4% of total 90,051,140):
rb_vm_opt_send_without_block: 19,762,433 (21.9%)
rb_vm_setinstancevariable: 7,698,314 ( 8.5%)
rb_hash_aref: 6,767,461 ( 7.5%)
rb_vm_env_write: 5,373,080 ( 6.0%)
rb_vm_send: 5,049,229 ( 5.6%)
rb_vm_getinstancevariable: 4,535,259 ( 5.0%)
rb_obj_is_kind_of: 3,746,306 ( 4.2%)
rb_ivar_get_at_no_ractor_check: 3,745,237 ( 4.2%)
rb_vm_invokesuper: 3,037,467 ( 3.4%)
rb_ary_entry: 2,351,983 ( 2.6%)
rb_vm_opt_getconstant_path: 1,344,740 ( 1.5%)
rb_vm_invokeblock: 1,184,474 ( 1.3%)
Hash#[]=: 1,064,288 ( 1.2%)
rb_gc_writebarrier: 1,006,972 ( 1.1%)
rb_ec_ary_new_from_values: 902,687 ( 1.0%)
fetch: 898,667 ( 1.0%)
rb_str_buf_append: 833,787 ( 0.9%)
rb_class_allocate_instance: 822,024 ( 0.9%)
Hash#fetch: 699,580 ( 0.8%)
_bi20: 682,068 ( 0.8%)
Top-4 setivar fallback reasons (100.0% of total 7,732,326):
shape_transition: 6,032,109 (78.0%)
not_monomorphic: 1,469,300 (19.0%)
not_t_object: 172,636 ( 2.2%)
too_complex: 58,281 ( 0.8%)
```
lobsters stats after:
```
Top-20 calls to C functions from JIT code (79.0% of total 88,322,656):
rb_vm_opt_send_without_block: 19,777,880 (22.4%)
rb_hash_aref: 6,771,589 ( 7.7%)
rb_vm_env_write: 5,372,789 ( 6.1%)
rb_gc_writebarrier: 5,195,527 ( 5.9%)
rb_vm_send: 5,049,145 ( 5.7%)
rb_vm_getinstancevariable: 4,538,485 ( 5.1%)
rb_obj_is_kind_of: 3,746,241 ( 4.2%)
rb_ivar_get_at_no_ractor_check: 3,745,172 ( 4.2%)
rb_vm_invokesuper: 3,037,157 ( 3.4%)
rb_ary_entry: 2,351,968 ( 2.7%)
rb_vm_setinstancevariable: 1,703,337 ( 1.9%)
rb_vm_opt_getconstant_path: 1,344,730 ( 1.5%)
rb_vm_invokeblock: 1,184,290 ( 1.3%)
Hash#[]=: 1,061,868 ( 1.2%)
rb_ec_ary_new_from_values: 902,666 ( 1.0%)
fetch: 898,666 ( 1.0%)
rb_str_buf_append: 833,784 ( 0.9%)
rb_class_allocate_instance: 821,778 ( 0.9%)
Hash#fetch: 755,913 ( 0.9%)
Top-4 setivar fallback reasons (100.0% of total 1,703,337):
not_monomorphic: 1,472,405 (86.4%)
not_t_object: 172,629 (10.1%)
too_complex: 58,281 ( 3.4%)
new_shape_needs_extension: 22 ( 0.0%)
```
I also noticed that primitive printing in HIR was broken so I fixed that.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
|
|
* All Invariant::SingleRactorMode PatchPoint are replaced by
assume_single_ractor_mode() to fix https://github.com/Shopify/ruby/issues/875
for SingleRactorMode patchpoints.
|
|
Append a codepoint.
|
|
```
warning: unknown lint: `unnecessary_transmutes`
--> zjit/src/cruby.rs:107:9
|
107 | #[allow(unnecessary_transmutes)] // https://github.com/rust-lang/rust-bindgen/issues/2807
| ^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unknown_lints)]` on by default
```
|
|
Don't support shape transitions for now.
|
|
Add `LoadEC` then it's just two `LoadField`.
|
|
Going through a call to a C function just to read a bitfield was a
little extreme. We did it to be super conservative since bitfields
have historically been the trigger of many bugs and surprises. Let's
try directly accessing them with code from rust-bindgen. If this
ends up causing issues, we can use the FFI approach behind nicer
wrappers.
In any case, directly access regular struct fields such as `lead_num`
and `opt_num` to remove boilerplate.
|
|
This will make reading the parameters nicer for the JITs. Should be
no-op for the C side.
|
|
|
|
lobsters:
```
Top-20 calls to C functions from JIT code (79.9% of total 97,004,883):
rb_vm_opt_send_without_block: 19,874,212 (20.5%)
rb_vm_setinstancevariable: 9,774,841 (10.1%)
rb_ivar_get: 9,358,866 ( 9.6%)
rb_hash_aref: 6,828,948 ( 7.0%)
rb_vm_send: 6,441,551 ( 6.6%)
rb_vm_env_write: 5,375,989 ( 5.5%)
rb_vm_invokesuper: 3,037,836 ( 3.1%)
Module#===: 2,562,446 ( 2.6%)
rb_ary_entry: 2,354,546 ( 2.4%)
Kernel#is_a?: 1,424,092 ( 1.5%)
rb_vm_opt_getconstant_path: 1,344,923 ( 1.4%)
Thread.current: 1,300,822 ( 1.3%)
rb_zjit_defined_ivar: 1,222,613 ( 1.3%)
rb_vm_invokeblock: 1,184,555 ( 1.2%)
Hash#[]=: 1,061,969 ( 1.1%)
rb_ary_push: 1,024,987 ( 1.1%)
rb_ary_new_capa: 904,003 ( 0.9%)
rb_str_buf_append: 833,782 ( 0.9%)
rb_class_allocate_instance: 822,626 ( 0.8%)
Hash#fetch: 755,913 ( 0.8%)
```
railsbench:
```
Top-20 calls to C functions from JIT code (74.8% of total 189,170,268):
rb_vm_opt_send_without_block: 29,870,307 (15.8%)
rb_vm_setinstancevariable: 17,631,199 ( 9.3%)
rb_hash_aref: 16,928,890 ( 8.9%)
rb_ivar_get: 14,441,240 ( 7.6%)
rb_vm_env_write: 11,571,001 ( 6.1%)
rb_vm_send: 11,153,457 ( 5.9%)
rb_vm_invokesuper: 7,568,267 ( 4.0%)
Module#===: 6,065,923 ( 3.2%)
Hash#[]=: 2,842,990 ( 1.5%)
rb_ary_entry: 2,766,125 ( 1.5%)
rb_ary_push: 2,722,079 ( 1.4%)
rb_vm_invokeblock: 2,594,398 ( 1.4%)
Thread.current: 2,560,129 ( 1.4%)
rb_str_getbyte: 1,965,627 ( 1.0%)
Kernel#is_a?: 1,961,815 ( 1.0%)
rb_vm_opt_getconstant_path: 1,863,678 ( 1.0%)
rb_hash_new_with_size: 1,796,456 ( 0.9%)
rb_class_allocate_instance: 1,785,043 ( 0.9%)
String#empty?: 1,713,414 ( 0.9%)
rb_ary_new_capa: 1,678,834 ( 0.9%)
```
shipit:
```
Top-20 calls to C functions from JIT code (83.4% of total 182,402,821):
rb_vm_opt_send_without_block: 45,753,484 (25.1%)
rb_ivar_get: 21,020,650 (11.5%)
rb_vm_setinstancevariable: 17,528,603 ( 9.6%)
rb_hash_aref: 11,892,856 ( 6.5%)
rb_vm_send: 11,723,471 ( 6.4%)
rb_vm_env_write: 10,434,452 ( 5.7%)
Module#===: 4,225,048 ( 2.3%)
rb_vm_invokesuper: 3,705,906 ( 2.0%)
Thread.current: 3,337,603 ( 1.8%)
rb_ary_entry: 3,114,378 ( 1.7%)
Hash#[]=: 2,509,912 ( 1.4%)
Array#empty?: 2,282,994 ( 1.3%)
rb_vm_invokeblock: 2,210,511 ( 1.2%)
Hash#fetch: 2,017,960 ( 1.1%)
_bi20: 1,975,147 ( 1.1%)
rb_zjit_defined_ivar: 1,897,127 ( 1.0%)
rb_vm_opt_getconstant_path: 1,813,294 ( 1.0%)
rb_ary_new_capa: 1,615,406 ( 0.9%)
Kernel#is_a?: 1,567,854 ( 0.9%)
rb_class_allocate_instance: 1,560,035 ( 0.9%)
```
Thanks to @eregon for the idea.
Co-authored-by: Jacob Denbeaux <jacob.denbeaux@shopify.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
|
|
Inline the `String#bytesize` function and remove the C call.
|
|
|
|
Fix https://github.com/Shopify/ruby/issues/670
|
|
|
|
* Handled in cruby_methods.rs because there is no basic operation for Fixnum#^.
|
|
This helps ZJIT optimize ~300,000 more sends in ruby-bench's lobsters
Top-6 not optimized method types for send_without_block
Before After
iseq: 713,899 (48.0%) iseq: 725,668 (62.4%)
optimized: 359,864 (24.2%) optimized: 359,940 (31.0%)
bmethod: 339,040 (22.8%) alias: 73,541 ( 6.3%)
alias: 73,392 ( 4.9%) null: 2,521 ( 0.2%)
null: 2,521 ( 0.2%) bmethod: 979 ( 0.1%)
cfunc: 4 ( 0.0%) cfunc: 4 ( 0.0%)
|
|
```
/src/jit.c:19:5: error: ISO C restricts enumerator values to range of 'int' (4294967295 is too large) [-Werror,-Wpedantic]
19 | RB_INVALID_SHAPE_ID = INVALID_SHAPE_ID,
| ^ ~~~~~~~~~~~~~~~~
```
|
|
This is mostly to see what happens to the loops-times benchmark.
|
|
|
|
no_rest() trips an assert inside the GC when we allocate with the GC
disabled this way:
(gc_continue) ../src/gc/default/default.c:2029
(newobj_cache_miss+0x128) [0x105040048] ../src/gc/default/default.c:2370
(rb_gc_impl_new_obj+0x7c) [0x105036374] ../src/gc/default/default.c:2482
(newobj_of) ../src/gc.c:995
(rb_method_entry_alloc+0x40) [0x1051e6c64] ../src/vm_method.c:1102
(rb_method_entry_complement_defined_class) ../src/vm_method.c:1180
(prepare_callable_method_entry+0x14c) [0x1051e87b8] ../src/vm_method.c:1728
(callable_method_entry_or_negative+0x1e8) [0x1051e809c] ../src/vm_method.c:1874
It's tries to continue the GC because it was out of space. Looks like
it's not safe to allocate new objects after using
rb_gc_disable_no_rest(); existing usages use it for malloc calls.
|
|
This fixes a reliable "ZJIT saw a dead object" repro on my machine, and should
fix the flaky ones on CI. The code for disabling the GC is the same as
the code in newobj_of().
See: https://github.com/ruby/ruby/actions/runs/18511676257/job/52753782036
|
|
Strictly more info than just the builtin_type from `assert_ne!`.
Old:
assertion `left != right` failed: ZJIT should only see live objects
left: 0
right: 0
New:
ZJIT saw a dead object. T_type=0, out-of-heap:0x0000000110d4bb40
Also, the new `VALUE::obj_info` is more flexible for print debugging than the
dump_info() it replaces. It now allows you to use it as part of a `format!`
string instead of always printing to stderr for you.
|
|
* ZJIT: Add NoSingletonClass patch point
This patch point makes sure that when the object has a singleton class,
the JIT code is invalidated. As of now, this is only needed for C call
optimization.
In YJIT, the singleton class guard only applies to Array, Hash, and String.
But in ZJIT, we may optimize C calls from gems (e.g. `sqlite3`). So the
patch point needs to be applied to a broader range of classes.
* ZJIT: Only generate NoSingletonClass guard when the type can have singleton class
* ZJIT: Update or forget NoSingletonClass patch point when needed
|
|
|
|
|
|
* ZJIT: Allow testing JIT code on zjit-test
* Resurrect TestingAllocator tests
|
|
It's not just the default allocator; other allocators are also leaf.
|
|
|
|
|
|
ZJIT: Remove JITed code after TracePoint is enabled
|
|
|
|
|
|
|
|
Specialize monomorphic `GetIvar` into:
* `GuardType(HeapObject)`
* `GuardShape`
* `LoadIvarEmbedded` or `LoadIvarExtended`
This requires profiling self for `getinstancevariable` (it's not on the operand
stack).
This also optimizes `GetIvar`s that happen as a result of inlining
`attr_reader` and `attr_accessor`.
Also move some (newly) shared JIT helpers into jit.c.
|
|
The embed layout is way more common than the heap one,
especially since WVA.
I think it makes for more readable code to inverse the
flag.
|
|
Its usage was removed in 306d50811dd060d876d1eb364a0d5e6106f5e4f1.
|
|
ZJIT: Fix ObjToString rewrite
Currently, the rewrite for `ObjToString` always replaces it with a
`SendWithoutBlock(to_s)` instruction when the receiver is not a
string literal. This is incorrect because it calls `to_s` on the
receiver even if it's already a string.
This change fixes it by:
- Avoiding the `SendWithoutBlock(to_s)` rewrite
- Implement codegen for `ObjToString`
|
|
ZJIT uses the interpreter to take type profiles of what objects pass through
the code. It stores a compressed record of the history per opcode for the
opcodes we select.
Before this change, we re-used the HIR Type data-structure, a shallow type
lattice, to store historical type information. This was quick for bringup but
is quite lossy as profiles go: we get one bit per built-in type seen, and if we
see a non-built-in type in addition, we end up with BasicObject. Not very
helpful. Additionally, it does not give us any notion of cardinality: how many
of each type did we see?
This change brings with it a much more interesting slice of type history: a
histogram. A Distribution holds a record of the top-N (where N is fixed at Ruby
compile-time) `(Class, ShapeId)` pairs and their counts. It also holds an
*other* count in case we see more than N pairs.
Using this distribution, we can make more informed decisions about when we
should use type information. We can determine if we are strictly monomorphic,
very nearly monomorphic, or something else. Maybe the call-site is polymorphic,
so we should have a polymorphic inline cache. Exciting stuff.
I also plumb this new distribution into the HIR part of the compilation
pipeline.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|