| Age | Commit message (Collapse) | Author |
|
YJIT: Fix potential infinite loop when OOM (GH-13186)
Avoid generating an infinite loop in the case where:
1. Block `first` is adjacent to block `second`, and the branch from `first` to
`second` is a fallthrough, and
2. Block `second` immediately exits to the interpreter, and
3. Block `second` is invalidated and YJIT is OOM
While pondering how to fix this, I think I've stumbled on another related edge case:
1. Block `incoming_one` and `incoming_two` both branch to block `second`. Block
`incoming_one` has a fallthrough
2. Block `second` immediately exits to the interpreter (so it starts with its exit)
3. When Block `second` is invalidated, the incoming fallthrough branch from
`incoming_one` might be rewritten first, which overwrites the start of block
`second` with a jump to a new branch stub.
4. YJIT runs of out memory
5. The incoming branch from `incoming_two` is then rewritten, but because we're
OOM we can't generate a new stub, so we use `second`'s exit as the branch
target. However `second`'s exit was already overwritten with a jump to the
branch stub for `incoming_one`, so `incoming_two` will end up jumping to
`incoming_one`'s branch stub.
Fixes [Bug #21257]
|
|
Previously, if the method ID argument happens to be on one below the top
of the stack, we didn't overwrite the type of the stack slot, which
leaves an incorrect type for the stack slot. The included script tripped
asserts both with and without --yjit-verify-ctx.
|
|
|
|
YJIT: Avoid register allocation conflict
with a higher stack_idx
|
|
Like in the example given in delayed_deallocation(), stubs can be hit
even if the block housing it is invalidated. Mark them so we don't
work with invalidate ISeqs when hitting these stubs.
|
|
|
|
|
|
* WIP context refactoring
* Refactor to remove Context.temp_mapping
|
|
|
|
|
|
|
|
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
|
|
We've long had a size restriction on the code memory region such that a
u32 could refer to everything. This commit capitalizes on this
restriction by shrinking the size of `CodePtr` to be 4 bytes from 8.
To derive a full raw pointer from a `CodePtr`, one needs a base pointer.
Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The
base pointer is readily available everywhere, except for in the case of
the `jit_return` "branch". Generalize lea_label() to lea_jump_target()
in the IR to delay deriving the `jit_return` address until `compile()`,
when the base pointer is available.
On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size`
(58,397,765 to 57,742,248).
|
|
|
|
So that we get a reminder to check CodeBlock::has_dropped_bytes().
Internally, asm.compile() already checks it, and this patch just
propagates it out to the caller with a `#[must_use]`.
Code GC logic moved out one level in entry_stub_hit(), so the body
can freely use `?`
|
|
* Port call threshold logic from Rust to C for performance
* Prefix global/field names with yjit_
* Fix linker error
* Fix preprocessor condition for rb_yjit_threshold_hit
* Fix third linker issue
* Exclude yjit_calls_at_interv from RJIT bindgen
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
* Shink local types from 8 to 4 bytes, context from 21 to 17 bytes
Use repr(packed)
* Add comment about Type being limited to 4 bits
|
|
|
|
Previously, at the end of `leave` we did
`*caller_cfp->sp = return_value`, like the interpreter.
With future changes that leaves the SP field uninitialized for C frames,
this will become problematic. For cases like returning from
`rb_funcall()`, the return value was written above the stack and
never read anyway (callers use the copy in the return register).
Leave the return value in a register at the end of `leave` and have the
code at `cfp->jit_return` decide what to do with it. This avoids the
unnecessary memory write mentioned above. For JIT-to-JIT returns, it goes
through `asm.stack_push()` and benefits from register allocation for
stack temporaries.
Mostly flat on benchmarks, with maybe some marginal speed improvements.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
* YJIT: Add counter to measure how often we compile "cold" ISEQs (#535)
Fix counter name in DEFAULT_COUNTERS
YJIT: add --yjit-cold-threshold, don't compile cold ISEQs
YJIT: increase default cold threshold to 200_000
Remove rb_yjit_call_threshold()
Remove conflict markers
Fix compilation errors
Threshold 1 should compile immediately
Debug deadlock issue with test_ractor
Fix call threshold issue with tests
* Revert exception threshold logic. Document option in yjid.md
* (void) for 0 parameter functions in C99
* Rename iseq_entry_cold => cold_iseq_entry
* Document --yjit-cold-threshold in ruby.c
* Update doc/yjit/yjit.md
Co-authored-by: Jean byroot Boussier <jean.boussier+github@shopify.com>
* Shorten help string to appease test
* Address bug found by Kokubun. Reorder logic.
---------
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Co-authored-by: Jean byroot Boussier <jean.boussier+github@shopify.com>
|
|
* YJIT: Chain-guard opt_mult overflow
* YJIT: Support regenerating Jo after Mul
|
|
* YJIT: Avoid creating a vector in get_temp_regs()
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* Remove unused import
---------
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
* YJIT: Skip Insn::Comment and format!
if disasm is disabled
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Get rid of asm.comment
---------
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Previously, TestStack#test_machine_stack_size failed pretty consistently
on ARM64 macOS, with Rust code and part of the interpreter used for
per-instruction fallback (rb_vm_invokeblock() and friends) touching the
stack guard page and crashing with SEGV. I've also seen the same test
fail on x64 Linux, though with a different symptom.
Notes:
Merged: https://github.com/ruby/ruby/pull/8443
Merged-By: XrXr
|
|
* YJIT: Add compilation time counter
* YJIT: Use Instant instead
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
TempMapping (#8321)
* YJIT: merge tempmapping and temp types into a single-byte encoding
YJIT: refactor to shrink Context by 8 bytes
* Add tests, fix bug in TempMapping::map_to_local()
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Fewer transmutes where `as` would suffice. Also repr(u8)
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
These types are essentially claims about what `RBASIC_CLASS(obj)`
returns. The field changes with singleton class creation, but we didn't
consider so previously and elided guards where we actually needed them.
Found running ruby/spec with --yjit-verify-ctx. The assertion interface
makes extensive use of singleton classes.
Notes:
Merged: https://github.com/ruby/ruby/pull/8299
|
|
Rack uses this. Speculate that the `obj` in `the_call(&obj)`
will be a proc when the compile-time sample is a proc.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/8117
Merged-By: XrXr
|
|
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Avoid generating long dispatch chains for all array lengths seen.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/8067
Merged-By: nobu <nobu@ruby-lang.org>
|
|
* YJIT: add new stats counter for compiled ISEQ entry points
* Update yjit.rb
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* YJIT: Remove Insn::RegTemps
* Update a comment
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Introduce Target::SideExit
* YJIT: Obviate Insn::SideExitContext
* YJIT: Avoid cloning a Context for each insn
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* YJIT: Let Assembler own Context
* Update a comment
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Encourages commenting about soundness of `unsafe` usages.
Notes:
Merged: https://github.com/ruby/ruby/pull/7657
|
|
This reverts commit 9e678cdbd054f78576a8f21b3f97cccc395ade22.
Without the `unsafe` annotations, the SAFETY comments make less sense.
I want to keep the SAFETY comments.
Notes:
Merged: https://github.com/ruby/ruby/pull/7657
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* YJIT: Generate side exits late as possible
* YJIT: s/for_stack_size/with_stack_size/
* YJIT: s/get_counter/exit_counter/
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Making overlapping `&mut`s triggers Undefined Bahavior. This function
previously had them through `cb` and `ocb` aliasing with `self` or live
references in the caller.
To fix the overlap, take `ocb` as a parameter and don't use `get_inline_cb()`
in the body of the function.
Notes:
Merged: https://github.com/ruby/ruby/pull/7611
|
|
This assert would've caught a bug I wrote while developing
ruby/ruby#7443 so I figured it would be good to commit it
as it could be helpful in the future.
Notes:
Merged: https://github.com/ruby/ruby/pull/7558
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7535
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
`Rc` and `RefCell` both incur runtime space costs.
In addition, `RefCell` has given us some headaches with the
non obvious borrow panics it likes to throw out. The latest
one started with 7fd53eeb46db261bbc20025cdab70096245a5cbe
and is yet to be resolved.
Since we already rely on the GC to properly reclaim memory for `Block`
and `Branch`, we might as well stop paying the overhead of `Rc` and
`RefCell`. The `RefCell` panics go away with this change, too.
On 25 iterations of `railsbench` with a stats build I got
`yjit_alloc_size: 8,386,129 => 7,348,637`, with the new memory size 87.6%
of the status quo. This makes the metadata and machine code size roughly
line up one-to-one.
The general idea here is to use `&` shared references with
[interior mutability][1] with `Cell`, which doesn't take any extra
space. The `noalias` requirement that `&mut` imposes is way too hard to
meet and verify. Imagine replacing places where we would've gotten
`BorrowError` from `RefCell` with Rust/LLVM miscompiling us due to aliasing
violations. With shared references, we don't have to think about subtle
cases like the GC _sometimes_ calling the mark callback while codegen
has an aliasing reference in a stack frame below. We mostly only need to
worry about liveness, with which the GC already helps.
There is now a clean split between blocks and branches that are not yet
fully constructed and ones that are "in-service", so to speak. Working
with `PendingBranch` and `JITState` don't really involve `unsafe` stuff.
This change allows `Branch` and `Block` to not have as many optional
fields as many of them are only optional during compilation. Fields that
change post-compilation are wrapped in `Cell` to facilitate mutation
through shared references.
I do some `unsafe` dances here. I've included just a couple tests to run
with Miri (`cargo +nightly miri test miri`). We can add more Miri tests
if desired.
[1]: https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html
Notes:
Merged: https://github.com/ruby/ruby/pull/7443
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Somewhat important because having the lock is a key part of the
soundness reasoning for the `unsafe` usage here.
Notes:
Merged: https://github.com/ruby/ruby/pull/7530
|