| Age | Commit message (Collapse) | Author |
|
|
|
31e67a476f2262e01a0829e8ab5e6d8a97e0724e,0b95cbcbde8875effdbcbb676cb0a7f751a1d4c1: [Backport #19601]
YJIT: Fix false object collection when setting ivar
Previously, setinstancevariable could generate code that calls
`rb_ensure_iv_list_size()` without first updating `cfp->sp`. This means
in the event that a GC start from within said routine the top few
objects would not be marked, causing them to be falsly collected.
Call `jit_prepare_routine_call()` first.
[Bug #19601]
---
bootstraptest/test_yjit.rb | 20 ++++++++++++++++++++
yjit/src/codegen.rs | 5 +++++
2 files changed, 25 insertions(+)
YJIT: Remove duplicate `asm.spill_temps()`
`jit_prepare_routine_call()` calls it, and there is another call above on line 2302.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---
yjit/src/codegen.rs | 1 -
1 file changed, 1 deletion(-)
|
|
|
|
YJIT: Fix missing argc check in known cfuncs
Previously we were missing a compile-time check that the known cfuncs
receive the correct number of arguments.
We noticied this because in particular when using ARGS_SPLAT, which also
wasn't checked, YJIT would crash on code which was otherwise correct
(didn't raise exceptions in the VM).
This still supports vararg (argc == -1) cfuncs. I added an additional
assertion that when we use the specialized codegen for one of these
known functions that the argc are popped off the stack correctly, which
should help ensure they're implemented correctly (previously the crash
was usually observed on a future `leave` insn).
[Bug #19595]
---
bootstraptest/test_yjit.rb | 32 ++++++++++++++++++++++++++++++++
yjit/src/codegen.rs | 4 +++-
2 files changed, 35 insertions(+), 1 deletion(-)
|
|
33edcc112081f96856d52e73253d73c97a5c4a3c,b4e438d8aabaf4bba2b27f374c787543fae07c58: [Backport #19483]
YJIT: Protect strings from GC on String#<< (#7466)
Fix https://github.com/Shopify/yjit/issues/310
[Bug #19483]
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>
---
yjit/src/codegen.rs | 3 +++
1 file changed, 3 insertions(+)
YJIT: Save PC on rb_str_concat (#7586)
[Bug #19483]
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
---
test/ruby/test_yjit.rb | 19 +++++++++++++++++++
yjit/src/codegen.rs | 6 ++++--
2 files changed, 23 insertions(+), 2 deletions(-)
|
|
YJIT: Generate Block::entry_exit with block entry PC
Previously, when Block::entry_exit is requested from any instruction
that is not the first one in the block, we generated the exit with an
incorrect PC. We should always be using the PC for the entry of the
block for Block::entry_exit.
It was a simple typo. The bug was [introduced][1] while we were
refactoring to use the current backend. Later, we had a chance to spot
this issue while [preparing][2] to enable unused variable warnings, but
didn't spot the issue.
Fixes [Bug #19463]
[1]: 27fcab995e6dde19deb91dc6e291bdb72100af68
[2]: 31461c7e0eab4963ccc8649ea8ebf27979132c0c
---
test/ruby/test_yjit.rb | 41 +++++++++++++++++++++++++++++++++++++++++
yjit/src/codegen.rs | 4 ++--
2 files changed, 43 insertions(+), 2 deletions(-)
|
|
YJIT: Fix autosplat miscomp for blocks with optionals
When passing an array as the sole argument to `yield`, and the yieldee
takes more than 1 optional parameter, the array is expanded similar
to `*array` splat calls. This is called "autosplat" in
`setup_parameters_complex()`.
Previously, YJIT did not detect this autosplat condition. It passed the
array without expanding it, deviating from interpreter behavior.
Detect this conditon and refuse to compile it.
|
|
YJIT: Detect and reject `send(:alias_for_send, :foo)`
Previously, YJIT failed to put the stack into the correct shape when
`BasicObject#send` calls an alias method for the send method itself.
This can manifest as strange `NoMethodError`s in the final non-send
receiver, as [seen][1] with the kt-paperclip gem. I also found a case
where it makes YJIT fail the stack size assertion while compiling
`leave`.
YJIT's `BasicObject#__send__` implementation already rejects sends to
`send`, but didn't detect sends to aliases of `send`. Adjust the
detection and reject these cases.
Fixes [Bug #19464]
[1]: https://github.com/Shopify/yjit/issues/306
---
test/ruby/test_yjit.rb | 20 ++++++++++++++++++++
yjit/src/codegen.rs | 25 ++++++++++---------------
2 files changed, 30 insertions(+), 15 deletions(-)
|
|
c178926fbe879045fa711444a1fd9e906af23e3b,a4b7ec12298c78392797e5ba7704076550e4f100: [Backport #19444]
YJIT: jit_prepare_routine_call() for String#+@ missing
We saw SEGVs due to this when running with StackProf, which needs a
correct PC for RUBY_INTERNAL_EVENT_NEWOBJ, the same event used for
ObjectSpace allocation tracing.
[Bug #19444]
---
test/ruby/test_yjit.rb | 27 +++++++++++++++++++++++++++
yjit/src/codegen.rs | 5 ++++-
2 files changed, 31 insertions(+), 1 deletion(-)
YJIT: Fix false assumption that String#+@ => ::String
Could return a subclass.
[Bug #19444]
---
test/ruby/test_yjit.rb | 17 +++++++++++++++++
yjit/src/codegen.rs | 10 +++++++---
2 files changed, 24 insertions(+), 3 deletions(-)
|
|
b78f871d838c168789648738e5c67b071beb8a19,ecd0cdaf820af789f355f1a18c31d6adfe8aad94: [Backport #19400]
YJIT: Use the system page size when the code page size is too small
(#7267)
Previously on ARM64 Linux systems that use 64 KiB pages
(`CONFIG_ARM64_64K_PAGES=y`), YJIT was panicking on boot due to a failed
assertion.
The assertion was making sure that code GC can free the last code page
that YJIT manages without freeing unrelated memory. YJIT prefers picking
16 KiB as the granularity at which to free code memory, but when the
system can only free at 64 KiB granularity, that is not possible.
The fix is to use the system page size as the code page size when the
system page size is 64 KiB. Continue to use 16 KiB as the code page size
on common systems that use 16/4 KiB pages.
Add asserts to code_gc() and free_page() about code GC's assumptions.
Fixes [Bug #19400]
---
yjit/src/asm/mod.rs | 78 ++++++++++++++++++++++++++++++++------------------
yjit/src/codegen.rs | 2 --
yjit/src/virtualmem.rs | 13 +++++++++
3 files changed, 63 insertions(+), 30 deletions(-)
YJIT: Fix assertion for partially mapped last pages (#7337)
Follows up [Bug #19400]
---
test/ruby/test_yjit.rb | 19 +++++++++++++++++++
yjit/src/asm/mod.rs | 2 +-
yjit/src/virtualmem.rs | 18 +++++++++++++-----
3 files changed, 33 insertions(+), 6 deletions(-)
|
|
[PATCH 1/4] YJIT: Move CodegenGlobals::freed_pages into an Rc
This allows for supplying a freed_pages vec in Rust tests. We need it so we
can test scenarios that occur after code GC.
---
yjit/src/asm/mod.rs | 48 +++++++++++++++++++++++++++++++++------------
yjit/src/codegen.rs | 16 ++++-----------
2 files changed, 39 insertions(+), 25 deletions(-)
Subject: [PATCH 2/4] YJIT: other_cb is None in tests
Since the other cb is in CodegenGlobals, and we want Rust tests to be
self-contained.
---
yjit/src/asm/mod.rs | 1 +
1 file changed, 1 insertion(+)
Subject: [PATCH 3/4] YJIT: ARM64: Move functions out of arm64_emit()
---
yjit/src/backend/arm64/mod.rs | 180 +++++++++++++++++-----------------
1 file changed, 90 insertions(+), 90 deletions(-)
Subject: [PATCH 4/4] YJIT: ARM64: Fix long jumps to labels
Previously, with Code GC, YJIT panicked while trying to emit a B.cond
instruction with an offset that is not encodable in 19 bits. This only
happens when the code in an assembler instance straddles two pages.
To fix this, when we detect that a jump to a label can land on a
different page, we switch to a fresh new page and regenerate all the
code in the assembler there. We still assume that no one assembler has
so much code that it wouldn't fit inside a fresh new page.
[Bug #19385]
---
yjit/src/backend/arm64/mod.rs | 65 ++++++++++++++++++++++++++++++++---
1 file changed, 60 insertions(+), 5 deletions(-)
|
|
YJIT: Save PC and SP before calling leaf builtins (#7090)
Previously, we did not update `cfp->sp` before calling the C function of
ISEQs marked with `Primitive.attr! "inline"` (leaf builtins). This
caused the GC to miss temporary values on the stack in case the function
allocates and triggers a GC run. Right now, there is only a few leaf
builtins in numeric.rb on Integer methods such as `Integer#~`. Since
these methods only allocate when operating on big numbers, we missed
this issue.
Fix by saving PC and SP before calling the functions -- our usual
protocol for calling C functions that may allocate on the GC heap.
[Bug #19316]
---
test/ruby/test_yjit.rb | 16 ++++++++++++++++
yjit/src/codegen.rs | 4 ++++
2 files changed, 20 insertions(+)
|
|
YJIT: Fix `yield` into block with >=30 locals on ARM
It's a register spill issue. Fix by moving the Qnil fill snippet to
after registers are released.
[Bug #19299]
---
test/ruby/test_yjit.rb | 14 ++++++++++++++
yjit/src/codegen.rs | 29 ++++++++++++++---------------
2 files changed, 28 insertions(+), 15 deletions(-)
|
|
All the method call types need to handle argument shifting in case they're
called by `.send`, and we weren't handling that in `OPTIMIZED_METHOD_TYPE_CALL`.
Lack of shifting caused the stack size assertion in gen_leave() to fail.
Discovered by Rails CI: https://buildkite.com/rails/rails/builds/91705#018516c4-f8f8-469e-bc2d-ddeb25ca8317/1920-2067
Diagnosed with help from `@eileencodes` and `@k0kubun`.
Notes:
Merged: https://github.com/ruby/ruby/pull/6943
Merged-By: XrXr
|
|
Stubs we generate for invalidation don't necessarily co-locate with the
code that jump to the stub. Since we rely on co-location to keep stubs
alive as they are in the outlined code block, it used to be possible for
code GC inside branch_stub_hit() to free the stub that's its direct
caller, leading us to return to freed code after.
Stubs used to look like:
```
mov arg0, branch_ptr
mov arg1, target_idx
mov arg2, ec
call branch_stub_hit
jmp return_reg
```
Since the call and the jump after the call is the same for all stubs, we
can extract them and use a static trampoline for them. That makes
branch_stub_hit() always return to static code. Stubs now look like:
```
mov arg0, branch_ptr
mov arg1, target_idx
jmp trampoline
```
Where the trampoline is:
```
mov arg2, ec
call branch_stub_hit
jmp return_reg
```
Code GC can now free stubs without problems since we'll always return
to the trampoline, which we generate once on boot and lives forever.
This might save a small bit of memory due to factoring out the static
part of stubs, but it's probably minor.
[Bug #19234]
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.
Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.
This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.
For example:
```ruby
class NG; end
HUGE_NUMBER.times do
NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```
We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.
For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:
```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```
However, the following class is *not* too complex because it only has
one leaf in the shape tree:
```ruby
class Foo
def initialize
@a = @b = @c = @d = @e = @f = @g = @h = @i = nil
end
end
9.times { Foo.new }
``
This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6931
|
|
It's idempotent.
Notes:
Merged: https://github.com/ruby/ruby/pull/6930
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: implement getconstant YARV instruction
* Constant id is not a pointer
* Stack operands must be read after jit_prepare_routine_call
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Certain code page sizes don't work and can cause crashes, so having this
value available as a command-line option is a bit dangerous. Remove it
and turn it into a constant instead.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
* YJIT: Make case-when optimization respect === redefinition
Even when a fixnum key is in the dispatch hash, if there is a case such
that its basic operations for === is redefined, we need to fall back to
checking each case like the interpreter. Semantically we're always
checking each case by calling === in order, it's just that this is not
observable when basic operations are intact.
When all the keys are fixnums, though, we can do the optimization we're
doing right now. Check for this condition.
* Update yjit/src/cruby_bindings.inc.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* YJIT: Reorder branches for Fixnum opt_case_dispatch
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Don't support too large values
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Rename InsnOpnd => YARVOpnd
Make it more clear this refers to YARV insn/vm operands rather
than backend IR, x86 or ARM insn operands.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6794
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
YJIT: Skip padding jumps to side exits
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
This commit changes the shape id comparisons to use a 32 bit comparison
rather than 64 bit. That means we don't need to load the shape id to a
register on x86 machines.
Given the following program:
```ruby
class Foo
def initialize
@foo = 1
@bar = 1
end
def read
[@foo, @bar]
end
end
foo = Foo.new
foo.read
foo.read
foo.read
foo.read
foo.read
puts RubyVM::YJIT.disasm(Foo.instance_method(:read))
```
The machine code we generated _before_ this change is like this:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ======================
# getinstancevariable
0x559a18623023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x559a18623027: test al, 7
0x559a1862302a: jne 0x559a1862502d
0x559a18623030: cmp rax, 4
0x559a18623034: jbe 0x559a1862502d
# guard shape, embedded, and T_OBJECT
0x559a1862303a: mov rcx, qword ptr [rax]
0x559a1862303d: movabs r11, 0xffff00000000201f
0x559a18623047: and rcx, r11
0x559a1862304a: movabs r11, 0xb000000002001
0x559a18623054: cmp rcx, r11
0x559a18623057: jne 0x559a18625046
0x559a1862305d: mov rax, qword ptr [rax + 0x18]
0x559a18623061: mov qword ptr [rbx], rax
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ======================
# gen_direct_jmp: fallthrough
# getinstancevariable
# regenerate_branch
# getinstancevariable
# regenerate_branch
0x559a18623064: mov rax, qword ptr [r13 + 0x18]
# guard shape, embedded, and T_OBJECT
0x559a18623068: mov rcx, qword ptr [rax]
0x559a1862306b: movabs r11, 0xffff00000000201f
0x559a18623075: and rcx, r11
0x559a18623078: movabs r11, 0xb000000002001
0x559a18623082: cmp rcx, r11
0x559a18623085: jne 0x559a18625099
0x559a1862308b: mov rax, qword ptr [rax + 0x20]
0x559a1862308f: mov qword ptr [rbx + 8], rax
```
After this change, it's like this:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ======================
# getinstancevariable
0x5560c986d023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x5560c986d027: test al, 7
0x5560c986d02a: jne 0x5560c986f02d
0x5560c986d030: cmp rax, 4
0x5560c986d034: jbe 0x5560c986f02d
# guard shape
0x5560c986d03a: cmp word ptr [rax + 6], 0x19
0x5560c986d03f: jne 0x5560c986f046
0x5560c986d045: mov rax, qword ptr [rax + 0x10]
0x5560c986d049: mov qword ptr [rbx], rax
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ======================
# gen_direct_jmp: fallthrough
# getinstancevariable
# regenerate_branch
# getinstancevariable
# regenerate_branch
0x5560c986d04c: mov rax, qword ptr [r13 + 0x18]
# guard shape
0x5560c986d050: cmp word ptr [rax + 6], 0x19
0x5560c986d055: jne 0x5560c986f099
0x5560c986d05b: mov rax, qword ptr [rax + 0x18]
0x5560c986d05f: mov qword ptr [rbx + 8], rax
```
The first ivar read is a bit more complex, but the second ivar read is
much simpler. I think eventually we could teach the context about the
shape, then emit only one shape guard.
Notes:
Merged: https://github.com/ruby/ruby/pull/6737
|
|
@casperisfine reporting a bug in this gist https://gist.github.com/casperisfine/d59e297fba38eb3905a3d7152b9e9350
After investigating I found it was caused by a combination of send and a c_func that we have overwritten in the JIT. For send calls, we need to do some stack manipulation before making the call. Because of the way exits works, we need to do that stack manipulation at the last possible moment. In this case, we weren't doing that stack manipulation at all. Unfortunately, with how the code is structured there isn't a great place to do that stack manipulation for our overridden C funcs.
Each overridden C func can return a boolean stating that it shouldn't be used. We would need to do the stack manipulation after all of those checks are done. We could pass a lambda(?) or separate out the logic for "can I run this override" from "now generate the code for it". Since we are coming up on a release, I went with the path of least resistence and just decided to not use these overrides if we are in a send call.
We definitely should revist this in the future.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Previously, there is no instruction boundary patch point after
the call to a non-leaf C function we generate for
OPTIMIZED_METHOD_TYPE_CALL. This meant that if code GC is triggered
while inside the C function, we would keep running invalidated code when
we return from the C function. This had the effect of running
stale branch stubs, jumping to bad code, etc.
Use jit_prepare_routine_call() to make sure we exit from the invalidated
region as soon as possible after the C call in case of invalidation.
Notes:
Merged: https://github.com/ruby/ruby/pull/6711
|
|
* Enable --yjit-stats for release builds
In order for people in the real world to report information about how their application runs with YJIT, we want to expose stats without requiring rebuilding ruby. We can do this without overhead, with the exception of count ratio in yjit, since this relies on the interpreter also counting instructions.
This change exposes those stats, while not showing ratio in yjit if we are not in a stats build.
* Update yjit.rb
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Since object shapes store the capacity of an object, we no longer
need the numiv field on RObjects. This gives us one extra slot which
we can use to give embedded objects one more instance variable (for a
total of 3 ivs). This commit removes the concept of numiv from RObject.
Notes:
Merged: https://github.com/ruby/ruby/pull/6699
|