Age | Commit message (Collapse) | Author |
|
Certain code page sizes don't work and can cause crashes, so having this
value available as a command-line option is a bit dangerous. Remove it
and turn it into a constant instead.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6767
|
|
* YJIT: Make case-when optimization respect === redefinition
Even when a fixnum key is in the dispatch hash, if there is a case such
that its basic operations for === is redefined, we need to fall back to
checking each case like the interpreter. Semantically we're always
checking each case by calling === in order, it's just that this is not
observable when basic operations are intact.
When all the keys are fixnums, though, we can do the optimization we're
doing right now. Check for this condition.
* Update yjit/src/cruby_bindings.inc.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Reorder branches for Fixnum opt_case_dispatch
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Don't support too large values
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* Fix 32 and 16 bit register store in YJIT
Co-Authored-By: Takashi Kokubun <takashikkbn@gmail.com>
* Remove an unnecessary diff
* Reuse an rm_num_bits result
* Use u16::MAX instead
* Update the link
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Just use sturh for 16 bits
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Previously we essentially never freed block even after invalidation.
Their reference count never reached zero for a couple of reasons:
1. `Branch::block` formed a cycle with the block holding the branch
2. Strong count on a branch that has ever contained a stub never
reached 0 because we increment the `.clone()` call for
`BranchRef::into_raw()` didn't have a matching decrement.
It's not safe to immediately deallocate blocks during
invalidation since `branch_stub_hit()` can end up
running with a branch pointer from an invalidated branch.
To plug the leaks, we wait until code GC or global invalidation and
deallocate the blocks for iseqs that are definitely not running.
Notes:
Merged: https://github.com/ruby/ruby/pull/6833
|
|
When we run global invalidation for TracePoints or code GC, we clear out
all blocks in our assumptions table but we don't deallocate the backing
buffers. Let's reclaim some memory during these rare events.
Notes:
Merged: https://github.com/ruby/ruby/pull/6833
|
|
HashSet::clear() doesn't deallocate the backing buffer and shrink the
capacity. Replace with a 0-capcity set instead so we reclaim some memory
each code GC.
Notes:
Merged: https://github.com/ruby/ruby/pull/6833
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
instead of FILE*.
Using C.fprintf is slower than String manipulation on memory. I'm going
to change the way MJIT writes files, and this is a prerequisite for it.
|
|
Rename InsnOpnd => YARVOpnd
Make it more clear this refers to YARV insn/vm operands rather
than backend IR, x86 or ARM insn operands.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
We frequently make branches that only have one target but we used to
always allocate space for two branch targets. This patch moves all the
information a branch target has into a struct and refer to them using
Option<Box<BranchTarget>>, this way when the second branch target is not
present it only takes 8 bytes.
Retained heap size on railsbench went from 16.17 MiB to 14.57 MiB, a
ratio of about 1.1.
Notes:
Merged: https://github.com/ruby/ruby/pull/6799
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6794
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
YJIT: x86_64: Fix cmp with number where sign bit is set
Before this commit, we were unconditionally treating unsigned ints as
signed ints when counting the number of bits required for representing
the immediate in machine code. When the size of the immediate matches
the size of the other operand, no sign extension happens, so this was
incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's
encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this
issue.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
YJIT: Skip padding jumps to side exits
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
This commit changes the shape id comparisons to use a 32 bit comparison
rather than 64 bit. That means we don't need to load the shape id to a
register on x86 machines.
Given the following program:
```ruby
class Foo
def initialize
@foo = 1
@bar = 1
end
def read
[@foo, @bar]
end
end
foo = Foo.new
foo.read
foo.read
foo.read
foo.read
foo.read
puts RubyVM::YJIT.disasm(Foo.instance_method(:read))
```
The machine code we generated _before_ this change is like this:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ======================
# getinstancevariable
0x559a18623023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x559a18623027: test al, 7
0x559a1862302a: jne 0x559a1862502d
0x559a18623030: cmp rax, 4
0x559a18623034: jbe 0x559a1862502d
# guard shape, embedded, and T_OBJECT
0x559a1862303a: mov rcx, qword ptr [rax]
0x559a1862303d: movabs r11, 0xffff00000000201f
0x559a18623047: and rcx, r11
0x559a1862304a: movabs r11, 0xb000000002001
0x559a18623054: cmp rcx, r11
0x559a18623057: jne 0x559a18625046
0x559a1862305d: mov rax, qword ptr [rax + 0x18]
0x559a18623061: mov qword ptr [rbx], rax
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ======================
# gen_direct_jmp: fallthrough
# getinstancevariable
# regenerate_branch
# getinstancevariable
# regenerate_branch
0x559a18623064: mov rax, qword ptr [r13 + 0x18]
# guard shape, embedded, and T_OBJECT
0x559a18623068: mov rcx, qword ptr [rax]
0x559a1862306b: movabs r11, 0xffff00000000201f
0x559a18623075: and rcx, r11
0x559a18623078: movabs r11, 0xb000000002001
0x559a18623082: cmp rcx, r11
0x559a18623085: jne 0x559a18625099
0x559a1862308b: mov rax, qword ptr [rax + 0x20]
0x559a1862308f: mov qword ptr [rbx + 8], rax
```
After this change, it's like this:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ======================
# getinstancevariable
0x5560c986d023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x5560c986d027: test al, 7
0x5560c986d02a: jne 0x5560c986f02d
0x5560c986d030: cmp rax, 4
0x5560c986d034: jbe 0x5560c986f02d
# guard shape
0x5560c986d03a: cmp word ptr [rax + 6], 0x19
0x5560c986d03f: jne 0x5560c986f046
0x5560c986d045: mov rax, qword ptr [rax + 0x10]
0x5560c986d049: mov qword ptr [rbx], rax
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ======================
# gen_direct_jmp: fallthrough
# getinstancevariable
# regenerate_branch
# getinstancevariable
# regenerate_branch
0x5560c986d04c: mov rax, qword ptr [r13 + 0x18]
# guard shape
0x5560c986d050: cmp word ptr [rax + 6], 0x19
0x5560c986d055: jne 0x5560c986f099
0x5560c986d05b: mov rax, qword ptr [rax + 0x18]
0x5560c986d05f: mov qword ptr [rbx + 8], rax
```
The first ivar read is a bit more complex, but the second ivar read is
much simpler. I think eventually we could teach the context about the
shape, then emit only one shape guard.
Notes:
Merged: https://github.com/ruby/ruby/pull/6737
|
|
@casperisfine reporting a bug in this gist https://gist.github.com/casperisfine/d59e297fba38eb3905a3d7152b9e9350
After investigating I found it was caused by a combination of send and a c_func that we have overwritten in the JIT. For send calls, we need to do some stack manipulation before making the call. Because of the way exits works, we need to do that stack manipulation at the last possible moment. In this case, we weren't doing that stack manipulation at all. Unfortunately, with how the code is structured there isn't a great place to do that stack manipulation for our overridden C funcs.
Each overridden C func can return a boolean stating that it shouldn't be used. We would need to do the stack manipulation after all of those checks are done. We could pass a lambda(?) or separate out the logic for "can I run this override" from "now generate the code for it". Since we are coming up on a release, I went with the path of least resistence and just decided to not use these overrides if we are in a send call.
We definitely should revist this in the future.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Stop wrapping CmePtr with CmeDependency
* YJIT: Fix an outdated comment [ci skip]
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Always encode Opnd::Value in 64 bits on x86_64
for GC offsets
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* Introduce heap_object_p
* Leave original mov intact
* Remove unneeded branches
* Add a test for movabs
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
When RUBY_DEBUG is enabled, shape ids are 16 bits. I would like to do
16 bit comparisons, so I need to load halfwords sometimes. This commit
adds LDURH so that I can load halfwords.
https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/LDURH--Load-Register-Halfword--unscaled--?lang=en
I verified the bytes using clang:
```
$ cat asmthing.s
.global _start
.align 2
_start:
ldurh w10, [x1]
ldurh w10, [x1, #123]
$ as asmthing.s -o asmthing.o && objdump --disassemble asmthing.o
asmthing.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000000 <ltmp0>:
0: 2a 00 40 78 ldurh w10, [x1]
4: 2a b0 47 78 ldurh w10, [x1, #123]
```
Notes:
Merged: https://github.com/ruby/ruby/pull/6729
|
|
We don't need this constant to be exposed anymore, so remove it
Notes:
Merged: https://github.com/ruby/ruby/pull/6728
|
|
* YJIT: Instrument global allocations on stats build
* Just use GLOVAL_ALLOCATOR.stats()
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Previously, there is no instruction boundary patch point after
the call to a non-leaf C function we generate for
OPTIMIZED_METHOD_TYPE_CALL. This meant that if code GC is triggered
while inside the C function, we would keep running invalidated code when
we return from the C function. This had the effect of running
stale branch stubs, jumping to bad code, etc.
Use jit_prepare_routine_call() to make sure we exit from the invalidated
region as soon as possible after the C call in case of invalidation.
Notes:
Merged: https://github.com/ruby/ruby/pull/6711
|
|
* Enable --yjit-stats for release builds
In order for people in the real world to report information about how their application runs with YJIT, we want to expose stats without requiring rebuilding ruby. We can do this without overhead, with the exception of count ratio in yjit, since this relies on the interpreter also counting instructions.
This change exposes those stats, while not showing ratio in yjit if we are not in a stats build.
* Update yjit.rb
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|