summaryrefslogtreecommitdiff
path: root/yjit
AgeCommit message (Collapse)Author
2022-11-24YJIT: rename `InsnOpnd` => `YARVOpnd` (#6801)Maxime Chevalier-Boisvert
Rename InsnOpnd => YARVOpnd Make it more clear this refers to YARV insn/vm operands rather than backend IR, x86 or ARM insn operands. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-23YJIT: Use a Box for branch targets to save memoryAlan Wu
We frequently make branches that only have one target but we used to always allocate space for two branch targets. This patch moves all the information a branch target has into a struct and refer to them using Option<Box<BranchTarget>>, this way when the second branch target is not present it only takes 8 bytes. Retained heap size on railsbench went from 16.17 MiB to 14.57 MiB, a ratio of about 1.1. Notes: Merged: https://github.com/ruby/ruby/pull/6799
2022-11-23YJIT: Simplify Insn::CCall to obviate Target::FunPtr (#6793)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-23YJIT: Use NonNull pointer for CodePtr (#6792)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-23YJIT: Stop passing target1 to gen_return_branchTakashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/6794
2022-11-23YJIT: Simplify code for RB_SPECIAL_CONST_P (#6795)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-23Fix YJIT backend to account for unsigned int immediates (#6789)Jemma Issroff
YJIT: x86_64: Fix cmp with number where sign bit is set Before this commit, we were unconditionally treating unsigned ints as signed ints when counting the number of bits required for representing the immediate in machine code. When the size of the immediate matches the size of the other operand, no sign extension happens, so this was incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this issue. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Alan Wu <alanwu@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Alan Wu <alanwu@ruby-lang.org> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-22YJIT: Skip padding jumps to side exits on Arm (#6790)Takashi Kokubun
YJIT: Skip padding jumps to side exits Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-21YJIT: Lower the required Rust version from 1.58.1 to 1.58.0 (#6780)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-18YJIT: Improve the failure message on enlarging a branch (#6769)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-1832 bit comparison on shape idAaron Patterson
This commit changes the shape id comparisons to use a 32 bit comparison rather than 64 bit. That means we don't need to load the shape id to a register on x86 machines. Given the following program: ```ruby class Foo def initialize @foo = 1 @bar = 1 end def read [@foo, @bar] end end foo = Foo.new foo.read foo.read foo.read foo.read foo.read puts RubyVM::YJIT.disasm(Foo.instance_method(:read)) ``` The machine code we generated _before_ this change is like this: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 65 bytes ====================== # getinstancevariable 0x559a18623023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x559a18623027: test al, 7 0x559a1862302a: jne 0x559a1862502d 0x559a18623030: cmp rax, 4 0x559a18623034: jbe 0x559a1862502d # guard shape, embedded, and T_OBJECT 0x559a1862303a: mov rcx, qword ptr [rax] 0x559a1862303d: movabs r11, 0xffff00000000201f 0x559a18623047: and rcx, r11 0x559a1862304a: movabs r11, 0xb000000002001 0x559a18623054: cmp rcx, r11 0x559a18623057: jne 0x559a18625046 0x559a1862305d: mov rax, qword ptr [rax + 0x18] 0x559a18623061: mov qword ptr [rbx], rax == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 47 bytes ====================== # gen_direct_jmp: fallthrough # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x559a18623064: mov rax, qword ptr [r13 + 0x18] # guard shape, embedded, and T_OBJECT 0x559a18623068: mov rcx, qword ptr [rax] 0x559a1862306b: movabs r11, 0xffff00000000201f 0x559a18623075: and rcx, r11 0x559a18623078: movabs r11, 0xb000000002001 0x559a18623082: cmp rcx, r11 0x559a18623085: jne 0x559a18625099 0x559a1862308b: mov rax, qword ptr [rax + 0x20] 0x559a1862308f: mov qword ptr [rbx + 8], rax ``` After this change, it's like this: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 41 bytes ====================== # getinstancevariable 0x5560c986d023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5560c986d027: test al, 7 0x5560c986d02a: jne 0x5560c986f02d 0x5560c986d030: cmp rax, 4 0x5560c986d034: jbe 0x5560c986f02d # guard shape 0x5560c986d03a: cmp word ptr [rax + 6], 0x19 0x5560c986d03f: jne 0x5560c986f046 0x5560c986d045: mov rax, qword ptr [rax + 0x10] 0x5560c986d049: mov qword ptr [rbx], rax == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 23 bytes ====================== # gen_direct_jmp: fallthrough # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x5560c986d04c: mov rax, qword ptr [r13 + 0x18] # guard shape 0x5560c986d050: cmp word ptr [rax + 6], 0x19 0x5560c986d055: jne 0x5560c986f099 0x5560c986d05b: mov rax, qword ptr [rax + 0x18] 0x5560c986d05f: mov qword ptr [rbx + 8], rax ``` The first ivar read is a bit more complex, but the second ivar read is much simpler. I think eventually we could teach the context about the shape, then emit only one shape guard. Notes: Merged: https://github.com/ruby/ruby/pull/6737
2022-11-17Fix bug involving .send and overwritten methods. (#6752)Jimmy Miller
@casperisfine reporting a bug in this gist https://gist.github.com/casperisfine/d59e297fba38eb3905a3d7152b9e9350 After investigating I found it was caused by a combination of send and a c_func that we have overwritten in the JIT. For send calls, we need to do some stack manipulation before making the call. Because of the way exits works, we need to do that stack manipulation at the last possible moment. In this case, we weren't doing that stack manipulation at all. Unfortunately, with how the code is structured there isn't a great place to do that stack manipulation for our overridden C funcs. Each overridden C func can return a boolean stating that it shouldn't be used. We would need to do the stack manipulation after all of those checks are done. We could pass a lambda(?) or separate out the logic for "can I run this override" from "now generate the code for it". Since we are coming up on a release, I went with the path of least resistence and just decided to not use these overrides if we are in a send call. We definitely should revist this in the future. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-16YJIT: Shrink version lists after mutation (#6749)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-16YJIT: Pack BlockId and CodePtr (#6748)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-16YJIT: Add compiled_branch_count stats (#6746)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-16YJIT: Stop wrapping CmePtr with CmeDependency (#6747)Takashi Kokubun
* YJIT: Stop wrapping CmePtr with CmeDependency * YJIT: Fix an outdated comment [ci skip] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-16YJIT: Shrink the vectors of Block after mutation (#6739)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-15YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733)Takashi Kokubun
* YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * Introduce heap_object_p * Leave original mov intact * Remove unneeded branches * Add a test for movabs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-15YJIT: Include actual memory region size in stats (#6736)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-15YJIT: Count getivar side exits by receiver flag changes (#6735)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-15YJIT: Invalidate redefined methods only through cme (#6734)Takashi Kokubun
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-11-14Implement LDURH on Aarch64Aaron Patterson
When RUBY_DEBUG is enabled, shape ids are 16 bits. I would like to do 16 bit comparisons, so I need to load halfwords sometimes. This commit adds LDURH so that I can load halfwords. https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/LDURH--Load-Register-Halfword--unscaled--?lang=en I verified the bytes using clang: ``` $ cat asmthing.s .global _start .align 2 _start: ldurh w10, [x1] ldurh w10, [x1, #123] $ as asmthing.s -o asmthing.o && objdump --disassemble asmthing.o asmthing.o: file format mach-o arm64 Disassembly of section __TEXT,__text: 0000000000000000 <ltmp0>: 0: 2a 00 40 78 ldurh w10, [x1] 4: 2a b0 47 78 ldurh w10, [x1, #123] ``` Notes: Merged: https://github.com/ruby/ruby/pull/6729
2022-11-14Remove USE_RVARGC codeAaron Patterson
We don't need this constant to be exposed anymore, so remove it Notes: Merged: https://github.com/ruby/ruby/pull/6728
2022-11-13YJIT: Instrument global allocations on stats build (#6712)Takashi Kokubun
* YJIT: Instrument global allocations on stats build * Just use GLOVAL_ALLOCATOR.stats() Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-13YJIT: Remove unused src_ctx from Block (#6714)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-11YJIT: Fix staying in invalidated code after proc callsAlan Wu
Previously, there is no instruction boundary patch point after the call to a non-leaf C function we generate for OPTIMIZED_METHOD_TYPE_CALL. This meant that if code GC is triggered while inside the C function, we would keep running invalidated code when we return from the C function. This had the effect of running stale branch stubs, jumping to bad code, etc. Use jit_prepare_routine_call() to make sure we exit from the invalidated region as soon as possible after the C call in case of invalidation. Notes: Merged: https://github.com/ruby/ruby/pull/6711
2022-11-10Enable --yjit-stats for release builds (#6694)Jimmy Miller
* Enable --yjit-stats for release builds In order for people in the real world to report information about how their application runs with YJIT, we want to expose stats without requiring rebuilding ruby. We can do this without overhead, with the exception of count ratio in yjit, since this relies on the interpreter also counting instructions. This change exposes those stats, while not showing ratio in yjit if we are not in a stats build. * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-10Remove numiv from RObjectJemma Issroff
Since object shapes store the capacity of an object, we no longer need the numiv field on RObjects. This gives us one extra slot which we can use to give embedded objects one more instance variable (for a total of 3 ivs). This commit removes the concept of numiv from RObject. Notes: Merged: https://github.com/ruby/ruby/pull/6699
2022-11-10Transition shape when object's capacity changesJemma Issroff
This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6699
2022-11-08YJIT: Reset dropped_bytes when patching codeAlan Wu
We switch to a new page when we detect dropped_bytes flipping from false to true. Previously, when we patch code for invalidation during code gc, we start with the flag being set to true, so we failed to apply patches that straddle pages. We would write out jumps half way and then stop, which left the code corrupted. Reset the flag before patching so we patch across pages properly. Notes: Merged: https://github.com/ruby/ruby/pull/6686
2022-11-08Implement optimize call (#6691)Jimmy Miller
This dispatches to a c func for doing the dynamic lookup. I experimented with chain on the proc but wasn't able to detect which call sites would be monomorphic vs polymorphic. There is definitely room for optimization here, but it does reduce exits. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-07YJIT: Free pages after ObjectSpace API usages (#6676)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-03YJIT: Make Code GC metrics available for non-stats builds (#6665)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-03YJIT: Fix a wrong type reference (#6661)Takashi Kokubun
* YJIT: Fix a wrong type reference * YJIT: Just remove CapturedSelfOpnd for now Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-03YJIT: Stop incrementing write_pos if cb.has_dropped_bytes (#6664)Takashi Kokubun
Co-Authored-By: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-02YJIT: Support invokeblock (#6640)Takashi Kokubun
* YJIT: Support invokeblock * Update yjit/src/backend/arm64/mod.rs * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-02YJIT: Avoid accumulating freed pages in the payload (#6657)Takashi Kokubun
Co-Authored-By: Alan Wu <alansi.xingwu@shopify.com> Co-Authored-By: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-01YJIT: Visualize live ranges on register spill (#6651)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-01YJIT: Add an assert to help with Context changesAlan Wu
While experimenting I found that it's easy to change Context and forget to also change the copying operation in limit_block_versions(). Add an assert to make sure we substitute a compatible generic context when limiting the number of versions. Notes: Merged: https://github.com/ruby/ruby/pull/6656
2022-11-01YJIT: Delete redundant ways to make ContextAlan Wu
Context::new() is the same as Context::default() and Context::new_with_stack_size() was only used in tests. Notes: Merged: https://github.com/ruby/ruby/pull/6656
2022-10-31YJIT: Add RubyVM::YJIT.code_gc (#6644)Takashi Kokubun
* YJIT: Add RubyVM::YJIT.code_gc * Rename compiled_page_count to live_page_count Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-31YJIT: reduce default `--yjit-exec-mem-size` to 128MiB instead of 256 (#6649)Maxime Chevalier-Boisvert
Reduce default --yjit-exec-mem-size to 128MiB instead of 256 Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-27YJIT: Use guard_known_class() for opt_aref on Arrays (#6643)Alan Wu
This code used to roll its own heap object check before we made a better version in guard_known_class(). The improved version uses one fewer comparison, so let's use that. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-26YJIT: Support nil and blockparamproxy as blockarg in send (#6492)Matthew Draper
Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: John Hawthorn <john@hawthorn.email> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-26YJIT: Invalidate i-cache for the other cb on next_page (#6631)Takashi Kokubun
* YJIT: Invalidate i-cache for the other cb on next_page * YJIT: Invalidate only what's written by jmp_ptr * YJIT: Move the code to the arm64 backend Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-25YJIT: GC and recompile all code pages (#6406)Takashi Kokubun
when it fails to allocate a new page. Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-10-21Rename `iv_count` on shapes to `next_iv_index`Jemma Issroff
`iv_count` is a misleading name because when IVs are unset, the new shape doesn't decrement this value. `next_iv_count` is an accurate, and more descriptive name. Notes: Merged: https://github.com/ruby/ruby/pull/6608
2022-10-21YJIT: Fix page rounding for icache bustingAlan Wu
Previously, we found the current page by rounding the current pointer to the closest smaller page size. This is incorrect because pages are relative to the start of the address we reserve. For example, if the starting address is 12KiB modulo the 16KiB page size, once we have more than 4KiB of code, calculating with the address would incorrectly give us page 1 when we're actually still on page 0. Previously, I can reproduce crashes with: make btest RUN_OPTS=--yjit-code-page-size=32 on ARM64 macOS, where system page sizes are 16KiB. Notes: Merged: https://github.com/ruby/ruby/pull/6607 Merged-By: XrXr
2022-10-21YJIT: Read rb_num_t as usize earlyAlan Wu
This patch makes sure that we're not accidentally reading rb_num_t instruction arguments as VALUE and accidentally baking them into code and marking them. Some of these are simply moving the cast earlier, but some of these avoid potential problems for flag and ID arguments. Follow-up for 39f7eddec4c55711d56f05b085992a83bf23159e. Notes: Merged: https://github.com/ruby/ruby/pull/6606
2022-10-20YJIT: Fix gen_expandarray treating argument as VALUEAlan Wu
The expandarray instruction interpreters its arguments as rb_num_t. YJIT was treating the num argument as a VALUE previously and when it has a certain bit pattern, it can look like a GC pointer. The argument is not a pointer, so YJIT crashed when trying to mark those pointers. This bug existed previously, but our test suite didn't expose it until f55212bce939f736559709a8cd16c409772389c8. TestArgf#test_to_io has a line like: a1, a2, a3, a4, a5, a6, a7, a8 = array Which maps to an expandarray with an argument of 8. Qnil happened to be defined as 8, which masked the issue. Fix it by not using the argument as a VALUE. Notes: Merged: https://github.com/ruby/ruby/pull/6603