summaryrefslogtreecommitdiff
path: root/yjit/src/asm
AgeCommit message (Collapse)Author
2022-10-25YJIT: GC and recompile all code pages (#6406)Takashi Kokubun
when it fails to allocate a new page. Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-10-21YJIT: Fix page rounding for icache bustingAlan Wu
Previously, we found the current page by rounding the current pointer to the closest smaller page size. This is incorrect because pages are relative to the start of the address we reserve. For example, if the starting address is 12KiB modulo the 16KiB page size, once we have more than 4KiB of code, calculating with the address would incorrectly give us page 1 when we're actually still on page 0. Previously, I can reproduce crashes with: make btest RUN_OPTS=--yjit-code-page-size=32 on ARM64 macOS, where system page sizes are 16KiB. Notes: Merged: https://github.com/ruby/ruby/pull/6607 Merged-By: XrXr
2022-10-19YJIT: Skip dumping code for the other cb on --yjit-dump-disasm (#6592)Takashi Kokubun
YJIT: Skip dumping code for the other cb on --yjit-dump-disasm Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-19YJIT: fix a #[warn(unused_parens)]Alan Wu
2022-10-19YJIT: fold the "asm_comments" feature into "disasm" (#6591)Alan Wu
Previously, enabling only "disasm" didn't actually build. Since these two features are closely related and we don't really use one without the other, let's simplify and merge the two features together. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-18Code clean around unused code for some architectures or features (#6581)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-17YJIT: Interleave inline and outlined code blocks (#6460)Takashi Kokubun
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-10-14More clippy fixes (#6547)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-13fixes more clippy warnings (#6543)Jimmy Miller
* fixes more clippy warnings * Fix x86 c_callable to have doc_strings Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-13Make op_ext an optional for code clarity (#6542)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-11Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
2022-10-06YJIT: fix ARM64 bitmask encoding for 32 bit registers (#6503)Alan Wu
For logical instructions such as AND, there is a constraint that the N part of the bitmask immediate must be 0. We weren't respecting this condition previously and were silently emitting undefined instructions. Check for this condition in the assembler and tweak the backend to correctly detect whether a number could be encoded as an immediate in a 32 bit logical instruction. Due to the nature of the immediate encoding, the same numeric value encodes differently depending on the size of the register the instruction works on. We currently don't have cases where we use 32 bit immediates but we ran into this encoding issue during development. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-30Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30A bunch of clippy auto fixes for yjit (#6476)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-28This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-27YJIT: add assertion wrt label names (#6459)Maxime Chevalier-Boisvert
Add assertion wrt label names Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-27Change IncrCounter lowering on AArch64 (#6455)Kevin Newton
* Change IncrCounter lowering on AArch64 Previously we were using LDADDAL which is not available on Graviton 1 chips. Instead, we're going to use an exclusive load/store group through the LDAXR/STLXR instructions. * Update yjit/src/backend/arm64/mod.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-26Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson
Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
2022-09-26This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/6386
2022-09-14YJIT: Add Opnd#with_num_bits to use only 8 bits (#6359)Takashi Kokubun
* YJIT: Add Opnd#sub_opnd to use only 8 bits * Add with_num_bits and let arm64_split use it * Add another assertion to with_num_bits * Use only with_num_bits Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-09Better offsets (#6315)Kevin Newton
* Introduce InstructionOffset for AArch64 There are a lot of instructions on AArch64 where we take an offset from PC in terms of the number of instructions. This is for loading a value relative to the PC or for jumping. We were usually accepting an A64Opnd or an i32. It can get confusing and inconsistent though because sometimes you would divide by 4 to get the number of instructions or multiply by 4 to get the number of bytes. This commit adds a struct that wraps an i32 in order to keep all of that logic in one place. It makes it much easier to read and reason about how these offsets are getting used. * Use b instruction when the offset fits on AArch64 Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-01Let --yjit-dump-disasm=all dump ocb code as well (#6309)Takashi Kokubun
* Let --yjit-dump-disasm=all dump ocb code as well * Use an enum instead * Add a None Option to DumpDisasm (#444) * Add a None Option to DumpDisasm * Update yjit/src/asm/mod.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> * Fix a build failure * Use only a single name * Only None will be a disabled case * Fix cargo test * Fix --yjit-dump-disasm=all to print outlined cb Co-authored-by: Jimmy Miller <jimmyhmiller@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-08-31Better b.cond usage on AArch64 (#6305)Kevin Newton
* Better b.cond usage on AArch64 When we're lowering a conditional jump, we previously had a bit of a complicated setup where we could emit a conditional jump to skip over a jump that was the next instruction, and then write out the destination and use a branch register. Now instead we use the b.cond instruction if our offset fits (not common, but not unused either) and if it doesn't we write out an inverse condition to jump past loading the destination and branching directly. * Added an inverse fn for Condition (#443) Prevents the need to pass two params and potentially reduces errors. Co-authored-by: Jimmy Miller <jimmyhmiller@jimmys-mbp.lan> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Jimmy Miller <jimmyhmiller@jimmys-mbp.lan> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-08-29Fixed width immediates (https://github.com/Shopify/ruby/pull/437)Kevin Newton
There are a lot of times when encoding AArch64 instructions that we need to represent an integer value with a custom fixed width. For example, the offset for a B instruction is 26 bits, so we store an i32 on the instruction struct and then mask it when we encode. We've been doing this masking everywhere, which has worked, but it's getting a bit copy-pasty all over the place. This commit centralizes that logic to make sure we stay consistent. Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29TBZ and TBNZ for AArch64 (https://github.com/Shopify/ruby/pull/434)Kevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29LDRH and STRH for AArch64 (https://github.com/Shopify/ruby/pull/438)Kevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29Add --yjit-dump-disasm to dump every compiled code ↵Takashi Kokubun
(https://github.com/Shopify/ruby/pull/430) * Add --yjit-dump-disasm to dump every compiled code * Just use get_option * Carve out disasm_from_addr * Avoid push_str with format! * Share the logic through asm.compile * This seems to negatively impact the compilation speed Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29Various AArch64 optimizations (https://github.com/Shopify/ruby/pull/433)Kevin Newton
* When we're storing an immediate 0 value at a memory address, we can use STUR XZR, Xd instead of loading 0 into a register and then storing that register. * When we're moving 0 into an argument register, we can use MOV Xd, XZR instead of loading the value into a register first. * In the newarray instruction, we can skip looking at the stack at all if the number of values we're using is 0. Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29Optimize bitmask immediates (https://github.com/Shopify/ruby/pull/403)Kevin Newton
2022-08-29Op::Xor for backend IR (https://github.com/Shopify/ruby/pull/397)Kevin Newton
2022-08-29Fix code invalidation while OOM and OOM simulation ↵Alan Wu
(https://github.com/Shopify/ruby/pull/395) `YJIT.simulate_oom!` used to leave one byte of space in the code block, so our test didn't expose a problem with asserting that the write position is in bounds in `CodeBlock::set_pos`. We do the following when patching code: 1. save current write position 2. seek to middle of the code block and patch 3. restore old write position The bounds check fails on (3) when the code block is already filled up. Leaving one byte of space also meant that when we write that byte, we need to fill the entire code region with trapping instruction in `VirtualMem`, which made the OOM tests unnecessarily slow. Remove the incorrect bounds check and stop leaving space in the code block when simulating OOM.
2022-08-29Load mem displacement when necessary on AArch64 ↵Kevin Newton
(https://github.com/Shopify/ruby/pull/382) * LDR instruction for AArch64 * Split loads in arm64_split when memory address displacements do not fit
2022-08-29Left and right shift for IR (https://github.com/Shopify/ruby/pull/374)Kevin Newton
* Left and right shift for IR * Update yjit/src/backend/x86_64/mod.rs Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2022-08-29Port the YJIT defined opcode; fix C_ARG_REGS ↵Noah Gibbs
(https://github.com/Shopify/ruby/pull/342)
2022-08-29A64: Fix off by one in offset encoding for BL ↵Alan Wu
(https://github.com/Shopify/ruby/pull/344) * A64: Fix off by one in offset encoding for BL It's relative to the address of the instruction not the end of it. * A64: Fix off by one when encoding B It's relative to the start of the instruction not the end. * A64: Add some tests for boundary offsets
2022-08-29Fixes (https://github.com/Shopify/ruby/pull/340)Kevin Newton
* Fix conditional jumps to label * Bitmask immediates cannot be u64::MAX
2022-08-29Fixes for AArch64 (https://github.com/Shopify/ruby/pull/338)Kevin Newton
* Better splitting for Op::Add, Op::Sub, and Op::Cmp * Split stores if the displacement is too large * Use a shifted immediate argument * Split all places where shifted immediates are used * Add more tests to the cirrus workflow
2022-08-29Fixes (https://github.com/Shopify/ruby/pull/336)Kevin Newton
* Fix bitmask encoding to u32 * Fix splitting for Op::And to account for bitmask immediate
2022-08-29Better splitting for Op::Test on AArch64 ↵Kevin Newton
(https://github.com/Shopify/ruby/pull/335)
2022-08-29A lot of fixes coming from our pairing session ↵Kevin Newton
(https://github.com/Shopify/ruby/pull/329) * Move to/from SP on AArch64 * Consolidate loads and stores * Implement LDR post-index and LDR pre-index for AArch64 * Implement STR post-index and STR pre-index for AArch64 * Module entrypoints for LDR pre/post -index and STR pre/post -index * Use STR (pre-index) and LDR (post-index) to implement push/pop * Go back to using MOV for to/from SP
2022-08-29Assert not the same register in AArch64Kevin Newton
2022-08-29BLR instruction for AArch64 (https://github.com/Shopify/ruby/pull/325)Kevin Newton
2022-08-29AArch64 frames (https://github.com/Shopify/ruby/pull/324)Kevin Newton
2022-08-29Conditionals (https://github.com/Shopify/ruby/pull/323)Kevin Newton
* CSEL on AArch64 * Implement various Op::CSel* instructions
2022-08-29Port print_int to the new backend (https://github.com/Shopify/ruby/pull/321)Kevin Newton
* Port print_int to the new backend * Tests for print_int and print_str
2022-08-29Port print_str to new backend (https://github.com/Shopify/ruby/pull/318)Kevin Newton
* ADR and ADRP for AArch64 * Implement Op::Jbe on X86 * Lera instruction * Op::BakeString * LeaPC -> LeaLabel * Port print_str to the new backend * Port print_value to the new backend * Port print_ptr to the new backend * Write null-terminators in Op::BakeString * Fix up rebase issues on print-str port * Add back in panic for X86 backend for unsupported instructions being lowered * Fix target architecture
2022-08-29Op::CPushAll and Op::CPopAll (https://github.com/Shopify/ruby/pull/317)Kevin Newton
Instructions for pushing all caller-save registers and the flags so that we can implement dump_insns.
2022-08-29Assert that the # of bytes matches for label refs ↵Kevin Newton
(https://github.com/Shopify/ruby/pull/316)
2022-08-29Encode MRS and MSR for AArch64 (https://github.com/Shopify/ruby/pull/315)Kevin Newton
2022-08-29Better label refs (https://github.com/Shopify/ruby/pull/310)Kevin Newton
Previously we were using a `Box<dyn FnOnce>` to support patching the code when jumping to labels. We needed to do this because some of the closures that were being used to patch needed to capture local variables (on both X86 and ARM it was the type of condition for the conditional jumps). To get around that, we can instead use const generics since the condition codes are always known at compile-time. This means that the closures go from polymorphic to monomorphic, which means they can be represented as an `fn` instead of a `Box<dyn FnOnce>`, which means they can fall back to a plain function pointer. This simplifies the storage of the `LabelRef` structs and should hopefully be a better default going forward.