summaryrefslogtreecommitdiff
path: root/yjit/src/asm/x86_64
AgeCommit message (Collapse)Author
2025-06-11YJIT: x86: Fix panic writing 32-bit number with top bit setAlan Wu
Previously, `asm.mov(m32, imm32)` panicked when `imm32 > 0x80000000`. It attempted to split imm32 into a register before doing the store, but then the register size didn't match the destination size. Instead of splitting, use the `MOV r/m32, imm32` form which works for all 32-bit values. Adjust asserts that assumed that all forms undergo sign extension, which is not true for this case. See: 54edc930f9f0a658da45cfcef46648d1b6f82467 Notes: Merged: https://github.com/ruby/ruby/pull/13576
2024-01-25YJIT: Assert lea source operand typeAlan Wu
2023-11-07YJIT: Use u32 for CodePtr to save 4 bytes eachAlan Wu
We've long had a size restriction on the code memory region such that a u32 could refer to everything. This commit capitalizes on this restriction by shrinking the size of `CodePtr` to be 4 bytes from 8. To derive a full raw pointer from a `CodePtr`, one needs a base pointer. Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The base pointer is readily available everywhere, except for in the case of the `jit_return` "branch". Generalize lea_label() to lea_jump_target() in the IR to delay deriving the `jit_return` address until `compile()`, when the base pointer is available. On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size` (58,397,765 to 57,742,248).
2023-08-22YJIT: x64: Split mem-to-mem Insn::Store like Insn::MovAlan Wu
The ARM backend allows for this so let's make x64 consistent. Notes: Merged: https://github.com/ruby/ruby/pull/8263 Merged-By: XrXr
2023-08-11YJIT: implement codegen for rb_int_lshift (#8201)Maxime Chevalier-Boisvert
* YJIT: implement codegen for rb_int_lshift * Update yjit/src/asm/x86_64/mod.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-08-09YJIT: implement imul instruction encoding in x86 assembler (#8191)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-08-04YJIT: expand bitwise shift support in x86 assembler (#8174)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-02-10YJIT: add counters for polymorphic send and send with known class (#7288)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-02-02Fix typos in YJIT [ci skip]Alan Wu
2023-01-18Add stats so we can keep track of x86 rel32 vs register calls (#7142)Maxime Chevalier-Boisvert
* Add stats so we can keep track of x86 rel32 vs register calls To know if we get that "prime real estate" as Alan put it. * Fix bug pointed by Alan Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-23Fix YJIT backend to account for unsigned int immediates (#6789)Jemma Issroff
YJIT: x86_64: Fix cmp with number where sign bit is set Before this commit, we were unconditionally treating unsigned ints as signed ints when counting the number of bits required for representing the immediate in machine code. When the size of the immediate matches the size of the other operand, no sign extension happens, so this was incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this issue. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Alan Wu <alanwu@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Alan Wu <alanwu@ruby-lang.org> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-11-15YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733)Takashi Kokubun
* YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * Introduce heap_object_p * Leave original mov intact * Remove unneeded branches * Add a test for movabs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-10-19YJIT: fix a #[warn(unused_parens)]Alan Wu
2022-10-19YJIT: fold the "asm_comments" feature into "disasm" (#6591)Alan Wu
Previously, enabling only "disasm" didn't actually build. Since these two features are closely related and we don't really use one without the other, let's simplify and merge the two features together. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-13Make op_ext an optional for code clarity (#6542)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-11Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
2022-09-30Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-28This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson
Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
2022-09-26This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/6386
2022-09-14YJIT: Add Opnd#with_num_bits to use only 8 bits (#6359)Takashi Kokubun
* YJIT: Add Opnd#sub_opnd to use only 8 bits * Add with_num_bits and let arm64_split use it * Add another assertion to with_num_bits * Use only with_num_bits Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-08-29Port the YJIT defined opcode; fix C_ARG_REGS ↵Noah Gibbs
(https://github.com/Shopify/ruby/pull/342)
2022-08-29Better label refs (https://github.com/Shopify/ruby/pull/310)Kevin Newton
Previously we were using a `Box<dyn FnOnce>` to support patching the code when jumping to labels. We needed to do this because some of the closures that were being used to patch needed to capture local variables (on both X86 and ARM it was the type of condition for the conditional jumps). To get around that, we can instead use const generics since the condition codes are always known at compile-time. This means that the closures go from polymorphic to monomorphic, which means they can be represented as an `fn` instead of a `Box<dyn FnOnce>`, which means they can fall back to a plain function pointer. This simplifies the storage of the `LabelRef` structs and should hopefully be a better default going forward.
2022-08-29Arm64 progress (https://github.com/Shopify/ruby/pull/304)Kevin Newton
* Get initial wiring up * Split IncrCounter instruction * Breakpoints in Arm64 * Support for ORR * MOV instruction encodings * Implement JmpOpnd and CRet * Add ORN * Add MVN * PUSH, POP, CCALL for Arm64 * Some formatting and implement Op::Not for Arm64 * Consistent constants when working with the Arm64 SP * Allow OR-ing values into the memory buffer * Test lowering Arm64 ADD * Emit unconditional jumps consistently in Arm64 * Begin emitting conditional jumps for A64 * Back out some labelref changes * Remove label API that no longer exists * Use a trait for the label encoders * Encode nop * Add in nops so jumps are the same width no matter what on Arm64 * Op::Jbe for CodePtr * Pass src_addr and dst_addr instead of calculated offset to label refs * Even more jump work for Arm64 * Fix up jumps to use consistent assertions * Handle splitting Add, Sub, and Not insns for Arm64 * More Arm64 splits and various fixes * PR feedback for Arm64 support * Split up jumps and conditional jump logic
2022-08-29Make sure allocated reg size in bits matches insn out sizeMaxime Chevalier-Boisvert
2022-08-29Implement X86Reg::sub_reg() methodMaxime Chevalier-Boisvert
2022-08-29Port over putnil, putobject, and gen_leave()Maxime Chevalier-Boisvert
* Remove x86-64 dependency from codegen.rs * Port over putnil and putobject * Port over gen_leave() * Complete port of gen_leave() * Fix bug in x86 instruction splitting
2022-08-29* Arm64 Beginnings (https://github.com/Shopify/ruby/pull/291)Maxime Chevalier-Boisvert
* Initial setup for aarch64 * ADDS and SUBS * ADD and SUB for immediates * Revert moved code * Documentation * Rename Arm64* to A64* * Comments on shift types * Share sig_imm_size and unsig_imm_size
2022-08-29Implement gc offset logicMaxime Chevalier-Boisvert
2022-08-29Function to map from Opnd => X86OpndMaxime Chevalier-Boisvert
2022-08-29Start work on platform-specific codegenMaxime Chevalier-Boisvert
2022-08-29WIP backend IR sketchMaxime Chevalier-Boisvert
2022-06-14YJIT: On-demand executable memory allocation; faster boot (#5944)Alan Wu
This commit makes YJIT allocate memory for generated code gradually as needed. Previously, YJIT allocates all the memory it needs on boot in one go, leading to higher than necessary resident set size (RSS) and time spent on boot initializing the memory with a large memset(). Users should no longer need to search for a magic number to pass to `--yjit-exec-mem` since physical memory consumption should now more accurately reflect the requirement of the workload. YJIT now reserves a range of addresses on boot. This region start out with no access permission at all so buggy attempts to jump to the region crashes like before this change. To get this hardening at finer granularity than the page size, we fill each page with trapping instructions when we first allocate physical memory for the page. Most of the time applications don't need 256 MiB of executable code, so allocating on-demand ends up doing less total work than before. Case in point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about half as long after this change. In terms of memory consumption, here is a table to give a rough summary of the impact: | Peak RSS in MiB | -eitself example | railsbench once | | :-------------: | ---------------: | --------------: | | before | 265 | 377 | | after | 11 | 143 | | no YJIT | 10 | 101 | A new module is introduced to handle allocation bookkeeping. `CodePtr` is moved into the module since it has a close relationship with the new `VirtualMemory` struct. This new interface has a slightly smaller surface than before in that marking a region as writable is no longer a public operation. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-05-02YJIT: Remove redundant `extern crate` (#5869)Koichi ITO
Follow up https://github.com/ruby/ruby/commit/0514d81 Rust YJIT requires Rust 1.60.0 or later. So, `extern crate` looks unnecessary because it can use the following Rust 2018 edition feature: https://doc.rust-lang.org/stable/edition-guide/rust-2018/path-changes.html#no-more-extern-crate It passes the following tests. ```console % cd yjit % cargo test --features asm_comments,disasm (snip) test result: ok. 56 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s ``` Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-04-29YJIT: Enable default rustc lints (warnings) (#5864)Alan Wu
`rustc` performs in depth dead code analysis and issues warning even for things like unused struct fields and unconstructed enum variants. This was annoying for us during the port but hopefully they are less of an issue now. This patch enables all the unused warnings we disabled and address all the warnings we previously ignored. Generally, the approach I've taken is to use `cfg!` instead of using the `cfg` attribute and to delete code where it makes sense. I've put `#[allow(unused)]` on things we intentionally keep around for printf style debugging and on items that are too annoying to keep warning-free in all build configs. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-04-29YJIT: Adopt Clippy suggestions we likeAlan Wu
This adopts most suggestions that rust-clippy is confident enough to auto apply. The manual changes mostly fix manual if-lets and take opportunities to use the `Default` trait on standard collections. Co-authored-by: Kevin Newton <kddnewton@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Notes: Merged: https://github.com/ruby/ruby/pull/5853
2022-04-27Rust YJITAlan Wu
In December 2021, we opened an [issue] to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications. Because we want to be careful, YJIT is guarded behind a configure option: ```shell ./configure --enable-yjit # Build YJIT in release mode ./configure --enable-yjit=dev # Build YJIT in dev/debug mode ``` By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then `cargo` is used to fetch development dependencies, but when building in release, `cargo` is not required, only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer. The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in `doc/yjit/yjit.md`. The CI tests have been updated and do not take any more resources than before. The development history of the Rust port is available at the following commit for interested parties: https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up. [issue]: https://bugs.ruby-lang.org/issues/18481 Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com> Co-authored-by: Kevin Newton <kddnewton@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/5826