summaryrefslogtreecommitdiff
path: root/yjit
AgeCommit message (Collapse)Author
2022-10-11Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
2022-10-06YJIT: add an assert for branch_stub_hit() (#6505)Alan Wu
We set the PC in branch_stub_hit(), which only makes sense if we're running with the intended iseq for the stub. We ran into an issue caught by this while tweaking code layout. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-06YJIT: fix ARM64 bitmask encoding for 32 bit registers (#6503)Alan Wu
For logical instructions such as AND, there is a constraint that the N part of the bitmask immediate must be 0. We weren't respecting this condition previously and were silently emitting undefined instructions. Check for this condition in the assembler and tweak the backend to correctly detect whether a number could be encoded as an immediate in a 32 bit logical instruction. Due to the nature of the immediate encoding, the same numeric value encodes differently depending on the size of the register the instruction works on. We currently don't have cases where we use 32 bit immediates but we ran into this encoding issue during development. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-04YJIT: add support for calling bmethods (#6489)Alan Wu
* YJIT: fix a parameter name * YJIT: add support for calling bmethods This commit adds support for the VM_METHOD_TYPE_BMETHOD method type in YJIT. You can get these type of methods from facilities like Kernel#define_singleton_method and Module#define_method. Even though the body of these methods are blocks, the parameter setup for them is exactly the same as VM_METHOD_TYPE_ISEQ, so we can reuse the same logic in gen_send_iseq(). You can see this from how vm_call_bmethod() eventually calls setup_parameters_complex() with arg_setup_method. Bmethods do need their frame environment to be setup differently. We handle this by allowing callers of gen_send_iseq() to control the iseq, the frame flag, and the prev_ep. The `prev_ep` goes into the same location as the block handler would go into in an iseq method frame. Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: John Hawthorn <john@hawthorn.email> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-10-03Split cmp operations that aren't 32/64 bit for arm (#6484)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-30Fix YJIT args for rb_vm_set_ivar_idxJohn Hawthorn
This was broken accidentally with the revert of shapes (it conflicted with some unrelated cleanup). Notes: Merged: https://github.com/ruby/ruby/pull/6479
2022-09-30Fix YJIT build after shapes-revertJohn Hawthorn
An variable had been renamed in between the merge and revert, so the build was broken. This restores it.
2022-09-30Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30A bunch of clippy auto fixes for yjit (#6476)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-28This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-27YJIT: add assertion wrt label names (#6459)Maxime Chevalier-Boisvert
Add assertion wrt label names Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-27Change IncrCounter lowering on AArch64 (#6455)Kevin Newton
* Change IncrCounter lowering on AArch64 Previously we were using LDADDAL which is not available on Graviton 1 chips. Instead, we're going to use an exclusive load/store group through the LDAXR/STLXR instructions. * Update yjit/src/backend/arm64/mod.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-26Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson
Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
2022-09-26This commit implements the Object Shapes technique in CRuby.Jemma Issroff
Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/6386
2022-09-23YJIT: Support Rust 1.58.1 for --yjit-stats on Arm (#6410)Takashi Kokubun
* YJIT: Test Rust 1.58.1 as well on Cirrus * YJIT: Avoid using a Rust 1.60.0 feature * YJIT: Use autoconf to detect support * YJIT: We actually need to run it for checking it properly * YJIT: Try cfg!(target_feature = "lse") * Revert "YJIT: Try cfg!(target_feature = "lse")" This reverts commit 4e2a9ca9a9c83052c23b5e205c91bdf79e88342e. * YJIT: Add --features stats only when it works * Update configure.ac Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-22YJIT: add chain guards in `guard_two_fixnums` (#6422)Maxime Chevalier-Boisvert
* Add chain guards in guard_two_fixnums, opt_eq with symbols * Remove symbol comparison in gen_equality_specialized Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-22YJIT: Refactor into gen_push_frame (#6412)John Hawthorn
This refactors the "push frame" operation common to both gen_send_iseq and gen_send_cfunc into its own method. This allows that logic to live in one place. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-20Guard `--yjit-stats` behind `#[cfg(feature = "stats")]` (#6409)Maxime Chevalier-Boisvert
* Guard --yjit-stats behind #[cfg(feature = "stats")] * Only ask for --yjit-stats with dev builds on cirrus CI * Revert "Only ask for --yjit-stats with dev builds on cirrus CI" This reverts commit cfb5ddfa4b9394ca240447eee02637788435b02a. * Make it so the --yjit-stats option works for non-release builds * Revert accidental changes Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-20YJIT: Support MAKE=bmake for release buildAlan Wu
This add support for bmake, which should allow building with `configure --enable-yjit` for the BSDs. Tested on FreeBSD 13 and on macOS with `configure MAKE=bmake` on a case-sensitive file system. It works by including a fragment into the Makefile through the configure script, similar to common.mk. It uses the always rebuild approach to keep build system changes minimal. Notes: Merged: https://github.com/ruby/ruby/pull/6408
2022-09-20YJIT: Show --yjit-stats of railsbench on CI (#6403)Takashi Kokubun
* YJIT: Show --yjit-stats of railsbench on CI * YJIT: Use --enable-yjit=dev to see ratio_in_yjit * YJIT: Show master GitHub URL for quick comparison * YJIT: Avoid making CI red by a yjit-bench failure Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-19YJIT: Check if the processor supports --yjit-stats (#6401)Takashi Kokubun
* YJIT: Add asm comment for incr_counter * YJIT: Check if the processor supports --yjit-stats Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-19Only exit if ruby2_keywords and splat together (#6395)Jimmy Miller
Before this change railsbench spent less time in yjit than before splat. This brings it back to parity. Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-18Update bindgen crate (#6397)Takashi Kokubun
to get rid of deprecated indirect dependency, ansi_term Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-16Invalidate i-cache after link_labels (#6388)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-16Fix splat args (#6385)Jimmy Miller
* Fix splat args Cfuncs were not working properly so I disabled them right now. There were some checks above that were also actually preventing splat args from being called. Finally I did some basic code cleanup after realizing I didn't need to mutate argc so much * Add can't compile for direct cfunc splat call * Fix typo * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-15Add asm comments to make disasm more readable (#6377)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-14YJIT: Implement specialized respond_to? (#6363)John Hawthorn
* Add rb_callable_method_entry_or_negative * YJIT: Implement specialized respond_to? This implements a specialized respond_to? in YJIT. * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-14Initial support for VM_CALL_ARGS_SPLAT (#6341)Jimmy Miller
* Initial support for VM_CALL_ARGS_SPLAT This implements support for calls with splat (*) for some methods. In benchmarks this made very little difference for most benchmarks, but a large difference for binarytrees. Looking at side exits, many benchmarks now don't exit for splat, but exit for some other reason. Binarytrees however had a number of calls that used splat args that are now much faster. In my non-scientific benchmarking this made splat args performance on par with not using splat args at all. * Fix wording and whitespace Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> * Get rid of side_effect reassignment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-14YJIT: Add Opnd#with_num_bits to use only 8 bits (#6359)Takashi Kokubun
* YJIT: Add Opnd#sub_opnd to use only 8 bits * Add with_num_bits and let arm64_split use it * Add another assertion to with_num_bits * Use only with_num_bits Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-14Add comments to touch libyjitNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6371
2022-09-14Touch libyjit.a which may be still old due to the cacheNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6371
2022-09-14Expand dependency for `$(YJIT_LIBS)`Nobuyoshi Nakada
Currently, miniruby is rebuild **always** when yjit is enabled, even if nothing is changed. Notes: Merged: https://github.com/ruby/ruby/pull/6371
2022-09-09YJIT: Branch directly when nil? is known from typesJohn Hawthorn
Notes: Merged: https://github.com/ruby/ruby/pull/6350
2022-09-09YJIT: Branch directly when truthyness is knownJohn Hawthorn
Notes: Merged: https://github.com/ruby/ruby/pull/6350
2022-09-09YJIT: eliminate redundant mov in csel/cmov on x86 (#6348)Maxime Chevalier-Boisvert
* Eliminate redundant mov in csel/cmov. Translate mov reg,0 into xor * Fix x86 asm test * Remove dbg!() * xor optimization unsound because it resets flags Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-09Better offsets (#6315)Kevin Newton
* Introduce InstructionOffset for AArch64 There are a lot of instructions on AArch64 where we take an offset from PC in terms of the number of instructions. This is for loading a value relative to the PC or for jumping. We were usually accepting an A64Opnd or an i32. It can get confusing and inconsistent though because sometimes you would divide by 4 to get the number of instructions or multiply by 4 to get the number of bytes. This commit adds a struct that wraps an i32 in order to keep all of that logic in one place. It makes it much easier to read and reason about how these offsets are getting used. * Use b instruction when the offset fits on AArch64 Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-08Remove as many unnecessary moves as possible (#6342)v3_2_0_preview2Kevin Newton
This commit does a bunch of stuff to try to eliminate as many unnecessary mov instructions as possible. First, it introduces the Insn::LoadInto instruction. Previously when we needed a value to go into a specific register (like in Insn::CCall when we're putting values into the argument registers or in Insn::CRet when we're putting a value into the return register) we would first load the value and then mov it into the correct register. This resulted in a lot of duplicated work with short live ranges since they basically immediately we unnecessary. The new instruction accepts a destination and does not interact with the register allocator at all, making it much more efficient. We then use the new instruction when we're loading values into argument registers for AArch64 or X86_64, and when we're returning a value from AArch64. Notably we don't do it when we're returning a value from X86_64 because everything can be accomplished with a single mov anyway. A couple of unnecessary movs were also present because when we called the split_load_opnd function in a lot of split passes we were loading all registers and instruction outputs. We no longer do that. This commit also makes it so that UImm(0) passes through the Insn::Store split without attempting to be loaded, which allows it can take advantage of the zero register. So now instead of mov-ing 0 into a register and then calling store, it just stores XZR. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-01Allow comparing against 64-bit immediates on x86 (#6314)Kevin Newton
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-09-01Remove rb_iseq_eachJohn Hawthorn
Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-09-01New constant caching insn: opt_getconstant_pathJohn Hawthorn
Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out. Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-09-01Stop using a callee-saved register for scratch0 on aarch64 (#6312)Takashi Kokubun
[Bug #18985] * Callee-save x22 for aarch64 * Just use a caller-saved register * Update yjit/src/backend/arm64/mod.rs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-01Let --yjit-dump-disasm=all dump ocb code as well (#6309)Takashi Kokubun
* Let --yjit-dump-disasm=all dump ocb code as well * Use an enum instead * Add a None Option to DumpDisasm (#444) * Add a None Option to DumpDisasm * Update yjit/src/asm/mod.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> * Fix a build failure * Use only a single name * Only None will be a disabled case * Fix cargo test * Fix --yjit-dump-disasm=all to print outlined cb Co-authored-by: Jimmy Miller <jimmyhmiller@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-08-31Better b.cond usage on AArch64 (#6305)Kevin Newton
* Better b.cond usage on AArch64 When we're lowering a conditional jump, we previously had a bit of a complicated setup where we could emit a conditional jump to skip over a jump that was the next instruction, and then write out the destination and use a branch register. Now instead we use the b.cond instruction if our offset fits (not common, but not unused either) and if it doesn't we write out an inverse condition to jump past loading the destination and branching directly. * Added an inverse fn for Condition (#443) Prevents the need to pass two params and potentially reduces errors. Co-authored-by: Jimmy Miller <jimmyhmiller@jimmys-mbp.lan> Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Jimmy Miller <jimmyhmiller@jimmys-mbp.lan> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-08-30Skip linking rb_yjit_icache_invalidate on cargo testTakashi Kokubun
Co-authored-by: Kevin Newton <kddnewton@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/6304
2022-08-29Check only symbol flag bits (#6301)Takashi Kokubun
* Check only symbol flag bits * Check all 4 bits Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-08-29Fixed width immediates (https://github.com/Shopify/ruby/pull/437)Kevin Newton
There are a lot of times when encoding AArch64 instructions that we need to represent an integer value with a custom fixed width. For example, the offset for a B instruction is 26 bits, so we store an i32 on the instruction struct and then mask it when we encode. We've been doing this masking everywhere, which has worked, but it's getting a bit copy-pasty all over the place. This commit centralizes that logic to make sure we stay consistent. Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29A64: Only clear icache when writing out new code ↵Alan Wu
(https://github.com/Shopify/ruby/pull/442) Previously we cleared the cache for all the code in the system when we flip memory protection, which was prohibitively expensive since the operation is not constant time. Instead, only clear the cache for the memory region of newly written code when we write out new code. This brings the runtime for the 30k_if_else test down to about 6 seconds from the previous 45 seconds on my laptop. Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29TBZ and TBNZ for AArch64 (https://github.com/Shopify/ruby/pull/434)Kevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29LDRH and STRH for AArch64 (https://github.com/Shopify/ruby/pull/438)Kevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/6289
2022-08-29Remove ir_ssa.rs as we aren't using it and it's now outdatedMaxime Chevalier-Boisvert
Notes: Merged: https://github.com/ruby/ruby/pull/6289