summaryrefslogtreecommitdiff
path: root/yjit/src
AgeCommit message (Collapse)Author
2023-07-29YJIT: Drop Copy trait from Context (#8138)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-27YJIT: Count setivar too-complex exits (#8131)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-27YJIT: implement missing `asm.jg` instruction in backend (#8130)Maxime Chevalier-Boisvert
YJIT: implement missing jg instruction in backend While trying to implement a specialize integer left shift, I ran into a problem where we have no way to do a greater-than comparison at the moment. Surprising we went this far without ever needing it. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-27YJIT: getblockparamproxy for when block is a ProcAlan Wu
Notes: Merged: https://github.com/ruby/ruby/pull/8124
2023-07-27Revert "YJIT: Fix naming for a getblockparamproxy counter"Alan Wu
This reverts commit e7804963f09d7df7f6cce44fbb3e37809c9a15cc. Oops. The counter was for getblockparam, without "proxy", so it was aptly named. Notes: Merged: https://github.com/ruby/ruby/pull/8124
2023-07-27YJIT: Use dynamic dispatch for megamorphic send (#8125)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-26YJIT: Count the number of dynamic send dispatches (#8122)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-07-26YJIT: Fix naming for a getblockparamproxy counterAlan Wu
The rest of the counters are prefixed with `gbpp_` and that's what `yjit.rb` uses when printing the summary. This counter wasn't included in the summary. Notes: Merged: https://github.com/ruby/ruby/pull/8121
2023-07-26Implement `opt_aref_with` instruction (#8118)ywenc
Implement gen_opt_aref_with Vm opt_aref_with is available Test opt_aref_with Stats for opt_aref_with Co-authored-by: jhawthorn <jhawthorn@github.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-24YJIT: Fallback send instructions to vm_sendish (#8106)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-07-20YJIT: Rename exec_instruction to yjit_insns_count (#8102)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-20Get rid of obsoleted __bp__ referencesTakashi Kokubun
2023-07-20YJIT: Avoid undercounting retired_in_yjit (#8038)Takashi Kokubun
* YJIT: Count the number of failed instructions * Rename yjit_insns_count to exec_instructions instead * Hoist out the exec_instruction counter Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-17Remove __bp__ and speed-up bmethod calls (#8060)Alan Wu
Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since 9121e57a5f50bc91bae48b3b91edb283bf96cb6b. Closes ruby/ruby#6428
2023-07-17YJIT: refactoring to allow for fancier call threshold logic (#8078)Maxime Chevalier-Boisvert
* YJIT: refactoring to allow for fancier call threshold logic * Avoid potentially compiling functions multiple times. * Update vm.c Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-13YJIT: Make ratio_in_yjit always available (#8064)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-13Remove RARRAY_CONST_PTR_TRANSIENTPeter Zhu
RARRAY_CONST_PTR now does the same things as RARRAY_CONST_PTR_TRANSIENT. Notes: Merged: https://github.com/ruby/ruby/pull/8071
2023-07-13[Feature #19730] Remove transient heapPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/7942
2023-07-13[DOC] Removed redundant `the`Hiroshi SHIBATA
Notes: Merged: https://github.com/ruby/ruby/pull/8067 Merged-By: nobu <nobu@ruby-lang.org>
2023-07-13Store object age in a bitmapMatt Valentine-House
Closes [Feature #19729] Previously 2 bits of the flags on each RVALUE are reserved to store the number of GC cycles that each object has survived. This commit introduces a new bit array on the heap page, called age_bits, to store that information instead. This patch still reserves one of the age bits in the flags (the old FL_PROMOTED0 bit, now renamed FL_PROMOTED). This is set to 0 for young objects and 1 for old objects, and is used as a performance optimisation for the write barrier. Fetching the age_bits from the heap page and doing the required math to calculate if the object was old or not would slow down the write barrier. So we keep this bit synced in the flags for fast access. Notes: Merged: https://github.com/ruby/ruby/pull/7938
2023-07-11YJIT: add counter for untracked gbpp exit reason (#8052)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-06YJIT: Use registers to pass stack temps to C calls (#7920)Takashi Kokubun
* YJIT: Use registers to pass stack temps to C calls * YJIT: Update comments in ccall
2023-07-06YJIT: add new stats counter for compiled ISEQ entry points (#8032)Maxime Chevalier-Boisvert
* YJIT: add new stats counter for compiled ISEQ entry points * Update yjit.rb Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-05YJIT: Use --yjit-exec-mem-size=128 by default (#8031)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-07-04YJIT: Avoid reloading InsnOut operands (#8021)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-04YJIT: Break register cycles for C arguments (take 2) (#8018)Takashi Kokubun
* Revert "Revert "YJIT: Break register cycles for C arguments (#7918)"" This reverts commit 78ca085785460de46bfc4851a898d525c1698ef8. * Use shfited_live_ranges for the last-insn check Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-07-04YJIT: Fix autosplat miscomp for blocks with optionals (#8006)Alan Wu
* YJIT: Fix autosplat miscomp for blocks with optionals When passing an array as the sole argument to `yield`, and the yieldee takes more than 1 optional parameter, the array is expanded similar to `*array` splat calls. This is called "autosplat" in `setup_parameters_complex()`. Previously, YJIT did not detect this autosplat condition. It passed the array without expanding it, deviating from interpreter behavior. Detect this conditon and refuse to compile it. Fixes: Shopify/yjit#313 * RJIT: Fix autosplat miscomp for blocks with optionals This is mirrors the same issue as YJIT. See previous commit. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-19Remove taint and untrusted flags (#7958)Nobuyoshi Nakada
* Make TAINT and UNTRUSTED flags zero These flags do nothing already, and should break nothing. * Remove TAINT and UNTRUSTED macros same as functions These macros had been defined to use with `#ifdef`, but should not be used anymore. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-12Revert "YJIT: Break register cycles for C arguments (#7918)"Takashi Kokubun
This reverts commit 888ba29e462075472776098f4f95eb6d3df8e730. It caused a CI failure http://ci.rvm.jp/results/trunk-yjit@ruby-sp2-docker/4598881 and I'm investigating it.
2023-06-12YJIT: Break register cycles for C arguments (#7918)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-08Remove RHASH_TRANSIENT_FLAGPeter Zhu
Hashes are no longer allocated on the transient heap.
2023-06-06YJIT: Avoid identity-based known-class guards for IO objects (#7911)Alan Wu
`IO#reopen` is very special in that it is able to change the class and singleton class of IO instances. In its presence, it is not correct to assume that IO instances has a stable class/singleton class and guard by comparing identity. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-06Unify length field for embedded and heap strings (#7908)Peter Zhu
* Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-06-05YJIT: Fix a warning on cargo test (#7909)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-06-05Implement Struct on VWAPeter Zhu
The benchmark results show that this feature has either a positive or no impact on performance. The memory usage is also mostly unchanged, except in hexapdf, where there is a decrease in RSS. -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- ------------- bench master (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr master/branch activerecord 70.8 2.2 56.0 71.7 2.2 56.0 0.99 0.99 erubi_rails 20.5 13.6 94.7 20.5 14.3 94.2 0.93 1.00 hexapdf 2541.0 0.7 212.8 2544.4 0.7 203.4 1.00 1.00 liquid-c 65.6 0.3 38.9 65.3 0.3 38.9 1.01 1.01 liquid-compile 63.7 0.3 34.6 61.1 0.2 34.6 1.04 1.04 liquid-render 163.1 0.1 37.1 163.3 0.1 37.1 1.00 1.00 mail 139.3 0.1 50.5 137.0 0.1 50.1 0.99 1.02 psych-load 2065.7 0.1 36.9 2068.2 0.1 37.3 1.00 1.00 railsbench 2034.6 0.5 103.9 2031.9 0.5 103.8 1.02 1.00 ruby-lsp 65.3 3.1 89.8 66.2 3.0 89.7 1.01 0.99 sequel 73.2 1.0 40.3 73.4 1.0 40.3 1.00 1.00 -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- ------------- Notes: Merged: https://github.com/ruby/ruby/pull/7871
2023-06-05Revert "Revert "Fix cvar caching when class is cloned""eileencodes
This reverts commit 10621f7cb9a0c70e568f89cce47a02e878af6778. This was reverted because the gc integrity build started failing. We have figured out a fix so I'm reopening the PR. Original commit message: Fix cvar caching when class is cloned The class variable cache that was added in ruby#4544 changed the behavior of class variables on cloned classes. As reported when a class is cloned AND a class variable was set, and the class variable was read from the original class, reading a class variable from the cloned class would return the value from the original class. This was happening because the IC (inline cache) is stored on the ISEQ which is shared between the original and cloned class, therefore they share the cache too. To fix this we are now storing the `cref` in the cache so that we can check if it's equal to the current `cref`. If it's different we don't want to read from the cache. If it's the same we do. Cloned classes don't share the same cref with their original class. This will need to be backported to 3.1 in addition to 3.2 since the bug exists in both versions. We also added a marking function which was missing. Fixes [Bug #19379] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/7900
2023-06-02YJIT: Use #[cfg] instead of if cfg! (#7899)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-06-01Revert "Fix cvar caching when class is cloned"Aaron Patterson
This reverts commit 77d1b082470790c17c24a2f406b4fec5d522636b.
2023-06-01Fix cvar caching when class is clonedeileencodes
The class variable cache that was added in https://github.com/ruby/ruby/pull/4544 changed the behavior of class variables on cloned classes. As reported when a class is cloned AND a class variable was set, and the class variable was read from the original class, reading a class variable from the cloned class would return the value from the original class. This was happening because the IC (inline cache) is stored on the ISEQ which is shared between the original and cloned class, therefore they share the cache too. To fix this we are now storing the `cref` in the cache so that we can check if it's equal to the current `cref`. If it's different we don't want to read from the cache. If it's the same we do. Cloned classes don't share the same cref with their original class. This will need to be backported to 3.1 in addition to 3.2 since the bug exists in both versions. We also added a marking function which was missing. Fixes [Bug #19379] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/7265
2023-05-30YJIT: Force showing a backtrace on panic (#7869)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-05-24Add a newline at EOF [ci skip]Nobuyoshi Nakada
2023-05-01YJIT: Move exits in gen_send_iseq to functions and use ? (#7725)Jimmy Miller
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2023-04-25Generalize cfunc large array splat fix to fix many additional cases raising ↵Jeremy Evans
SystemStackError Originally, when 2e7bceb34ea858649e1f975a934ce1894d1f06a6 fixed cfuncs to no longer use the VM stack for large array splats, it was thought to have fully fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs) back in Ruby 2.2. After additional research, I determined that same issue affects almost all types of method calls, not just iseq and cfunc calls. There were two main types of remaining issues, important cases (where large array splat should work) and pedantic cases (where large array splat raised SystemStackError instead of ArgumentError). Important cases: ```ruby define_method(:a){|*a|} a(*1380888.times) def b(*a); end send(:b, *1380888.times) :b.to_proc.call(self, *1380888.times) def d; yield(*1380888.times) end d(&method(:b)) def self.method_missing(*a); end not_a_method(*1380888.times) ``` Pedantic cases: ```ruby def a; end a(*1380888.times) def b(_); end b(*1380888.times) def c(_=nil); end c(*1380888.times) c = Class.new do attr_accessor :a alias b a= end.new c.a(*1380888.times) c.b(*1380888.times) c = Struct.new(:a) do alias b a= end.new c.a(*1380888.times) c.b(*1380888.times) ``` This patch fixes all usage of CALLER_SETUP_ARG with splatting a large number of arguments, and required similar fixes to use a temporary hidden array in three other cases where the VM would use the VM stack for handling a large number of arguments. However, it is possible there may be additional cases where splatting a large number of arguments still causes a SystemStackError. This has a measurable performance impact, as it requires additional checks for a large number of arguments in many additional cases. This change is fairly invasive, as there were many different VM functions that needed to be modified to support this. To avoid too much API change, I modified struct rb_calling_info to add a heap_argv member for storing the array, so I would not have to thread it through many functions. This struct is always stack allocated, which helps ensure sure GC doesn't collect it early. Because of how invasive the changes are, and how rarely large arrays are actually splatted in Ruby code, the existing test/spec suites are not great at testing for correct behavior. To try to find and fix all issues, I tested this in CI with VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used for all array splat method calls. This was very helpful in finding breaking cases, especially ones involving flagged keyword hashes. Fixes [Bug #4040] Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com> Notes: Merged: https://github.com/ruby/ruby/pull/7522
2023-04-24YJIT: Use general definedivar at the end of chains (#7756)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-04-20YJIT: invokesuper: Remove cme mid matching checkJohn Hawthorn
This check was introduced to match an assertion in the C YJIT when this was originally introduced. I don't believe it's necessary for correctness of the generated code. Co-authored-by: Adam Hess <HParker@github.com> Co-authored-by: Daniel Colson <danieljamescolson@gmail.com> Co-authored-by: Luan Vieira <luanzeba@github.com> Notes: Merged: https://github.com/ruby/ruby/pull/7740
2023-04-20YJIT: Merge lower_stack into the split pass (#7748)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-04-20Fix inaccurate commentMaxime Chevalier-Boisvert
2023-04-20YJIT: Merge csel and mov on arm64 (#7747)Takashi Kokubun
* YJIT: Refactor arm64_split with &mut insn * YJIT: Merge csel and mov on arm64 Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-04-20YJIT: Avoid splitting mov for small values on arm64 (#7745)Takashi Kokubun
* YJIT: Avoid splitting mov for small values on arm64 * Fix a comment Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * YJIT: Test the 0xffff boundary --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-04-19YJIT: Replace Mov with LoadInto on arm64 (#7744)Takashi Kokubun
* YJIT: Replace Mov with LoadInto on arm64 * YJIT: Add a test for the new pass Notes: Merged-By: k0kubun <takashikkbn@gmail.com>