summaryrefslogtreecommitdiff
path: root/insns.def
AgeCommit message (Collapse)Author
2022-11-10Adjust indents [ci skip]Nobuyoshi Nakada
2022-09-01New constant caching insn: opt_getconstant_pathJohn Hawthorn
Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out. Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-08-09Add peephole optimizer for newarray(X)/expandarray(X, 0) -> opt_reverse(X)Jeremy Evans
This renames the reverse instruction to opt_reverse, since now it is only added by the optimizer. Then it uses as a more general form of swap. This optimizes multiple assignment in the popped case with more than two elements. Notes: Merged: https://github.com/ruby/ruby/pull/6158
2022-08-09Revert "Remove reverse VM instruction"Jeremy Evans
This reverts commit 5512353d97250e85c13bf10b9b32e750478cf474. Notes: Merged: https://github.com/ruby/ruby/pull/6158
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-04-05RubyVM.stat constant cache metrics (#5766)Kevin Newton
Before the new constant cache behavior, caches were invalidated by a single global variable. You could inspect the value of this variable with RubyVM.stat(:global_constant_state). This was mostly useful to verify the behavior of the VM or to test constant loading like in Rails. With the new constant cache behavior, we introduced RubyVM.stat(:constant_cache) which returned a hash with symbol keys and integer values that represented the number of live constant caches associated with the given symbol. Additionally, we removed the old RubyVM.stat(:global_constant_state). This was proven to be not very useful, so it doesn't help you diagnose constant loading issues. So, instead we added the global constant state back into the RubyVM output. However, that number can be misleading as now when you invalidate something like `Foo::Bar::Baz` you're actually invalidating 3 different lists of inline caches. This commit attempts to get the best of both worlds. We remove RubyVM.stat(:global_constant_state) like we did originally, as it doesn't have the same semantic meaning and it could be confusing going forward. Instead we add RubyVM.stat(:constant_cache_invalidations) and RubyVM.stat(:constant_cache_misses). These two metrics should provide enough information to diagnose any constant loading issues, as well as provide a replacement for the old global constant state. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-04-01Finer-grained constant cache invalidation (take 2)Kevin Newton
This commit reintroduces finer-grained constant cache invalidation. After 8008fb7 got merged, it was causing issues on token-threaded builds (such as on Windows). The issue was that when you're iterating through instruction sequences and using the translator functions to get back the instruction structs, you're either using `rb_vm_insn_null_translator` or `rb_vm_insn_addr2insn2` depending if it's a direct-threading build. `rb_vm_insn_addr2insn2` does some normalization to always return to you the non-trace version of whatever instruction you're looking at. `rb_vm_insn_null_translator` does not do that normalization. This means that when you're looping through the instructions if you're trying to do an opcode comparison, it can change depending on the type of threading that you're using. This can be very confusing. So, this commit creates a new translator function `rb_vm_insn_normalizing_translator` to always return the non-trace version so that opcode comparisons don't have to worry about different configurations. [Feature #18589] Notes: Merged: https://github.com/ruby/ruby/pull/5716
2022-03-25Revert "Finer-grained inline constant cache invalidation"Nobuyoshi Nakada
This reverts commits for [Feature #18589]: * 8008fb7352abc6fba433b99bf20763cf0d4adb38 "Update formatting per feedback" * 8f6eaca2e19828e92ecdb28b0fe693d606a03f96 "Delete ID from constant cache table if it becomes empty on ISEQ free" * 629908586b4bead1103267652f8b96b1083573a8 "Finer-grained inline constant cache invalidation" MSWin builds on AppVeyor have been crashing since the merger. Notes: Merged: https://github.com/ruby/ruby/pull/5715 Merged-By: nobu <nobu@ruby-lang.org>
2022-03-24Finer-grained inline constant cache invalidationKevin Newton
Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated. ```ruby class A B = 1 end def foo A::B # inline cache depends on global counter end foo # populate inline cache foo # hit inline cache C = 1 # global counter increments, all caches are invalidated foo # misses inline cache due to `C = 1` ``` Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache. ```ruby class A B = 1 end def foo A::B # inline cache depends constants named "A" and "B" end foo # populate inline cache foo # hit inline cache C = 1 # caches that depend on the name "C" are invalidated foo # hits inline cache because IC only depends on "A" and "B" ``` Examples of breaking the new cache: ```ruby module C # Breaks `foo` cache because "A" constant is set and the cache in foo depends # on "A" and "B" class A; end end B = 1 ``` We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT. Notes: Merged: https://github.com/ruby/ruby/pull/5433
2022-03-24Add ISEQ_BODY macroPeter Zhu
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation. Notes: Merged: https://github.com/ruby/ruby/pull/5698
2022-02-02Treat TS_ICVARC cache as separate from TS_IVC cacheJemma Issroff
Notes: Merged: https://github.com/ruby/ruby/pull/5519
2022-01-01Prefer RBOOLNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/5385
2021-12-02Lazily create singletons on instance_{exec,eval} (#5146)John Hawthorn
* Lazily create singletons on instance_{exec,eval} Previously when instance_exec or instance_eval was called on an object, that object would be given a singleton class so that method definitions inside the block would be added to the object rather than its class. This commit aims to improve performance by delaying the creation of the singleton class unless/until one is needed for method definition. Most of the time instance_eval is used without any method definition. This was implemented by adding a flag to the cref indicating that it represents a singleton of the object rather than a class itself. In this case CREF_CLASS returns the object's existing class, but in cases that we are defining a method (either via definemethod or VM_SPECIAL_OBJECT_CBASE which is used for undef and alias). This also happens to fix what I believe is a bug. Previously instance_eval behaved differently with regards to constant access for true/false/nil than for all other objects. I don't think this was intentional. String::Foo = "foo" "".instance_eval("Foo") # => "foo" Integer::Foo = "foo" 123.instance_eval("Foo") # => "foo" TrueClass::Foo = "foo" true.instance_eval("Foo") # NameError: uninitialized constant Foo This also slightly changes the error message when trying to define a method through instance_eval on an object which can't have a singleton class. Before: $ ruby -e '123.instance_eval { def foo; end }' -e:1:in `block in <main>': no class/module to add method (TypeError) After: $ ./ruby -e '123.instance_eval { def foo; end }' -e:1:in `block in <main>': can't define singleton (TypeError) IMO this error is a small improvement on the original and better matches the (both old and new) message when definging a method using `def self.` $ ruby -e '123.instance_eval{ def self.foo; end }' -e:1:in `block in <main>': can't define singleton (TypeError) Co-authored-by: Matthew Draper <matthew@trebex.net> * Remove "under" argument from yield_under * Move CREF_SINGLETON_SET into vm_cref_new * Simplify vm_get_const_base * Fix leaf VM_SPECIAL_OBJECT_CONST_BASE Co-authored-by: Matthew Draper <matthew@trebex.net> Notes: Merged-By: jhawthorn <john@hawthorn.email>
2021-11-18Optimize dynamic string interpolation for symbol/true/false/nil/0-9Jeremy Evans
This provides a significant speedup for symbol, true, false, nil, and 0-9, class/module, and a small speedup in most other cases. Speedups (using included benchmarks): :symbol :: 60% 0-9 :: 50% Class/Module :: 50% nil/true/false :: 20% integer :: 10% [] :: 10% "" :: 3% One reason this approach is faster is it reduces the number of VM instructions for each interpolated value. Initial idea, approach, and benchmarks from Eric Wong. I applied the same approach against the master branch, updating it to handle the significant internal changes since this was first proposed 4 years ago (such as CALL_INFO/CALL_CACHE -> CALL_DATA). I also expanded it to optimize true/false/nil/0-9/class/module, and added handling of missing methods, refined methods, and RUBY_DEBUG. This renames the tostring insn to anytostring, and adds an objtostring insn that implements the optimization. This requires making a few functions non-static, and adding some non-static functions. This disables 4 YJIT tests. Those tests should be reenabled after YJIT optimizes the new objtostring insn. Implements [Feature #13715] Co-authored-by: Eric Wong <e@80x24.org> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Yusuke Endoh <mame@ruby-lang.org> Co-authored-by: Koichi Sasada <ko1@atdot.net> Notes: Merged: https://github.com/ruby/ruby/pull/5002 Merged-By: jeremyevans <code@jeremyevans.net>
2021-11-18Refactor setclassvariable (#5143)Eileen M. Uchitelle
We only need the cref when we have a cache miss so don't look it up until we need it. This likely speeds up class variable writes in the interpreter but also simplifies the jit code. Before ``` Warming up -------------------------------------- write a cvar 192.280k i/100ms Calculating ------------------------------------- write a cvar 1.915M (± 3.5%) i/s - 9.614M in 5.026694s ``` After ``` Warming up -------------------------------------- write a cvar 216.308k i/100ms Calculating ------------------------------------- write a cvar 2.140M (± 3.1%) i/s - 10.815M in 5.058079s ``` Followup to ruby/ruby#5137 Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2021-11-18Refactor getclassvariable (#5137)Eileen M. Uchitelle
* Refactor getclassvariable We only need the cref when we have a cache miss so don't look it up until we need it. This speeds up class variable reads in the interpreter but also simplifies the jit code. Benchmarks for master vs this branch (without yjit): Before: ``` Warming up -------------------------------------- read a cvar 1.276M i/100ms Calculating ------------------------------------- read a cvar 12.596M (± 1.7%) i/s - 63.781M in 5.064902s ``` After: ``` Warming up -------------------------------------- read a cvar 1.336M i/100ms Calculating ------------------------------------- read a cvar 13.114M (± 3.6%) i/s - 65.488M in 5.000584s ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> * Clean up function signatures / remove dead code rb_vm_getclassvariable signature has changed and we don't need rb_vm_get_cref. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2021-10-18Eliminate some redundant checks on `num` in `newhash`Aaron Patterson
The `newhash` instruction was checking if `num` is greater than 0, but so is [`rb_hash_new_with_size`](https://github.com/ruby/ruby/blob/82e2443d8b1e3edd2607c78dddf5aac79a13492d/hash.c#L1564) as well as [`rb_hash_bulk_insert`](https://github.com/ruby/ruby/blob/82e2443d8b1e3edd2607c78dddf5aac79a13492d/hash.c#L4764). If we know the size is 0 in the instruction, we can just directly call `rb_hash_new` and only check the size once. Unfortunately, when num is greater than 0, it's still checked 3 times. Notes: Merged: https://github.com/ruby/ruby/pull/4977
2021-09-30Make Array#min/max optimization respect refined methodsJeremy Evans
Pass in ec to vm_opt_newarray_{max,min}. Avoids having to call GET_EC inside the functions, for better performance. While here, add a test for Array#min/max being redefined to test_optimization.rb. Fixes [Bug #18180] Notes: Merged: https://github.com/ruby/ruby/pull/4911 Merged-By: jeremyevans <code@jeremyevans.net>
2021-09-23Fix typo in insns.def [ci skip]Alan Wu
Notes: Merged: https://github.com/ruby/ruby/pull/4888 Merged-By: XrXr
2021-06-18Add a cache for class variableseileencodes
Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-06-14[Bug #17880] Set leaf false on opt_setinlinecache (#4565)Eileen M. Uchitelle
This change fixes the bug described in https://bugs.ruby-lang.org/issues/17880. Checking `ractor_shareable_p` will cause the method to call back into Ruby. Anything calling this method can't be a leaf instruction, otherwise it could crash. By adding `attr bool leaf = false` we no longer crash because it marks the function as not a leaf. Here's a simplified reproduction script: ```ruby require "set" class Id attr_reader :db_id def initialize(db_id) @db_id = db_id end def ==(other) other.class == self.class && other.db_id == db_id end alias_method :eql?, :== def hash 10 end def <=>(other) db_id <=> other.db_id if other.is_a?(self.class) end end class Namespace IDS = Set[ Id.new(1).freeze, Id.new(2).freeze, Id.new(3).freeze, Id.new(4).freeze, ].freeze class << self def test?(id) IDS.include?(id) end end end p Namespace.test?(Id.new(1)) p Namespace.test?(Id.new(5)) ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2021-05-11Revert "Filling cache values on cvar write"Aaron Patterson
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11Filling cache values on cvar writeeileencodes
Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11Add a cache for class variableseileencodes
This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-04-26Fix type-o in insns.defebrohman
"redefine" -> "redefined" Notes: Merged: https://github.com/ruby/ruby/pull/4418
2021-04-21Remove reverse VM instructionJeremy Evans
This was previously only used by the multiple assignment code, but is no longer needed after the multiple assignment execution order fix. Notes: Merged: https://github.com/ruby/ruby/pull/4398
2021-03-17Use rb_fstring for "defined" strings.Aaron Patterson
We can take advantage of fstrings to de-duplicate the defined strings. This means we don't need to keep the list of defined strings on the VM (or register them as mark objects) Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Refactor vm_defined to return a booleanAaron Patterson
We just need this function to return whether or not the thing we're looking for is defined. If it's defined, return something true, otherwise false. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Stop calling `rb_iseq_defined_string` in vm_definedAaron Patterson
We already have access to the string from the iseqs, so we can stop calling this function. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Store strings for `defined` in the iseqsAaron Patterson
We can know the string used for "defined" calls at compile time, then store the string in the instruction sequences Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-01-05enable constant cache on ractorsKoichi Sasada
constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2020-12-25Fix a cyclic explanationTakashi Kokubun
2020-12-17encourage inlining for vm_sendish()Koichi Sasada
Some tunings. * add `inline` for vm_sendish() * pass enum instead of func ptr to vm_sendish() * reorder initial order of `calling` struct. * add ALWAYS_INLINE for vm_search_method_fastpath() * call vm_search_method_fastpath() from vm_sendish() Notes: Merged: https://github.com/ruby/ruby/pull/3922
2020-12-16Lazily move PC with RUBY_VM_CHECK_INTSTakashi Kokubun
``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-12-17T06:17:46Z master 3b4d698e0b) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-17T07:01:48Z master 843abb96f0) +JIT [x86_64-linux] last_commit=Lazily move PC with RUBY_VM_CHECK_INTS Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.29343646660429 83.15779723251525 fps 82.26755637885149 85.50197941326810 83.50682959728820 88.14657804306270 85.01236533133049 88.78201988978667 87.81799334561326 88.94841008936447 87.88228562393064 89.37925215601926 88.06695585889995 89.86143277214475 88.84730834922165 90.00773346420887 90.46317871213088 90.82603371104014 90.96308347148916 91.29797694822179 90.97945938504556 91.31086331868738 91.57127890154500 91.49949184318844 ```
2020-12-16Inline getconstant on JIT (#3906)Takashi Kokubun
* Inline getconstant on JIT * Support USE_MJIT=0 Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-12-15fix inline method cache sync bugKoichi Sasada
`cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`. Notes: Merged: https://github.com/ruby/ruby/pull/3903
2020-12-10Unfortunately getinstancevariable was still not leafTakashi Kokubun
https://github.com/ruby/ruby/runs/1533401436
2020-12-10Make getinstancevariable a leaf instructionJeremy Evans
It can no longer issue a warning. Notes: Merged: https://github.com/ruby/ruby/pull/3879
2020-12-07tuning trial: newobj with current ecKoichi Sasada
Passing current ec can improve performance of newobj. This patch tries it for Array and String literals ([] and ''). Notes: Merged: https://github.com/ruby/ruby/pull/3842
2020-10-17sync RClass::ext::iv_index_tblKoichi Sasada
iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically. Notes: Merged: https://github.com/ruby/ruby/pull/3662
2020-09-15Interpolated strings are no longer frozen with frozen-string-literal: trueBenoit Daloze
* Remove freezestring instruction since this was the only usage for it. * [Feature #17104] Notes: Merged: https://github.com/ruby/ruby/pull/3488
2020-07-13precalc invokebuiltin destinations卜部昌平
Noticed that struct rb_builtin_function is a purely compile-time constant. MJIT can eliminate some runtime calculations by statically generate dedicated C code generator for each builtin functions. Notes: Merged: https://github.com/ruby/ruby/pull/3305
2020-07-03Use ID instead of GENTRY for gvars. (#3278)Koichi Sasada
Use ID instead of GENTRY for gvars. Global variables are compiled into GENTRY (a pointer to struct rb_global_entry). This patch replace this GENTRY to ID and make the code simple. We need to search GENTRY from ID every time (st_lookup), so additional overhead will be introduced. However, the performance of accessing global variables is not important now a day and this simplicity helps Ractor development. Notes: Merged-By: ko1 <ko1@atdot.net>
2020-06-23Trace :return of builtin methodsTakashi Kokubun
using opt_invokebuiltin_delegate_leave insn. Since Ruby 2.7, :return of methods using builtin have not been traced properly.
2020-06-17Remove obsoleted opt_call_c_function insn (#3232)Takashi Kokubun
* Remove obsoleted opt_call_c_function insn * Keep opt_call_c_function with DEFINE_INSN_IF Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-06-02vm_insnhelper.c: merge opt_eq_func / opt_eql_func卜部昌平
These two function were almost identical, except in case of T_STRING/T_FLOAT. Why not merge them into one, and let the difference be handled in normal method calls (slowpath). This does not improve runtime performance for me, but at least reduces for instance rb_eql_opt from 653 bytes to 86 bytes on my machine, according to nm(1). Notes: Merged: https://github.com/ruby/ruby/pull/3169
2020-04-10Turn class variable warnings into exceptionsJeremy Evans
This changes the following warnings: * warning: class variable access from toplevel * warning: class variable @foo of D is overtaken by C into RuntimeErrors. Handle defined?(@@foo) at toplevel by returning nil instead of raising an exception (the previous behavior warned before returning nil when defined? was used). Refactor the specs to avoid the warnings even in older versions. The specs were checking for the warnings, but the purpose of the related specs as evidenced from their description is to test for behavior, not for warnings. Fixes [Bug #14541] Notes: Merged: https://github.com/ruby/ruby/pull/2987
2020-02-22Introduce disposable call-cache.Koichi Sasada
This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2020-02-22VALUE size packed callinfo (ci).Koichi Sasada
Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2020-01-27Fixed a typo, missing "i" [ci skip]Nobuyoshi Nakada