summaryrefslogtreecommitdiff
path: root/vm_insnhelper.c
AgeCommit message (Collapse)Author
2021-10-20Extract yjit_force_iv_index and make it work when object is frozenAlan Wu
In an effort to simplify the logic YJIT generates for accessing instance variable, YJIT ensures that a given name-to-index mapping exists at compile time. In the case that the mapping doesn't exist, it was created by using rb_ivar_set() with Qundef on the sample object we see at compile time. This hack isn't fine if the sample object happens to be frozen, in which case YJIT would raise a FrozenError unexpectedly. To deal with this, make a new function that only reserves the mapping but doesn't touch the object. This is rb_obj_ensure_iv_index_mapping(). This new function superceeds the functionality of rb_iv_index_tbl_lookup() so it was removed. Reported by and includes a test case from John Hawthorn <john@hawthorn.email> Fixes: GH-282
2021-10-20Add comments about special runtime routines YJIT callsAlan Wu
When YJIT make calls to routines without reconstructing interpreter state through jit_prepare_routine_call(), it relies on the routine to never allocate, raise, and push/pop control frames. Comment about this on the routines that YJTI calls. This is probably something we should dynamically verify on debug builds. It's hard to statically verify this as it requires verifying all functions in the call tree. Maybe something to look at in the future.
2021-10-20Cleanup diff against upstream. Add commentsAlan Wu
I did a `git diff --stat` against upstream and looked at all the files that are outside of YJIT to come up with these minor changes.
2021-10-20Put YJIT into a single compilation unitAlan Wu
For upstreaming, we want functions we export either prefixed with "rb_" or made static. Historically we haven't been following this rule, so we were "leaking" a lot of symbols as `make leak-globals` would tell us. This change unifies everything YJIT into a single compilation unit, yjit.o, and makes everything unprefixed static to pass `make leak-globals`. This manual "unified build" setup is similar to that of vm.o. Having everything in one compilation unit allows static functions to be visible across YJIT files and removes the need for declarations in headers in some cases. Unnecessary declarations were removed. Other changes of note: - switched to MJIT_SYMBOL_EXPORT_BEGIN which indicates stuff as being off limits for native extensions - the first include of each YJIT file is change to be "internal.h" - undefined MAP_STACK before explicitly redefining it since it collide's with a definition in system headers. Consider renaming?
2021-10-20Implement getclassvariable in yjiteileencodes
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-10-20Add a slowpath for opt_getinlinecacheAlan Wu
Before this change, when we encounter a constant cache that is specific to a lexical scope, we unconditionally exit. This change falls back to the interpreter's cache in this situation. This should help constant expressions in `class << self`, which is popular at Shopify due to the style guide. This change relies on the cache being warm while compiling to detect the need for checking the lexical scope for simplicity.
2021-10-20Remove vm_opt_asetJohn Hawthorn
2021-10-20Add comments for new functionAaron Patterson
2021-10-20Refactor attrset to use a functionAaron Patterson
This new function will do the write barrier / resize the object / check frozen for us
2021-10-20Remove rb_opt_equality_specializedJohn Hawthorn
2021-10-20Implement splatarrayKevin Newton
2021-10-20Try to fix MJIT symbol clash with cargo cultMaxime Chevalier-Boisvert
2021-10-20Implement defined bytecode (#39)Maxime Chevalier-Boisvert
2021-10-20Implement setivar with a plain old function call (#34)Maxime Chevalier-Boisvert
* Implement setivar with a plain old function call * Remove return
2021-10-20Implement opt_aset as interpreter handler callMaxime Chevalier-Boisvert
2021-10-20Implement opt_mod as call to interpreter function (#29)Maxime Chevalier-Boisvert
2021-10-20Implement opt_eq by calling interpreter function (#28)Maxime Chevalier-Boisvert
2021-10-20Implement send with alias method (#23)Maxime Chevalier-Boisvert
* Implement send with alias method * Add alias_method tests
2021-10-20Implement calls to methods with simple optional paramsAlan Wu
* Implement calls to methods with simple optional params * Remove unnecessary MJIT_STATIC See comment for MJIT_STATIC. I added it not knowing whether it's required because the function next to it has it. Don't use it and wait for problems to come up instead. * Better naming, some comments * Count bailing on kw only iseqs On railsbench: ``` opt_send_without_block exit reasons: bmethod 59729 (27.7%) optimized_method 59137 (27.5%) iseq_complex_callee 41362 (19.2%) alias_method 33346 (15.5%) callsite_not_simple 19170 ( 8.9%) iseq_only_keywords 1300 ( 0.6%) kw_splat 1299 ( 0.6%) cfunc_ruby_array_varg 18 ( 0.0%) ```
2021-10-20YJIT: Fancier opt_getinlinecacheAlan Wu
Make sure `opt_getinlinecache` is in a block all on its own, and invalidate it from the interpreter when `opt_setinlinecache`. It will recompile with a filled cache the second time around. This lets YJIT runs well when the IC for constant is cold.
2021-10-20YJIT: lazy polymorphic getinstancevariableAlan Wu
Lazily compile out a chain of checks for different known classes and whether `self` embeds its ivars or not. * Remove trailing whitespaces * Get proper addresss in Capstone disassembly * Lowercase address in Capstone disassembly Capstone uses lowercase for jump targets in generated listings. Let's match it. * Use the same successor in getivar guard chains Cuts down on duplication * Address reviews * Fix copypasta error * Add a comment
2021-10-20WIP JIT-to-JIT returnsMaxime Chevalier-Boisvert
2021-10-19Remove useless castsNobuyoshi Nakada
2021-10-19Get rid of type-punning castNobuyoshi Nakada
2021-10-03Using NIL_P macro instead of `== Qnil`S.H
Notes: Merged: https://github.com/ruby/ruby/pull/4925 Merged-By: nobu <nobu@ruby-lang.org>
2021-10-01Introduce rb_vm_call_with_refinements to DRY up a few callsJeremy Evans
Notes: Merged: https://github.com/ruby/ruby/pull/4919
2021-09-30Make Array#min/max optimization respect refined methodsJeremy Evans
Pass in ec to vm_opt_newarray_{max,min}. Avoids having to call GET_EC inside the functions, for better performance. While here, add a test for Array#min/max being redefined to test_optimization.rb. Fixes [Bug #18180] Notes: Merged: https://github.com/ruby/ruby/pull/4911 Merged-By: jeremyevans <code@jeremyevans.net>
2021-09-19Extract hook macro for attributesNobuyoshi Nakada
2021-09-11Remove printf family from the mjit headerNobuyoshi Nakada
Linking printf family functions makes mjit objects to link unnecessary code. Notes: Merged: https://github.com/ruby/ruby/pull/4820
2021-08-29Support tracing of attr_reader and attr_writerJeremy Evans
In vm_call_method_each_type, check for c_call and c_return events before dispatching to vm_call_ivar and vm_call_attrset. With this approach, the call cache will still dispatch directly to those functions, so this change will only decrease performance for the first (uncached) call, and even then, the performance decrease is very minimal. This approach requires that we clear the call caches when tracing is enabled or disabled. The approach currently switches all vm_call_ivar and vm_call_attrset call caches to vm_call_general any time tracing is enabled or disabled. So it could theoretically result in a slowdown for code that constantly enables or disables tracing. This approach does not handle targeted tracepoints, but from my testing, c_call and c_return events are not supported for targeted tracepoints, so that shouldn't matter. This includes a benchmark showing the performance decrease is minimal if detectable at all. Fixes [Bug #16383] Fixes [Bug #10470] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/4767
2021-08-11Get rid of type-punning pointer casts [Bug #18062]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4716
2021-08-02Using RBOOL macroS.H
Notes: Merged: https://github.com/ruby/ruby/pull/4695 Merged-By: nobu <nobu@ruby-lang.org>
2021-06-23Refactor class variable cache functionsNobuyoshi Nakada
Extracted repeated code as update_classvariable_cache. When cvc table is not set in getclassvariable, an empty table was created but it has no id and would cause [BUG], so made the code same as setclassvariable.
2021-06-18Add a cache for class variableseileencodes
Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-05-11Revert "Filling cache values on cvar write"Aaron Patterson
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11Filling cache values on cvar writeeileencodes
Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11Add a cache for class variableseileencodes
This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-07Protoized old pre-ANSI K&R style declarations and definitionsNobuyoshi Nakada
2021-04-23Add back checks for empty kw splat with tests (#4405)Alan Wu
This reverts commit a224ce8150f2bc687cf79eb415c931d87a4cd247. Turns out the checks are needed to handle splatting an array with an empty ruby2 keywords hash. Notes: Merged-By: XrXr
2021-04-23Remove part of comment that is no longer accurateJeremy Evans
In Ruby 2.7, empty keyword splats could be added back for backwards compatibility. However, that stopped in Ruby 3.0.
2021-04-23Remove unnecessary checks for empty kw splatAlan Wu
These two checks are surrounded by an if that ensures the call site is not a kw splat call site. Notes: Merged: https://github.com/ruby/ruby/pull/4404 Merged-By: XrXr
2021-04-02fix return from orphan Proc in lambdaKoichi Sasada
A "return" statement in a Proc in a lambda like: `lambda{ proc{ return }.call }` should return outer lambda block. However, the inner Proc can become orphan Proc from the lambda block. This "return" escape outer-scope like method, but this behavior was decieded as a bug. [Bug #17105] This patch raises LocalJumpError by checking the proc is orphan or not from lambda blocks before escaping by "return". Most of tests are written by Jeremy Evans https://github.com/ruby/ruby/pull/4223 Notes: Merged: https://github.com/ruby/ruby/pull/4347
2021-03-17return bool instead of VALUEAaron Patterson
Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Refactor vm_defined to return a booleanAaron Patterson
We just need this function to return whether or not the thing we're looking for is defined. If it's defined, return something true, otherwise false. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Stop calling `rb_iseq_defined_string` in vm_definedAaron Patterson
We already have access to the string from the iseqs, so we can stop calling this function. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-10Remove DEFINED_IVAR2 from enumJohn Hawthorn
This version of defined? doesn't seem to be possible to emit anymore. Notes: Merged: https://github.com/ruby/ruby/pull/4253
2021-02-13opt_equality_by_mid for rb_equal_optKoichi Sasada
This patch improves the performance of sequential and parallel execution of rb_equal() (and rb_eql()). [Bug #17497] rb_equal_opt (and rb_eql_opt) does not have own cd and it waste a time to initialize cd. This patch introduces opt_equality_by_mid() to check equality without cd. Furthermore, current master uses "static" cd on rb_equal_opt (and rb_eql_opt) and it hurts CPU caches on multi-thread execution. Now they are gone so there are no bottleneck on parallel execution. Notes: Merged: https://github.com/ruby/ruby/pull/4177
2021-01-29global call-cache cache table for rb_funcall*Koichi Sasada
rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall*. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough. Notes: Merged: https://github.com/ruby/ruby/pull/4129
2021-01-17Suppress the warning for the invalid method_explorer caseNobuyoshi Nakada
2021-01-13Revert "[Bug #11213] let defined?(super) call respond_to_missing?"Nobuyoshi Nakada
This reverts commit fac2498e0299f13dffe4f09a7dd7657fb49bf643 for now, due to [Bug #17509], the breakage in the case `super` is called in `respond_to?`. Notes: Merged: https://github.com/ruby/ruby/pull/4057