summaryrefslogtreecommitdiff
path: root/vm_insnhelper.c
AgeCommit message (Collapse)Author
2020-12-17encourage inlining for vm_sendish()Koichi Sasada
Some tunings. * add `inline` for vm_sendish() * pass enum instead of func ptr to vm_sendish() * reorder initial order of `calling` struct. * add ALWAYS_INLINE for vm_search_method_fastpath() * call vm_search_method_fastpath() from vm_sendish() Notes: Merged: https://github.com/ruby/ruby/pull/3922
2020-12-16Inline getconstant on JIT (#3906)Takashi Kokubun
* Inline getconstant on JIT * Support USE_MJIT=0 Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-12-16tuning vm_setivar_slowpath() more.Koichi Sasada
specify inline/noinline code for is_attr condition.
2020-12-16remove unused functionKoichi Sasada
2020-12-16tuning ivar setKoichi Sasada
* make rb_init_iv_list() simple * introduce vm_setivar_slowpath() for cache miss cases ../clean/miniruby is 647ee6f091. Calculating ------------------------------------- ./miniruby ../clean/miniruby ../ruby_2_7/miniruby vm_ivar_init 7.388M 6.814M 5.771M i/s - 30.000M times in 4.060420s 4.402534s 5.198781s vm_ivar_init_subclass 2.158M 2.147M 1.974M i/s - 3.000M times in 1.390328s 1.397587s 1.519951s vm_ivar_set 128.607M 97.931M 140.668M i/s - 30.000M times in 0.233269s 0.306338s 0.213268s vm_ivar 144.315M 151.722M 117.734M i/s - 30.000M times in 0.207879s 0.197730s 0.254811s Comparison: vm_ivar_init ./miniruby: 7388398.8 i/s ../clean/miniruby: 6814257.1 i/s - 1.08x slower ../ruby_2_7/miniruby: 5770583.9 i/s - 1.28x slower vm_ivar_init_subclass ./miniruby: 2157763.6 i/s ../clean/miniruby: 2146557.0 i/s - 1.01x slower ../ruby_2_7/miniruby: 1973747.9 i/s - 1.09x slower vm_ivar_set ../ruby_2_7/miniruby: 140668063.8 i/s ./miniruby: 128606912.1 i/s - 1.09x slower ../clean/miniruby: 97931027.8 i/s - 1.44x slower vm_ivar ../clean/miniruby: 151722121.9 i/s ./miniruby: 144314526.5 i/s - 1.05x slower ../ruby_2_7/miniruby: 117734305.5 i/s - 1.29x slower Notes: Merged: https://github.com/ruby/ruby/pull/3912
2020-12-16fix typoKoichi Sasada
2020-12-15add several debug countersKoichi Sasada
add cc_found_in_ccs (renamed from cc_found_ccs), cc_not_found_in_ccs, call0_public, call0_other debug counters to measure more details. also it contains several modification. Notes: Merged: https://github.com/ruby/ruby/pull/3903
2020-12-15fix inline method cache sync bugKoichi Sasada
`cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`. Notes: Merged: https://github.com/ruby/ruby/pull/3903
2020-12-14fix condition and add another debug counterKoichi Sasada
mc_inline_miss_same_def is added to check same method or not. Also the mc_inline_miss_same_cc calculation was fixed.
2020-12-14add debug counters to survey the IMC missKoichi Sasada
2020-12-14create ccs with 0 capaKoichi Sasada
ccs is created with default 4 capa, but it is too much, so start with 0 and extend to 1, 2, 4, 8, ... Notes: Merged: https://github.com/ruby/ruby/pull/3892
2020-12-12fix ivar with shareable objects issueKoichi Sasada
Instance variables of sharable objects are accessible only from main ractor, so we need to check it correctly. Notes: Merged: https://github.com/ruby/ruby/pull/3887
2020-12-10Remove the uninitialized instance variable verbose mode warningJeremy Evans
This speeds up all instance variable access, even when not in verbose mode. Uninitialized instance variable warnings were rarely helpful, and resulted in slower code if you wanted to avoid warnings when run in verbose mode. Implements [Feature #17055] Notes: Merged: https://github.com/ruby/ruby/pull/3879
2020-12-01rb_ext_ractor_safe() to declare ractor-safe extKoichi Sasada
C extensions can violate the ractor-safety, so only ractor-safe C extensions (C methods) can run on non-main ractors. rb_ext_ractor_safe(true) declares that the successive defined methods are ractor-safe. Otherwiwze, defined methods checked they are invoked in main ractor and raise an error if invoked at non-main ractors. [Feature #17307] Notes: Merged: https://github.com/ruby/ruby/pull/3824
2020-11-26Revert "Set VM_FRAME_FLAG_FINISH at once on MJIT"Takashi Kokubun
This reverts commit 4d2c8edca69884a41d2f843d36023e3decdb9872. Unfortunately this seems to cause several issues: https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802
2020-11-26Set VM_FRAME_FLAG_FINISH at once on MJITTakashi Kokubun
Performance is probably improved? $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master 69e77e81dc) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux] last_commit=Set VM_FRAME_FLAG_FINISH at once Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.89292998533379 82.19497327502751 fps 80.93130641142331 85.13943315260148 81.06214830270119 87.43757879797808 82.29172808453910 87.89942441487113 84.61206450455929 87.91309779491075 85.44545883567997 87.98026086648694 86.02923132404449 88.03081060383973 86.07411817365879 88.14650206137341 86.34348799602836 88.32791633649961 87.90257338977324 88.57599644892220 88.58006509876580 88.67426384743277 89.26611118140011 88.81669430874207 This should have no bad impact on VM because this function is ALWAYS_INLINE.
2020-11-25Prefer rb_module_new() over rb_define_module_id()Alan Wu
rb_define_module_id() doesn't do anything with its parameter so it's a bit confusing.
2020-11-20[Bug #11213] let defined?(super) call respond_to_missing?Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/3777
2020-11-16Set allocator on class creationAlan Wu
Allocating an instance of a class uses the allocator for the class. When the class has no allocator set, Ruby looks for it in the super class (see rb_get_alloc_func()). It's uncommon for classes created from Ruby code to ever have an allocator set, so it's common during the allocation process to search all the way to BasicObject from the class with which the allocation is being performed. This makes creating instances of classes that have long ancestry chains more expensive than creating instances of classes have that shorter ancestry chains. Setting the allocator at class creation time removes the need to perform a search for the alloctor during allocation. This is a breaking change for C-extensions that assume that classes created from Ruby code have no allocator set. Libraries that setup a class hierarchy in Ruby code and then set the allocator on some parent class, for example, can experience breakage. This seems like an unusual use case and hopefully it is rare or non-existent in practice. Rails has many classes that have upwards of 60 elements in the ancestry chain and benchmark shows a significant improvement for allocating with a class that includes 64 modules. ``` pre: ruby 3.0.0dev (2020-11-12T14:39:27Z master 6325866421) post: ruby 3.0.0dev (2020-11-12T20:15:30Z cut-allocator-lookup) Comparison: allocate_8_deep post: 10336985.6 i/s pre: 8691873.1 i/s - 1.19x slower allocate_32_deep post: 10423181.2 i/s pre: 6264879.1 i/s - 1.66x slower allocate_64_deep post: 10541851.2 i/s pre: 4936321.5 i/s - 2.14x slower allocate_128_deep post: 10451505.0 i/s pre: 3031313.5 i/s - 3.45x slower ``` Notes: Merged: https://github.com/ruby/ruby/pull/3764
2020-11-13Improve error message when subclassing non-ClassJeremy Evans
Fixes [Bug #14726] Notes: Merged: https://github.com/ruby/ruby/pull/3330
2020-11-09Add debug counter for ivar inline cache misses that could hitAaron Patterson
This commit adds a debug counter for the case where the inline cache *missed* but the ivar index table has an entry for that ivar. This is a case where a polymorphic cache could help Notes: Merged: https://github.com/ruby/ruby/pull/3750
2020-11-09Avoid slow path ivar settingAaron Patterson
If the ivar index table exists, we can avoid the slowest path for setting ivars. Notes: Merged: https://github.com/ruby/ruby/pull/3750
2020-11-09Update vm_insnhelper.cAaron Patterson
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/3740
2020-11-09Remove iv table size checkAaron Patterson
iv tables cannot shrink. If the inline cache was ever set, then there must be an entry for the instance variable in the iv table. Just set the iv list on the object to be equal to the iv index table size, then set the iv. Notes: Merged: https://github.com/ruby/ruby/pull/3740
2020-11-09eagerly initialize ivar table when index is small enoughAaron Patterson
When the inline cache is written, the iv table will contain an entry for the instance variable. If we get an inline cache hit, then we know the iv table must contain a value for the index written to the inline cache. If the index in the inline cache is larger than the list on the object, but *smaller* than the iv index table on the class, then we can just eagerly allocate the iv list to be the same size as the iv index table. This avoids duplicate work of checking frozen as well as looking up the index for the particular instance variable name. Notes: Merged: https://github.com/ruby/ruby/pull/3740
2020-10-30Ractor.make_shareable(a_proc)Koichi Sasada
Ractor.make_shareable() supports Proc object if (1) a Proc only read outer local variables (no assignments) (2) read outer local variables are shareable. Read local variables are stored in a snapshot, so after making shareable Proc, any assignments are not affeect like that: ```ruby a = 1 pr = Ractor.make_shareable(Proc.new{p a}) pr.call #=> 1 a = 2 pr.call #=> 1 # `a = 2` doesn't affect ``` [Feature #17284] Notes: Merged: https://github.com/ruby/ruby/pull/3722
2020-10-25Fix bootstrap-test error in previous commitJeremy Evans
2020-10-17sync RClass::ext::iv_index_tblKoichi Sasada
iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically. Notes: Merged: https://github.com/ruby/ruby/pull/3662
2020-10-16Abort on system stack overflow during GCAlan Wu
Buggy native extensions could have mark functions that cause stack overflow. When a stack overflow happens during GC, Ruby used to recover by raising an exception, which runs the interpreter. It's not safe to run the interpreter during GC since the GC is in an inconsistent state. This could cause object allocation during GC, for example. Instead of running the interpreter and potentially causing a crash down the line, fail fast and abort. Notes: Merged: https://github.com/ruby/ruby/pull/3661
2020-10-14sync generic_ivtblKoichi Sasada
generic_ivtbl is a process global table to maintain instance variables for non T_OBJECT/T_CLASS/... objects. So we need to protect them for multi-Ractor exection. Hint: we can make them Ractor local for unshareable objects, but now it is premature optimization. Notes: Merged: https://github.com/ruby/ruby/pull/3655
2020-10-01Make minor improvements to supereileencodes
The changes here include: * Using `FL_TEST_RAW` instead of `FL_TEST` in the first check in `vm_search_super_method`. While the profile showed us spending a fair amount of time here, the subsequent benchmarks didn't show much improvement when adding this. Regardless, we know this does less work than `FL_TEST` and we know that `FL_TEST_RAW` is safe due to the previous check so it's a small but accurate optimization. * Set `mid` only once. Both `vm_ci_new_runtime` and `vm_ci_mid` were getting the `original_id` for the method entry. We can do this once and pass the variable to the 2 callers that need it. This also doesn't have a huge performance improvement but cleans up the code a bit. Benchmark: ``` | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |vm_iclass_super | 3.540M| 3.940M| | | -| 1.11x| ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/3614
2020-09-25prohibi method call by defined_method in other racotrsKoichi Sasada
We can not call a non-isolated Proc in multiple ractors. Notes: Merged: https://github.com/ruby/ruby/pull/3584
2020-09-23Improve the performance of supereileencodes
This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super | |compare-ruby|built-ruby| |:---------|-----------:|---------:| |vm_super | 21.555M| 37.819M| | | -| 1.75x| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |vm_iclass_super | 1.669M| 3.683M| | | -| 2.21x| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do |module_name| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do |x| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/3545
2020-09-22Revert "Prevent SystemStackError when calling super in module with activated ↵Jeremy Evans
refinement" This reverts commit eeef16e190cdabc2ba474622720f8e3df7bac43b. This also reverts the spec change. Preventing the SystemStackError would be nice, but there is valid code that the fix breaks, and it is probably more common than cases that cause the SystemStackError. Fixes [Bug #17182] Notes: Merged: https://github.com/ruby/ruby/pull/3564
2020-09-15Interpolated strings are no longer frozen with frozen-string-literal: trueBenoit Daloze
* Remove freezestring instruction since this was the only usage for it. * [Feature #17104] Notes: Merged: https://github.com/ruby/ruby/pull/3488
2020-09-03Introduce Ractor mechanism for parallel executionKoichi Sasada
This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues. Notes: Merged: https://github.com/ruby/ruby/pull/3365
2020-09-01Remove the pc argument of vm_trace()Alan Wu
This makes the binary 272 bytes smaller on -O3 GCC 10.2.0. Notes: Merged: https://github.com/ruby/ruby/pull/3494
2020-08-27Fix Method#super_method for aliased methodsJeremy Evans
Previously, Method#super_method looked at the called_id to determine the method id to use, but that isn't correct for aliased methods, because the super target depends on the original method id, not the called_id. Additionally, aliases can reference methods defined in other classes and modules, and super lookup needs to start in the super of the defined class in such cases. This adds tests for Method#super_method for both types of aliases, one that uses VM_METHOD_TYPE_ALIAS and another that does not. Both check that the results for calling super methods return the expected values. To find the defined class for alias methods, add an rb_ prefix to find_defined_class_by_owner in vm_insnhelper.c and make it non-static, so that it can be called from method_super_method in proc.c. This bug was original discovered while researching [Bug #11189]. Fixes [Bug #17130] Notes: Merged: https://github.com/ruby/ruby/pull/3458 Merged-By: jeremyevans <code@jeremyevans.net>
2020-07-27Prevent SystemStackError when calling super in module with activated refinementJeremy Evans
Without this, if a refinement defines a method that calls super and includes a module with a module that calls super and has a activated refinement at the point super is called, the module method super call will end up calling back into the refinement method, creating a loop. Fixes [Bug #17007] Notes: Merged: https://github.com/ruby/ruby/pull/3309
2020-07-10Fixed another typoNobuyoshi Nakada
2020-07-10Fixed typosNobuyoshi Nakada
2020-07-10vm_push_frame_debug_counter_inc: use branches卜部昌平
Ko1 doesn't like previous code. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-07-10vm_push_frame: move assignments around卜部昌平
Struct assignment using a compound literal is more readable than before, to me at least. It seems compilers reorder assignments anyways. Neither speedup nor slowdown is observed on my machine. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-07-10vm_push_frame: move assertions out of the function卜部昌平
These assertions are purely static. Ned not be checked on-the-fly. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-07-10vm_push_frame: hoist out debug codes卜部昌平
Made it a bit readable. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-07-10nobody uses the return value of vm_push_frame卜部昌平
Surprised to see such a waste of time in this super duper hot path. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-07-03Use ID instead of GENTRY for gvars. (#3278)Koichi Sasada
Use ID instead of GENTRY for gvars. Global variables are compiled into GENTRY (a pointer to struct rb_global_entry). This patch replace this GENTRY to ID and make the code simple. We need to search GENTRY from ID every time (st_lookup), so additional overhead will be introduced. However, the performance of accessing global variables is not important now a day and this simplicity helps Ractor development. Notes: Merged-By: ko1 <ko1@atdot.net>
2020-06-30Extracted METHOD_ENTRY_CACHEABLE macroNobuyoshi Nakada
2020-06-29vm_getivar: do not goto into a branch卜部昌平
I'm not necessarily against every goto in general, but jumping into a branch is definitely a bad idea. Better refactor. Notes: Merged: https://github.com/ruby/ruby/pull/3247
2020-06-25Decide JIT-ed insn based on cached cfuncTakashi Kokubun
for opt_* insns. opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is handled by opt_send_without_block. Therefore we can't decide which insn should be generated by checking whether it's cfunc cc or not. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux] last_commit=Decide JIT-ed insn based on cached cfunc Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s Comparison: mjit_nil?(1) after --jit: 74020528.4 i/s before --jit: 73878185.9 i/s - 1.00x slower mjit_not(1) after --jit: 74600882.0 i/s before --jit: 72634507.6 i/s - 1.03x slower mjit_eq(1, nil) after --jit: 7444657.4 i/s before --jit: 7331304.3 i/s - 1.02x slower mjit_eq(nil, 1) after --jit: 64710790.6 i/s before --jit: 49449507.4 i/s - 1.31x slower ```