summaryrefslogtreecommitdiff
path: root/vm_core.h
AgeCommit message (Collapse)Author
2021-06-18Add a cache for class variableseileencodes
Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-06-09Pack iseq_inline_constant_cache_entryNobuyoshi Nakada
Reordered iseq_inline_constant_cache_entry members not to exceed the size of RValue.
2021-06-02Clarify these are just for MJITTakashi Kokubun
and not for third-party libraries. See: e6484a153038703447b50fcac26349249922ab28
2021-06-01Enable VM_ASSERT in --jit CIs (#4543)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2021-05-26Add Thread#native_thread_id [Feature #17853]NARUSE, Yui
2021-05-21Update a comment about what 'inline' attr meansTakashi Kokubun
2021-05-11Revert "Filling cache values on cvar write"Aaron Patterson
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11Add a cache for class variableseileencodes
This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-04Fix trivial -Wundef warningsBenoit Daloze
* See [Feature #17752] Co-authored-by: xtkoba (Tee KOBAYASHI) <xtkoba+ruby@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/4428
2021-05-01Adjust struct member offset for i386 Cygwinxtkoba
Fixes [Bug #17606] Notes: Merged: https://github.com/ruby/ruby/pull/4437
2021-03-17Use rb_fstring for "defined" strings.Aaron Patterson
We can take advantage of fstrings to de-duplicate the defined strings. This means we don't need to keep the list of defined strings on the VM (or register them as mark objects) Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-01-29global call-cache cache table for rb_funcall*Koichi Sasada
rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall*. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough. Notes: Merged: https://github.com/ruby/ruby/pull/4129
2021-01-06expose some C-APIs for ractorKoichi Sasada
expose some C-APIs to try to make ractor utilities on external gems. * add * rb_ractor_local_storage_value_lookup() to check availability * expose * rb_ractor_make_shareable() * rb_ractor_make_shareable_copy() * rb_proc_isolate() (not public) * rb_proc_isolate_bang() (not public) * rb_proc_ractor_make_shareable() (not public)
2021-01-05enable constant cache on ractorsKoichi Sasada
constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2020-12-22separate rb_ractor_pub from rb_ractor_tKoichi Sasada
separate some fields from rb_ractor_t to rb_ractor_pub and put it at the beggining of rb_ractor_t and declare it in vm_core.h so vm_core.h can access rb_ractor_pub fields. Now rb_ec_ractor_hooks() is a complete inline function and no MJIT related issue. Notes: Merged: https://github.com/ruby/ruby/pull/3943
2020-12-22TracePoint.new(&block) should be ractor-localKoichi Sasada
TracePoint should be ractor-local because the Proc can violate the Ractor-safe. Notes: Merged: https://github.com/ruby/ruby/pull/3943
2020-12-21export rb_eRactorIsolationError for MJITKoichi Sasada
https://ci.appveyor.com/project/ruby/ruby/builds/36942168/job/7ugrpk0pndoly9wp ``` _ruby_mjit_p11920u0.c C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.c(14) : warning C4005: 'GET_SELF' : macro redefinition c:\projects\ruby\vm_insnhelper.h(111) : see previous definition of 'GET_SELF' Creating library C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.lib and object C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.exp _ruby_mjit_p11920u0.obj : error LNK2001: unresolved external symbol rb_eRactorIsolationError C:\Users\appveyor\AppData\Local\Temp\1/_ruby_mjit_p11920u0.so : fatal error LNK1120: 1 unresolved externals ```
2020-12-21Introduce Ractor::IsolationErrorKoichi Sasada
Ractor has several restrictions to keep each ractor being isolated and some operation such as `CONST="foo"` in non-main ractor raises an exception. This kind of operation raises an error but there is confusion (some code raises RuntimeError and some code raises NameError). To make clear we introduce Ractor::IsolationError which is raised when the isolation between ractors is violated. Notes: Merged: https://github.com/ruby/ruby/pull/3957
2020-12-15fix inline method cache sync bugKoichi Sasada
`cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`. Notes: Merged: https://github.com/ruby/ruby/pull/3903
2020-12-14Introduce negative method cacheKoichi Sasada
pCMC doesn't have negative method cache so this patch implements it. Notes: Merged: https://github.com/ruby/ruby/pull/3892
2020-12-12Signal handler type should be voidNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/3889
2020-12-07fix decl of ruby_single_main_ractorKoichi Sasada
On windows, MJIT doesn't work without this patch because of the declaration of ruby_single_main_ractor. This patch fix this issue and move the definition of it from ractor.c to vm.c to locate near place of ruby_current_vm_ptr. Notes: Merged: https://github.com/ruby/ruby/pull/3842
2020-12-07ruby_single_main_ractor for single ractor modeKoichi Sasada
ruby_multi_ractor was a flag that indicates the interpreter doesn't make any additional ractors (single ractor mode). Instead of boolean flag, ruby_single_main_ractor pointer is introduced which keeps main ractor's pointer if single ractor mode. If additional ractors are created, ruby_single_main_ractor becomes NULL. Notes: Merged: https://github.com/ruby/ruby/pull/3842
2020-12-01rb_ext_ractor_safe() to declare ractor-safe extKoichi Sasada
C extensions can violate the ractor-safety, so only ractor-safe C extensions (C methods) can run on non-main ractors. rb_ext_ractor_safe(true) declares that the successive defined methods are ractor-safe. Otherwiwze, defined methods checked they are invoked in main ractor and raise an error if invoked at non-main ractors. [Feature #17307] Notes: Merged: https://github.com/ruby/ruby/pull/3824
2020-11-11introduce USE_VM_CLOCK for windows.Koichi Sasada
The timer function used on windows system set timer interrupt flag of current main ractor's executing ec and thread can detect the end of time slice. However, to set all ec->interrupt_flag for all running ractors, it is requires to synchronize with other ractors. However, timer thread can not acquire the ractor-wide lock because of some limitation. To solve this issue, this patch introduces USE_VM_CLOCK compile option to introduce rb_vm_t::clock. This clock will be incremented by the timer thread and each thread can check the incrementing by comparison with previous checked clock. At last, on windows platform this patch introduces some overhead, but I think there is no critical performance issue because of this modification. Notes: Merged: https://github.com/ruby/ruby/pull/3754
2020-10-30Ractor.make_shareable(a_proc)Koichi Sasada
Ractor.make_shareable() supports Proc object if (1) a Proc only read outer local variables (no assignments) (2) read outer local variables are shareable. Read local variables are stored in a snapshot, so after making shareable Proc, any assignments are not affeect like that: ```ruby a = 1 pr = Ractor.make_shareable(Proc.new{p a}) pr.call #=> 1 a = 2 pr.call #=> 1 # `a = 2` doesn't affect ``` [Feature #17284] Notes: Merged: https://github.com/ruby/ruby/pull/3722
2020-10-29check isolated Proc more strictlyKoichi Sasada
Isolated Proc prohibit to access outer local variables, but it was violated by binding and so on, so they should be error. Notes: Merged: https://github.com/ruby/ruby/pull/3721
2020-10-28Add Thread.ignore_deadlock accessorJeremy Evans
Setting this to true disables the deadlock detector. It should only be used in cases where the deadlock could be broken via some external means, such as via a signal. Now that $SAFE is no longer used, replace the safe_level_ VM flag with ignore_deadlock for storing the setting. Fixes [Bug #13768] Notes: Merged: https://github.com/ruby/ruby/pull/3710 Merged-By: jeremyevans <code@jeremyevans.net>
2020-10-20Some global variables can be accessed from ractorsKoichi Sasada
Some global variables should be used from non-main Ractors. [Bug #17268] ```ruby # ractor-local (derived from created ractor): debug '$DEBUG' => $DEBUG, '$-d' => $-d, # ractor-local (derived from created ractor): verbose '$VERBOSE' => $VERBOSE, '$-w' => $-w, '$-W' => $-W, '$-v' => $-v, # process-local (readonly): other commandline parameters '$-p' => $-p, '$-l' => $-l, '$-a' => $-a, # process-local (readonly): getpid '$$' => $$, # thread local: process result '$?' => $?, # scope local: match '$~' => $~.inspect, '$&' => $&, '$`' => $`, '$\'' => $', '$+' => $+, '$1' => $1, # scope local: last line '$_' => $_, # scope local: last backtrace '$@' => $@, '$!' => $!, # ractor local: stdin, out, err '$stdin' => $stdin.inspect, '$stdout' => $stdout.inspect, '$stderr' => $stderr.inspect, ``` Notes: Merged: https://github.com/ruby/ruby/pull/3670
2020-10-20Use language TLS specifier if it is possible.Koichi Sasada
To access TLS, it is faster to use language TLS specifier instead of using pthread_get/setspecific functions. Original proposal is: Use native thread locals. #3665 Notes: Merged: https://github.com/ruby/ruby/pull/3667
2020-10-17sync RClass::ext::iv_index_tblKoichi Sasada
iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically. Notes: Merged: https://github.com/ruby/ruby/pull/3662
2020-10-14fix releasing timing.Koichi Sasada
(1) recorded_lock_rec > current_lock_rec should not be occurred on rb_ec_vm_lock_rec_release(). (2) should be release VM lock at EXEC_TAG(), not POP_TAG(). (3) some refactoring. Notes: Merged: https://github.com/ruby/ruby/pull/3655
2020-10-14support exception when lock_rec > 0Koichi Sasada
If a ractor getting a VM lock (monitor) raises an exception, unlock can be skipped. To release VM lock correctly on exception (or other jumps with JUMP_TAG), EC_POP_TAG() releases VM lock. Notes: Merged: https://github.com/ruby/ruby/pull/3654
2020-09-21Make `Thread#join` non-blocking.Samuel Williams
Notes: Merged: https://github.com/ruby/ruby/pull/3558
2020-09-15relax dependencyKoichi Sasada
vm_sync.h does not need to include vm_core.h and ractor_pub.h. Notes: Merged: https://github.com/ruby/ruby/pull/3534
2020-09-15restart Ractor.select on intteruptKoichi Sasada
signal can interrupt Ractor.select, but if there is no exception, Ractor.select should restart automatically. Notes: Merged: https://github.com/ruby/ruby/pull/3534
2020-09-03Introduce Ractor mechanism for parallel executionKoichi Sasada
This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues. Notes: Merged: https://github.com/ruby/ruby/pull/3365
2020-08-21Avoid a use after free in VM assertionJeremy Evans
If the thread for the current EC has been killed, don't check the VM ptr for the EC (which gets it via the thread), as that will have already been freed. Fixes [Bug #16907] Notes: Merged: https://github.com/ruby/ruby/pull/3443
2020-07-23Remove unused field in rb_iseq_constant_bodyAlan Wu
This was introduced in 191ce5344ec42c91571f8f47c85be9138262b1c7 and has been unused since beae6cbf0fd8b6619e5212552de98022d4c4d4d4 Notes: Merged: https://github.com/ruby/ruby/pull/3353
2020-07-10RUBY_CONST_ASSERT: use STATIC_ASSERT instead卜部昌平
Static assertions shall be done using STATIC_ASSERT these days. Notes: Merged: https://github.com/ruby/ruby/pull/3296
2020-06-20Introduce Primitive.attr! to annotate 'inline' (#3242)Takashi Kokubun
[Feature #15589] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-05-14Thread scheduler for light weight concurrency.Samuel Williams
Notes: Merged: https://github.com/ruby/ruby/pull/3032 Merged-By: ioquatix <samuel@codeotaku.com>
2020-05-11drop varargs.h support卜部昌平
This header file is simply out of date (for decades since at least 1989). It's the 21st century. Just stop using it.
2020-05-11sed -i 's|ruby/impl|ruby/internal|'卜部昌平
To fix build failures. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-05-11sed -i s|ruby/3|ruby/impl|g卜部昌平
This shall fix compile errors. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-04-13add #include guard hack卜部昌平
According to MSVC manual (*1), cl.exe can skip including a header file when that: - contains #pragma once, or - starts with #ifndef, or - starts with #if ! defined. GCC has a similar trick (*2), but it acts more stricter (e. g. there must be _no tokens_ outside of #ifndef...#endif). Sun C lacked #pragma once for a looong time. Oracle Developer Studio 12.5 finally implemented it, but we cannot assume such recent version. This changeset modifies header files so that each of them include strictly one #ifndef...#endif. I believe this is the most portable way to trigger compiler optimizations. [Bug #16770] *1: https://docs.microsoft.com/en-us/cpp/preprocessor/once *2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html Notes: Merged: https://github.com/ruby/ruby/pull/3023
2020-04-08Merge pull request #2991 from shyouhei/ruby.h卜部昌平
Split ruby.h Notes: Merged-By: shyouhei <shyouhei@ruby-lang.org>
2020-03-19Get rid of redefinition of `rb_execution_context_t`Nobuyoshi Nakada
Regardless of the order to include "vm_core.h" and "builtin.h".
2020-03-06thread_pthread.c: allocate sigaltstack before pthread_createYusuke Endoh
A new (not-initialized-yet) pthread attempts to allocate sigaltstack by using xmalloc. It may cause GC, but because the thread is not initialized yet, ruby_native_thread_p() returns false, which leads to "[FATAL] failed to allocate memory" and exit. In fact, we can observe the error message in the log of OpenBSD CI: https://rubyci.org/logs/rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20200306T083005Z.log.html.gz This changeset allocates sigaltstack before pthread is created.
2020-02-22Introduce disposable call-cache.Koichi Sasada
This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details. Notes: Merged: https://github.com/ruby/ruby/pull/2888