summaryrefslogtreecommitdiff
path: root/tool/ruby_vm
AgeCommit message (Collapse)Author
2022-09-04Ruby MJIT (#6028)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-09-01New constant caching insn: opt_getconstant_pathJohn Hawthorn
Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out. Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-08-21Rename mjit_compile.c to mjit_compiler.cTakashi Kokubun
I'm planning to introduce mjit_compiler.rb, and I want to make this consistent with it. Consistency with compile.c doesn't seem important for MJIT anyway.
2022-08-19Rename mjit_exec to jit_exec (#6262)Takashi Kokubun
* Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-07-18Separate TS_IVC and TS_ICVARC in is_entries buffersJemma Issroff
This allows us to treat cvar caches differently than ivar caches. Notes: Merged: https://github.com/ruby/ruby/pull/6148
2022-07-15Implement Objects on VWAPeter Zhu
This commit implements Objects on Variable Width Allocation. This allows Objects with more ivars to be embedded (i.e. contents directly follow the object header) which improves performance through better cache locality. Notes: Merged: https://github.com/ruby/ruby/pull/6117
2022-06-30Adjust indent [ci skip]Nobuyoshi Nakada
2022-06-29Move function to `static inline` so we don't have leaked globalsAaron Patterson
This function shouldn't leak and is only needed during instruction assembly Notes: Merged: https://github.com/ruby/ruby/pull/6069
2022-04-01Finer-grained constant cache invalidation (take 2)Kevin Newton
This commit reintroduces finer-grained constant cache invalidation. After 8008fb7 got merged, it was causing issues on token-threaded builds (such as on Windows). The issue was that when you're iterating through instruction sequences and using the translator functions to get back the instruction structs, you're either using `rb_vm_insn_null_translator` or `rb_vm_insn_addr2insn2` depending if it's a direct-threading build. `rb_vm_insn_addr2insn2` does some normalization to always return to you the non-trace version of whatever instruction you're looking at. `rb_vm_insn_null_translator` does not do that normalization. This means that when you're looping through the instructions if you're trying to do an opcode comparison, it can change depending on the type of threading that you're using. This can be very confusing. So, this commit creates a new translator function `rb_vm_insn_normalizing_translator` to always return the non-trace version so that opcode comparisons don't have to worry about different configurations. [Feature #18589] Notes: Merged: https://github.com/ruby/ruby/pull/5716
2022-03-25Revert "Finer-grained inline constant cache invalidation"Nobuyoshi Nakada
This reverts commits for [Feature #18589]: * 8008fb7352abc6fba433b99bf20763cf0d4adb38 "Update formatting per feedback" * 8f6eaca2e19828e92ecdb28b0fe693d606a03f96 "Delete ID from constant cache table if it becomes empty on ISEQ free" * 629908586b4bead1103267652f8b96b1083573a8 "Finer-grained inline constant cache invalidation" MSWin builds on AppVeyor have been crashing since the merger. Notes: Merged: https://github.com/ruby/ruby/pull/5715 Merged-By: nobu <nobu@ruby-lang.org>
2022-03-24Finer-grained inline constant cache invalidationKevin Newton
Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated. ```ruby class A B = 1 end def foo A::B # inline cache depends on global counter end foo # populate inline cache foo # hit inline cache C = 1 # global counter increments, all caches are invalidated foo # misses inline cache due to `C = 1` ``` Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache. ```ruby class A B = 1 end def foo A::B # inline cache depends constants named "A" and "B" end foo # populate inline cache foo # hit inline cache C = 1 # caches that depend on the name "C" are invalidated foo # hits inline cache because IC only depends on "A" and "B" ``` Examples of breaking the new cache: ```ruby module C # Breaks `foo` cache because "A" constant is set and the cache in foo depends # on "A" and "B" class A; end end B = 1 ``` We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT. Notes: Merged: https://github.com/ruby/ruby/pull/5433
2022-03-24Add ISEQ_BODY macroPeter Zhu
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation. Notes: Merged: https://github.com/ruby/ruby/pull/5698
2022-02-02Treat TS_ICVARC cache as separate from TS_IVC cacheJemma Issroff
Notes: Merged: https://github.com/ruby/ruby/pull/5519
2021-12-05Make `leaf` const in VM generatorAlan Wu
Assigning to `leaf` in insns.def would give undesirable results.
2021-10-29vm_core.h: Avoid unaligned access to ic_serial on 32-bit machineYusuke Endoh
This caused Bus error on 32 bit Solaris Notes: Merged: https://github.com/ruby/ruby/pull/5049
2021-10-20Cleanup diff against upstream. Add commentsAlan Wu
I did a `git diff --stat` against upstream and looked at all the files that are outside of YJIT to come up with these minor changes.
2021-10-20Remove the scraperAaron Patterson
Now that we're using the jit function entry point, we don't need the scraper. Thank you for your service, scraper. ❤️
2021-10-20Remove some MicroJIT vestigesAaron Patterson
Just happened to run across this, so lets fix them
2021-10-20YJIT: lazy polymorphic getinstancevariableAlan Wu
Lazily compile out a chain of checks for different known classes and whether `self` embeds its ivars or not. * Remove trailing whitespaces * Get proper addresss in Capstone disassembly * Lowercase address in Capstone disassembly Capstone uses lowercase for jump targets in generated listings. Let's match it. * Use the same successor in getivar guard chains Cuts down on duplication * Address reviews * Fix copypasta error * Add a comment
2021-10-20Remove trailing whitespacesMaxime Chevalier-Boisvert
2021-10-20Yet Another Ruby JIT!Jose Narvaez
Renaming uJIT to YJIT. AKA s/ujit/yjit/g.
2021-10-20Restore interpreter regs in ujit hook. Implement leave bytecode.Maxime Chevalier-Boisvert
2021-10-20Refactor uJIT code into more files for readabilityMaxime Chevalier-Boisvert
2021-10-20Fix typoAlan Wu
2021-10-20Include disassembly in MicroJIT scraper outputAlan Wu
2021-10-20Add to the MicroJIT scraper an example that passes ecAlan Wu
2021-10-20Fix compilation for OPT_THREADED_CODE=2Alan Wu
2021-10-20Zero sized array are not standard CAlan Wu
2021-10-20Compile with MicroJIT disabled when scrape failsAlan Wu
This is just so we can build successfully on -O0 and other cases that are not supported by the code scraper.
2021-10-20endbr64 is fineAlan Wu
2021-10-20Preliminary GNU/Linux support for code scraperAlan Wu
Let's see if this works on CI
2021-10-20Refactor ujit_examples.h generator. Remove dwarfdump dependencyAlan Wu
2021-10-20Remove PC argument from ujit instructionsMaxime Chevalier-Boisvert
2021-10-20Yeah, this actually works!Alan Wu
2021-10-20Add example handler for ujit and scrape it from vm.oAlan Wu
2021-10-04Expose instruction information for debuggers [Feature #18026]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4633
2021-08-21Allow tracing of optimized methodsJeremy Evans
This updates the trace instructions to directly dispatch to opt_send_without_block. So this should cause no slowdown in non-trace mode. To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL and RUBY_EVENT_C_RETURN are added as events to the specialized instructions. Fixes [Bug #14870] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/4739 Merged-By: jeremyevans <code@jeremyevans.net>
2021-05-31Decompose the captured_cc code for investigationTakashi Kokubun
I'm investigating SEGVs like https://github.com/ruby/ruby/runs/2715166621?check_suite_focus=true. Because a lot of things are going on on this line, it's hard to identify the cause, especially because we can't get the core file of the failures. Therefore I intentionally increased the number of lines for investigation.
2021-03-10Remove DEFINED_IVAR2 from enumJohn Hawthorn
This version of defined? doesn't seem to be possible to emit anymore. Notes: Merged: https://github.com/ruby/ruby/pull/4253
2021-01-11Avoid re-entering opt_invokebuiltin_delegate_leaveTakashi Kokubun
on interruption. The cancellation code was originally written for leave insn, but re-entering opt_invokebuiltin_delegate_leave insn on a cancellation is not safe, because a builtin function is executed twice.
2021-01-04Fix broken JIT of getinlinecacheTakashi Kokubun
e7fc353f04 reverted vm_ic_hit_p's signature change made in 53babf35ef, which broke JIT compilation of getinlinecache. To make sure it doesn't happen again, I separated vm_inlined_ic_hit_p to make the intention clear.
2021-01-04Avoid using inconsistent coding styleTakashi Kokubun
Other `_mjit_compile_*.erb` files don't use goto. These files'd better be consistent for readability.
2021-01-05enable constant cache on ractorsKoichi Sasada
constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2020-12-22Skip checking ROBJECT_EMBEDTakashi Kokubun
when we already check ROBJECT_NUMIV(self) is larger than ROBJECT_EMBED_LEN_MAX at the beginning of the method, because the number of instance variables for the same object doesn't decrease. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=4 --alternate --output=all benchmark_3000.yml before --jit: ruby 3.0.0dev (2020-12-23T06:32:19Z master dbb4f19969) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-23T07:45:42Z master 95e866c098) +JIT [x86_64-linux] last_commit=Skip checking ROBJECT_EMBED Calculating ------------------------------------- before --jit after --jit Optcarrot 3000 frames 102.34091772397872 102.77738408379015 fps 103.37784821624231 105.46530219076179 104.39567016876369 106.43712452152215 105.31782092252713 106.54986150067481 ```
2020-12-21Prefer stdbool in vm_execTakashi Kokubun
Make the code a bit modern and consistent with some other places.
2020-12-19Check mjit_call_p only when interruptedTakashi Kokubun
for leaf_without_check_ints insns. $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-12-20T05:02:18Z master 02b3555874) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-20T05:36:00Z master 3f58de4eab) +JIT [x86_64-linux] last_commit=Check mjit_call_p only when interrupted Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 84.50647332260259 85.85057800433144 fps 91.17796644338372 92.09930605656054 91.29346683444497 93.01336611323687 91.50322318568884 93.07234029037433 91.66560903214686 93.22773241529644 91.82315142636172 93.37032901061119 92.15066379608260 93.83701526141679 92.37897097456643 93.86032792681507 92.53049815524908 93.91211970920320 92.78414507914283 94.09109196967890 92.90299756525958 94.40107239595325 93.70279428858790 95.01326369371263
2020-12-19Prefer RB_OBJ_FROZEN_RAWTakashi Kokubun
following the original implementation's change. RB_TYPE_P(obj, T_OBJECT) is already checked in these places. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-12-19T08:27:44Z master 52b1716c78) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-19T08:27:44Z master 52b1716c78) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 88.04551460097873 84.38303800957766 fps 88.25194345156318 85.31098251408059 88.34143982084871 86.60491582339496 88.63486879856976 88.23675694701865 88.85392212902701 88.23696283371444 89.05739427483194 88.97185459567562 89.08141031147311 90.16373192658857 89.11359420883423 90.61655686444394 89.80323392966130 90.77044959019291 90.58912189625207 90.88534596330966 90.59847996970350 91.34314801302897 90.61180456415137 93.11599164249547 ```
2020-12-16Lazily move PC with RUBY_VM_CHECK_INTSTakashi Kokubun
``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-12-17T06:17:46Z master 3b4d698e0b) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-17T07:01:48Z master 843abb96f0) +JIT [x86_64-linux] last_commit=Lazily move PC with RUBY_VM_CHECK_INTS Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.29343646660429 83.15779723251525 fps 82.26755637885149 85.50197941326810 83.50682959728820 88.14657804306270 85.01236533133049 88.78201988978667 87.81799334561326 88.94841008936447 87.88228562393064 89.37925215601926 88.06695585889995 89.86143277214475 88.84730834922165 90.00773346420887 90.46317871213088 90.82603371104014 90.96308347148916 91.29797694822179 90.97945938504556 91.31086331868738 91.57127890154500 91.49949184318844 ```
2020-12-16Ignore catch_except_p for PC motionTakashi Kokubun
We probably don't need to move it when an insn is leaf...