summaryrefslogtreecommitdiff
path: root/insns.def
AgeCommit message (Collapse)Author
2024-03-19Implement chilled stringsÉtienne Barrié
[Feature #20205] As a path toward enabling frozen string literals by default in the future, this commit introduce "chilled strings". From a user perspective chilled strings pretend to be frozen, but on the first attempt to mutate them, they lose their frozen status and emit a warning rather than to raise a `FrozenError`. Implementation wise, `rb_compile_option_struct.frozen_string_literal` is no longer a boolean but a tri-state of `enabled/disabled/unset`. When code is compiled with frozen string literals neither explictly enabled or disabled, string literals are compiled with a new `putchilledstring` instruction. This instruction is identical to `putstring` except it marks the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags. Chilled strings have the `FL_FREEZE` flag as to minimize the need to check for chilled strings across the codebase, and to improve compatibility with C extensions. Notes: - `String#freeze`: clears the chilled flag. - `String#-@`: acts as if the string was mutable. - `String#+@`: acts as if the string was mutable. - `String#clone`: copies the chilled flag. Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-02-20Add pushtoarraykwsplat instruction to avoid unnecessary array allocationJeremy Evans
This is designed to replace the newarraykwsplat instruction, which is no longer used in the parse.y compiler after this commit. This avoids an unnecessary array allocation in the case where ARGSCAT is followed by LIST with keyword: ```ruby a = [] kw = {} [*a, 1, **kw] ``` Previous Instructions: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 newhash 0 ( 2)[Li] 0006 setlocal_WC_0 kw@1 0008 getlocal_WC_0 a@0 ( 3)[Li] 0010 splatarray true 0012 putobject_INT2FIX_1_ 0013 putspecialobject 1 0015 newhash 0 0017 getlocal_WC_0 kw@1 0019 opt_send_without_block <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE> 0021 newarraykwsplat 2 0023 concattoarray 0024 leave ``` New Instructions: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 newhash 0 ( 2)[Li] 0006 setlocal_WC_0 kw@1 0008 getlocal_WC_0 a@0 ( 3)[Li] 0010 splatarray true 0012 putobject_INT2FIX_1_ 0013 pushtoarray 1 0015 putspecialobject 1 0017 newhash 0 0019 getlocal_WC_0 kw@1 0021 opt_send_without_block <calldata!mid:core#hash_merge_kwd, argc:2, ARGS_SIMPLE> 0023 pushtoarraykwsplat 0024 leave ``` pushtoarraykwsplat is designed to be simpler than newarraykwsplat. It does not take a variable number of arguments from the stack, it pops the top of the stack, and appends it to the second from the top, unless the top of the stack is an empty hash. During this work, I found the ARGSPUSH followed by HASH with keyword did not compile correctly, as it pushed the generated hash to the array even if the hash was empty. This fixes the behavior, to use pushtoarraykwsplat instead of pushtoarray in that case: ```ruby a = [] kw = {} [*a, **kw] [{}] # Before [] # After ``` This does not remove the newarraykwsplat instruction, as it is still referenced in the prism compiler (which should be updated similar to this), YJIT (only in the bindings, it does not appear to be implemented), and RJIT (in a couple comments). After those are updated, the newarraykwsplat instruction should be removed.
2024-02-12Allow `foo(**nil, &block_arg)`Alan Wu
Previously, `**nil` by itself worked, but if you add a block argument, it raised a conversion error. The presence of the block argument shouldn't change how keyword splat works. See: <https://bugs.ruby-lang.org/issues/20064>
2024-01-30Use `UNDEF_P`Nobuyoshi Nakada
2024-01-24Add pushtoarray VM instructionJeremy Evans
This instruction is similar to concattoarray, but it takes the number of arguments to push to the array, removes that number of arguments from the stack, and adds them to the array now at the top of the stack. This allows `f(*a, 1)` to allocate only a single array on the caller side (which can be reused on the callee side in the case of `def f(*a)`). Prior to this commit, `f(*a, 1)` would generate 3 arrays: * a dupped by splatarray true * 1 wrapped in array by newarray * a dupped again by concatarray Instructions Before for `a = []; f(*a, 1)`: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 putself 0005 getlocal_WC_0 a@0 0007 splatarray true 0009 putobject_INT2FIX_1_ 0010 newarray 1 0012 concatarray 0013 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|FCALL> 0015 leave ``` Instructions After for `a = []; f(*a, 1)`: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 putself 0005 getlocal_WC_0 a@0 0007 splatarray true 0009 putobject_INT2FIX_1_ 0010 pushtoarray 1 0012 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL> 0014 leave ``` With these changes, method calls to Ruby methods should implicitly allocate at most one array. Ignore typeprof bundled gem failure due to unrecognized instruction.
2024-01-24Add concattoarray VM instructionJeremy Evans
This instruction is similar to concatarray, but assumes the first object is already an array, and appends to it directly. This is different than concatarray, which will create a new array instead of appending to an existing array. Additionally, for both concatarray and concattoarray, if the second argument cannot be converted to an array, then just push it onto the array, instead of creating a new array to wrap it, and then using concat array. This saves an array allocation in that case. This allows `f(*a, *a, *1)` to allocate only a single array on the caller side (which can be reused on the callee side in the case of `def f(*a)`). Prior to this commit, `f(*a, *a, *1)` would generate 4 arrays: * a dupped by splatarray true * a dupped again by first concatarray * 1 wrapped in array by third splatarray * result of [*a, *a] dupped by second concatarray Instructions Before for `a = []; f(*a, *a, *1)`: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 putself 0005 getlocal_WC_0 a@0 0007 splatarray true 0009 getlocal_WC_0 a@0 0011 splatarray false 0013 concatarray 0014 putobject_INT2FIX_1_ 0015 splatarray false 0017 concatarray 0018 opt_send_without_block <calldata!mid:g, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL> 0020 leave ``` Instructions After for `a = []; f(*a, *a, *1)`: ``` 0000 newarray 0 ( 1)[Li] 0002 setlocal_WC_0 a@0 0004 putself 0005 getlocal_WC_0 a@0 0007 splatarray true 0009 getlocal_WC_0 a@0 0011 concattoarray 0012 putobject_INT2FIX_1_ 0013 concattoarray 0014 opt_send_without_block <calldata!mid:f, argc:1, ARGS_SPLAT|ARGS_SPLAT_MUT|FCALL> 0016 leave ```
2023-12-09Ensure f(**kw, &block) calls kw.to_hash before block.to_procJeremy Evans
Previously, block.to_proc was called first, by vm_caller_setup_arg_block. kw.to_hash was called later inside CALLER_SETUP_ARG or setup_parameters_complex. This adds a splatkw instruction that is inserted before sends with ARGS_BLOCKARG and KW_SPLAT and without KW_SPLAT_MUT. This is not needed in the KW_SPLAT_MUT case, because then you know the value is a hash, and you don't need to call to_hash on it. The splatkw instruction checks whether the second to top block is a hash, and if not, replaces it with the value of calling to_hash on it (using rb_to_hash_type). As it is always before a send with ARGS_BLOCKARG and KW_SPLAT, second to top is the keyword splat, and top is the passed block.
2023-12-01Make expandarray compaction safePeter Zhu
The expandarray instruction can allocate an array, which can trigger a GC compaction. However, since it does not increment the sp until the end of the instruction, the objects it places on the stack are not marked or reference updated by the GC, which can cause the objects to move which leaves broken or incorrect objects on the stack. This commit changes the instruction to be handles_sp so the sp is incremented inside of the instruction right after the object is written on the stack.
2023-10-13YJIT: Fallback opt_getconstant_path for const_missing (#8623)Takashi Kokubun
* YJIT: Fallback opt_getconstant_path for const_missing * Fix a comment [ci skip] * Remove a wrapper function
2023-04-21Remove unused opt_call_c_function insn (#7750)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-04-18Emit special instruction for array literal + .(hash|min|max)Aaron Patterson
This commit introduces a new instruction `opt_newarray_send` which is used when there is an array literal followed by either the `hash`, `min`, or `max` method. ``` [a, b, c].hash ``` Will emit an `opt_newarray_send` instruction. This instruction falls back to a method call if the "interested" method has been monkey patched. Here are some examples of the instructions generated: ``` $ ./miniruby --dump=insns -e '[@a, @b].max' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE) 0000 getinstancevariable :@a, <is:0> ( 1)[Li] 0003 getinstancevariable :@b, <is:1> 0006 opt_newarray_send 2, :max 0009 leave $ ./miniruby --dump=insns -e '[@a, @b].min' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,12)> (catch: FALSE) 0000 getinstancevariable :@a, <is:0> ( 1)[Li] 0003 getinstancevariable :@b, <is:1> 0006 opt_newarray_send 2, :min 0009 leave $ ./miniruby --dump=insns -e '[@a, @b].hash' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 getinstancevariable :@a, <is:0> ( 1)[Li] 0003 getinstancevariable :@b, <is:1> 0006 opt_newarray_send 2, :hash 0009 leave ``` [Feature #18897] [ruby-core:109147] Co-authored-by: John Hawthorn <jhawthorn@github.com> Notes: Merged: https://github.com/ruby/ruby/pull/6090
2023-03-16Refactor jit_func_t and jit_execTakashi Kokubun
I closed https://github.com/ruby/ruby/pull/7543, but part of the diff seems useful regardless, so I extracted it.
2023-03-14YJIT: Implement throw instruction (#7491)Takashi Kokubun
* Break up jit_exec from vm_sendish * YJIT: Implement throw instruction * YJIT: Explain what rb_vm_throw does [ci skip] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2023-03-10rename `defined_ivar` to `definedivar`Koichi Sasada
because non-opt instructions should contain `_` char. Notes: Merged: https://github.com/ruby/ruby/pull/7485
2023-03-08Add defined_ivar instructionOle Friis Østergaard
This is a variation of the `defined` instruction, for use when we are checking for an instance variable. Splitting this out as a separate instruction lets us skip some checks, and it also allows us to use an instance variable cache, letting shape analysis speed up the operation further. Notes: Merged: https://github.com/ruby/ruby/pull/7433
2023-03-06Remove obsoleted MJIT_HEADER macroTakashi Kokubun
Notes: Merged: https://github.com/ruby/ruby/pull/7461
2022-11-10Adjust indents [ci skip]Nobuyoshi Nakada
2022-09-01New constant caching insn: opt_getconstant_pathJohn Hawthorn
Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out. Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-08-09Add peephole optimizer for newarray(X)/expandarray(X, 0) -> opt_reverse(X)Jeremy Evans
This renames the reverse instruction to opt_reverse, since now it is only added by the optimizer. Then it uses as a more general form of swap. This optimizes multiple assignment in the popped case with more than two elements. Notes: Merged: https://github.com/ruby/ruby/pull/6158
2022-08-09Revert "Remove reverse VM instruction"Jeremy Evans
This reverts commit 5512353d97250e85c13bf10b9b32e750478cf474. Notes: Merged: https://github.com/ruby/ruby/pull/6158
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-04-05RubyVM.stat constant cache metrics (#5766)Kevin Newton
Before the new constant cache behavior, caches were invalidated by a single global variable. You could inspect the value of this variable with RubyVM.stat(:global_constant_state). This was mostly useful to verify the behavior of the VM or to test constant loading like in Rails. With the new constant cache behavior, we introduced RubyVM.stat(:constant_cache) which returned a hash with symbol keys and integer values that represented the number of live constant caches associated with the given symbol. Additionally, we removed the old RubyVM.stat(:global_constant_state). This was proven to be not very useful, so it doesn't help you diagnose constant loading issues. So, instead we added the global constant state back into the RubyVM output. However, that number can be misleading as now when you invalidate something like `Foo::Bar::Baz` you're actually invalidating 3 different lists of inline caches. This commit attempts to get the best of both worlds. We remove RubyVM.stat(:global_constant_state) like we did originally, as it doesn't have the same semantic meaning and it could be confusing going forward. Instead we add RubyVM.stat(:constant_cache_invalidations) and RubyVM.stat(:constant_cache_misses). These two metrics should provide enough information to diagnose any constant loading issues, as well as provide a replacement for the old global constant state. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2022-04-01Finer-grained constant cache invalidation (take 2)Kevin Newton
This commit reintroduces finer-grained constant cache invalidation. After 8008fb7 got merged, it was causing issues on token-threaded builds (such as on Windows). The issue was that when you're iterating through instruction sequences and using the translator functions to get back the instruction structs, you're either using `rb_vm_insn_null_translator` or `rb_vm_insn_addr2insn2` depending if it's a direct-threading build. `rb_vm_insn_addr2insn2` does some normalization to always return to you the non-trace version of whatever instruction you're looking at. `rb_vm_insn_null_translator` does not do that normalization. This means that when you're looping through the instructions if you're trying to do an opcode comparison, it can change depending on the type of threading that you're using. This can be very confusing. So, this commit creates a new translator function `rb_vm_insn_normalizing_translator` to always return the non-trace version so that opcode comparisons don't have to worry about different configurations. [Feature #18589] Notes: Merged: https://github.com/ruby/ruby/pull/5716
2022-03-25Revert "Finer-grained inline constant cache invalidation"Nobuyoshi Nakada
This reverts commits for [Feature #18589]: * 8008fb7352abc6fba433b99bf20763cf0d4adb38 "Update formatting per feedback" * 8f6eaca2e19828e92ecdb28b0fe693d606a03f96 "Delete ID from constant cache table if it becomes empty on ISEQ free" * 629908586b4bead1103267652f8b96b1083573a8 "Finer-grained inline constant cache invalidation" MSWin builds on AppVeyor have been crashing since the merger. Notes: Merged: https://github.com/ruby/ruby/pull/5715 Merged-By: nobu <nobu@ruby-lang.org>
2022-03-24Finer-grained inline constant cache invalidationKevin Newton
Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated. ```ruby class A B = 1 end def foo A::B # inline cache depends on global counter end foo # populate inline cache foo # hit inline cache C = 1 # global counter increments, all caches are invalidated foo # misses inline cache due to `C = 1` ``` Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache. ```ruby class A B = 1 end def foo A::B # inline cache depends constants named "A" and "B" end foo # populate inline cache foo # hit inline cache C = 1 # caches that depend on the name "C" are invalidated foo # hits inline cache because IC only depends on "A" and "B" ``` Examples of breaking the new cache: ```ruby module C # Breaks `foo` cache because "A" constant is set and the cache in foo depends # on "A" and "B" class A; end end B = 1 ``` We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT. Notes: Merged: https://github.com/ruby/ruby/pull/5433
2022-03-24Add ISEQ_BODY macroPeter Zhu
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation. Notes: Merged: https://github.com/ruby/ruby/pull/5698
2022-02-02Treat TS_ICVARC cache as separate from TS_IVC cacheJemma Issroff
Notes: Merged: https://github.com/ruby/ruby/pull/5519
2022-01-01Prefer RBOOLNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/5385
2021-12-02Lazily create singletons on instance_{exec,eval} (#5146)John Hawthorn
* Lazily create singletons on instance_{exec,eval} Previously when instance_exec or instance_eval was called on an object, that object would be given a singleton class so that method definitions inside the block would be added to the object rather than its class. This commit aims to improve performance by delaying the creation of the singleton class unless/until one is needed for method definition. Most of the time instance_eval is used without any method definition. This was implemented by adding a flag to the cref indicating that it represents a singleton of the object rather than a class itself. In this case CREF_CLASS returns the object's existing class, but in cases that we are defining a method (either via definemethod or VM_SPECIAL_OBJECT_CBASE which is used for undef and alias). This also happens to fix what I believe is a bug. Previously instance_eval behaved differently with regards to constant access for true/false/nil than for all other objects. I don't think this was intentional. String::Foo = "foo" "".instance_eval("Foo") # => "foo" Integer::Foo = "foo" 123.instance_eval("Foo") # => "foo" TrueClass::Foo = "foo" true.instance_eval("Foo") # NameError: uninitialized constant Foo This also slightly changes the error message when trying to define a method through instance_eval on an object which can't have a singleton class. Before: $ ruby -e '123.instance_eval { def foo; end }' -e:1:in `block in <main>': no class/module to add method (TypeError) After: $ ./ruby -e '123.instance_eval { def foo; end }' -e:1:in `block in <main>': can't define singleton (TypeError) IMO this error is a small improvement on the original and better matches the (both old and new) message when definging a method using `def self.` $ ruby -e '123.instance_eval{ def self.foo; end }' -e:1:in `block in <main>': can't define singleton (TypeError) Co-authored-by: Matthew Draper <matthew@trebex.net> * Remove "under" argument from yield_under * Move CREF_SINGLETON_SET into vm_cref_new * Simplify vm_get_const_base * Fix leaf VM_SPECIAL_OBJECT_CONST_BASE Co-authored-by: Matthew Draper <matthew@trebex.net> Notes: Merged-By: jhawthorn <john@hawthorn.email>
2021-11-18Optimize dynamic string interpolation for symbol/true/false/nil/0-9Jeremy Evans
This provides a significant speedup for symbol, true, false, nil, and 0-9, class/module, and a small speedup in most other cases. Speedups (using included benchmarks): :symbol :: 60% 0-9 :: 50% Class/Module :: 50% nil/true/false :: 20% integer :: 10% [] :: 10% "" :: 3% One reason this approach is faster is it reduces the number of VM instructions for each interpolated value. Initial idea, approach, and benchmarks from Eric Wong. I applied the same approach against the master branch, updating it to handle the significant internal changes since this was first proposed 4 years ago (such as CALL_INFO/CALL_CACHE -> CALL_DATA). I also expanded it to optimize true/false/nil/0-9/class/module, and added handling of missing methods, refined methods, and RUBY_DEBUG. This renames the tostring insn to anytostring, and adds an objtostring insn that implements the optimization. This requires making a few functions non-static, and adding some non-static functions. This disables 4 YJIT tests. Those tests should be reenabled after YJIT optimizes the new objtostring insn. Implements [Feature #13715] Co-authored-by: Eric Wong <e@80x24.org> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Co-authored-by: Yusuke Endoh <mame@ruby-lang.org> Co-authored-by: Koichi Sasada <ko1@atdot.net> Notes: Merged: https://github.com/ruby/ruby/pull/5002 Merged-By: jeremyevans <code@jeremyevans.net>
2021-11-18Refactor setclassvariable (#5143)Eileen M. Uchitelle
We only need the cref when we have a cache miss so don't look it up until we need it. This likely speeds up class variable writes in the interpreter but also simplifies the jit code. Before ``` Warming up -------------------------------------- write a cvar 192.280k i/100ms Calculating ------------------------------------- write a cvar 1.915M (± 3.5%) i/s - 9.614M in 5.026694s ``` After ``` Warming up -------------------------------------- write a cvar 216.308k i/100ms Calculating ------------------------------------- write a cvar 2.140M (± 3.1%) i/s - 10.815M in 5.058079s ``` Followup to ruby/ruby#5137 Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2021-11-18Refactor getclassvariable (#5137)Eileen M. Uchitelle
* Refactor getclassvariable We only need the cref when we have a cache miss so don't look it up until we need it. This speeds up class variable reads in the interpreter but also simplifies the jit code. Benchmarks for master vs this branch (without yjit): Before: ``` Warming up -------------------------------------- read a cvar 1.276M i/100ms Calculating ------------------------------------- read a cvar 12.596M (± 1.7%) i/s - 63.781M in 5.064902s ``` After: ``` Warming up -------------------------------------- read a cvar 1.336M i/100ms Calculating ------------------------------------- read a cvar 13.114M (± 3.6%) i/s - 65.488M in 5.000584s ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> * Clean up function signatures / remove dead code rb_vm_getclassvariable signature has changed and we don't need rb_vm_get_cref. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2021-10-18Eliminate some redundant checks on `num` in `newhash`Aaron Patterson
The `newhash` instruction was checking if `num` is greater than 0, but so is [`rb_hash_new_with_size`](https://github.com/ruby/ruby/blob/82e2443d8b1e3edd2607c78dddf5aac79a13492d/hash.c#L1564) as well as [`rb_hash_bulk_insert`](https://github.com/ruby/ruby/blob/82e2443d8b1e3edd2607c78dddf5aac79a13492d/hash.c#L4764). If we know the size is 0 in the instruction, we can just directly call `rb_hash_new` and only check the size once. Unfortunately, when num is greater than 0, it's still checked 3 times. Notes: Merged: https://github.com/ruby/ruby/pull/4977
2021-09-30Make Array#min/max optimization respect refined methodsJeremy Evans
Pass in ec to vm_opt_newarray_{max,min}. Avoids having to call GET_EC inside the functions, for better performance. While here, add a test for Array#min/max being redefined to test_optimization.rb. Fixes [Bug #18180] Notes: Merged: https://github.com/ruby/ruby/pull/4911 Merged-By: jeremyevans <code@jeremyevans.net>
2021-09-23Fix typo in insns.def [ci skip]Alan Wu
Notes: Merged: https://github.com/ruby/ruby/pull/4888 Merged-By: XrXr
2021-06-18Add a cache for class variableseileencodes
Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-06-14[Bug #17880] Set leaf false on opt_setinlinecache (#4565)Eileen M. Uchitelle
This change fixes the bug described in https://bugs.ruby-lang.org/issues/17880. Checking `ractor_shareable_p` will cause the method to call back into Ruby. Anything calling this method can't be a leaf instruction, otherwise it could crash. By adding `attr bool leaf = false` we no longer crash because it marks the function as not a leaf. Here's a simplified reproduction script: ```ruby require "set" class Id attr_reader :db_id def initialize(db_id) @db_id = db_id end def ==(other) other.class == self.class && other.db_id == db_id end alias_method :eql?, :== def hash 10 end def <=>(other) db_id <=> other.db_id if other.is_a?(self.class) end end class Namespace IDS = Set[ Id.new(1).freeze, Id.new(2).freeze, Id.new(3).freeze, Id.new(4).freeze, ].freeze class << self def test?(id) IDS.include?(id) end end end p Namespace.test?(Id.new(1)) p Namespace.test?(Id.new(5)) ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2021-05-11Revert "Filling cache values on cvar write"Aaron Patterson
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11Filling cache values on cvar writeeileencodes
Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11Add a cache for class variableseileencodes
This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-04-26Fix type-o in insns.defebrohman
"redefine" -> "redefined" Notes: Merged: https://github.com/ruby/ruby/pull/4418
2021-04-21Remove reverse VM instructionJeremy Evans
This was previously only used by the multiple assignment code, but is no longer needed after the multiple assignment execution order fix. Notes: Merged: https://github.com/ruby/ruby/pull/4398
2021-03-17Use rb_fstring for "defined" strings.Aaron Patterson
We can take advantage of fstrings to de-duplicate the defined strings. This means we don't need to keep the list of defined strings on the VM (or register them as mark objects) Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Refactor vm_defined to return a booleanAaron Patterson
We just need this function to return whether or not the thing we're looking for is defined. If it's defined, return something true, otherwise false. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Stop calling `rb_iseq_defined_string` in vm_definedAaron Patterson
We already have access to the string from the iseqs, so we can stop calling this function. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17Store strings for `defined` in the iseqsAaron Patterson
We can know the string used for "defined" calls at compile time, then store the string in the instruction sequences Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-01-05enable constant cache on ractorsKoichi Sasada
constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2020-12-25Fix a cyclic explanationTakashi Kokubun
2020-12-17encourage inlining for vm_sendish()Koichi Sasada
Some tunings. * add `inline` for vm_sendish() * pass enum instead of func ptr to vm_sendish() * reorder initial order of `calling` struct. * add ALWAYS_INLINE for vm_search_method_fastpath() * call vm_search_method_fastpath() from vm_sendish() Notes: Merged: https://github.com/ruby/ruby/pull/3922
2020-12-16Lazily move PC with RUBY_VM_CHECK_INTSTakashi Kokubun
``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-12-17T06:17:46Z master 3b4d698e0b) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-12-17T07:01:48Z master 843abb96f0) +JIT [x86_64-linux] last_commit=Lazily move PC with RUBY_VM_CHECK_INTS Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.29343646660429 83.15779723251525 fps 82.26755637885149 85.50197941326810 83.50682959728820 88.14657804306270 85.01236533133049 88.78201988978667 87.81799334561326 88.94841008936447 87.88228562393064 89.37925215601926 88.06695585889995 89.86143277214475 88.84730834922165 90.00773346420887 90.46317871213088 90.82603371104014 90.96308347148916 91.29797694822179 90.97945938504556 91.31086331868738 91.57127890154500 91.49949184318844 ```