summaryrefslogtreecommitdiff
path: root/compile.c
AgeCommit message (Collapse)Author
2021-08-31Remove no longer used variable line_nodeNobuyoshi Nakada
2021-08-31Extract compile_block from iseq_compile_each0Nobuyoshi Nakada
And constify `node` argument of `iseq_compile_each0`.
2021-08-31Constify line_node in iseq_compile_each0Nobuyoshi Nakada
2021-08-21Allow tracing of optimized methodsJeremy Evans
This updates the trace instructions to directly dispatch to opt_send_without_block. So this should cause no slowdown in non-trace mode. To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL and RUBY_EVENT_C_RETURN are added as events to the specialized instructions. Fixes [Bug #14870] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/4739 Merged-By: jeremyevans <code@jeremyevans.net>
2021-08-15Show verbose error messages when single pattern match failsKazuki Tsujimoto
[0] => [0, *, a] #=> [0] length mismatch (given 1, expected 2+) (NoMatchingPatternError) Ignore test failures of typeprof caused by this change for now.
2021-07-29Fix use-after-free on -DUSE_EMBED_CI=0Alan Wu
On -DUSE_EMBED_CI=0, there are more GC allocations and the old code didn't keep old_operands[0] reachable while allocating. On a Debian based system, I get a crash requiring erb under GC stress mode. On macOS, tool/transcode-tblgen.rb runs incorrectly if I put GC.stress=true as the first line. Notes: Merged: https://github.com/ruby/ruby/pull/4662 Merged-By: XrXr
2021-07-15Add pattern matching pin support for instance/class/global variablesJeremy Evans
Pin matching for local variables and constants is already supported, and it is fairly simple to add support for these variable types. Note that pin matching for method calls is still not supported without wrapping in parentheses (pin expressions). I think that's for the best as method calls are far more complex (arguments/blocks). Implements [Feature #17724] Notes: Merged: https://github.com/ruby/ruby/pull/4502
2021-07-06Store the dup'd CDHASH in the object list during IBF loadAaron Patterson
Since b2fc592c304 nothing was holding a reference to the dup'd CDHASH during IBF loading. If a GC happened to run during IBF load then the copied hash wouldn't have anything to keep it alive. We don't really want to keep the originally loaded CDHASH hash, so this patch just overwrites the original hash with the copied / modified hash. [Bug #17984] [ruby-core:104259] Notes: Merged: https://github.com/ruby/ruby/pull/4630
2021-06-23Check type of instruction - can be INSN or ADJUSTeileencodes
If the type is ADJUST we don't want to treat it like an INSN so we have to check the type before reading from `insn_info.events`. [Bug #18001] [ruby-core:104371] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4601
2021-06-18Add a cache for class variableseileencodes
Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-06-18Enable USE_ISEQ_NODE_ID by defaultYusuke Endoh
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID. See also ff69ef27b06eed1ba750e7d9cab8322f351ed245. https://bugs.ruby-lang.org/issues/17930 Notes: Merged: https://github.com/ruby/ruby/pull/4558
2021-06-18Make it possible to get AST::Node from Thread::Backtrace::LocationYusuke Endoh
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that corresponds to the location. Typically, the node is a method call, but not always. This change also includes iseq's dump/load support of node_ids for each instructions. Notes: Merged: https://github.com/ruby/ruby/pull/4558
2021-06-18node.h: Reduce struct size to fit with Ruby object size (five VALUEs)Yusuke Endoh
by merging `rb_ast_body_t#line_count` and `#script_lines`. Fortunately `line_count == RARRAY_LEN(script_lines)` was always satisfied. When script_lines is saved, it has an array of lines, and when not saved, it has a Fixnum that represents the old line_count. Notes: Merged: https://github.com/ruby/ruby/pull/4581
2021-06-18ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`Yusuke Endoh
This option makes the parser keep the original source as an array of the original code lines. This feature exploits the mechanism of `SCRIPT_LINES__` but records only the specified code that is passed to RubyVM::AST.of or .parse, instead of recording all parsed program texts. Notes: Merged: https://github.com/ruby/ruby/pull/4581
2021-06-17Adjust styles [ci skip]Nobuyoshi Nakada
* --braces-after-func-def-line * --dont-cuddle-else * --procnames-start-lines * --space-after-for * --space-after-if * --space-after-while
2021-06-03Warn more duplicate literal hash keysNobuyoshi Nakada
Following non-special_const literals: * T_REGEXP Notes: Merged: https://github.com/ruby/ruby/pull/4548
2021-06-03Warn more duplicate literal hash keysNobuyoshi Nakada
Following non-special_const literals: * T_BIGNUM * T_FLOAT (non-flonum) * T_RATIONAL * T_COMPLEX Notes: Merged: https://github.com/ruby/ruby/pull/4548
2021-06-02Refactor rb_vm_insn_addr2insn callsTakashi Kokubun
It's been a way too much amount of ifdefs.
2021-05-28compile.c: Emit send for === calls in when statementsAlan Wu
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls === without a call cache. Emit a send instruction to make the call instead. It includes a call cache. The call cache improves throughput of using when statements to check the class of a given object. This is useful for say, JSON serialization. Use of a regular send instead of checkmatch also avoids taking the VM lock every time, which is good for multi-ractor workloads. Calculating ------------------------------------- master post vm_case_classes 11.013M 16.172M i/s - 6.000M times in 0.544795s 0.371009s vm_case_lit 2.296 2.263 i/s - 1.000 times in 0.435606s 0.441826s vm_case 74.098M 64.338M i/s - 6.000M times in 0.080974s 0.093257s Comparison: vm_case_classes post: 16172114.4 i/s master: 11013316.9 i/s - 1.47x slower vm_case_lit master: 2.3 i/s post: 2.3 i/s - 1.01x slower vm_case master: 74097858.6 i/s post: 64338333.9 i/s - 1.15x slower The vm_case benchmark is a bit slower post patch, possibily due to the larger instruction sequence. The benchmark dispatches using opt_case_dispatch so was not running checkmatch and does not make the === call post patch. Notes: Merged: https://github.com/ruby/ruby/pull/4468
2021-05-28Make range literal peephole optimization target "newrange"Alan Wu
It looks for "checkmatch", when it could be applied to anything that has "newrange". Making the optimization target more ranges might only be fair play when all ranges are frozen. So I'm putting a reference to the ticket that froze all ranges. [Feature #15504] Notes: Merged: https://github.com/ruby/ruby/pull/4468
2021-05-21Build CDHASH properly when loading iseq from binaryAlan Wu
Before this change, CDHASH operands were built as plain hashes when loaded from binary. Without setting up the hash with the correct st_table type, the hash can sometimes be an ar_table. When the hash is an ar_table, lookups can call the `eql?` method on keys of the hash, which makes the `opt_case_dispatch` instruction not "leaf" as it implicitly declares. The following script trips the stack canary for checking the leaf attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by default with RUBY_DEBUG). rb_vm_iseq = RubyVM::InstructionSequence iseq = rb_vm_iseq.compile(<<-EOF) case Class.new(String).new("foo") when "foo" 42 end EOF puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval This commit changes the binary loading logic to build CDHASH with the right st_table type. The dumping logic and the dump format stays the same Notes: Merged: https://github.com/ruby/ruby/pull/4511 Merged-By: XrXr
2021-05-21simple rescue+while+break should not use `throw`Koichi Sasada
609de71f043e8ba34f22b9993e444e2e5bb05709 fixes the issue by using `throw` insn if `ensure` is used. However, that patch introduce additional `throw` even if it is not needed. This patch solves the issue. This issue is pointed by @mame. Notes: Merged: https://github.com/ruby/ruby/pull/4507
2021-05-20compile.c: stop the jump-jump optimization if the second has any eventYusuke Endoh
Fixes [Bug #17868]
2021-05-12Avoid improper optimization of case statements mixed integer/rational/complexJeremy Evans
Fixes [Bug #17857] Notes: Merged: https://github.com/ruby/ruby/pull/4496
2021-05-12cdhash_cmp: should use ||卜部昌平
cf: https://github.com/ruby/ruby/pull/4469#discussion_r628386707
2021-05-12cdhash_cmp: recursively apply卜部昌平
For instance a rational's numerator can be a bignum. Comparison using C's == can be insufficient. Notes: Merged: https://github.com/ruby/ruby/pull/4469
2021-05-12cdhash_cmp: can also take complex卜部昌平
There are complex literals `123i`, which can also be a case condition. Notes: Merged: https://github.com/ruby/ruby/pull/4469
2021-05-12cdhash_cmp: rational literals with fractions卜部昌平
Nobu kindly pointed out that rational literals can have fractions. Notes: Merged: https://github.com/ruby/ruby/pull/4469
2021-05-12cdhash_cmp: can take rational literals卜部昌平
Rational literals are those integers suffixed with `r`. They tend to be a part of more complex expressions like `123/456r`, but in theory they can live alone. When such "bare" rational literals are passed to case-when branch, we have to take care of them. Fixes [Bug #17854] Notes: Merged: https://github.com/ruby/ruby/pull/4469
2021-05-11Revert "Filling cache values on cvar write"Aaron Patterson
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11Filling cache values on cvar writeeileencodes
Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11Add a cache for class variableseileencodes
This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] | |compare-ruby|built-ruby| |:--------|-----------:|---------:| |vm_cvar | 5.681M| 36.980M| | | -| 6.51x| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do |x| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-07compile.c: Pass node instead of nd_line(node) to ADD_INSN* functionsYusuke Endoh
... then, new_insn_core extracts nd_line(node). Also, if a macro "EXPERIMENTAL_ISEQ_NODE_ID" is defined, this changeset keeps nd_node_id(node) for each instruction. This is intended for TypeProf to identify what AST::Node corresponds to each instruction. This patch is originally authored by @yui-knk for showing which column a NoMethodError occurred. https://github.com/ruby/ruby/compare/master...yui-knk:feature/node_id Co-Authored-By: Yuichiro Kaneko <yui-knk@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4470
2021-04-22fix raise in exception with jumpKoichi Sasada
add_ensure_iseq() adds ensure block to the end of jump such as next/redo/return. However, if the rescue cause are in the body, this rescue catches the exception in ensure clause. iter do next rescue R ensure raise end In this case, R should not be executed, but executed without this patch. Fixes [Bug #13930] Fixes [Bug #16618] A part of tests are written by @jeremyevans https://github.com/ruby/ruby/pull/4291 Notes: Merged: https://github.com/ruby/ruby/pull/4399
2021-04-21Evaluate multiple assignment left hand side before right hand sideJeremy Evans
In regular assignment, Ruby evaluates the left hand side before the right hand side. For example: ```ruby foo[0] = bar ``` Calls `foo`, then `bar`, then `[]=` on the result of `foo`. Previously, multiple assignment didn't work this way. If you did: ```ruby abc.def, foo[0] = bar, baz ``` Ruby would previously call `bar`, then `baz`, then `abc`, then `def=` on the result of `abc`, then `foo`, then `[]=` on the result of `foo`. This change makes multiple assignment similar to single assignment, changing the evaluation order of the above multiple assignment code to calling `abc`, then `foo`, then `bar`, then `baz`, then `def=` on the result of `abc`, then `[]=` on the result of `foo`. Implementing this is challenging with the stack-based virtual machine. We need to keep track of all of the left hand side attribute setter receivers and setter arguments, and then keep track of the stack level while handling the assignment processing, so we can issue the appropriate topn instructions to get the receiver. Here's an example of how the multiple assignment is executed, showing the stack and instructions: ``` self # putself abc # send abc, self # putself abc, foo # send abc, foo, 0 # putobject 0 abc, foo, 0, [bar, baz] # evaluate RHS abc, foo, 0, [bar, baz], baz, bar # expandarray abc, foo, 0, [bar, baz], baz, bar, abc # topn 5 abc, foo, 0, [bar, baz], baz, abc, bar # swap abc, foo, 0, [bar, baz], baz, def= # send abc, foo, 0, [bar, baz], baz # pop abc, foo, 0, [bar, baz], baz, foo # topn 3 abc, foo, 0, [bar, baz], baz, foo, 0 # topn 3 abc, foo, 0, [bar, baz], baz, foo, 0, baz # topn 2 abc, foo, 0, [bar, baz], baz, []= # send abc, foo, 0, [bar, baz], baz # pop abc, foo, 0, [bar, baz] # pop [bar, baz], foo, 0, [bar, baz] # setn 3 [bar, baz], foo, 0 # pop [bar, baz], foo # pop [bar, baz] # pop ``` As multiple assignment must deal with splats, post args, and any level of nesting, it gets quite a bit more complex than this in non-trivial cases. To handle this, struct masgn_state is added to keep track of the overall state of the mass assignment, which stores a linked list of struct masgn_attrasgn, one for each assigned attribute. This adds a new optimization that replaces a topn 1/pop instruction combination with a single swap instruction for multiple assignment to non-aref attributes. This new approach isn't compatible with one of the optimizations previously used, in the case where the multiple assignment return value was not needed, there was no lhs splat, and one of the left hand side used an attribute setter. This removes that optimization. Removing the optimization allowed for removing the POP_ELEMENT and adjust_stack functions. This adds a benchmark to measure how much slower multiple assignment is with the correct evaluation order. This benchmark shows: * 4-9% decrease for attribute sets * 14-23% decrease for array member sets * Basically same speed for local variable sets Importantly, it shows no significant difference between the popped (where return value of the multiple assignment is not needed) and !popped (where return value of the multiple assignment is needed) cases for attribute and array member sets. This indicates the previous optimization, which was dropped in the evaluation order fix and only affected the popped case, is not important to performance. Fixes [Bug #4443] Notes: Merged: https://github.com/ruby/ruby/pull/4390 Merged-By: jeremyevans <code@jeremyevans.net>
2021-03-29Make defined? cache the results of method callsJeremy Evans
Previously, defined? could result in many more method calls than the code it was checking. `defined? a.b.c.d.e.f` generated 15 calls, with `a` called 5 times, `b` called 4 times, etc.. This was due to the fact that defined works in a recursive manner, but it previously did not cache results. So for `defined? a.b.c.d.e.f`, the logic was similar to ```ruby return nil unless defined? a return nil unless defined? a.b return nil unless defined? a.b.c return nil unless defined? a.b.c.d return nil unless defined? a.b.c.d.e return nil unless defined? a.b.c.d.e.f "method" ``` With this change, the logic is similar to the following, without the creation of a local variable: ```ruby return nil unless defined? a _ = a return nil unless defined? _.b _ = _.b return nil unless defined? _.c _ = _.c return nil unless defined? _.d _ = _.d return nil unless defined? _.e _ = _.e return nil unless defined? _.f "method" ``` In addition to eliminating redundant method calls for defined statements, this greatly simplifies the instruction sequences by eliminating duplication. Previously: ``` 0000 putnil ( 1)[Li] 0001 putself 0002 defined func, :a, false 0006 branchunless 73 0008 putself 0009 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0011 defined method, :b, false 0015 branchunless 73 0017 putself 0018 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0020 opt_send_without_block <calldata!mid:b, argc:0, ARGS_SIMPLE> 0022 defined method, :c, false 0026 branchunless 73 0028 putself 0029 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0031 opt_send_without_block <calldata!mid:b, argc:0, ARGS_SIMPLE> 0033 opt_send_without_block <calldata!mid:c, argc:0, ARGS_SIMPLE> 0035 defined method, :d, false 0039 branchunless 73 0041 putself 0042 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0044 opt_send_without_block <calldata!mid:b, argc:0, ARGS_SIMPLE> 0046 opt_send_without_block <calldata!mid:c, argc:0, ARGS_SIMPLE> 0048 opt_send_without_block <calldata!mid:d, argc:0, ARGS_SIMPLE> 0050 defined method, :e, false 0054 branchunless 73 0056 putself 0057 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0059 opt_send_without_block <calldata!mid:b, argc:0, ARGS_SIMPLE> 0061 opt_send_without_block <calldata!mid:c, argc:0, ARGS_SIMPLE> 0063 opt_send_without_block <calldata!mid:d, argc:0, ARGS_SIMPLE> 0065 opt_send_without_block <calldata!mid:e, argc:0, ARGS_SIMPLE> 0067 defined method, :f, true 0071 swap 0072 pop 0073 leave ``` After change: ``` 0000 putnil ( 1)[Li] 0001 putself 0002 dup 0003 defined func, :a, false 0007 branchunless 52 0009 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0011 dup 0012 defined method, :b, false 0016 branchunless 52 0018 opt_send_without_block <calldata!mid:b, argc:0, ARGS_SIMPLE> 0020 dup 0021 defined method, :c, false 0025 branchunless 52 0027 opt_send_without_block <calldata!mid:c, argc:0, ARGS_SIMPLE> 0029 dup 0030 defined method, :d, false 0034 branchunless 52 0036 opt_send_without_block <calldata!mid:d, argc:0, ARGS_SIMPLE> 0038 dup 0039 defined method, :e, false 0043 branchunless 52 0045 opt_send_without_block <calldata!mid:e, argc:0, ARGS_SIMPLE> 0047 defined method, :f, true 0051 swap 0052 pop 0053 leave ``` This fixes issues where for pathological small examples, Ruby would generate huge instruction sequences. Unfortunately, implementing this support is kind of a hack. This adds another parameter to compile_call for whether we should assume the receiver is already present on the stack, and has defined? set that parameter for the specific case where it is compiling a method call where the receiver is also a method call. defined_expr0 also takes an additional parameter for whether it should leave the results of the method call on the stack. If that argument is true, in the case where the method isn't defined, we jump to the pop before the leave, so the extra result is not left on the stack. This requires space for an additional label, so lfinish now needs to be able to hold 3 labels. Fixes [Bug #17649] Fixes [Bug #13708] Notes: Merged: https://github.com/ruby/ruby/pull/4213
2021-03-21Pattern matching pin operator against expression [Feature #17411]Kazuki Tsujimoto
This commit is based on the patch by @nobu.
2021-03-17Store strings for `defined` in the iseqsAaron Patterson
We can know the string used for "defined" calls at compile time, then store the string in the instruction sequences Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-10Simplify ibf_dump_object_symbol by delegating to ibf_dump_object_stringJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4119
2021-03-10Pre-freeze ISeq names to avoid useless duplicationJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4119
2021-03-10Use rb_enc_interned_str in ibf_load_object_stringJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4119
2021-03-10Specialize ibf_load_object_symbol and ibf_dump_object_symbolJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4119
2021-02-16Eliminate useless catch tables and nops from lambdasAaron Patterson
Before this commit: ``` $ ruby --dump=insn -e '1.times { |x| puts x }' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE) == catch table | catch type: break st: 0000 ed: 0004 sp: 0000 cont: 0004 | == disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE) | == catch table | | catch type: redo st: 0001 ed: 0006 sp: 0000 cont: 0001 | | catch type: next st: 0001 ed: 0006 sp: 0000 cont: 0006 | |------------------------------------------------------------------------ | local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) | [ 1] x@0<Arg> | 0000 nop ( 1)[Bc] | 0001 putself [Li] | 0002 getlocal_WC_0 x@0 | 0004 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE> | 0006 leave [Br] |------------------------------------------------------------------------ 0000 putobject_INT2FIX_1_ ( 1)[Li] 0001 send <calldata!mid:times, argc:0>, block in <main> 0004 leave ``` After this commit: ``` > ruby --dump=insn -e '1.times { |x| puts x }' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,22)> (catch: FALSE) 0000 putobject_INT2FIX_1_ ( 1)[Li] 0001 send <calldata!mid:times, argc:0>, block in <main> 0004 leave == disasm: #<ISeq:block in <main>@-e:1 (1,8)-(1,22)> (catch: FALSE) local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] x@0<Arg> 0000 putself ( 1)[LiBc] 0001 getlocal_WC_0 x@0 0003 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE> 0005 leave ``` Fixes [ruby-core:102418] [Feature #17613] Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com> Notes: Merged: https://github.com/ruby/ruby/pull/4125
2021-01-19Mark pattern labels as unremoveableVladimir Dementyev
Peephole optimization doesn't play well with find pattern at least. The only case when a pattern matching could have unreachable patterns is when we have lasgn/dasgn node, which shouldn't happen in real-life. Fixes https://bugs.ruby-lang.org/issues/17534
2021-01-14Fix WB for callinfoAaron Patterson
The WB for callinfo needs to be executed *after* the reference is written. Otherwise we get a WB miss.
2021-01-13Guard callinfoAaron Patterson
Callinfo was being written in to an array and the GC would not see the reference on the stack. `new_insn_send` creates a new callinfo object, then it calls `new_insn_core`. `new_insn_core` allocates a new INSN linked list item, which can end up calling `xmalloc` which will trigger a GC: https://github.com/ruby/ruby/blob/70cd351c7c71c48ee18d7c01e851a89614086f8f/compile.c#L968-L969 Since the callinfo object isn't on the stack, the GC won't see it, and it can get collected. This patch just refactors `new_insn_send` to keep the object on the stack Co-authored-by: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/4066
2021-01-13only add the trailing nop if the catch table is not break / next / redoAaron Patterson
We don't need nop padding when the catch tables are only for break / next / redo, so lets avoid them. This eliminates nop padding in many lambdas. Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> Notes: Merged: https://github.com/ruby/ruby/pull/4055
2021-01-05enable constant cache on ractorsKoichi Sasada
constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2021-01-01Hoisted out compile_builtin_arg to refine messagesNobuyoshi Nakada
2020-12-31Access to reserved word parameter like as `__builtin.arg!(:if)`Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4015