Age | Commit message (Collapse) | Author |
|
for opt_* insns.
opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is
handled by opt_send_without_block. Therefore we can't decide which insn
should be generated by checking whether it's cfunc cc or not.
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux]
last_commit=Decide JIT-ed insn based on cached cfunc
Calculating -------------------------------------
before --jit after --jit
mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s
mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s
mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s
mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s
Comparison:
mjit_nil?(1)
after --jit: 74020528.4 i/s
before --jit: 73878185.9 i/s - 1.00x slower
mjit_not(1)
after --jit: 74600882.0 i/s
before --jit: 72634507.6 i/s - 1.03x slower
mjit_eq(1, nil)
after --jit: 7444657.4 i/s
before --jit: 7331304.3 i/s - 1.02x slower
mjit_eq(nil, 1)
after --jit: 64710790.6 i/s
before --jit: 49449507.4 i/s - 1.31x slower
```
|
|
* Verify builtin inline annotation with VM_CHECK_MODE
* Remove static to fix the link issue on MJIT
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
|
|
|
|
This reverts commit 19cabe8b09d92d033c244f32ff622b8e513375f1,
which didn't support tool/lib/iseq_loader_checker.rb.
|
|
|
|
This callcache is on stack, must not be GCed. However its contents are
copied from other materials, which can be an ordinal object. Should
set a flag to make sure it is properly skipped by the GC.
|
|
When on USE_EMBED_CI, cd is stored statically. Previous use could cache
stale cd->cc, which could have already been GCed. Need flush them.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
Entires not GC-able must be considered to be volatile. Not eligible for
later use.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
It was a wrong idea to assume CIs are always embedded.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
Ko1 prefers variables be assgined, instead of bare literals in function
arguments.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This further reduces the generated binary of vm_call_method from 566
bytes to 545 bytes on my machine, according to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of rb_vm_call0 from 281
bytes to 211 bytes on my machine. Should reduce GC pressure as well.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of vm_call_method from 600
bytes to 566 bytes on my machine, accroding to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of vm_call_method_each_type
from 2,442 bytes to 2,378 bytes on my machine, accroding to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of vm_call_method_each_type
from 2,522 bytes to 2,442 bytes on my machine, accroding to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of
vm_call_method_missing_body from 604 bytes to 532 bytes on my machine.
Should reduce GC pressure as well.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of vm_call_symbol from 808
bytes to 798 bytes on my machine. Should reduce GC pressure as well.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of vm_call_alias from 188
bytes to 149 bytes on my machine, accroding to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of rb_eql_opt from 86 bytes to
20 bytes on my machine, according to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
Now that vm_empty_cc is statically allocated outside of the object
space. It shall not be GCed. Here, because vm_search_cc can return
that. Must not be blindly passed to RB_OBJ_WRITE, unless assertions
fail on RGENGC_CHECK_MODE, like this:
-- C level backtrace information
-------------------------------------------
ruby(rb_print_backtrace+0x19) [0x5555557fd579] vm_dump.c:757
ruby(rb_vm_bugreport+0x151) [0x5555557fd6f1] vm_dump.c:955
ruby(rb_bug+0x1d6) [0x5555558d6396] error.c:660
ruby(check_rvalue_consistency_force+0x707) [0x5555555adb97] gc.c:1289
ruby(check_rvalue_consistency+0x1a) [0x555555598a0a] gc.c:1305
ruby(RVALUE_OLD_P+0x15) [0x5555555975d5] gc.c:1382
ruby(rb_gc_writebarrier+0x9f) [0x55555559753f] gc.c:6882
ruby(rb_obj_written+0x3a) [0x5555557a025a] include/ruby/internal/rgengc.h:180
ruby(rb_obj_write+0x41) [0x5555557a1a81] include/ruby/internal/rgengc.h:195
ruby(rb_vm_search_method_slowpath+0x5a) [0x5555557a125a] vm_insnhelper.c:1603
ruby(vm_search_method_fastpath+0x197) [0x5555557d8027] vm_insnhelper.c:1638
ruby(vm_search_method+0xea) [0x5555557d7d2a] vm_insnhelper.c:1650
ruby(vm_search_method_wrap+0x29) [0x5555557dbaf9] vm_insnhelper.c:4091
ruby(vm_sendish+0xa9) [0x5555557dba39] vm_insnhelper.c:4143
ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801
ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942
ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058
ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130
ruby(invoke_block_from_c_bh) vm.c:1148
ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193
ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141
ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147
ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157
ruby(rb_ary_collect+0xb0) [0x555555828320] array.c:3186
ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385
ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553
ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146
ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782
ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942
ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058
ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130
ruby(invoke_block_from_c_bh) vm.c:1148
ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193
ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141
ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147
ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157
ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242
ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385
ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553
ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146
ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782
ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942
ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058
ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130
ruby(invoke_block_from_c_bh) vm.c:1148
ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193
ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141
ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147
ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157
ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242
ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385
ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553
ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146
ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782
ruby(rb_vm_exec+0x19f) [0x5555557d183f] vm.c:1951
ruby(rb_iseq_eval+0x30) [0x5555557d2530] vm.c:2190
ruby(load_iseq_eval+0xd6) [0x5555555fa7e6] load.c:592
ruby(require_internal+0x25e) [0x5555555f7f5e] load.c:1022
ruby(rb_require_string+0x27) [0x5555555f74e7] load.c:1094
ruby(rb_f_require_relative+0x5f) [0x5555555f758f] load.c:837
ruby(call_cfunc_1+0x30) [0x5555557f0f70] vm_insnhelper.c:2391
ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553
ruby(vm_call_cfunc+0xad) [0x5555557e521d] vm_insnhelper.c:2574
ruby(vm_call_method_each_type+0xc7) [0x5555557e4af7] vm_insnhelper.c:3040
ruby(vm_call_method+0x19c) [0x5555557e45dc] vm_insnhelper.c:3144
ruby(vm_call_general+0x2d) [0x5555557c8c3d] vm_insnhelper.c:3176
ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146
ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801
ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942
ruby(rb_iseq_eval_main+0x30) [0x5555557d2670] vm.c:2201
ruby(rb_ec_exec_node+0x16b) [0x55555557e39b] eval.c:296
ruby(ruby_run_node+0x72) [0x55555557e1f2] eval.c:354
ruby(main+0x78) [0x55555557a5d8] main.c:50
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This changeset reduces the generated binary of rb_equal_opt from 129 bytes
to 17 bytes on my machine, according to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
This is such a hot path that it's worth eliminating a function call. Use
the static variable directly instead.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
For debug. Must not change generated binary unless VM_ASSERT is on.
Notes:
Merged: https://github.com/ruby/ruby/pull/3179
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3180
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3180
|
|
According to nobu recursion can be longer than my expectation. Limit
them here.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
VM stack could overflow here. The condition is when a symbol is passed
to a block-taking method via &variable, and that symbol has never been
used for actual method names (thus yielding that results in calling
method_missing), and the VM stack is full (no single word left). This
is a once-in-a-blue-moon event. Yet there is a very tiny room of stack
overflow. We need to check that.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Was (harmless but) redundant.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Just cosmetic change to improve readability.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
This commit changes the number of calls of MEMCPY from...
| send | &:sym
-------------------------|-------|-------
Symbol already interned | once | twice
Symbol not pinned yet | none | once
to:
| send | &:sym
-------------------------|-------|-------
Symbol already interned | once | none
Symbol not pinned yet | twice | once
So it sacrifices exceptional situation for normal path.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Symbol#to_proc and Object#send are closely related each other. Why not
share their implementations. By doing so we can skip recursive call of
vm_exec(), which could benefit for speed.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
This changeset slightly speeds up on my machine.
Calculating -------------------------------------
before after
Optcarrot Lan_Master.nes 38.33488426546287 40.89825082589147 fps
40.91288557922081 41.48687465359386
40.96591995270991 41.98499064664184
41.20461943032173 43.67314690779162
42.38344888176518 44.02777536251875
43.43563728880915 44.88695892714136
43.88082889062643 45.11226186242523
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
This makes it possible for vm_invoke_block to pass its passed arguments
verbatimly to calling functions. Because they are tail-called the
function calls can be strength-recuced into indirect jumps, which is a
huge win.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Use recursion for better readability.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Moved block handler -> captured block conversion from vm_invokeblock to
each vm_invoke_*_block functions.
Notes:
Merged: https://github.com/ruby/ruby/pull/3152
|
|
Protected methods are restricted to be called according to the
class/module in where it is defined, not the actual receiver's
class. [Bug #16931]
|
|
These two function were almost identical, except in case of
T_STRING/T_FLOAT. Why not merge them into one, and let the difference be
handled in normal method calls (slowpath). This does not improve
runtime performance for me, but at least reduces for instance rb_eql_opt
from 653 bytes to 86 bytes on my machine, according to nm(1).
Notes:
Merged: https://github.com/ruby/ruby/pull/3169
|
|
to reduce code size and improve locality of hot code.
|
|
If a module has an origin, and that module is included in another
module or class, previously the iclass created for the module had
an origin pointer to the module's origin instead of the iclass's
origin.
Setting the origin pointer correctly requires using a stack, since
the origin iclass is not created until after the iclass itself.
Use a hidden ruby array to implement that stack.
Correctly assigning the origin pointers in the iclass caused a
use-after-free in GC. If a module with an origin is included
in a class, the iclass shares a method table with the module
and the iclass origin shares a method table with module origin.
Mark iclass origin with a flag that notes that even though the
iclass is an origin, it shares a method table, so the method table
should not be garbage collected. The shared method table will be
garbage collected when the module origin is garbage collected.
I've tested that this does not introduce a memory leak.
This change caused a VM assertion failure, which was traced to callable
method entries using the incorrect defined_class. Update
rb_vm_check_redefinition_opt_method and find_defined_class_by_owner
to treat iclass origins different than class origins to avoid this
issue.
This also includes a fix for Module#included_modules to skip
iclasses with origins.
Fixes [Bug #16736]
Notes:
Merged: https://github.com/ruby/ruby/pull/3136
|
|
rb_callable_method_entry() creates ccs entry in cc_tbl, but this
code overwrite by insert newly created ccs and overwrote ccs never
freed.
[Bug #16900]
Notes:
Merged: https://github.com/ruby/ruby/pull/3128
|
|
Function pointers are not void*. See also
115fec062ccf7c6d72c8d5f64b7a5d84c9fb2dd8
ce4ea956d24eab5089a143bba38126f2b11b55b6
8427fca49bd85205f5a8766292dd893f003c0e48
|
|
To fix build failures.
Notes:
Merged: https://github.com/ruby/ruby/pull/3079
|
|
This shall fix compile errors.
Notes:
Merged: https://github.com/ruby/ruby/pull/3079
|
|
|
|
The same bug as 8355a99883 existed in attr_reader too.
|
|
We started to use fastpath on invokesuper when a method is not refinements
since 5c27681813, but we shouldn't have used fastpath for attr_writer either.
`cc->aux_.attr_index` is for an actual receiver class, while we store
its superclass in `cc->klass` and therefore there's no way to properly
invalidate attr_writer's inline cache when it's called by super.
[Bug #16785]
I suspect the same bug also exists in attr_reader. I'll address that in
another commit.
|
|
when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT
(i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat).
Micro benchmark:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
before after
vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s
Comparison:
vm_send_cfunc
after: 88723605.2 i/s
before: 69584737.1 i/s - 1.28x slower
```
Optcarrot:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
before after
Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps
50.76388649761503 51.04211379912850
50.80930672252514 51.39455790755538
50.90236000778749 51.75656936556145
51.01744746340430 51.86875277356489
51.06495279015112 51.88692482485558
51.07785337168974 51.93429603190578
51.20163525187862 51.95768145071314
51.34671771913112 52.45577266040274
51.35918340835583 52.53163888762858
51.46641337418146 52.62172484121034
51.50835463462257 52.85064021113239
```
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Fastpath has not been used for invokesuper since it has set vm_call_super_method on every invocation.
Because it seems to be blocked only by refinements, try enabling fastpath on invokesuper when cme is not for refinements.
While this patch itself should be helpful for VM performance, a part of this patch's motivation is to unblock inlining invokesuper on JIT.
$ benchmark-driver -v --rbenv 'before;after' benchmark/vm2_super.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-11T15:19:58Z master a01bda5949) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-12T02:00:08Z invokesuper-fastpath c171984ee3) [x86_64-linux]
Calculating -------------------------------------
before after
vm2_super 20.031M 32.860M i/s - 6.000M times in 0.299534s 0.182593s
Comparison:
vm2_super
after: 32859885.2 i/s
before: 20031097.3 i/s - 1.64x slower
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|