summaryrefslogtreecommitdiff
path: root/mjit_compile.c
AgeCommit message (Collapse)Author
2022-08-21Rename mjit_compile.c to mjit_compiler.cTakashi Kokubun
I'm planning to introduce mjit_compiler.rb, and I want to make this consistent with it. Consistency with compile.c doesn't seem important for MJIT anyway.
2022-08-20Drop mswin support of MJIT (#6265)Takashi Kokubun
The current MJIT relies on SIGCHLD and fork(2) to be performant, and it's something mswin can't offer. You could run Linux MJIT on WSL instead. [Misc #18968] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-08-19Rename mjit_exec to jit_exec (#6262)Takashi Kokubun
* Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2022-08-17yjit.h is not necessary for all sources using mjit.hNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6239
2022-07-15Implement Objects on VWAPeter Zhu
This commit implements Objects on Variable Width Allocation. This allows Objects with more ivars to be embedded (i.e. contents directly follow the object header) which improves performance through better cache locality. Notes: Merged: https://github.com/ruby/ruby/pull/6117
2022-07-14MJIT: Share rb_mjit_unit through mjit_unit.hTakashi Kokubun
mjit_compile.c should be able to access this more easily. Notes: Merged: https://github.com/ruby/ruby/pull/6140
2022-06-23Speed up ISeq by marking via bitmaps and IC rearrangingAaron Patterson
This commit adds a bitfield to the iseq body that stores offsets inside the iseq buffer that contain values we need to mark. We can use this bitfield to mark objects instead of disassembling the instructions. This commit also groups inline storage entries and adds a counter for each entry. This allows us to iterate and mark each entry without disassembling instructions Since we have a bitfield and grouped inline caches, we can mark all VALUE objects associated with instructions without actually disassembling the instructions at mark time. [Feature #18875] [ruby-core:109042] Notes: Merged: https://github.com/ruby/ruby/pull/6053
2022-06-15MJIT: Get rid of obsoleted compiling_iseqsTakashi Kokubun
2022-06-08Fix MJIT's ISEQ_BODY macro usage at 5f10bd634fbTakashi Kokubun
2022-03-24Add ISEQ_BODY macroPeter Zhu
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation. Notes: Merged: https://github.com/ruby/ruby/pull/5698
2021-10-12Extract precompile_inlinable_child_iseq to separate alloca for each iseqNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4951
2021-08-30Suppress redefinition warnings of GET_SELF()Nobuyoshi Nakada
2021-06-04Improve perfomance for Integer#size method [Feature #17135] (#3476)S.H
* Improve perfomance for Integer#size method [Feature #17135] * re-run ci * Let MJIT frame skip work for Integer#size Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2021-06-02Refactor rb_vm_insn_addr2insn callsTakashi Kokubun
It's been a way too much amount of ifdefs.
2021-05-30Mark inlined ISeqs during MJIT compilation (#4539)Takashi Kokubun
[Bug #17584] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-12-21Prefer stdbool in vm_execTakashi Kokubun
Make the code a bit modern and consistent with some other places.
2020-12-16Inline getconstant on JIT (#3906)Takashi Kokubun
* Inline getconstant on JIT * Support USE_MJIT=0 Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-11-27Cache access to reg_cfp->self on JITTakashi Kokubun
``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-11-27T06:41:15Z master 8ce1711c25) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-11-27T08:36:02Z master 2c592126b9) +JIT [x86_64-linux] last_commit=Cache access to reg_cfp->self on JIT Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 82.40522392468650 82.66023870551237 fps 82.67998539899482 83.08660305312587 85.51280693947453 87.09311989553235 86.32925337181406 87.16115255191410 87.35617494926235 87.30699391518075 87.91865339426212 88.47590342996875 88.11573661006648 88.64778616696353 88.16060826662158 88.67015079203991 88.21639244865058 89.19630739497482 88.47241577897603 89.23443637947730 89.37087287229809 89.57052723997015 89.46969964699964 89.97803363889025 ```
2020-11-26Revert "Set VM_FRAME_FLAG_FINISH at once on MJIT"Takashi Kokubun
This reverts commit 4d2c8edca69884a41d2f843d36023e3decdb9872. Unfortunately this seems to cause several issues: https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802
2020-11-26Set VM_FRAME_FLAG_FINISH at once on MJITTakashi Kokubun
Performance is probably improved? $ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master 69e77e81dc) +JIT [x86_64-linux] after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux] last_commit=Set VM_FRAME_FLAG_FINISH at once Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 80.89292998533379 82.19497327502751 fps 80.93130641142331 85.13943315260148 81.06214830270119 87.43757879797808 82.29172808453910 87.89942441487113 84.61206450455929 87.91309779491075 85.44545883567997 87.98026086648694 86.02923132404449 88.03081060383973 86.07411817365879 88.14650206137341 86.34348799602836 88.32791633649961 87.90257338977324 88.57599644892220 88.58006509876580 88.67426384743277 89.26611118140011 88.81669430874207 This should have no bad impact on VM because this function is ALWAYS_INLINE.
2020-11-22ruby/internal/config.h needs to be included firstTakashi Kokubun
to define USE_MJIT.
2020-11-20Eliminate IVC sync between JIT and Ruby threads (#3799)Takashi Kokubun
Thanks to Ractor (https://github.com/ruby/ruby/pull/2888 and https://github.com/ruby/ruby/pull/3662), inline caches support parallel access now. Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-10-17sync RClass::ext::iv_index_tblKoichi Sasada
iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This patch also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically. Notes: Merged: https://github.com/ruby/ruby/pull/3662
2020-09-08Use size_t for MJIT's max_ivar_indexTakashi Kokubun
iseq_inline_iv_cache_entry's index is also size_t. %"PRIuSIZE" seems to print warnings against st_index_t in some environments.
2020-07-03Merge ivar guards on JIT (#3284)Takashi Kokubun
when an ISeq has multiple ivar accesses. Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-06-25Show what's inlined first in "JIT inline" logTakashi Kokubun
and add a debug log
2020-06-25Decide JIT-ed insn based on cached cfuncTakashi Kokubun
for opt_* insns. opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is handled by opt_send_without_block. Therefore we can't decide which insn should be generated by checking whether it's cfunc cc or not. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux] last_commit=Decide JIT-ed insn based on cached cfunc Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s Comparison: mjit_nil?(1) after --jit: 74020528.4 i/s before --jit: 73878185.9 i/s - 1.00x slower mjit_not(1) after --jit: 74600882.0 i/s before --jit: 72634507.6 i/s - 1.03x slower mjit_eq(1, nil) after --jit: 7444657.4 i/s before --jit: 7331304.3 i/s - 1.02x slower mjit_eq(nil, 1) after --jit: 64710790.6 i/s before --jit: 49449507.4 i/s - 1.31x slower ```
2020-06-23Avoid generating opt_send with cfunc cc with JITTakashi Kokubun
only for opt_nil_p and opt_not. While vm_method_cfunc_is is used for opt_eq too, many fast paths of it don't call it. So if it's populated, it should generate opt_send, regardless of cfunc or not. And again, opt_neq isn't relevant due to the difference in operands. So opt_nil_p and opt_not are the only variants using vm_method_cfunc_is like they use. ``` $ benchmark-driver -v --rbenv 'before2 --jit::ruby --jit;before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before2 --jit: ruby 2.8.0dev (2020-06-22T08:37:37Z master 3238641750) +JIT [x86_64-linux] before --jit: ruby 2.8.0dev (2020-06-23T01:01:24Z master 9ce2066209) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-23T06:58:37Z master 17e9df3157) +JIT [x86_64-linux] last_commit=Avoid generating opt_send with cfunc cc with JIT Calculating ------------------------------------- before2 --jit before --jit after --jit mjit_nil?(1) 54.204M 75.536M 75.031M i/s - 40.000M times in 0.737947s 0.529548s 0.533110s mjit_not(1) 53.822M 70.921M 71.920M i/s - 40.000M times in 0.743195s 0.564007s 0.556171s mjit_eq(1, nil) 7.367M 6.496M 7.331M i/s - 8.000M times in 1.085882s 1.231470s 1.091327s Comparison: mjit_nil?(1) before --jit: 75536059.3 i/s after --jit: 75031409.4 i/s - 1.01x slower before2 --jit: 54204431.6 i/s - 1.39x slower mjit_not(1) after --jit: 71920324.1 i/s before --jit: 70921063.1 i/s - 1.01x slower before2 --jit: 53821697.6 i/s - 1.34x slower mjit_eq(1, nil) before2 --jit: 7367280.0 i/s after --jit: 7330527.4 i/s - 1.01x slower before --jit: 6496302.8 i/s - 1.13x slower ```
2020-06-22Compile opt_send for opt_* only when cc has ISeqTakashi Kokubun
because opt_nil/opt_not/opt_eq populates cc even when it doesn't fallback to opt_send_without_block because of vm_method_cfunc_is. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-22T08:11:24Z master d231b8f95b) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-22T08:53:27Z master e1125879ed) +JIT [x86_64-linux] last_commit=Compile opt_send for opt_* only when cc has ISeq Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 54.106M 73.693M i/s - 40.000M times in 0.739288s 0.542795s mjit_not(1) 53.398M 74.477M i/s - 40.000M times in 0.749090s 0.537075s mjit_eq(1, nil) 7.427M 6.497M i/s - 8.000M times in 1.077136s 1.231326s Comparison: mjit_nil?(1) after --jit: 73692594.3 i/s before --jit: 54106108.4 i/s - 1.36x slower mjit_not(1) after --jit: 74477487.9 i/s before --jit: 53398125.0 i/s - 1.39x slower mjit_eq(1, nil) before --jit: 7427105.9 i/s after --jit: 6497063.0 i/s - 1.14x slower ``` Actually opt_eq becomes slower by this. Maybe it's indeed using opt_send_without_block, but I'll approach that one in another commit.
2020-06-20Introduce Primitive.attr! to annotate 'inline' (#3242)Takashi Kokubun
[Feature #15589] Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-05-26Eliminate a call instruction on JIT cancel pathTakashi Kokubun
by calling combined functions specialized for each cancel type. I'm hoping to improve locality of hot code, but this patch's impact should be insignificant.
2020-05-17Simplify maybe_special_const_class_pTakashi Kokubun
2020-05-17Reduce code size for rb_class_ofTakashi Kokubun
by inlining only hot path. === mame/optcarrot === $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all before --jit: ruby 2.8.0dev (2020-05-18T05:21:31Z master 0e5a58b6bf) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-05-18T06:12:04Z master 0e3d71a8d1) +JIT [x86_64-linux] last_commit=Reduce code size for rb_class_of Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 71.62880463568773 70.95730063273503 fps 71.73973684273152 71.98447841929851 75.03923801841310 75.54262519509039 75.16300287174957 77.64029272984344 75.16834828625935 78.67861469580785 75.17670723726911 78.81879353707393 75.67637908020630 79.18188850392886 76.19843953215396 79.66484891814478 77.28166716118808 79.80278072861037 77.38509903325165 80.05859292679696 78.12693418455953 80.34624804808006 78.73654441746730 80.66326571254345 79.25387513454415 80.69760605740196 79.44137881689524 81.32053489212245 79.50497657368358 81.50250852553751 79.62401328582868 82.27544931834611 79.79178811723664 82.67455264522741 81.20275352937418 82.93857260493297 81.57027048640776 83.15019118788184 81.63373188649095 83.20728816044721 81.93420437766426 83.25027576772972 82.05716136357167 83.27072145898173 82.21070805525066 83.36008265822194 82.56924063784872 83.36112268888493 === benchmark-driver/sinatra === [rps] before: 13143.49 rps after: 13505.70 rps [inlined rb_class_of size] before: 11.5K after: 3.8K (calculated by `dwarftree --die inlined_subroutine --flat --merge --show-size`)
2020-05-11sed -i 's|ruby/impl|ruby/internal|'卜部昌平
To fix build failures. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-05-11sed -i s|ruby/3|ruby/impl|g卜部昌平
This shall fix compile errors. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-05-06Remove OPT_CHECKED_RUN codeTakashi Kokubun
Now this one is actually not in use because we override entire leave definition for JIT.
2020-05-06Enable OPT_CHECKED_RUN on MJIT for debuggingTakashi Kokubun
Trying to debug errors like http://ci.rvm.jp/results/trunk-mjit@silicon-docker/2921397 http://ci.rvm.jp/results/trunk-mjit@silicon-docker/2894526
2020-05-01Fix MJIT compiler warnings in clangTakashi Kokubun
2020-05-01Fix a wrong argument of vm_exec on JIT cancelTakashi Kokubun
2020-05-01Make sure unit->id is inheritedTakashi Kokubun
to child compile_status
2020-04-30Include unit id in a function name of an inlined methodTakashi Kokubun
I'm trying to make it possible to include all JIT-ed code in a single C file. This is needed to guarantee uniqueness of all function names
2020-04-08Merge pull request #2991 from shyouhei/ruby.h卜部昌平
Split ruby.h Notes: Merged-By: shyouhei <shyouhei@ruby-lang.org>
2020-03-30Optimize exivar access on JIT-ed getivarTakashi Kokubun
JIT support of dd723771c11. $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4 before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_exivar 57.944M 53.579M 54.471M i/s - 200.000M times in 3.451588s 3.732772s 3.671687s Comparison: mjit_exivar before: 57944345.1 i/s after --jit: 54470876.7 i/s - 1.06x slower before --jit: 53579483.4 i/s - 1.08x slower
2020-03-12Avoid referring to an old value of reallocTakashi Kokubun
OpenBSD RubyCI has failed with SEGV since 4bcd5981e80d3e1852c8723741a0069779464128. https://rubyci.org/logs/rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20200312T223005Z.fail.html.gz This was because `status->cc_entries` could be stale after `realloc` call for inlined iseqs.
2020-03-10Capture inlined iseq's cc entries in root iseq'sTakashi Kokubun
jit_unit to avoid marking wrong cc entries when inlined iseq is compiled multiple times, resolving the TODO added by daf7c48d88. This obviates pseudo jit_unit in inlined iseq introduced by 7ec2359374 and fixes memory leak of the adhoc unit.
2020-02-26Internalize rb_mjit_unit definition againTakashi Kokubun
Fixed a TODO in b9007b6c548f91e88fd3f2ffa23de740431fa969
2020-02-22Introduce disposable call-cache.Koichi Sasada
This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2020-02-22VALUE size packed callinfo (ci).Koichi Sasada
Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2019-12-26decouple internal.h headers卜部昌平
Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies). Notes: Merged: https://github.com/ruby/ruby/pull/2711
2019-11-21Add a proper cast to pass JIT tests on mswin.Koichi Sasada
https://ci.appveyor.com/project/ruby/ruby/builds/29001248/job/ye80bsrmewdgw294