summaryrefslogtreecommitdiff
path: root/benchmark
AgeCommit message (Collapse)Author
2020-04-22support builtin for Kernel#FloatS.H
# Iteration per second (i/s) | |compare-ruby|built-ruby| |:------------|-----------:|---------:| |float | 30.395M| 38.314M| | | -| 1.26x| |float_true | 3.833M| 27.322M| | | -| 7.13x| |float_false | 4.182M| 24.938M| | | -| 5.96x| Notes: Merged: https://github.com/ruby/ruby/pull/3048 Merged-By: nobu <nobu@ruby-lang.org>
2020-04-13Unify vm benchmark prefixes to vm_ (#3028)Takashi Kokubun
The vm1_ prefix and vm2_ had had special meaning until 820ad9cb1d72d0897b73dae282df3793814b27e8 and 12068aa4e980ab32a0438408a519030e65dabf5e. AFAIK there's no special meaning in vm3_ prefix. As they have confused people (like "In `benchmark` what is difference between `vm1_`, `vm2_` and `vm3_`"), I'd like to remove the obsoleted prefix as we obviated that two years ago. Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-04-13Make vm_call_cfunc_with_frame a fastpath (#3027)Takashi Kokubun
when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ``` Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2020-04-13Unwrap vm_call_cfunc indirection on JITTakashi Kokubun
for VM_METHOD_TYPE_CFUNC. This has been known to decrease optcarrot fps: ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 66.38132676191719 67.41369177299630 fps 69.42728743772243 68.90327567263054 72.16028300263211 69.62605130880686 72.46631319102777 70.48818243767207 73.37078877002490 70.79522887347566 73.69422431217367 70.99021920193194 74.01471487018695 74.69931965402584 75.48685183295630 74.86714575949016 75.54445264507932 75.97864419721677 77.28089738169756 76.48908637569581 78.04183397891302 76.54320932488021 78.36807984096562 76.59407262898067 78.92898762543574 77.31316743361343 78.93576483233765 77.97153484180480 79.13754917503078 77.98478782102325 79.62648945850653 78.02263322726446 79.86334213878064 78.26333724045934 80.05100635898518 78.60056756355614 80.26186843769584 78.91082645644468 80.34205717020330 79.01226659142263 80.62286066044338 79.32733939423721 80.95883033058557 79.63793060542024 80.97376819251613 79.73108936622778 81.23050939202896 80.18280109433088 ``` and I deleted this capability in an early stage of YARV-MJIT development: https://github.com/k0kubun/yarv-mjit/commit/0ab130feeefc2b9078a1077e4fec93b3f5e45d07 I suspect either of the following things could be the cause: * Directly calling vm_call_cfunc requires more optimization effort in GCC, resulting in 30ms-ish compilation time increase for such methods and decreasing the number of methods compiled in a benchmarked period. * Code size increase => icache miss hit These hypotheses could be verified by some methodologies. However, I'd like to introduce this regardless of the result because this blocks inlining C method's definition. I may revert this commit when I give up to implement inlining C method definition, which requires this change. Microbenchmark-wise, this gives slight performance improvement: ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_send_cfunc.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit mjit_send_cfunc 41.961M 56.489M i/s - 100.000M times in 2.383143s 1.770244s Comparison: mjit_send_cfunc after --jit: 56489372.5 i/s before --jit: 41961388.1 i/s - 1.35x slower ```
2020-03-31Make JIT-ed leave insn leafTakashi Kokubun
to eliminate sp / pc moves by cancelling JIT execution on interrupts. $ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 75.06409603894944 76.06422026555558 fps 75.12025067279242 78.48161731616810 77.42020273492177 79.78958240950033 79.07253675128945 79.88645902325614 79.99179109732327 80.33743931749331 80.07633091008627 80.53790081529166 80.15450942667547 80.99048270668010 80.48372803283709 81.70497146081003 80.57410149187352 82.79494539467382 81.80449157081202 82.85797792223954 82.24629397834902 83.00603891515506 82.63708148686703 83.23221006969828 $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_leave.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-04-01T03:48:56Z master 5a81562dfe) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-04-01T04:58:01Z master 39beb26a27) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_leave 106.656M 82.786M 91.635M i/s - 200.000M times in 1.875183s 2.415881s 2.182569s Comparison: mjit_leave before: 106656239.9 i/s after --jit: 91635143.7 i/s - 1.16x slower before --jit: 82785537.2 i/s - 1.29x slower
2020-03-30Remove an unused pragmaTakashi Kokubun
It originally had a string literal, but it no longer has one.
2020-03-30Optimize exivar access on JIT-ed getivarTakashi Kokubun
JIT support of dd723771c11. $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4 before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_exivar 57.944M 53.579M 54.471M i/s - 200.000M times in 3.451588s 3.732772s 3.671687s Comparison: mjit_exivar before: 57944345.1 i/s after --jit: 54470876.7 i/s - 1.06x slower before --jit: 53579483.4 i/s - 1.08x slower
2020-03-17Reduce allocations for keyword argument hashesJeremy Evans
Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(**kw) kw end h = {a: 1} kw(a: 1) # about same kw(**h) # 2.37x faster kws(a: 1) # 1.30x faster kws(**h) # 2.19x faster kw(a: 1, **h) # 1.03x slower kw(**h, **h) # about same kws(a: 1, **h) # 1.16x faster kws(**h, **h) # 1.14x faster Notes: Merged: https://github.com/ruby/ruby/pull/2945
2020-03-17support builtin for Kernel#cloneS.H
Notes: Merged: https://github.com/ruby/ruby/pull/2954 Merged-By: nobu <nobu@ruby-lang.org>
2020-02-29Added more benchmarks for StringNobuyoshi Nakada
2020-01-31Improve `String#slice!` performanceNobuyoshi Nakada
Instead of searching twice to extract and to delete, extract and delete the found position at the first search. This makes faster nearly twice, for regexps and strings. | |compare-ruby|built-ruby| |:-------------|-----------:|---------:| |regexp-short | 2.143M| 3.918M| |regexp-long | 105.162k| 205.410k| |string-short | 3.789M| 7.964M| |string-long | 1.301M| 2.457M| Notes: Merged: https://github.com/ruby/ruby/pull/2871
2020-01-22Drop executable bit of *.{yml,h,mk.tmpl}Kazuhiro NISHIYAMA
2020-01-11Let execution context local storage be an ID tableLourens Naudé
Notes: Merged: https://github.com/ruby/ruby/pull/2814
2020-01-08Speeds up fallback to Hash#default_proc in rb_hash_aref by removing a method ↵Lourens Naudé
call Notes: Merged: https://github.com/ruby/ruby/pull/2821
2019-11-09Remove unneeded exec bits from some filesDavid Rodríguez
I noticed that some files in rubygems were executable, and I could think of no reason why they should be. In general, I think ruby files should never have the executable bit set unless they include a shebang, so I run the following command over the whole repo: ```bash find . -name '*.rb' -type f -executable -exec bash -c 'grep -L "^#!" $1 || chmod -x $1' _ {} \; ``` Notes: Merged: https://github.com/ruby/ruby/pull/2662
2019-10-22Benchmark for [Feature #16155]Nobuyoshi Nakada
2019-10-21Stop making a redundant hash copy in Hash#dup (#2489)Dylan Thacker-Smith
* Stop making a redundant hash copy in Hash#dup It was making a copy of the hash without rehashing, then created an extra copy of the hash to do the rehashing. Since rehashing creates a new copy already, this change just uses that rehashing to make the copy. [Bug #16121] * Remove redundant Check_Type after to_hash * Fix freeing and clearing destination hash in Hash#initialize_copy The code was assuming the state of the destination hash based on the source hash for clearing any existing table on it. If these don't match, then that can cause the old table to be leaked. This can be seen by compiling hash.c with `#define HASH_DEBUG 1` and running the following script, which will crash from a debug assertion. ```ruby h = 9.times.map { |i| [i, i] }.to_h h.send(:initialize_copy, {}) ``` * Remove dead code paths in rb_hash_initialize_copy Given that `RHASH_ST_TABLE_P(h)` is defined as `(!RHASH_AR_TABLE_P(h))` it shouldn't be possible for a hash to be neither of these, so there is no need for the removed `else if` blocks. * Share implementation between Hash#replace and Hash#initialize_copy This also fixes key rehashing for small hashes backed by an array table for Hash#replace. This used to be done consistently in ruby 2.5.x, but stopped being done for small arrays in ruby 2.6.x. This also bring optimization improvements that were done for Hash#initialize_copy to Hash#replace. * Add the Hash#dup benchmark
2019-09-28Optimize Array#flatten and flatten! for already flattened arrays (#2495)Dylan Thacker-Smith
* Optimize Array#flatten and flatten! for already flattened arrays * Add benchmark for Array#flatten and Array#flatten! [Bug #16119]
2019-09-26Reduce ISeq size of mjit_exec benchmarkTakashi Kokubun
to avoid unwanted memory pressure
2019-09-26Add special runner to benchmark mjit_execTakashi Kokubun
I wanted to dynamically generate benchmark cases to test various number of methods. Thus I added a dedicated runner of benchmark-driver.
2019-09-21Add a benchmark for JIT-ed code dispatchTakashi Kokubun
2019-09-19reuse cc->call卜部昌平
I noticed that in case of cache misshit, re-calculated cc->me can be the same method entry than the pevious one. That is an okay situation but can't we partially reuse the cache, because cc->call should still be valid then? One thing that has to be special-cased is when the method entry gets amended by some refinements. That happens behind-the-scene of call cache mechanism. We have to check if cc->me->def points to the previously saved one. Calculating ------------------------------------- trunk ours vm2_poly_same_method 1.534M 2.025M i/s - 6.000M times in 3.910203s 2.962752s Comparison: vm2_poly_same_method ours: 2025143.9 i/s trunk: 1534447.2 i/s - 1.32x slower Notes: Merged: https://github.com/ruby/ruby/pull/2468
2019-09-02Add a benchmark for opt_regexpmatch2Takashi Kokubun
vm2_regexp was for opt_regexpmatch1.
2019-08-10Close created files [ci skip]Nobuyoshi Nakada
2019-08-10Fix typo in comment [ci skip]Masato Ohba
s/Thtread/Thread
2019-08-05n+1 to include n in rangeYaw Boakye
Python's range stop right before n, which means factL never returns the correct result. Closes: https://github.com/ruby/ruby/pull/1982
2019-08-02Revert "Revert "Add a specialized instruction for `.nil?` calls""Yusuke Endoh
This reverts commit a0980f2446c0db735b8ffeb37e241370c458a626. Retry for macOS Mojave.
2019-08-02Revert "Add a specialized instruction for `.nil?` calls"Yusuke Endoh
This reverts commit 9faef3113fb4331524b81ba73005ba13fa0ef6c6. It seemed to cause a failure on macOS Mojave, though I'm unsure how. https://rubyci.org/logs/rubyci.s3.amazonaws.com/osx1014/ruby-master/log/20190802T034503Z.fail.html.gz This tentative revert is to check if the issue is actually caused by the change or not.
2019-07-31Add a specialized instruction for `.nil?` callsAaron Patterson
This commit adds a specialized instruction for called to `.nil?`. It is about 27% faster than master in the case where the object is nil or not nil. In the case where an object implements `nil?`, I think it may be slightly slower. Here is a benchmark: ```ruby require "benchmark/ips" class Niller def nil?; true; end end not_nil = Object.new xnil = nil niller = Niller.new Benchmark.ips do |x| x.report("nil?") { xnil.nil? } x.report("not nil") { not_nil.nil? } x.report("niller") { niller.nil? } end ``` On Ruby master: ``` [aaron@TC ~/g/ruby (master)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 429.195k i/100ms not nil 437.889k i/100ms niller 437.935k i/100ms Calculating ------------------------------------- nil? 20.166M (± 8.1%) i/s - 100.002M in 5.002794s not nil 20.046M (± 7.6%) i/s - 99.839M in 5.020086s niller 22.467M (± 6.1%) i/s - 112.111M in 5.013817s [aaron@TC ~/g/ruby (master)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 449.660k i/100ms not nil 433.836k i/100ms niller 443.073k i/100ms Calculating ------------------------------------- nil? 19.997M (± 8.8%) i/s - 99.375M in 5.020458s not nil 20.529M (± 7.0%) i/s - 102.385M in 5.020689s niller 21.796M (± 8.0%) i/s - 108.110M in 5.002300s [aaron@TC ~/g/ruby (master)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 402.119k i/100ms not nil 438.968k i/100ms niller 398.226k i/100ms Calculating ------------------------------------- nil? 20.050M (±12.2%) i/s - 98.519M in 5.008817s not nil 20.614M (± 8.0%) i/s - 102.280M in 5.004531s niller 22.223M (± 8.8%) i/s - 110.309M in 5.013106s ``` On this branch: ``` [aaron@TC ~/g/ruby (specialized-nilp)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 468.371k i/100ms not nil 456.517k i/100ms niller 454.981k i/100ms Calculating ------------------------------------- nil? 27.849M (± 7.8%) i/s - 138.169M in 5.001730s not nil 26.417M (± 8.7%) i/s - 131.020M in 5.011674s niller 21.561M (± 7.5%) i/s - 107.376M in 5.018113s [aaron@TC ~/g/ruby (specialized-nilp)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 477.259k i/100ms not nil 428.712k i/100ms niller 446.109k i/100ms Calculating ------------------------------------- nil? 28.071M (± 7.3%) i/s - 139.837M in 5.016590s not nil 25.789M (±12.9%) i/s - 126.470M in 5.011144s niller 20.002M (±12.2%) i/s - 98.144M in 5.001737s [aaron@TC ~/g/ruby (specialized-nilp)]$ ./ruby compil.rb Warming up -------------------------------------- nil? 467.676k i/100ms not nil 445.791k i/100ms niller 415.024k i/100ms Calculating ------------------------------------- nil? 26.907M (± 8.0%) i/s - 133.755M in 5.013915s not nil 25.319M (± 7.9%) i/s - 125.713M in 5.007758s niller 19.569M (±11.8%) i/s - 96.286M in 5.008533s ``` Co-Authored-By: Ashe Connor <kivikakk@github.com>
2019-07-20Explain what's benchmark/lib/load.rb [ci skip]Takashi Kokubun
I'm actually not using this, but ko1 is.
2019-07-18Add note about setting `vm.max_map_count` for Linux.Samuel Williams
2019-07-18Add benchmark to help diagnose performance regression.Samuel Williams
See https://bugs.ruby-lang.org/issues/16009 for more details.
2019-07-12* remove trailing spaces.Nobuyoshi Nakada
2019-07-12Improved fiber benchmarks. Increase number of iterations.Samuel Williams
2019-07-02Adjust memory_status.rb under the tool directory.Hiroshi SHIBATA
2019-07-01Use realpath(3) instead of custom realpath implementation if availableJeremy Evans
This approach is simpler than the previous approach which tries to emulate realpath(3). It also performs much better on both Linux and OpenBSD on the included benchmarks. By using realpath(3), we can better integrate with system security features such as OpenBSD's unveil(2) system call. This does not use realpath(3) on Windows even if it exists, as the approach for checking for absolute paths does not work for drive letters. This can be fixed without too much difficultly, though until Windows defines realpath(3), there is no need to do so. For File.realdirpath, where the last element of the path is not required to exist, fallback to the previous approach, as realpath(3) on most operating systems requires the whole path be valid (per POSIX), and the operating systems where this isn't true either plan to conform to POSIX or may change to conform to POSIX in the future. glibc realpath(3) does not handle /path/to/file.rb/../other_file.rb paths, returning ENOTDIR in that case. Fallback to the previous code if realpath(3) returns ENOTDIR. glibc doesn't like realpath(3) usage for paths like /dev/fd/5, returning ENOENT even though the path may appear to exist in the filesystem. If ENOENT is returned and the path exists, then fall back to the default approach.
2019-06-19Improve benchmarks and tests for threads.Samuel Williams
2019-06-10Make sure to suppress .irbrc on benchmarkTakashi Kokubun
By the way, this is already improved by nobu: ``` $ benchmark-driver benchmark/irb_exec.yml --rbenv '2.6.3;2.7.0-preview1;before;after' -v 2.6.3: ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-linux] 2.7.0-preview1: ruby 2.7.0preview1 (2019-05-31 trunk c55db6aa271df4a689dc8eb0039c929bf6ed43ff) [x86_64-linux] before: ruby 2.7.0dev (2019-06-10T21:13:14+09:00 master 973fd18f11) [x86_64-linux] after: ruby 2.7.0dev (2019-06-10T21:18:56+09:00 master 976c689ad4) [x86_64-linux] Calculating ------------------------------------- 2.6.3 2.7.0-preview1 before after irb_exec 11.868 5.872 6.297 10.278 i/s - 30.000 times in 2.527776s 5.108997s 4.764167s 2.918821s Comparison: irb_exec 2.6.3: 11.9 i/s after: 10.3 i/s - 1.15x slower before: 6.3 i/s - 1.88x slower 2.7.0-preview1: 5.9 i/s - 2.02x slower ```
2019-06-10Add a benchmark of irb boot timeTakashi Kokubun
``` $ benchmark-driver benchmark/irb_exec.yml --rbenv '2.6.3;2.7.0-preview1' Calculating ------------------------------------- 2.6.3 2.7.0-preview1 irb_exec 11.844 5.171 i/s - 30.000 times in 2.532887s 5.801960s Comparison: irb_exec 2.6.3: 11.8 i/s 2.7.0-preview1: 5.2 i/s - 2.29x slower ```
2019-06-05Optimize CGI.escapeHTML by reducing buffer extensionTakashi Kokubun
and switch-case branches. Buffer allocation optimization using `ALLOCA_N` would be the main benefit of patch. It eliminates the O(N) buffer extensions. It also reduces the number of branches using escape table like https://mattn.kaoriya.net/software/lang/c/20160817011915.htm. Closes: https://github.com/ruby/ruby/pull/2226 Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Co-authored-by: Yasuhiro MATSUMOTO <mattn.jp@gmail.com>
2019-06-05Revert "Optimize CGI.escapeHTML by reducing buffer extension"Takashi Kokubun
This reverts commit 8d81e59aa7a62652caf85f9c8db371703668c149. `ALLOCA_N` does not check stack overflow unlike ALLOCV. I'll fix it and re-commit it again.
2019-06-05Optimize CGI.escapeHTML by reducing buffer extensionTakashi Kokubun
and switch-case branches. Buffer allocation optimization using `ALLOCA_N` would be the main benefit of patch. It eliminates the O(N) buffer extensions. It also reduces the number of branches using escape table like https://mattn.kaoriya.net/software/lang/c/20160817011915.htm. Closes: https://github.com/ruby/ruby/pull/2226 Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Co-authored-by: Yasuhiro MATSUMOTO <mattn.jp@gmail.com>
2019-06-01Add a benchmark using IRB::ColorTakashi Kokubun
I heard actually this part would not be a bottleneck for rendering because writing anything to terminal takes way longer time anyway, but I thought this benchmark script might be useful for benchmarking Ruby itself.
2019-05-07Reduce ONIG_NREGION from 10 to 4: power of 2 and testing revealed most ↵Lourens Naudé
pattern matches are less than or equal to 4 results Closes: https://github.com/ruby/ruby/pull/2135
2019-05-03Improve performance of case-conversion methodsNobuyoshi Nakada
2019-04-17string.c: improve splitting into charsnobu
* string.c (rb_str_split_m): improve splitting into chars by an empty string, without a regexp. Comparison: to_chars-1 built-ruby: 1273527.6 i/s compare-ruby: 189423.3 i/s - 6.72x slower to_chars-10 built-ruby: 120993.5 i/s compare-ruby: 37075.8 i/s - 3.26x slower to_chars-100 built-ruby: 15646.4 i/s compare-ruby: 4012.1 i/s - 3.90x slower to_chars-1000 built-ruby: 1295.1 i/s compare-ruby: 408.5 i/s - 3.17x slower git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21benchmark/app_aobench.rb: complete commented code to write the image to a fileeregon
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66900 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21benchmark/app_aobench.rb: remove extra printf argumentseregon
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66899 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21benchmark/app_aobench.rb: move `srand(0)` at the toperegon
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66898 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21benchmark/app_aobench.rb: add `srand(0)`mame
To prevent noise for benchmark result. Just for the case. [Bug #15552] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66893 b2dd03c8-39d4-4d8f-98ff-823fe69b080e