summaryrefslogtreecommitdiff
path: root/test/ruby/test_array.rb
diff options
context:
space:
mode:
authorAlan Wu <XrXr@users.noreply.github.com>2024-07-25 19:48:28 -0400
committerAlan Wu <XrXr@users.noreply.github.com>2024-07-26 11:44:34 -0400
commit158177e39995aa62df781b8ef64e800bf0c8670f (patch)
tree7f540634bef52f098acf00da176c541d92b5ac22 /test/ruby/test_array.rb
parenta06cfa7e89cd59d7ea1278e8d8424c23a38cc521 (diff)
Improve allocation throughput by outlining cache miss code path
Previously, GCC 11 on x86-64 inlined the heavy weight logic for potentially triggering GC into newobj_alloc(). This slowed down the hotter code path where the ractor cache hits, causing a degradation to allocation throughput. Outline the logic into a separate function and have it never inlined. This restores allocation throughput to the same level as 98eeadc ("Development of 3.4.0 started."). To evaluate, instrument miniruby so it allocates a bunch of objects and then exits: diff --git a/eval.c b/eval.c --- a/eval.c +++ b/eval.c @@ -92,6 +92,15 @@ ruby_setup(void) } EC_POP_TAG(); +rb_gc_disable(); +rb_execution_context_t *ec = GET_EC(); +long const n = 20000000; +for (long i = 0; i < n; ++i) { + rb_wb_protected_newobj_of(ec, 0, T_OBJECT, 40); +} +printf("alloc %ld\n", n); +exit(0); + return state; } With `3.3-equiv` being 98eeadc, and `pre` being f2728c3393d and `post` being this commit, I have: $ hyperfine -L buildtag post,pre,3.3-equiv '/ruby/build-{buildtag}/miniruby' Benchmark 1: /ruby/build-post/miniruby Time (mean ± σ): 873.4 ms ± 2.8 ms [User: 377.6 ms, System: 490.2 ms] Range (min … max): 868.3 ms … 877.8 ms 10 runs Benchmark 2: /ruby/build-pre/miniruby Time (mean ± σ): 960.1 ms ± 2.8 ms [User: 430.8 ms, System: 523.9 ms] Range (min … max): 955.5 ms … 964.2 ms 10 runs Benchmark 3: /ruby/build-3.3-equiv/miniruby Time (mean ± σ): 886.9 ms ± 2.8 ms [User: 379.5 ms, System: 501.0 ms] Range (min … max): 883.0 ms … 890.8 ms 10 runs Summary '/ruby/build-post/miniruby' ran 1.02 ± 0.00 times faster than '/ruby/build-3.3-equiv/miniruby' 1.10 ± 0.00 times faster than '/ruby/build-pre/miniruby' These results are from a Skylake server with GCC 11.
Diffstat (limited to 'test/ruby/test_array.rb')
0 files changed, 0 insertions, 0 deletions