| Age | Commit message (Collapse) | Author |
|
https://github.com/ruby/mmtk/commit/9f730cc709
|
|
Adds MMTK_HEAP_MODE=cpu, a dynamic heap-sizing policy that grows
or shrinks the heap after each GC cycle to keep measured GC CPU
overhead near a configurable target. The control law follows
Tavakolisomeh et al., 'Heap Size Adjustment with CPU Control', MPLR
'23: a sigmoid of the (averaged) GC CPU overhead error in (-inf, +inf)
maps to a heap-size adjustment factor in (0.5, 1.5).
Implementation lives alongside the existing 'ruby' delegated trigger
in gc/mmtk/src/heap/. T_GC is wall-clock GC duration; T_APP is process
CPU time delta read via clock_gettime(CLOCK_PROCESS_CPUTIME_ID), which
correctly credits multi-threaded mutator parallelism. Nursery-only
generational GCs are skipped so the trigger only re-sizes at full
collections.
Configuration:
MMTK_GC_CPU_TARGET target GC CPU overhead, percent. Default 5.
MMTK_GC_CPU_WINDOW number of recent cycles averaged. Default 3.
The default differs from the paper's recommended 15. The paper
targets ZGC, a concurrent generational collector; MMTk-Ruby currently
ships stop-the-world Immix, where every percent of GC CPU also blocks
the mutator. An empirical sweep of MMTK_GC_CPU_TARGET across
ruby-bench (railsbench, lobsters, psych-load, liquid-render, lee)
found 5-6 to be Pareto-optimal vs the existing 'ruby' heap mode:
about 6 percent geomean throughput improvement at essentially equal
peak RSS. Targets >=10 trade large amounts of throughput for modest
RSS savings on this collector.
bin/smoke-test, bin/ruby-mmtk-mode, bin/compare-heap-modes, and
doc/testing-cpu-heap-mode.md are included so reviewers and future
contributors can reproduce the sweep against ruby/ruby-bench.
https://github.com/ruby/mmtk/commit/1f223f5ad5
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
|
https://github.com/ruby/mmtk/commit/a46b68fe5b
|
|
the only caller of this unconditionally constructs a binding options
object now, So actually this is dead code
https://github.com/ruby/mmtk/commit/d832004e89
|
|
This is a debug mode in Ruby where an extra word is used after each
object to store the address of the Ractor that owns the object, used for
debug purposes only.
While we're working on Ractors, we also need to be able to test with
MMTk enabled, so we should introduce support for this to the MMTk
binding as well.
As implemented we'll default the binding options to have everything
disabled and hardcoded to 0, as was always the case, but if
RACTOR_CHECK_MODE is enabled, we'll build and pass a valid RubyBinding
object to MMTk.
https://github.com/ruby/mmtk/commit/83cb291313
|
|
Outputs the number of GC cycles that are moving.
https://github.com/ruby/mmtk/commit/fef8f04186
|
|
https://github.com/ruby/mmtk/commit/7889da7c0e
|
|
Instead of sending all 128 buffered objects to one bucket,
round-robin distribute them across all worker buckets so
parallel obj_free work stays balanced.
https://github.com/ruby/mmtk/commit/e1f926cd21
|
|
https://github.com/ruby/mmtk/commit/26ec9f7f89
|
|
Previously, every object allocation in rb_gc_impl_new_obj made a
per-object FFI call into Rust (mmtk_add_obj_free_candidate), which
acquired a mutex on one of the WeakProcessor's candidate vecs, pushed a
single element, and released the mutex. That's an FFI crossing + mutex
lock/unlock on every single allocation.
Now, each MMTk_ractor_cache has two local buffers (parallel-freeable and
non-parallel-freeable, 128 entries each). On allocation, we just store
the pointer into the local buffer. When a buffer fills up, we flush the
entire batch in one FFI call using mmtk_add_obj_free_candidates, which
does a single mutex acquisition and extend_from_slice for the whole
batch.
We picked 128 as our buffer size at random. We should probably
investigate further what an optimum size for this is
https://github.com/ruby/mmtk/commit/23c4a9a676
|
|
https://github.com/ruby/mmtk/commit/86fa2fd4af
|
|
https://github.com/ruby/mmtk/commit/002faa8f92
|
|
https://github.com/ruby/mmtk/commit/8813e76bf8
|
|
Redos commit 544770d which seems to have accidentally been undone in b27d935.
|
|
This allows the mutator thread to dump its backtrace when a GC thread crashes.
https://github.com/ruby/mmtk/commit/40ff9ffee7
|
|
https://github.com/ruby/mmtk/commit/58210c88ed
|
|
https://github.com/ruby/mmtk/commit/42adba630e
|
|
The argument to `is_data_encoding` is assumed to be `T_DATA`.
|
|
This makes it easier to visualize in profilers which one is non-parallel.
https://github.com/ruby/mmtk/commit/ba68b2ef3b
|
|
This commit allows objects that are safe to be freed in parallel to do so.
A decrease in object freeing time can be seen in profiles.
The benchmarks don't show much difference.
Before:
-------------- -------------------- ---------- ---------
bench sequential free (ms) stddev (%) RSS (MiB)
activerecord 242.3 7.4 84.3
chunky-png 439.1 0.6 75.6
erubi-rails 1221.2 4.2 132.7
hexapdf 1544.8 1.8 429.1
liquid-c 42.7 7.4 48.5
liquid-compile 41.4 8.3 52.2
liquid-render 100.6 3.0 56.8
mail 108.9 2.1 65.1
psych-load 1536.9 0.6 43.4
railsbench 1633.5 2.6 146.2
rubocop 126.5 15.8 142.1
ruby-lsp 129.6 9.7 112.2
sequel 47.9 6.5 44.6
shipit 1152.0 2.7 315.2
-------------- -------------------- ---------- ---------
After:
-------------- ------------------ ---------- ---------
bench parallel free (ms) stddev (%) RSS (MiB)
activerecord 235.1 5.5 87.4
chunky-png 440.8 0.8 68.1
erubi-rails 1105.3 0.8 128.0
hexapdf 1578.3 4.1 405.1
liquid-c 42.6 7.1 48.4
liquid-compile 41.5 8.1 52.1
liquid-render 101.2 2.8 53.3
mail 109.7 2.7 64.8
psych-load 1567.7 1.1 44.4
railsbench 1644.9 1.9 150.9
rubocop 125.6 15.4 148.5
ruby-lsp 127.9 5.8 104.6
sequel 48.2 6.1 44.1
shipit 1215.3 4.7 320.5
-------------- ------------------ ---------- ---------
https://github.com/ruby/mmtk/commit/4f0b5fd2eb
|
|
This commit implements moving Immix in MMTk, which allows objects to move
in the GC.
The performance of this implementation is not yet amazing. It is very
similar to non-moving Immix in many of them and slightly slower in others.
The benchmark results is shown below.
-------------- ----------------- ---------- ---------
bench Moving Immix (ms) stddev (%) RSS (MiB)
activerecord 241.9 0.5 86.6
chunky-png 447.8 0.8 74.9
erubi-rails 1183.9 0.8 136.1
hexapdf 1607.9 2.6 402.3
liquid-c 45.4 6.7 44.9
liquid-compile 44.1 9.3 53.0
liquid-render 105.4 4.5 55.9
lobsters 650.1 9.7 418.4
mail 115.4 2.1 64.4
psych-load 1656.8 0.8 43.6
railsbench 1653.5 1.3 149.8
rubocop 127.0 15.6 142.1
ruby-lsp 130.7 10.5 99.4
sequel 52.8 7.2 45.6
shipit 1187.0 3.9 311.0
-------------- ----------------- ---------- ---------
-------------- --------------------- ---------- ---------
bench Non-moving Immix (ms) stddev (%) RSS (MiB)
activerecord 218.9 2.7 86.1
chunky-png 464.6 0.8 66.7
erubi-rails 1119.0 4.3 132.7
hexapdf 1539.8 1.8 425.2
liquid-c 40.6 6.9 45.2
liquid-compile 40.6 8.1 52.9
liquid-render 99.3 2.3 48.3
mail 107.4 5.3 65.4
psych-load 1535.6 1.0 39.5
railsbench 1565.6 1.1 149.6
rubocop 122.5 14.3 146.7
ruby-lsp 128.4 10.7 106.4
sequel 44.1 4.0 45.7
shipit 1154.5 2.7 358.5
-------------- --------------------- ---------- ---------
|
|
https://github.com/ruby/mmtk/commit/f4c46cabc7
|
|
|
|
This heap emulates the growth characteristics of the Ruby default GC's
heap. By default, the heap grows by 40%, requires at least 20% empty
after a GC, and allows at most 65% empty before it shrinks the heap. This
is all configurable via the same environment variables the default GC
uses (`RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO`, `RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO`,
`RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO`, respectively).
The Ruby heap can be enabled via the `MMTK_HEAP_MODE=ruby` environment
variable.
Compared to the dynamic heap in MMTk (which uses the MemBalancer algorithm),
the Ruby heap allows the heap to grow more generously, which uses a bit
more memory but offers significant performance gains because it runs GC
much less frequently.
We can see in the benchmarks below that this Ruby heap heap gives faster
performance than the dynamic heap in every benchmark, with over 2x faster
in many of them. We see that memory is often around 10-20% higher with
certain outliers that use significantly more memory like hexapdf and
erubi-rails. We can also see that this brings MMTk's Ruby heap much
closer in performance to the default GC.
Ruby heap benchmark results:
-------------- -------------- ---------- ---------
bench ruby heap (ms) stddev (%) RSS (MiB)
activerecord 233.6 10.7 85.9
chunky-png 457.1 1.1 79.3
erubi-rails 1148.0 3.8 133.3
hexapdf 1570.5 2.4 403.0
liquid-c 42.8 5.3 43.4
liquid-compile 41.3 7.6 52.6
liquid-render 102.8 3.8 55.3
lobsters 651.9 8.0 426.3
mail 106.4 1.8 67.2
psych-load 1552.1 0.8 43.4
railsbench 1707.2 6.0 145.6
rubocop 127.2 15.3 148.8
ruby-lsp 136.6 11.7 113.7
sequel 47.2 5.9 44.4
shipit 1197.5 3.6 301.0
-------------- -------------- ---------- ---------
Dynamic heap benchmark results:
-------------- ----------------- ---------- ---------
bench dynamic heap (ms) stddev (%) RSS (MiB)
activerecord 845.3 3.1 76.7
chunky-png 525.9 0.4 38.9
erubi-rails 2694.9 3.4 115.8
hexapdf 2344.8 5.6 164.9
liquid-c 73.7 5.0 40.5
liquid-compile 107.1 6.8 40.3
liquid-render 147.2 1.7 39.5
lobsters 697.6 4.5 342.0
mail 224.6 2.1 64.0
psych-load 4326.7 0.6 37.4
railsbench 3218.0 5.5 124.7
rubocop 203.6 6.1 110.9
ruby-lsp 350.7 3.2 79.0
sequel 121.8 2.5 39.6
shipit 1510.1 3.1 220.8
-------------- ----------------- ---------- ---------
Default GC benchmark results:
-------------- --------------- ---------- ---------
bench default GC (ms) stddev (%) RSS (MiB)
activerecord 148.4 0.6 67.9
chunky-png 440.2 0.7 57.0
erubi-rails 722.7 0.3 97.8
hexapdf 1466.2 1.7 254.3
liquid-c 32.5 3.6 42.3
liquid-compile 31.2 1.9 35.4
liquid-render 88.3 0.7 30.8
lobsters 633.6 7.0 305.4
mail 76.6 1.6 53.2
psych-load 1166.2 1.3 29.1
railsbench 1262.9 2.3 114.7
rubocop 105.6 0.8 95.4
ruby-lsp 101.6 1.4 75.4
sequel 27.4 1.2 33.1
shipit 1083.1 1.5 163.4
-------------- --------------- ---------- ---------
https://github.com/ruby/mmtk/commit/c0ca29922d
|
|
Adding a fast path for bump pointer allocator can improve allocation
performance.
For the following microbenchmark with MMTK_HEAP_MIN=100MiB:
10_000_000.times { String.new }
Before:
810.7 ms ± 8.3 ms [User: 790.9 ms, System: 40.3 ms]
After:
777.9 ms ± 10.4 ms [User: 759.0 ms, System: 37.9 ms]
https://github.com/ruby/mmtk/commit/0ff5c9f579
|
|
This will allow the Ruby backtrace, memory mapping, etc. to be outputted
when a Ruby mutator thread panics.
https://github.com/ruby/mmtk/commit/d10fd325dd
|
|
If a reference marked weak becomes a special const, it will crash because
it is not a GC handled object. We should skip special consts here.
https://github.com/ruby/mmtk/commit/870a79426b
|
|
https://github.com/ruby/mmtk/commit/84975a8840
|
|
https://github.com/ruby/mmtk/commit/45f991578e
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13610
|
|
Ues more idiomatic rust approaches.
https://github.com/ruby/mmtk/commit/ef125f9eae
|
|
We also enable `#![warn(unsafe_op_in_unsafe_fn)]` in the whole mmtk_ruby
crate.
https://github.com/ruby/mmtk/commit/8b8025f71a
|
|
Remove the unused constant HAS_MOVED_GFIELDSTBL and related methods.
In the mmtk/mmtk-ruby repo, we are now able to find the global field
(IV) table of a moved object during copying GC without using the
HAS_MOVED_GFIELDSTBL bit. We synchronize some of the code, although we
haven't implemented moving GC in ruby/mmtk, yet.
See: https://github.com/mmtk/mmtk-ruby/commit/13080acdf553f20a88a7ea9ab9f6877611017136
https://github.com/ruby/mmtk/commit/400ba4e747
|
|
And get rid of the `obj_to_id_tbl`
It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.
We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.
The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/13159
|
|
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.
Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.
`field` encompass anything that can be stored in a VALUE array.
Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
Notes:
Merged: https://github.com/ruby/ruby/pull/13159
|
|
We rely on scan_vm_specific_roots to reach all stacks via the following
path:
VM -> ractors -> threads -> fibers -> stacks
https://github.com/ruby/mmtk/commit/0a6a835aaa
|
|
We now use `MMTK::handle_user_collection_request(true, ...)` to force
triggering a GC instead of enabling GC temporarily.
https://github.com/ruby/mmtk/commit/02ef47f818
|
|
https://github.com/ruby/mmtk/commit/9da566e26a
|
|
https://github.com/ruby/mmtk/commit/e52b973611
|
|
https://github.com/ruby/mmtk/commit/6a78ffaf16
|
|
https://github.com/ruby/mmtk/commit/5bbac70c69
|
|
https://github.com/ruby/mmtk/commit/810f897603
|
|
https://github.com/ruby/mmtk/commit/67da9ea5b8
|
|
https://github.com/ruby/mmtk/commit/836a9059cb
|
|
https://github.com/ruby/mmtk/commit/79ce2008a3
|
|
https://github.com/ruby/mmtk/commit/c8b1f4c156
|
|
https://github.com/ruby/mmtk/commit/4a24d55d91
|
|
https://github.com/ruby/mmtk/commit/1d2f7b9cfc
|
|
https://github.com/ruby/mmtk/commit/5c5c454f65
|
|
UNIQUE_OBJECT_ENQUEUING guarantees that object marking is atomic so that
an object cannot be marked more than once.
https://github.com/ruby/mmtk/commit/2f97fd8207
|