summaryrefslogtreecommitdiff
path: root/gc/mmtk
AgeCommit message (Collapse)Author
2025-12-22[ruby/mmtk] Implement Ruby heapPeter Zhu
This heap emulates the growth characteristics of the Ruby default GC's heap. By default, the heap grows by 40%, requires at least 20% empty after a GC, and allows at most 65% empty before it shrinks the heap. This is all configurable via the same environment variables the default GC uses (`RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO`, `RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO`, `RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO`, respectively). The Ruby heap can be enabled via the `MMTK_HEAP_MODE=ruby` environment variable. Compared to the dynamic heap in MMTk (which uses the MemBalancer algorithm), the Ruby heap allows the heap to grow more generously, which uses a bit more memory but offers significant performance gains because it runs GC much less frequently. We can see in the benchmarks below that this Ruby heap heap gives faster performance than the dynamic heap in every benchmark, with over 2x faster in many of them. We see that memory is often around 10-20% higher with certain outliers that use significantly more memory like hexapdf and erubi-rails. We can also see that this brings MMTk's Ruby heap much closer in performance to the default GC. Ruby heap benchmark results: -------------- -------------- ---------- --------- bench ruby heap (ms) stddev (%) RSS (MiB) activerecord 233.6 10.7 85.9 chunky-png 457.1 1.1 79.3 erubi-rails 1148.0 3.8 133.3 hexapdf 1570.5 2.4 403.0 liquid-c 42.8 5.3 43.4 liquid-compile 41.3 7.6 52.6 liquid-render 102.8 3.8 55.3 lobsters 651.9 8.0 426.3 mail 106.4 1.8 67.2 psych-load 1552.1 0.8 43.4 railsbench 1707.2 6.0 145.6 rubocop 127.2 15.3 148.8 ruby-lsp 136.6 11.7 113.7 sequel 47.2 5.9 44.4 shipit 1197.5 3.6 301.0 -------------- -------------- ---------- --------- Dynamic heap benchmark results: -------------- ----------------- ---------- --------- bench dynamic heap (ms) stddev (%) RSS (MiB) activerecord 845.3 3.1 76.7 chunky-png 525.9 0.4 38.9 erubi-rails 2694.9 3.4 115.8 hexapdf 2344.8 5.6 164.9 liquid-c 73.7 5.0 40.5 liquid-compile 107.1 6.8 40.3 liquid-render 147.2 1.7 39.5 lobsters 697.6 4.5 342.0 mail 224.6 2.1 64.0 psych-load 4326.7 0.6 37.4 railsbench 3218.0 5.5 124.7 rubocop 203.6 6.1 110.9 ruby-lsp 350.7 3.2 79.0 sequel 121.8 2.5 39.6 shipit 1510.1 3.1 220.8 -------------- ----------------- ---------- --------- Default GC benchmark results: -------------- --------------- ---------- --------- bench default GC (ms) stddev (%) RSS (MiB) activerecord 148.4 0.6 67.9 chunky-png 440.2 0.7 57.0 erubi-rails 722.7 0.3 97.8 hexapdf 1466.2 1.7 254.3 liquid-c 32.5 3.6 42.3 liquid-compile 31.2 1.9 35.4 liquid-render 88.3 0.7 30.8 lobsters 633.6 7.0 305.4 mail 76.6 1.6 53.2 psych-load 1166.2 1.3 29.1 railsbench 1262.9 2.3 114.7 rubocop 105.6 0.8 95.4 ruby-lsp 101.6 1.4 75.4 sequel 27.4 1.2 33.1 shipit 1083.1 1.5 163.4 -------------- --------------- ---------- --------- https://github.com/ruby/mmtk/commit/c0ca29922d
2025-12-21[ruby/mmtk] Add a 32 byte heap for allocating smaller objectsPeter Zhu
https://github.com/ruby/mmtk/commit/c4cca6c1c3
2025-12-20[ruby/mmtk] Implement fast path for bump pointer allocatorPeter Zhu
Adding a fast path for bump pointer allocator can improve allocation performance. For the following microbenchmark with MMTK_HEAP_MIN=100MiB: 10_000_000.times { String.new } Before: 810.7 ms ± 8.3 ms [User: 790.9 ms, System: 40.3 ms] After: 777.9 ms ± 10.4 ms [User: 759.0 ms, System: 37.9 ms] https://github.com/ruby/mmtk/commit/0ff5c9f579
2025-12-20[ruby/mmtk] Make rb_gc_impl_heap_id_for_size use MMTK_HEAP_COUNTPeter Zhu
https://github.com/ruby/mmtk/commit/2185189df4
2025-12-20[ruby/mmtk] Call rb_bug when Ruby mutator thread panicsPeter Zhu
This will allow the Ruby backtrace, memory mapping, etc. to be outputted when a Ruby mutator thread panics. https://github.com/ruby/mmtk/commit/d10fd325dd
2025-12-19[ruby/mmtk] Extract max object size to MMTK_MAX_OBJ_SIZEPeter Zhu
https://github.com/ruby/mmtk/commit/ed9036c295
2025-12-19[ruby/mmtk] Extract heap count to MMTK_HEAP_COUNT macroPeter Zhu
https://github.com/ruby/mmtk/commit/4e789e118b
2025-12-05Revert "gc.c: Pass shape_id to `newobj_init`"Peter Zhu
This reverts commit 228d13f6ed914d1e7f6bd2416e3f5be8283be865. This commit makes default.c and mmtk.c depend on shape.h, which prevents them from building independently.
2025-12-03gc.c: Pass shape_id to `newobj_init`Jean Boussier
Attempt to fix the following SEGV: ``` ruby(gc_mark) ../src/gc/default/default.c:4429 ruby(gc_mark_children+0x45) [0x560b380bf8b5] ../src/gc/default/default.c:4625 ruby(gc_mark_stacked_objects) ../src/gc/default/default.c:4647 ruby(gc_mark_stacked_objects_all) ../src/gc/default/default.c:4685 ruby(gc_marks_rest) ../src/gc/default/default.c:5707 ruby(gc_marks+0x4e7) [0x560b380c41c1] ../src/gc/default/default.c:5821 ruby(gc_start) ../src/gc/default/default.c:6502 ruby(heap_prepare+0xa4) [0x560b380c4efc] ../src/gc/default/default.c:2074 ruby(heap_next_free_page) ../src/gc/default/default.c:2289 ruby(newobj_cache_miss) ../src/gc/default/default.c:2396 ruby(RB_SPECIAL_CONST_P+0x0) [0x560b380c5df4] ../src/gc/default/default.c:2420 ruby(RB_BUILTIN_TYPE) ../src/include/ruby/internal/value_type.h:184 ruby(newobj_init) ../src/gc/default/default.c:2136 ruby(rb_gc_impl_new_obj) ../src/gc/default/default.c:2500 ruby(newobj_of) ../src/gc.c:996 ruby(rb_imemo_new+0x37) [0x560b380d8bed] ../src/imemo.c:46 ruby(imemo_fields_new) ../src/imemo.c:105 ruby(rb_imemo_fields_new) ../src/imemo.c:120 ``` I have no reproduction, but my understanding based on the backtrace and error is that GC is triggered inside `newobj_init` causing the new object to be marked while in a incomplete state. I believe the fix is to pass the `shape_id` down to `newobj_init` so it can be set before the GC has a chance to trigger.
2025-11-19[ruby/mmtk] Ensure not blocking for GC in rb_gc_impl_before_forkPeter Zhu
In rb_gc_impl_before_fork, it locks the VM and barriers all the Ractors before calling mmtk_before_fork. However, since rb_mmtk_block_for_gc is a barrier point, one or more Ractors could be paused there. However, mmtk_before_fork is not compatible with that because it assumes that the MMTk workers are idle, but the workers are not idle because they are busy working on a GC. This commit essentially implements a trylock. It will optimistically lock but will release the lock if it detects that any other Ractors are waiting in rb_mmtk_block_for_gc. For example, the following script demonstrates the issue: puts "Hello #{Process.pid}" 100.times do |i| puts "i = #{i}" Ractor.new(i) do |j| puts "Ractor #{j} hello" 1000.times do |i| s = "#{j}-#{i}" end Ractor.receive puts "Ractor #{j} goodbye" end pid = fork { } puts "Child pid is #{pid}" _, status = Process.waitpid2 pid puts status.success? end puts "Goodbye" We can see the MMTk worker thread is waiting to start the GC: #4 0x00007ffff66538b1 in rb_mmtk_stop_the_world () at gc/mmtk/mmtk.c:101 #5 0x00007ffff6d04caf in mmtk_ruby::collection::{impl#0}::stop_all_mutators<mmtk::scheduler::gc_work::{impl#14}::do_work::{closure_env#0}<mmtk::plan::immix::gc_work::ImmixGCWorkContext<mmtk_ruby::Ruby, 0>>> (_tls=..., mutator_visitor=...) at src/collection.rs:23 However, the mutator thread is stuck in mmtk_before_fork trying to stop that worker thread: #4 0x00007ffff6c0b621 in std::sys::thread::unix::Thread::join () at library/std/src/sys/thread/unix.rs:134 #5 0x00007ffff6658b6e in std::thread::JoinInner<()>::join<()> (self=...) #6 0x00007ffff6658d4c in std::thread::JoinHandle<()>::join<()> (self=...) #7 0x00007ffff665795e in mmtk_ruby::binding::RubyBinding::join_all_gc_threads (self=0x7ffff72462d0 <mmtk_ruby::BINDING+8>) at src/binding.rs:115 #8 0x00007ffff66561a8 in mmtk_ruby::api::mmtk_before_fork () at src/api.rs:309 #9 0x00007ffff66556ff in rb_gc_impl_before_fork (objspace_ptr=0x555555d17980) at gc/mmtk/mmtk.c:1054 #10 0x00005555556bbc3e in rb_gc_before_fork () at gc.c:5429 https://github.com/ruby/mmtk/commit/1a629504a7
2025-11-19[ruby/mmtk] Add VM barrier in rb_gc_impl_before_forkPeter Zhu
We need the VM barrier in rb_gc_impl_before_fork to stop the other Ractors because otherwise they could be allocating objects in the fast path which could be calling mmtk_add_obj_free_candidate. Since mmtk_add_obj_free_candidate acquires a lock on obj_free_candidates in weak_proc.rs, this lock may not be released in the child process after the Ractor dies. For example, the following script demonstrates the issue: puts "Hello #{Process.pid}" 100.times do |i| puts "i = #{i}" Ractor.new(i) do |j| puts "Ractor #{j} hello" 1000.times do |i| s = "#{j}-#{i}" end Ractor.receive puts "Ractor #{j} goodbye" end pid = fork { } puts "Child pid is #{pid}" _, status = Process.waitpid2 pid puts status.success? end puts "Goodbye" In the child process, we can see that it is stuck trying to acquire the lock on obj_free_candidates: #5 0x00007192bfb53f10 in mmtk_ruby::weak_proc::WeakProcessor::get_all_obj_free_candidates (self=0x7192c0657498 <mmtk_ruby::BINDING+72>) at src/weak_proc.rs:52 #6 0x00007192bfa634c3 in mmtk_ruby::api::mmtk_get_all_obj_free_candidates () at src/api.rs:295 #7 0x00007192bfa61d50 in rb_gc_impl_shutdown_call_finalizer (objspace_ptr=0x578c17abfc50) at gc/mmtk/mmtk.c:1032 #8 0x0000578c1601e48e in rb_ec_finalize (ec=0x578c17ac06d0) at eval.c:166 #9 rb_ec_cleanup (ec=<optimized out>, ex=<optimized out>) at eval.c:257 #10 0x0000578c1601ebf6 in ruby_cleanup (ex=<optimized out>) at eval.c:180 #11 ruby_stop (ex=<optimized out>) at eval.c:292 #12 0x0000578c16127124 in rb_f_fork (obj=<optimized out>) at process.c:4291 #13 rb_f_fork (obj=<optimized out>) at process.c:4281 https://github.com/ruby/mmtk/commit/eb4b229858
2025-11-14[ruby/mmtk] Lock VM in fork hooksPeter Zhu
If we are using multiple Ractors, other Ractors may allocate objects after rb_gc_impl_before_fork is ran because it does not lock the VM. This can cause the GC to be in a bad state since rb_gc_impl_before_fork may have terminated GC threads so a GC cannot run until rb_gc_impl_after_fork is ran. https://github.com/ruby/mmtk/commit/e4bea5676d
2025-11-09[ruby/mmtk] Lock the VM when freeing objects in ↵Peter Zhu
rb_gc_impl_shutdown_call_finalizer https://github.com/ruby/mmtk/commit/1828f6596f
2025-11-08Move rb_gc_verify_shareable to gc.cPeter Zhu
rb_gc_verify_shareable is not GC implementation specific so it should live in gc.c.
2025-10-31[ruby/mmtk] Bump mmtk-corePeter Zhu
https://github.com/ruby/mmtk/commit/9876d8f0a1
2025-10-23catch up modular-gcKoichi Sasada
2025-09-17Update rb_gc_impl_new_obj in mmtk.cPeter Zhu
2025-08-25Fix MMTk for compatibilityPeter Zhu
2025-07-30[ruby/mmtk] Skip weak references that are special constsPeter Zhu
If a reference marked weak becomes a special const, it will crash because it is not a GC handled object. We should skip special consts here. https://github.com/ruby/mmtk/commit/870a79426b
2025-07-29[ruby/mmtk] Fix warnings from cargo fmtPeter Zhu
https://github.com/ruby/mmtk/commit/84975a8840
2025-07-29[ruby/mmtk] Fix clippy warningsPeter Zhu
https://github.com/ruby/mmtk/commit/45f991578e
2025-06-13mmtk: Get rid of unused reference to FL_EXIVARJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13610
2025-06-09Take file and line in GC VM locksPeter Zhu
This commit adds file and line to GC VM locking functions for debugging purposes and adds upper case macros to pass __FILE__ and __LINE__. Notes: Merged: https://github.com/ruby/ruby/pull/13550
2025-06-03Allow pass special constants to the write barrierPeter Zhu
Some GC implementations want to always know when an object is written to, even if the written value is a special constant. Checking special constants in rb_obj_written was a micro-optimization that made assumptions about the GC implementation. Notes: Merged: https://github.com/ruby/ruby/pull/13497
2025-05-30[ruby/mmtk] Fix environment variable parsingKunshan Wang
Ues more idiomatic rust approaches. https://github.com/ruby/mmtk/commit/ef125f9eae
2025-05-30[ruby/mmtk] Fix clippy warnings and formatting.Kunshan Wang
We also enable `#![warn(unsafe_op_in_unsafe_fn)]` in the whole mmtk_ruby crate. https://github.com/ruby/mmtk/commit/8b8025f71a
2025-05-30[ruby/mmtk] Bump MMTk and dependencies versionKunshan Wang
https://github.com/ruby/mmtk/commit/de252637ec
2025-05-30[ruby/mmtk] Remove unused constantKunshan Wang
Remove the unused constant HAS_MOVED_GFIELDSTBL and related methods. In the mmtk/mmtk-ruby repo, we are now able to find the global field (IV) table of a moved object during copying GC without using the HAS_MOVED_GFIELDSTBL bit. We synchronize some of the code, although we haven't implemented moving GC in ruby/mmtk, yet. See: https://github.com/mmtk/mmtk-ruby/commit/13080acdf553f20a88a7ea9ab9f6877611017136 https://github.com/ruby/mmtk/commit/400ba4e747
2025-05-29[ruby/mmtk] Remove dependance on internal/object.hPeter Zhu
https://github.com/ruby/mmtk/commit/fdc13963f0
2025-05-26Add shape_id to RBasic under 32 bitJohn Hawthorn
This makes `RBobject` `4B` larger on 32 bit systems but simplifies the implementation a lot. [Feature #21353] Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/13341
2025-05-21[ruby/mmtk] Fix object ID in rb_gc_impl_define_finalizerPeter Zhu
The 0th element of the finalizer table array should be the object ID. https://github.com/ruby/mmtk/commit/75e4a82652
2025-05-21[ruby/mmtk] Fix object ID for finalizersPeter Zhu
We should get the object ID for finalizers in rb_gc_impl_define_finalizer instead of when we create the finalizer job in make_final_job because when we are in multi-Ractor mode, object ID needs to walk the references which allocates an identity hash table. We cannot allocate in make_final_job because it is in a MMTk worker thread. https://github.com/ruby/mmtk/commit/922f22a690
2025-05-16rb_gc_impl_copy_finalizer: generate a new object idJean Boussier
Fix a regression introduced by: https://github.com/ruby/ruby/pull/13155 Notes: Merged: https://github.com/ruby/ruby/pull/13350
2025-05-16Add missing lock to `rb_gc_impl_copy_finalizer`Jean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/13350
2025-05-15Add missing lock in `rb_gc_impl_undefine_finalizer`Jean Boussier
The table is global so accesses must be synchronized. Notes: Merged: https://github.com/ruby/ruby/pull/13349
2025-05-15YJIT: ZJIT: Allow both JITs in the same buildAlan Wu
This commit allows building YJIT and ZJIT simultaneously, a "combo build". Previously, `./configure --enable-yjit --enable-zjit` failed. At runtime, though, only one of the two can be enabled at a time. Add a root Cargo workspace that contains both the yjit and zjit crate. The common Rust build integration mechanisms are factored out into defs/jit.mk. Combo YJIT+ZJIT dev builds are supported; if either JIT uses `--enable-*=dev`, both of them are built in dev mode. The combo build requires Cargo, but building one JIT at a time with only rustc in release build remains supported. Notes: Merged: https://github.com/ruby/ruby/pull/13262
2025-05-08Move `object_id` in object fields.Jean Boussier
And get rid of the `obj_to_id_tbl` It's no longer needed, the `object_id` is now stored inline in the object alongside instance variables. We still need the inverse table in case `_id2ref` is invoked, but we lazily build it by walking the heap if that happens. The `object_id` concern is also no longer a GC implementation concern, but a generic implementation. Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com> Notes: Merged: https://github.com/ruby/ruby/pull/13159
2025-05-08Rename `ivptr` -> `fields`, `next_iv_index` -> `next_field_index`Jean Boussier
Ivars will longer be the only thing stored inline via shapes, so keeping the `iv_index` and `ivptr` names would be confusing. Instance variables won't be the only thing stored inline via shapes, so keeping the `ivptr` name would be confusing. `field` encompass anything that can be stored in a VALUE array. Similarly, `gen_ivtbl` becomes `gen_fields_tbl`. Notes: Merged: https://github.com/ruby/ruby/pull/13159
2025-04-23rb_gc_impl_define_finalizer: unlock on early returnJean Boussier
2025-04-22Add missing lock in `rb_gc_impl_define_finalizer`Jean Boussier
`objspace->finalizer_table` must be synchronized, otherwise concurrent insertion from multiple ractors will cause a crash. Repro: ```ruby ractors = 5.times.map do |i| Ractor.new do 100_000.times.map do o = Object.new ObjectSpace.define_finalizer(o, ->(id) {}) o end end end ractors.each(&:take) ``` Notes: Merged: https://github.com/ruby/ruby/pull/13151
2025-04-15Bump crossbeam-channel from 0.5.13 to 0.5.15 in /gc/mmtkdependabot[bot]
Bumps [crossbeam-channel](https://github.com/crossbeam-rs/crossbeam) from 0.5.13 to 0.5.15. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-channel-0.5.13...crossbeam-channel-0.5.15) --- updated-dependencies: - dependency-name: crossbeam-channel dependency-version: 0.5.15 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Notes: Merged: https://github.com/ruby/ruby/pull/13097
2025-04-15Lazily create `objspace->id_to_obj_tbl`Jean Boussier
This inverse table is only useful if `ObjectSpace._id2ref` is used, which is extremely rare. The only notable exception is the `drb` gem and even then it has an option not to rely on `_id2ref`. So if we assume this table will never be looked up, we can just not maintain it, and if it turns out `_id2ref` is called, we can lock the VM and re-build it. ``` compare-ruby: ruby 3.5.0dev (2025-04-10T09:44:40Z master 684cfa42d7) +YJIT +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-04-10T10:13:43Z lazy-id-to-obj d3aa9626cc) +YJIT +PRISM [arm64-darwin24] warming up.. | |compare-ruby|built-ruby| |:----------|-----------:|---------:| |baseline | 26.364M| 25.974M| | | 1.01x| -| |object_id | 10.293M| 14.202M| | | -| 1.38x| ``` Notes: Merged: https://github.com/ruby/ruby/pull/13115
2025-04-08[ruby/mmtk] Do root scanning in scan_vm_specific_rootsKunshan Wang
We rely on scan_vm_specific_roots to reach all stacks via the following path: VM -> ractors -> threads -> fibers -> stacks https://github.com/ruby/mmtk/commit/0a6a835aaa
2025-04-01Remove incorrect assertionMatt Valentine-House
ractor_cache will always be NULL in this context Notes: Merged: https://github.com/ruby/ruby/pull/13031
2025-03-31Don't preserve `object_id` when moving object to another RactorJean Boussier
That seemed like the logical thing to do to me, but ko1 disagree. Notes: Merged: https://github.com/ruby/ruby/pull/13008
2025-03-31Ractor: Fix moving embedded objectsJean Boussier
[Bug #20271] [Bug #20267] [Bug #20255] `rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic 40B pool, so if `obj` is larger than `40B`, we'll create a corrupted object when we later copy the shape_id. Instead we can use the same logic than ractor copy, which is to use `rb_obj_clone`, and later ask the GC to free the original object. We then must turn it into a `T_OBJECT`, because otherwise just changing its class to `RactorMoved` leaves a lot of ways to keep using the object, e.g.: ``` a = [1, 2, 3] Ractor.new{}.send(a, move: true) [].concat(a) # Should raise, but wasn't. ``` If it turns out that `rb_obj_clone` isn't performant enough for some uses, we can always have carefully crafted specialized paths for the types that would benefit from it. Notes: Merged: https://github.com/ruby/ruby/pull/13008
2025-03-13Output object_id in object metadata for MMTkPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/12915
2025-02-24[ruby/mmtk] Trigger forced GC in GC.startKunshan Wang
We now use `MMTK::handle_user_collection_request(true, ...)` to force triggering a GC instead of enabling GC temporarily. https://github.com/ruby/mmtk/commit/02ef47f818
2025-02-20[ruby/mmtk] Fix compatibility for Rust 1.85Peter Zhu
https://github.com/ruby/mmtk/commit/9da566e26a
2025-02-19Implement rb_gc_object_metadata for MMTkPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/12777