summaryrefslogtreecommitdiff
path: root/gc/default
AgeCommit message (Collapse)Author
5 daysRun FREEOBJ hook as separate stepJohn Hawthorn
7 daysEnsure DTrace probes capture all GC marking eventsKunshan Wang
`gc_prof_mark_timer_start` and `gc_prof_mark_timer_stop` include DTrace hooks for the `MARK_BEGIN` and `MARK_END` events, respectively. Previously, those probes are only triggered in `gc_marks`. However, `gc_marks_continue` and `gc_rest` also contain marking activities, but are not captured by the probes. We move the invocation of `gc_prof_mark_timer_start` and `gc_prof_mark_timer_stop` into `gc_marking_enter` and `gc_marking_exit` to ensure all marking activities are captured by the probes.
2026-05-20Unify gc_counter_t on rbimpl_atomic_uint64_tMatt Valentine-House
2026-05-20Snapshot malloc counters at end of sweepMatt Valentine-House
Snapshotting at start of marking lets sweep-time frees count against the next epoch, which roughly halves GC frequency on alloc-heavy workloads. Move the snapshot to end of sweep so the next epoch starts from a clean baseline.
2026-05-20Make sure we flush the cached count to update heap slotsMatt Valentine-House
2026-05-20Better feature detection for malloc counter locksMatt Valentine-House
2026-05-20Reorder rb_gc_impl_stat to keep heap_live_slots accurateMatt Valentine-House
Several SETs in rb_gc_impl_stat may allocate a T_BIGNUM RVALUE when the value exceeds FIXNUM_MAX This is invisible on LP64 but trips on LLP64 Windows and ILP32 Linux where FIXNUM_MAX ~= 1.07GB. If those allocations happen *after* setting heap_live_slots then stat[:heap_live_slots] reflects a stale snapshot, and tests that assert on it fail. This commit reorders everything so every potentially-allocating SET runs first, and the slot counters are SET last.
2026-05-20Expose monotonic malloc/free totals via GC.statMatt Valentine-House
2026-05-20Use monotonic add/sub counters for malloc_increaseMatt Valentine-House
Replace the single objspace->malloc_counters.{increase,oldmalloc_increase} size_t fields with pairs of monotonically-increasing counters. Snapshots of these counters are taken at each GC, so that the live malloc_increase is computed as (malloc - malloc_at_last_gc) - (free - free_at_last_gc) We update the baselines at each GC. Minor GC's update malloc and free associated with young objects only (counters). Major GC's update based on "oldcounters" as well. The malloc/free counters are 64 bits wide which should provide ample headroom for real world programs (>500 years at 1Gb/sec allocation rate XD). We use size_t on 64-bit and uint64_t on 32-bit, wrapped by a gc_counter_t struct. However, because updating a uint64_t is a multi-instruction operation on 32 bit architectures we have to introduce a lock to the malloc_counters struct to avoid racing. We introduced 2 new macros MALLOC_COUNTERS_LOCK and MALLOC_COUNTERS_UNLOCK that use `rb_nativethread_lock_t`. The lock is initialized in rb_gc_impl_objspace_init and destroyed in rb_gc_impl_objspace_free. We chose this because it mirrors existing finalizer_lock pattern in wbcheck. On 64 bit platforms aligned 64 bit loads are atomic, and writes are already using RUBY_ATOMIC_SIZE_ADD so the locks are not needed and the macros do nothing.
2026-05-20Preserve usable slot size when RVALUE_OVERHEAD is non-zeroMatt Valentine-House
We made a mistake calculating slot sizes during the heap slot sizes refactor. Previously BASE_SLOT_SIZE included RVALUE_OVERHEAD, this was lost during the refactor to use the SLOT macro. The result of this was that when Ruby was compiled with -DRUBY_DEBUG it was assumed that the last word of each slot was RVALUE_OVERHEAD. Because this hadn't been taken into account at allocation time, all slots were effectively one word shorter. This PR adds RVALUE_OVERHEAD to the size calcualted in the SLOT macro directly, so it will be added on to the physically allocated size at allocation time.
2026-05-16Always define RB_GC_OBJ_HAS_SUFFIXPeter Zhu
2026-05-16Move GC object suffix to gc.hPeter Zhu
2026-05-14Move rb_ractor_setup_belonging to rb_newobjPeter Zhu
2026-05-14Fix ec NULL assertion failure during gc stressMatt Valentine-House
rb_gc_initialize_vm_context calls GET_EC, which does VM_ASSERT(ec != NULL). When Ruby is built with RUBY_DEBUG=1 and GC stress is set to run at boot with RUBY_DEBUG=gc_stress then GC gets run inside Init_BareVM when we're setting up the main_thread. In gc_start we gate the GC with some early returns that prevent us actually attempting a GC if the heap and objspace are not ready yet, but we're attempting to initialize the gc's VM context before those gates, causing the assertion to fail (because the VM isn't ready yet). This commit moves the vm_context setup after the gates, so we don't attempt it before objspace and the heap are fully set up. To repro this bug configure with --enable-dev-env and cflags=-DRUBY_DEBUG and then run RUBY_DEBUG=gc_stress ./ruby -v
2026-05-07Use rb_gc_get_ec in rb_gc_event_hookPeter Zhu
This would allow rb_gc_event_hook to run in a GC thread that is a non-Ruby thread.
2026-05-05gc: Simplify updating the shape after moveJean Boussier
Back when this code was added, moving a T_OBJECT to a different size pool required to rebuilt its shape tree, which could allocate, potentially triggering GC during GC. Ref: https://github.com/ruby/ruby/pull/6926 Ref: https://github.com/ruby/ruby/pull/6938 However, this is no longer a concern. `SHAPE_T_OBJECT` has been removed, and now transitioning a shape from one size pool to another never involve an allocation. Ref: https://github.com/ruby/ruby/pull/13519 Hence we can remove a lot of complexity, and directly update the shape right after moving the object.
2026-05-04Use EC saved in GC for root markingPeter Zhu
Since EC is thread-local, we previously used rb_gc_worker_thread_set_vm_context in MMTk worker threads to temporarily set the EC. However, this was inelegant and also occasionally caused crashes when marking threads/fibers for the current EC since it will mark the current machine stack twice (once during root marking and once for the fiber). However, since the machine stack is actively being used, the contents may be different when marking the fiber. Since all objects on the machine stack are pinned, this may cause an unpinned object to be pinned, which is not allowed in Immix. The following crash can be observed: Object 0x200fffbc7d8 is trying to pin 0x200ffc80188 0: mmtk_ruby::handle_gc_thread_panic 1: mmtk_ruby::set_panic_hook::{{closure}} 2: <alloc::boxed::Box<dyn for<'a, 'b> core::ops::function::Fn<(&'a std::panic::PanicHookInfo<'b>,), Output = ()> + core::marker::Sync + core::marker::Send> as core::ops::function::Fn<(&std::panic::PanicHookInfo,)>>::call at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/alloc/src/boxed.rs:2254:9 3: std::panicking::panic_with_hook at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/std/src/panicking.rs:833:13 4: std::panicking::panic_handler::{closure#0} at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/std/src/panicking.rs:698:13 5: std::sys::backtrace::__rust_end_short_backtrace::<std::panicking::panic_handler::{closure#0}, !> at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/std/src/sys/backtrace.rs:182:18 6: __rustc::rust_begin_unwind at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/std/src/panicking.rs:689:5 7: core::panicking::panic_fmt at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/core/src/panicking.rs:80:14 8: <mmtk_ruby::scanning::VMScanning as mmtk::vm::scanning::Scanning<mmtk_ruby::Ruby>>::scan_object_and_trace_edges::{{closure}} 9: mmtk_ruby::abi::ObjectClosure::c_function_registered 10: rb_mmtk_call_object_closure at gc/mmtk/mmtk.c:976:19 11: rb_gc_impl_mark_and_pin at gc/mmtk/mmtk.c:1008:5 12: rb_gc_impl_mark_and_pin at gc/mmtk/mmtk.c:1004:1 13: gc_mark_maybe_internal at gc.c:2908:5 14: gc_mark_maybe_internal at gc.c:2906:1 15: gc_mark_maybe_each_location at gc.c:2939:5 16: gc_mark_maybe_each_location at gc.c:2937:1 17: each_location at gc.c:2924:9 18: each_location_ptr at gc.c:2933:5 19: each_location_ptr at gc.c:2930:1 20: rb_gc_mark_machine_context at gc.c:3200:5 21: rb_execution_context_mark at vm.c:3768:9 22: cont_mark at cont.c:1155:5 23: fiber_mark at cont.c:1284:5 24: rb_mmtk_call_gc_mark_children at gc/mmtk/mmtk.c:318:5 25: <mmtk_ruby::scanning::VMScanning as mmtk::vm::scanning::Scanning<mmtk_ruby::Ruby>>::scan_object_and_trace_edges::{{closure}}
2026-04-09Replace sweeping_heaps map with a counterMatt Valentine-House
We implemented some bit twiddling logic with an unsigned int to have a neat way of tracking which heaps were currently sweeping, but we actually don't need to care which heap is sweeping right now, just whether some are or not, so we can replace this with a counter.
2026-04-09Use the pre-processor to generate slot sizes and reciprocalsMatt Valentine-House
2026-04-09Allow flex in heap growth thresholdMatt Valentine-House
Add a 7/8 multiplier to the min_free_slots checks in gc_sweep_finish_heap and gc_marks_finish, allowing heaps to be up to ~12.5% below the free slots target without triggering a major GC or forced growth. With 12 heaps instead of 5, each heap independently hitting the exact threshold would cause excessive memory growth. The slack prevents cascading growth decisions while still ensuring heaps stay close to their target occupancy.
2026-04-09Cache has_sweeping_pages as a bitfieldMatt Valentine-House
2026-04-09Introduce RVALUE_SIZE GC constantMatt Valentine-House
Add GC::INTERNAL_CONSTANTS[:RVALUE_SIZE] to store the usable size (excluding debug overhead) of the smallest pool that can hold a standard RVALUE.
2026-04-09Introduce power-of-two size poolsMatt Valentine-House
Replace the RVALUE_SLOT_SIZE-multiplier based pool sizes with explicit power-of-two (and near-power-of-two) slot sizes. On 64-bit this gives 12 heaps (32, 40, 64, 80, 96, 128, 160, 256, 512, 640, 768, 1024) instead of 5, providing finer granularity and less internal fragmentation. On 32-bit the layout is 5 heaps (32, 64, 128, 256, 512).
2026-04-09Use sizeof(VALUE) for pointer alignment checksMatt Valentine-House
2026-04-06Revert "Double sweep byte budget to preserve heap 1 behavior"Matt Valentine-House
This reverts commit c617c5ec85ff69a5a8b13c56d51fcd234c00e1e2.
2026-04-06Double sweep byte budget to preserve heap 1 behaviorMatt Valentine-House
Anchors the historical 2048/1024 slot counts on the 80-byte heap instead of the 40-byte heap. This isolates whether the major GC elimination seen in railsbench was caused by heap 1's halved budget in the previous commit.
2026-04-06Convert incremental sweep budget from slots to bytesMatt Valentine-House
Larger slot pools are less heavily used, so a fixed slot count over-services them relative to allocation pressure. Divide a byte budget by heap->slot_size so the effective per-step slot count tapers inversely with slot size.
2026-03-30Introduce GC.stat_heap(:heap_allocatable_slots)Matt Valentine-House
2026-03-30Remove unused minimum_slots_for_heap functionMatt Valentine-House
2026-03-30Replace heap_init_slots array with single heap_init_bytes targetMatt Valentine-House
Replace per-heap GC_HEAP_INIT_SLOTS with a single GC_HEAP_INIT_BYTES target. Instead of allocating a fixed 10k slot budget for each heap to grow into. This PR gives each heap a fixed 2.5Mb heap growth allowance. This keeps the overall heap size budget roughly the same, but allows the smaller pools to grow much larger before more pages are allocated. ``` Heap 0: 10,000 × 40 = 400,000 bytes Heap 1: 10,000 × 80 = 800,000 bytes Heap 2: 10,000 × 160 = 1,600,000 bytes Heap 3: 10,000 × 320 = 3,200,000 bytes Heap 4: 10,000 × 640 = 6,400,000 bytes Total: 12,400,000 bytes (50,000 slots) ``` ``` Heap 0: 2,621,440 / 40 = 65,536 slots Heap 1: 2,621,440 / 80 = 32,768 slots Heap 2: 2,621,440 / 160 = 16,384 slots Heap 3: 2,621,440 / 320 = 8,192 slots Heap 4: 2,621,440 / 640 = 4,096 slots Total: 13,107,200 bytes (126,976 slots) ```
2026-03-25wip is this still necessaryMatt Valentine-House
2026-03-09Compress the size_to_heap_idx tableMatt Valentine-House
Index on 8 byte chunks instead of individual bytes. This works because all pool stot sizes are pointer aligned, so all sizes in an 8 byte range map to the same heap.
2026-03-09Look up slot sizes for allocations in a tableMatt Valentine-House
Also remove BASE_SLOT_SIZE.
2026-03-09Suppress format warningsNobuyoshi Nakada
Use the appropriate modifier. `size_t` is not always `unsigned long`, even if the size is the same.
2026-02-25Move class pre-aging out of the allocation pathMatt Valentine-House
Previously classes and modules were pre-aged. Ie. as soon as they're allocated they are aged to old_age - 1. This was done with the assumption that classes are generally always long lived, so we should assume that any that survive a single GC can be considered old. This commit keeps the same semantics, but moves the logic out of the allocation path, in order to simplify allocation. Classes and modules are now set to old-age the first time they are marked.
2026-02-23Fix spurious uses of BASE_SLOT_SIZEMatt Valentine-House
In gc_sweep_plane, VALGRIND_MAKE_MEM_UNDEFINED was using BASE_SLOT_SIZE which only covers the smallest pool's slot size. For larger size pools this left the tail of the slot with stale state. Use the page's actual slot_size instead. In gc_prof_set_heap_info, heap_use_size and heap_total_size were computed as object_count * BASE_SLOT_SIZE, undercounting memory for objects in larger size pools. Sum across all heaps using each pool's actual slot size for correct byte totals.
2026-02-20Remove NUM_IN_PAGE macroMatt Valentine-House
This is being used to calculate the starting point of the slots in a page in order to make them evenly divisible by a bitmap plane. Since https://github.com/ruby/ruby/pull/16150 we restructured the bitmaps in order to pack them such that 1 bit == 1 slot, and remove the masking, meaning that we no longer need to align against planes. This is the last remining use for the NUM_IN_PAGE macro so we can remove that as well.
2026-02-19Remove HEAP_PAGE_OBJ_LIMITMatt Valentine-House
This was useful when there was only a single size pool to have an easy way of referencing the average number of objects a page could hold (this would vary by a few in real terms because of page alignment). But with multiple heaps, each heap contains pages with different numbers of objects because slot sizes are different. So when we use HEAP_PAGE_OBJ_LIMIT to do any kind of calculations: such as calculating freeable pages), then we're significantly underestimating the number of freeable pages in the larger size pools, which will cause us to hold on to pages unnecessarily. This commit replaces uses of HEAP_PAGE_OBJ_LIMIT with a more accurate approximation for the actual heap being manipulated. It also removes HEAP_PAGE_OBJ_LIMIT from GC::INTERNAL_CONSTANTS
2026-02-13Use UINT32_MAX as magic divisorMatt Valentine-House
As @jhawthorn pointed out, the original calculation used `(1 << 32) / heap->slot_size + 1)` which leads to a subtle off by one error that gets shifted away because our slot sizes aren't powers of 2. This is still worth fixing now, so that we don't trip up over it if we change slot sizes in the future.
2026-02-13We can't actually hardcode theseMatt Valentine-House
because BASE_SLOT_SIZE changes on 32 bit, and when debug/devel symbols are added
2026-02-13Make sure we clear the bits when adding a new pageMatt Valentine-House
2026-02-13hardcode and look up magic numbersMatt Valentine-House
instead of computing them on page add
2026-02-13gc: implement slot-based bitmap indexing with division magicMatt Valentine-House
Replace the BASE_SLOT_SIZE-granularity bitmap scheme with slot-based indexing where each bit represents one slot regardless of size. Key changes: - Add slot_div_magic field to heap_page for fast division - Use Go-inspired formula: slot_index = (offset * div_magic) >> 32 - Update all bitmap iteration to use one-bit-per-slot scheme - Remove slot_bits_mask from rb_heap_t (no longer needed) This enables arbitrary slot sizes (not just power-of-two multiples of BASE_SLOT_SIZE) by decoupling bitmap indexing from slot size. Functions updated: - gc_sweep_plane/gc_sweep_page - rgengc_rememberset_mark/rgengc_rememberset_mark_plane - gc_marks_wb_unprotected_objects/gc_marks_wb_unprotected_objects_plane - gc_compact_plane/gc_compact_page - invalidate_moved_plane/invalidate_moved_page - RVALUE_AGE_GET/RVALUE_AGE_SET_BITMAP Inspired by Go runtime's mbitmap.go divideByElemSize().
2026-01-31ruby_xfree: reject memory allocated by ruby_mimallocJean Boussier
2026-01-30Clear age and unprotected bits for page at onceJohn Hawthorn
This aims to speed up sweeping by clearing all age and wb_unprotected bits for unmarked objects. This should be faster because we can clear up to a whole plane of objects (64 slots) at once.
2026-01-30Use bit plane for age bitsJohn Hawthorn
Previously we used two adjacent bits in the same word to store the object's age. This changes that to instead store the age in the same bit position across two adjacent words. This makes age use the exact same bit positions as the other bitmaps (just across two words).
2026-01-30gc.c: also verify sized_xrealloc old sizeJean Boussier
2026-01-29Remove dead gc_stat_sym_weak_references_countPeter Zhu
2026-01-29gc.c: Verify provided size in `rb_gc_impl_free`Jean Boussier
For now the provided size is just for GC statistics, but in the future we may want to forward it to C23's `free_sized` and passing an incorrect size to it is undefined behavior.
2026-01-26rename rb_gc_obj_free_on_sweep -> rb_gc_obj_needs_cleanup_pMatt Valentine-House