summaryrefslogtreecommitdiff
path: root/gc.c
AgeCommit message (Collapse)Author
4 daysDisambiguate private and public RSTRUCT_ helpersJean Boussier
RSTRUCT_LEN / RSTRUCT_GET / RSTRUCT_SET all existing in two versions, one public that does type and frozens checks and one private that doesn't. The problem is that this is error prone because the public version is always accessible, but the private one require to include `internal/struct.h`. So you may have some code that rely on the public version, and later on the private header is included and changes the behavior. This already led to introducing a bug in YJIT & ZJIT: https://github.com/ruby/ruby/pull/15835
6 daysOptimize rb_mark_generic_ivar for T_DATA and T_STRUCTPeter Zhu
T_DATA and T_STRUCT could have ivars but might not use the generic_fields_tbl. This commit skips lookup in the generic_fields_tbl for those cases.
11 daysDump fstr and frozen status in rb_raw_obj_info_buitin_typePeter Zhu
11 daysAdd field handle_weak_references to TypedDataPeter Zhu
This commit adds a field handle_weak_references to rb_data_type_struct for the callback when handling weak references. This avoids TypedData objects from needing to expose their rb_data_type_struct and weak references function.
12 daysAdd rb_gc_print_backtracePeter Zhu
2025-12-31Make `RTYPEDDATA_EMBEDDABLE_P` internal-use onlyNobuyoshi Nakada
It should not be exposed because it is so implementation specific that it is only used in gc.c even within the entire Ruby source tree.
2025-12-31Introduce typed-data embeddable predicate macrosNobuyoshi Nakada
The combination of `&` and `&&` is confusing.
2025-12-29Add rb_gc_move_obj_during_markingPeter Zhu
2025-12-29Add rb_gc_register_pinning_objPeter Zhu
2025-12-26Remove old APIs to allocate a data object deprecated for 5 yearsNobuyoshi Nakada
2025-12-25Implement cont using declare weak referencesPeter Zhu
2025-12-25Implement weak references on gen fields cachePeter Zhu
2025-12-25Implement callcache using declare weak referencesPeter Zhu
2025-12-25Implement WeakMap and WeakKeyMap using declare weak referencesPeter Zhu
2025-12-25Implement declaring weak referencesPeter Zhu
[Feature #21084] # Summary The current way of marking weak references uses `rb_gc_mark_weak(VALUE *ptr)`. This presents challenges because Ruby's GC is incremental, meaning that if the `ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory access. This also overwrites `*ptr = Qundef` if `*ptr` is dead, which prevents any cleanup to be run (e.g. freeing memory or deleting entries from hash tables). This ticket proposes `rb_gc_declare_weak_references` which declares that an object has weak references and calls a cleanup function after marking, allowing the object to clean up any memory for dead objects. # Introduction In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an API allowing objects to mark weak references, the function signature looks like this: ```c void rb_gc_mark_weak(VALUE *ptr); ``` `rb_gc_mark_weak` is called during the marking phase of the GC to specify that the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced. `rb_gc_mark_weak` appends this pointer to a list that is processed after the marking phase of the GC. If the object at `*ptr` is no longer alive, then it overwrites the object reference with a special value (`*ptr = Qundef`). However, this API resulted in two challenges: 1. Ruby's default GC is incremental, which means that the GC is not ran in one phase, but rather split into chunks of work that interleaves with Ruby execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc heap, and that memory could be realloc'd or even free'd. We had to use workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted runtime performance, and increased memory usage. 2. When an object dies, `rb_gc_mark_weak` only overwites the reference with `Qundef`. This means that if we want to do any cleanup (e.g. free a piece of memory or delete a hash table entry), we could not do that and had to defer this process elsewhere (e.g. during marking or runtime). In this ticket, I'm proposing a new API for weak references. Instead of an object marking its weak references during the marking phase, the object declares that it has weak references using the `rb_gc_declare_weak_references` function. This declaration occurs during runtime (e.g. after the object has been created) rather than during GC. After an object declares that it has weak references, it will have its callback function called after marking as long as that object is alive. This callback function can then call a special function `rb_gc_handle_weak_references_alive_p` to determine whether its references are alive. This will allow the callback function to do whatever it wants on the object, allowing it to perform any cleanup work it needs. This significantly simplifies the code for `ObjectSpace::WeakMap` and `ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for the limitations of `rb_gc_mark_weak`. # Performance The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now about 60% faster because the implementation has been simplified and the number of allocations has been reduced. We can see that there is not a significant impact on the performance of `ObjectSpace::WeakMap#[]`. Base: ``` ObjectSpace::WeakMap#[]= 4.620M (± 6.4%) i/s (216.44 ns/i) - 23.342M in 5.072149s ObjectSpace::WeakMap#[] 30.967M (± 1.9%) i/s (32.29 ns/i) - 154.998M in 5.007157s ``` Branch: ``` ObjectSpace::WeakMap#[]= 7.336M (± 2.8%) i/s (136.31 ns/i) - 36.755M in 5.013983s ObjectSpace::WeakMap#[] 30.902M (± 5.4%) i/s (32.36 ns/i) - 155.901M in 5.064060s ``` Code: ``` require "bundler/inline" gemfile do source "https://rubygems.org" gem "benchmark-ips" end wmap = ObjectSpace::WeakMap.new key = Object.new val = Object.new wmap[key] = val Benchmark.ips do |x| x.report("ObjectSpace::WeakMap#[]=") do |times| i = 0 while i < times wmap[Object.new] = Object.new i += 1 end end x.report("ObjectSpace::WeakMap#[]") do |times| i = 0 while i < times wmap[key] wmap[val] # does not exist i += 1 end end end ``` # Alternative designs Currently, `rb_gc_declare_weak_references` is designed to be an internal-only API. This allows us to assume the object types that call `rb_gc_declare_weak_references`. In the future, if we want to open up this API to third parties, we may want to change this function to something like: ```c void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj)); ``` This will allow the third party to implement a custom `callback` that gets called after the marking phase of GC to clean up any dead references. I chose not to implement this design because it is less efficient as we would need to store a mapping from `obj` to `callback`, which requires extra memory.
2025-12-23Move special const check to gc.c for rb_gc_impl_object_moved_pPeter Zhu
2025-12-20Check slot_size before zeroing memory for GC hookPeter Zhu
If the slot_size < RVALUE_SIZE then we would underflow in the memset.
2025-12-17[DOC] Small changes to docs for ObjectSpace#each_object (#15564)Luke Gruber
Change example to use user-defined class instead of `Numeric`.
2025-12-17Rename to `struct rbimpl_size_overflow_tag`Nobuyoshi Nakada
This struct is used for addition not only for multiplication, so remove the word `mul`, and make the member names more descriptive.
2025-12-11Add assumption to free_vm_weak_referencesJohn Hawthorn
Help the compiler know that we always get a heap object here.
2025-12-10Add `NUM2PTR` and `PTR2NUM` macrosNobuyoshi Nakada
These macros have been defined here and there, so collect them.
2025-12-09Fix typos in gc.c and gc.rbhi
2025-12-07Output ivar length for T_OBJECT in obj_infoPeter Zhu
2025-12-06Fix id2ref for multi-RactorPeter Zhu
The id2ref table needs to be under a VM lock to ensure there are no race conditions. The following script crashes: o = Object.new ObjectSpace._id2ref(o.object_id) 10.times.map do Ractor.new do 10_000.times do a = Object.new a.object_id end end end.map(&:value) With: [BUG] Object ID seen, but not in _id2ref table: object_id=2800 object=T_OBJECT ruby 4.0.0dev (2025-12-06T15:15:43Z ractor-id2ref-fix e7f9abdc91) +PRISM [x86_64-linux] -- Control frame information ----------------------------------------------- c:0001 p:---- s:0003 e:000002 l:y b:---- DUMMY [FINISH] -- Threading information --------------------------------------------------- Total ractor count: 5 Ruby thread count for this ractor: 1 -- C level backtrace information ------------------------------------------- miniruby(rb_print_backtrace+0x14) [0x6047d09b2dff] vm_dump.c:1105 miniruby(rb_vm_bugreport) vm_dump.c:1450 miniruby(rb_bug_without_die_internal+0x5f) [0x6047d066bf57] error.c:1098 miniruby(rb_bug) error.c:1116 miniruby(rb_gc_get_ractor_newobj_cache+0x0) [0x6047d066c8dd] gc.c:2052 miniruby(gc_sweep_plane+0xad) [0x6047d079276d] gc/default/default.c:3513 miniruby(gc_sweep_page) gc/default/default.c:3605 miniruby(gc_sweep_step) gc/default/default.c:3886 miniruby(gc_sweep+0x1ba) [0x6047d0794cfa] gc/default/default.c:4154 miniruby(gc_start+0xbf2) [0x6047d0796742] gc/default/default.c:6519 miniruby(heap_prepare+0xcc) [0x6047d079748c] gc/default/default.c:2090 miniruby(heap_next_free_page) gc/default/default.c:2305 miniruby(newobj_cache_miss) gc/default/default.c:2412 miniruby(newobj_alloc+0xd) [0x6047d0798ff5] gc/default/default.c:2436 miniruby(rb_gc_impl_new_obj) gc/default/default.c:2515 miniruby(newobj_of) gc.c:996 miniruby(rb_wb_protected_newobj_of) gc.c:1046 miniruby(str_alloc_embed+0x28) [0x6047d08fda18] string.c:1019 miniruby(str_enc_new) string.c:1069 miniruby(prep_io+0x5) [0x6047d07cda14] io.c:9305 miniruby(prep_stdio) io.c:9347 miniruby(rb_io_prep_stdin) io.c:9365 miniruby(thread_start_func_2+0x77c) [0x6047d093a55c] thread.c:679 miniruby(thread_sched_lock_+0x0) [0x6047d093aacd] thread_pthread.c:2241 miniruby(co_start) thread_pthread_mn.c:469
2025-12-05Revert "gc.c: Pass shape_id to `newobj_init`"Peter Zhu
This reverts commit 228d13f6ed914d1e7f6bd2416e3f5be8283be865. This commit makes default.c and mmtk.c depend on shape.h, which prevents them from building independently.
2025-12-05Allow rb_thread_call_with_gvl() to work when thread already has GVLKeenan Brock
[Feature #20750] Co-authored-by: Benoit Daloze <eregontp@gmail.com>
2025-12-03gc.c: check if the struct has fields before marking the fields_objJean Boussier
If GC trigger in the middle of `struct_alloc`, and the struct has more than 3 elements, then `fields_obj` reference is garbage. We must first check the shape to know if it was actually initialized.
2025-12-03gc.c: Pass shape_id to `newobj_init`Jean Boussier
Attempt to fix the following SEGV: ``` ruby(gc_mark) ../src/gc/default/default.c:4429 ruby(gc_mark_children+0x45) [0x560b380bf8b5] ../src/gc/default/default.c:4625 ruby(gc_mark_stacked_objects) ../src/gc/default/default.c:4647 ruby(gc_mark_stacked_objects_all) ../src/gc/default/default.c:4685 ruby(gc_marks_rest) ../src/gc/default/default.c:5707 ruby(gc_marks+0x4e7) [0x560b380c41c1] ../src/gc/default/default.c:5821 ruby(gc_start) ../src/gc/default/default.c:6502 ruby(heap_prepare+0xa4) [0x560b380c4efc] ../src/gc/default/default.c:2074 ruby(heap_next_free_page) ../src/gc/default/default.c:2289 ruby(newobj_cache_miss) ../src/gc/default/default.c:2396 ruby(RB_SPECIAL_CONST_P+0x0) [0x560b380c5df4] ../src/gc/default/default.c:2420 ruby(RB_BUILTIN_TYPE) ../src/include/ruby/internal/value_type.h:184 ruby(newobj_init) ../src/gc/default/default.c:2136 ruby(rb_gc_impl_new_obj) ../src/gc/default/default.c:2500 ruby(newobj_of) ../src/gc.c:996 ruby(rb_imemo_new+0x37) [0x560b380d8bed] ../src/imemo.c:46 ruby(imemo_fields_new) ../src/imemo.c:105 ruby(rb_imemo_fields_new) ../src/imemo.c:120 ``` I have no reproduction, but my understanding based on the backtrace and error is that GC is triggered inside `newobj_init` causing the new object to be marked while in a incomplete state. I believe the fix is to pass the `shape_id` down to `newobj_init` so it can be set before the GC has a chance to trigger.
2025-12-03Rename `rb_obj_exivar_p` -> `rb_obj_gen_fields_p`Jean Boussier
The "EXIVAR" terminology has been replaced by "gen fields" AKA "generic fields". Exivar implies variable, but generic fields include more than just variables, e.g. `object_id`.
2025-12-03Handle NEWOBJ tracepoints settings fieldsJean Boussier
[Bug #21710] - struct.c: `struct_alloc` It is possible for a `NEWOBJ` tracepoint call back to write fields into a newly allocated object before `struct_alloc` had the time to set the `RSTRUCT_GEN_FIELDS` flags and such. Hence we can't blindly initialize the `fields_obj` reference to `0` we first need to check no fields were added yet. - object.c: `rb_class_allocate_instance` Similarly, if a `NEWOBJ` tracepoint tries to set fields on the object, the `shape_id` must already be set, as it's required on T_OBJECT to know where to write fields. `NEWOBJ_OF` had to be refactored to accept a `shape_id`.
2025-12-02Box: Mark boxes when a class/module is originally defined in it.Satoshi Tagomori
When a class/module defined by extension libraries in a box, checking types of instances of the class needs to access its data type (rb_data_type_t). So if a class still exists (not GCed), the box must exist too (to be marked).
2025-11-26Eliminate redundant work and branching when marking T_OBJECT (#15274)Luke Gruber
2025-11-20Accurate GC.stat under multi-Ractor modeJohn Hawthorn
2025-11-18Fix crash in optimal size for large T_OBJECTJohn Hawthorn
Previously any T_OBJECT with >= 94 IVARs would crash during compaction attempting to make an object too large to embed.
2025-11-09Make rb_gc_obj_optimal_size always return allocatable sizePeter Zhu
It may return sizes that aren't allocatable for arrays and strings.
2025-11-08Move rb_gc_verify_shareable to gc.cPeter Zhu
rb_gc_verify_shareable is not GC implementation specific so it should live in gc.c.
2025-11-07renaming internal data structures and functions from namespace to boxSatoshi Tagomori
2025-11-04Release VM lock before running finalizers (#15050)Luke Gruber
We shouldn't run any ruby code with the VM lock held.
2025-11-04Fix rb_gc_impl_checking_shareable for modular GCJohn Hawthorn
This implements it the same as the other modular GC functions
2025-10-23catch up modular-gcKoichi Sasada
2025-10-23use `SET_SHAREABLE`Koichi Sasada
to adopt strict shareable rule. * (basically) shareable objects only refer shareable objects * (exception) shareable objects can refere unshareable objects but should not leak reference to unshareable objects to Ruby world
2025-10-21Move rb_class_classext_free to class.cPeter Zhu
2025-10-02ZJIT: Add `NoSingletonClass` patch point (#14680)Stan Lo
* ZJIT: Add NoSingletonClass patch point This patch point makes sure that when the object has a singleton class, the JIT code is invalidated. As of now, this is only needed for C call optimization. In YJIT, the singleton class guard only applies to Array, Hash, and String. But in ZJIT, we may optimize C calls from gems (e.g. `sqlite3`). So the patch point needs to be applied to a broader range of classes. * ZJIT: Only generate NoSingletonClass guard when the type can have singleton class * ZJIT: Update or forget NoSingletonClass patch point when needed
2025-09-30ZJIT: Add --zjit-trace-exits (#14640)Aiden Fox Ivey
Add side exit tracing functionality for ZJIT
2025-09-25ZJIT: Actually call rb_zjit_root_update_references()Alan Wu
Previously unused.
2025-09-24Don't require to set PC before allocating hidden objectAlan Wu
ZJIT doesn't set PC before rb_set_ivar(), and that allocates a managed ID table. Was a false positive from gc_validate_pc().
2025-09-23gc_validate_pc(): Exclude imemos, add a test and explain the assertsAlan Wu
The validation is relevant only for traceable userland ruby objects ruby code could interact with. ZJIT's use of rb_vm_method_cfunc_is() allocates a CC imemo and was failing this validation when it was actually fine. Relax the check.
2025-09-18Prevent GC from running during `newobj_of` for internal_event_newobj.Luke Gruber
If another ractor is calling for GC, we need to prevent the current one from joining the barrier. Otherwise, our half-built object will be marked. The repro script was: test.rb: ```ruby require "objspace" 1000.times do ObjectSpace.trace_object_allocations do r = Ractor.new do _obj = 'a' * 1024 end r.join end end ``` $ untilfail lldb -b ./exe/ruby -o "target create ./exe/ruby" -o "run test.rb" -o continue It would fail at `ractor_port_mark`, rp->r was a garbage value. Credit to John for finding the solution. Co-authored-by: John Hawthorn <john.hawthorn@shopify.com>
2025-09-17Fill more of the slot with garbagePeter Zhu
2025-09-17Remove setting v1, v2, v3 when creating a new objectPeter Zhu
Setting v1, v2, v3 when we allocate an object assumes that we always allocate 40 byte objects. By removing v1, v2, v3, we can make the base slot size another size.