ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
6 days	Disambiguate private and public RSTRUCT_ helpers	Jean Boussier
	RSTRUCT_LEN / RSTRUCT_GET / RSTRUCT_SET all existing in two versions, one public that does type and frozens checks and one private that doesn't. The problem is that this is error prone because the public version is always accessible, but the private one require to include `internal/struct.h`. So you may have some code that rely on the public version, and later on the private header is included and changes the behavior. This already led to introducing a bug in YJIT & ZJIT: https://github.com/ruby/ruby/pull/15835
8 days	Optimize rb_mark_generic_ivar for T_DATA and T_STRUCT	Peter Zhu
	T_DATA and T_STRUCT could have ivars but might not use the generic_fields_tbl. This commit skips lookup in the generic_fields_tbl for those cases.
13 days	Dump fstr and frozen status in rb_raw_obj_info_buitin_type	Peter Zhu

13 days	Add field handle_weak_references to TypedData	Peter Zhu
	This commit adds a field handle_weak_references to rb_data_type_struct for the callback when handling weak references. This avoids TypedData objects from needing to expose their rb_data_type_struct and weak references function.
14 days	Add rb_gc_print_backtrace	Peter Zhu

2025-12-31	Make `RTYPEDDATA_EMBEDDABLE_P` internal-use only	Nobuyoshi Nakada
	It should not be exposed because it is so implementation specific that it is only used in gc.c even within the entire Ruby source tree.
2025-12-31	Introduce typed-data embeddable predicate macros	Nobuyoshi Nakada
	The combination of `&` and `&&` is confusing.
2025-12-29	Add rb_gc_move_obj_during_marking	Peter Zhu

2025-12-29	Add rb_gc_register_pinning_obj	Peter Zhu

2025-12-26	Remove old APIs to allocate a data object deprecated for 5 years	Nobuyoshi Nakada

2025-12-25	Implement cont using declare weak references	Peter Zhu

2025-12-25	Implement weak references on gen fields cache	Peter Zhu

2025-12-25	Implement callcache using declare weak references	Peter Zhu

2025-12-25	Implement WeakMap and WeakKeyMap using declare weak references	Peter Zhu

2025-12-25	Implement declaring weak references	Peter Zhu
	[Feature #21084] # Summary The current way of marking weak references uses `rb_gc_mark_weak(VALUE ptr)`. This presents challenges because Ruby's GC is incremental, meaning that if the `ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory access. This also overwrites `ptr = Qundef` if `ptr` is dead, which prevents any cleanup to be run (e.g. freeing memory or deleting entries from hash tables). This ticket proposes `rb_gc_declare_weak_references` which declares that an object has weak references and calls a cleanup function after marking, allowing the object to clean up any memory for dead objects. # Introduction In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an API allowing objects to mark weak references, the function signature looks like this: ```c void rb_gc_mark_weak(VALUE ptr); ``` `rb_gc_mark_weak` is called during the marking phase of the GC to specify that the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced. `rb_gc_mark_weak` appends this pointer to a list that is processed after the marking phase of the GC. If the object at `ptr` is no longer alive, then it overwrites the object reference with a special value (`ptr = Qundef`). However, this API resulted in two challenges: 1. Ruby's default GC is incremental, which means that the GC is not ran in one phase, but rather split into chunks of work that interleaves with Ruby execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc heap, and that memory could be realloc'd or even free'd. We had to use workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted runtime performance, and increased memory usage. 2. When an object dies, `rb_gc_mark_weak` only overwites the reference with `Qundef`. This means that if we want to do any cleanup (e.g. free a piece of memory or delete a hash table entry), we could not do that and had to defer this process elsewhere (e.g. during marking or runtime). In this ticket, I'm proposing a new API for weak references. Instead of an object marking its weak references during the marking phase, the object declares that it has weak references using the `rb_gc_declare_weak_references` function. This declaration occurs during runtime (e.g. after the object has been created) rather than during GC. After an object declares that it has weak references, it will have its callback function called after marking as long as that object is alive. This callback function can then call a special function `rb_gc_handle_weak_references_alive_p` to determine whether its references are alive. This will allow the callback function to do whatever it wants on the object, allowing it to perform any cleanup work it needs. This significantly simplifies the code for `ObjectSpace::WeakMap` and `ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for the limitations of `rb_gc_mark_weak`. # Performance The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now about 60% faster because the implementation has been simplified and the number of allocations has been reduced. We can see that there is not a significant impact on the performance of `ObjectSpace::WeakMap#[]`. Base: ``` ObjectSpace::WeakMap#[]= 4.620M (± 6.4%) i/s (216.44 ns/i) - 23.342M in 5.072149s ObjectSpace::WeakMap#[] 30.967M (± 1.9%) i/s (32.29 ns/i) - 154.998M in 5.007157s ``` Branch: ``` ObjectSpace::WeakMap#[]= 7.336M (± 2.8%) i/s (136.31 ns/i) - 36.755M in 5.013983s ObjectSpace::WeakMap#[] 30.902M (± 5.4%) i/s (32.36 ns/i) - 155.901M in 5.064060s ``` Code: ``` require "bundler/inline" gemfile do source "https://rubygems.org" gem "benchmark-ips" end wmap = ObjectSpace::WeakMap.new key = Object.new val = Object.new wmap[key] = val Benchmark.ips do \|x\| x.report("ObjectSpace::WeakMap#[]=") do \|times\| i = 0 while i < times wmap[Object.new] = Object.new i += 1 end end x.report("ObjectSpace::WeakMap#[]") do \|times\| i = 0 while i < times wmap[key] wmap[val] # does not exist i += 1 end end end ``` # Alternative designs Currently, `rb_gc_declare_weak_references` is designed to be an internal-only API. This allows us to assume the object types that call `rb_gc_declare_weak_references`. In the future, if we want to open up this API to third parties, we may want to change this function to something like: ```c void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj)); ``` This will allow the third party to implement a custom `callback` that gets called after the marking phase of GC to clean up any dead references. I chose not to implement this design because it is less efficient as we would need to store a mapping from `obj` to `callback`, which requires extra memory.
2025-12-23	Move special const check to gc.c for rb_gc_impl_object_moved_p	Peter Zhu

2025-12-20	Check slot_size before zeroing memory for GC hook	Peter Zhu
	If the slot_size < RVALUE_SIZE then we would underflow in the memset.
2025-12-17	[DOC] Small changes to docs for ObjectSpace#each_object (#15564)	Luke Gruber
	Change example to use user-defined class instead of `Numeric`.
2025-12-17	Rename to `struct rbimpl_size_overflow_tag`	Nobuyoshi Nakada
	This struct is used for addition not only for multiplication, so remove the word `mul`, and make the member names more descriptive.
2025-12-11	Add assumption to free_vm_weak_references	John Hawthorn
	Help the compiler know that we always get a heap object here.
2025-12-10	Add `NUM2PTR` and `PTR2NUM` macros	Nobuyoshi Nakada
	These macros have been defined here and there, so collect them.
2025-12-09	Fix typos in gc.c and gc.rb	hi

2025-12-07	Output ivar length for T_OBJECT in obj_info	Peter Zhu

2025-12-06	Fix id2ref for multi-Ractor	Peter Zhu
	The id2ref table needs to be under a VM lock to ensure there are no race conditions. The following script crashes: o = Object.new ObjectSpace._id2ref(o.object_id) 10.times.map do Ractor.new do 10_000.times do a = Object.new a.object_id end end end.map(&:value) With: [BUG] Object ID seen, but not in _id2ref table: object_id=2800 object=T_OBJECT ruby 4.0.0dev (2025-12-06T15:15:43Z ractor-id2ref-fix e7f9abdc91) +PRISM [x86_64-linux] -- Control frame information ----------------------------------------------- c:0001 p:---- s:0003 e:000002 l:y b:---- DUMMY [FINISH] -- Threading information --------------------------------------------------- Total ractor count: 5 Ruby thread count for this ractor: 1 -- C level backtrace information ------------------------------------------- miniruby(rb_print_backtrace+0x14) [0x6047d09b2dff] vm_dump.c:1105 miniruby(rb_vm_bugreport) vm_dump.c:1450 miniruby(rb_bug_without_die_internal+0x5f) [0x6047d066bf57] error.c:1098 miniruby(rb_bug) error.c:1116 miniruby(rb_gc_get_ractor_newobj_cache+0x0) [0x6047d066c8dd] gc.c:2052 miniruby(gc_sweep_plane+0xad) [0x6047d079276d] gc/default/default.c:3513 miniruby(gc_sweep_page) gc/default/default.c:3605 miniruby(gc_sweep_step) gc/default/default.c:3886 miniruby(gc_sweep+0x1ba) [0x6047d0794cfa] gc/default/default.c:4154 miniruby(gc_start+0xbf2) [0x6047d0796742] gc/default/default.c:6519 miniruby(heap_prepare+0xcc) [0x6047d079748c] gc/default/default.c:2090 miniruby(heap_next_free_page) gc/default/default.c:2305 miniruby(newobj_cache_miss) gc/default/default.c:2412 miniruby(newobj_alloc+0xd) [0x6047d0798ff5] gc/default/default.c:2436 miniruby(rb_gc_impl_new_obj) gc/default/default.c:2515 miniruby(newobj_of) gc.c:996 miniruby(rb_wb_protected_newobj_of) gc.c:1046 miniruby(str_alloc_embed+0x28) [0x6047d08fda18] string.c:1019 miniruby(str_enc_new) string.c:1069 miniruby(prep_io+0x5) [0x6047d07cda14] io.c:9305 miniruby(prep_stdio) io.c:9347 miniruby(rb_io_prep_stdin) io.c:9365 miniruby(thread_start_func_2+0x77c) [0x6047d093a55c] thread.c:679 miniruby(thread_sched_lock_+0x0) [0x6047d093aacd] thread_pthread.c:2241 miniruby(co_start) thread_pthread_mn.c:469
2025-12-05	Revert "gc.c: Pass shape_id to `newobj_init`"	Peter Zhu
	This reverts commit 228d13f6ed914d1e7f6bd2416e3f5be8283be865. This commit makes default.c and mmtk.c depend on shape.h, which prevents them from building independently.
2025-12-05	Allow rb_thread_call_with_gvl() to work when thread already has GVL	Keenan Brock
	[Feature #20750] Co-authored-by: Benoit Daloze <eregontp@gmail.com>
2025-12-03	gc.c: check if the struct has fields before marking the fields_obj	Jean Boussier
	If GC trigger in the middle of `struct_alloc`, and the struct has more than 3 elements, then `fields_obj` reference is garbage. We must first check the shape to know if it was actually initialized.
2025-12-03	gc.c: Pass shape_id to `newobj_init`	Jean Boussier
	Attempt to fix the following SEGV: ``` ruby(gc_mark) ../src/gc/default/default.c:4429 ruby(gc_mark_children+0x45) [0x560b380bf8b5] ../src/gc/default/default.c:4625 ruby(gc_mark_stacked_objects) ../src/gc/default/default.c:4647 ruby(gc_mark_stacked_objects_all) ../src/gc/default/default.c:4685 ruby(gc_marks_rest) ../src/gc/default/default.c:5707 ruby(gc_marks+0x4e7) [0x560b380c41c1] ../src/gc/default/default.c:5821 ruby(gc_start) ../src/gc/default/default.c:6502 ruby(heap_prepare+0xa4) [0x560b380c4efc] ../src/gc/default/default.c:2074 ruby(heap_next_free_page) ../src/gc/default/default.c:2289 ruby(newobj_cache_miss) ../src/gc/default/default.c:2396 ruby(RB_SPECIAL_CONST_P+0x0) [0x560b380c5df4] ../src/gc/default/default.c:2420 ruby(RB_BUILTIN_TYPE) ../src/include/ruby/internal/value_type.h:184 ruby(newobj_init) ../src/gc/default/default.c:2136 ruby(rb_gc_impl_new_obj) ../src/gc/default/default.c:2500 ruby(newobj_of) ../src/gc.c:996 ruby(rb_imemo_new+0x37) [0x560b380d8bed] ../src/imemo.c:46 ruby(imemo_fields_new) ../src/imemo.c:105 ruby(rb_imemo_fields_new) ../src/imemo.c:120 ``` I have no reproduction, but my understanding based on the backtrace and error is that GC is triggered inside `newobj_init` causing the new object to be marked while in a incomplete state. I believe the fix is to pass the `shape_id` down to `newobj_init` so it can be set before the GC has a chance to trigger.
2025-12-03	Rename `rb_obj_exivar_p` -> `rb_obj_gen_fields_p`	Jean Boussier
	The "EXIVAR" terminology has been replaced by "gen fields" AKA "generic fields". Exivar implies variable, but generic fields include more than just variables, e.g. `object_id`.
2025-12-03	Handle NEWOBJ tracepoints settings fields	Jean Boussier
	[Bug #21710] - struct.c: `struct_alloc` It is possible for a `NEWOBJ` tracepoint call back to write fields into a newly allocated object before `struct_alloc` had the time to set the `RSTRUCT_GEN_FIELDS` flags and such. Hence we can't blindly initialize the `fields_obj` reference to `0` we first need to check no fields were added yet. - object.c: `rb_class_allocate_instance` Similarly, if a `NEWOBJ` tracepoint tries to set fields on the object, the `shape_id` must already be set, as it's required on T_OBJECT to know where to write fields. `NEWOBJ_OF` had to be refactored to accept a `shape_id`.
2025-12-02	Box: Mark boxes when a class/module is originally defined in it.	Satoshi Tagomori
	When a class/module defined by extension libraries in a box, checking types of instances of the class needs to access its data type (rb_data_type_t). So if a class still exists (not GCed), the box must exist too (to be marked).
2025-11-26	Eliminate redundant work and branching when marking T_OBJECT (#15274)	Luke Gruber

2025-11-20	Accurate GC.stat under multi-Ractor mode	John Hawthorn

2025-11-18	Fix crash in optimal size for large T_OBJECT	John Hawthorn
	Previously any T_OBJECT with >= 94 IVARs would crash during compaction attempting to make an object too large to embed.
2025-11-09	Make rb_gc_obj_optimal_size always return allocatable size	Peter Zhu
	It may return sizes that aren't allocatable for arrays and strings.
2025-11-08	Move rb_gc_verify_shareable to gc.c	Peter Zhu
	rb_gc_verify_shareable is not GC implementation specific so it should live in gc.c.
2025-11-07	renaming internal data structures and functions from namespace to box	Satoshi Tagomori

2025-11-04	Release VM lock before running finalizers (#15050)	Luke Gruber
	We shouldn't run any ruby code with the VM lock held.
2025-11-04	Fix rb_gc_impl_checking_shareable for modular GC	John Hawthorn
	This implements it the same as the other modular GC functions
2025-10-23	catch up modular-gc	Koichi Sasada

2025-10-23	use `SET_SHAREABLE`	Koichi Sasada
	to adopt strict shareable rule. * (basically) shareable objects only refer shareable objects * (exception) shareable objects can refere unshareable objects but should not leak reference to unshareable objects to Ruby world
2025-10-21	Move rb_class_classext_free to class.c	Peter Zhu

2025-10-02	ZJIT: Add `NoSingletonClass` patch point (#14680)	Stan Lo
	* ZJIT: Add NoSingletonClass patch point This patch point makes sure that when the object has a singleton class, the JIT code is invalidated. As of now, this is only needed for C call optimization. In YJIT, the singleton class guard only applies to Array, Hash, and String. But in ZJIT, we may optimize C calls from gems (e.g. `sqlite3`). So the patch point needs to be applied to a broader range of classes. * ZJIT: Only generate NoSingletonClass guard when the type can have singleton class * ZJIT: Update or forget NoSingletonClass patch point when needed
2025-09-30	ZJIT: Add --zjit-trace-exits (#14640)	Aiden Fox Ivey
	Add side exit tracing functionality for ZJIT
2025-09-25	ZJIT: Actually call rb_zjit_root_update_references()	Alan Wu
	Previously unused.
2025-09-24	Don't require to set PC before allocating hidden object	Alan Wu
	ZJIT doesn't set PC before rb_set_ivar(), and that allocates a managed ID table. Was a false positive from gc_validate_pc().
2025-09-23	gc_validate_pc(): Exclude imemos, add a test and explain the asserts	Alan Wu
	The validation is relevant only for traceable userland ruby objects ruby code could interact with. ZJIT's use of rb_vm_method_cfunc_is() allocates a CC imemo and was failing this validation when it was actually fine. Relax the check.
2025-09-18	Prevent GC from running during `newobj_of` for internal_event_newobj.	Luke Gruber
	If another ractor is calling for GC, we need to prevent the current one from joining the barrier. Otherwise, our half-built object will be marked. The repro script was: test.rb: ```ruby require "objspace" 1000.times do ObjectSpace.trace_object_allocations do r = Ractor.new do _obj = 'a' * 1024 end r.join end end ``` $ untilfail lldb -b ./exe/ruby -o "target create ./exe/ruby" -o "run test.rb" -o continue It would fail at `ractor_port_mark`, rp->r was a garbage value. Credit to John for finding the solution. Co-authored-by: John Hawthorn <john.hawthorn@shopify.com>
2025-09-17	Fill more of the slot with garbage	Peter Zhu

2025-09-17	Remove setting v1, v2, v3 when creating a new object	Peter Zhu
	Setting v1, v2, v3 when we allocate an object assumes that we always allocate 40 byte objects. By removing v1, v2, v3, we can make the base slot size another size.