| Age | Commit message (Collapse) | Author |
|
|
|
[Feature #21084]
# Summary
The current way of marking weak references uses `rb_gc_mark_weak(VALUE *ptr)`.
This presents challenges because Ruby's GC is incremental, meaning that if the
`ptr` changes (e.g. realloc'd or free'd), then we could have an invalid memory
access. This also overwrites `*ptr = Qundef` if `*ptr` is dead, which prevents
any cleanup to be run (e.g. freeing memory or deleting entries from hash
tables). This ticket proposes `rb_gc_declare_weak_references` which declares
that an object has weak references and calls a cleanup function after marking,
allowing the object to clean up any memory for dead objects.
# Introduction
In [[Feature #19783]](https://bugs.ruby-lang.org/issues/19783), I introduced an
API allowing objects to mark weak references, the function signature looks like
this:
```c
void rb_gc_mark_weak(VALUE *ptr);
```
`rb_gc_mark_weak` is called during the marking phase of the GC to specify that
the memory at `ptr` holds a pointer to a Ruby object that is weakly referenced.
`rb_gc_mark_weak` appends this pointer to a list that is processed after the
marking phase of the GC. If the object at `*ptr` is no longer alive, then it
overwrites the object reference with a special value (`*ptr = Qundef`).
However, this API resulted in two challenges:
1. Ruby's default GC is incremental, which means that the GC is not ran in one
phase, but rather split into chunks of work that interleaves with Ruby
execution. The `ptr` passed into `rb_gc_mark_weak` could be on the malloc
heap, and that memory could be realloc'd or even free'd. We had to use
workarounds such as `rb_gc_remove_weak` to ensure that there were no illegal
memory accesses. This made `rb_gc_mark_weak` difficult to use, impacted
runtime performance, and increased memory usage.
2. When an object dies, `rb_gc_mark_weak` only overwites the reference with
`Qundef`. This means that if we want to do any cleanup (e.g. free a piece of
memory or delete a hash table entry), we could not do that and had to defer
this process elsewhere (e.g. during marking or runtime).
In this ticket, I'm proposing a new API for weak references. Instead of an
object marking its weak references during the marking phase, the object declares
that it has weak references using the `rb_gc_declare_weak_references` function.
This declaration occurs during runtime (e.g. after the object has been created)
rather than during GC.
After an object declares that it has weak references, it will have its callback
function called after marking as long as that object is alive. This callback
function can then call a special function `rb_gc_handle_weak_references_alive_p`
to determine whether its references are alive. This will allow the callback
function to do whatever it wants on the object, allowing it to perform any
cleanup work it needs.
This significantly simplifies the code for `ObjectSpace::WeakMap` and
`ObjectSpace::WeakKeyMap` because it no longer needs to have the workarounds for
the limitations of `rb_gc_mark_weak`.
# Performance
The performance results below demonstrate that `ObjectSpace::WeakMap#[]=` is now
about 60% faster because the implementation has been simplified and the number
of allocations has been reduced. We can see that there is not a significant
impact on the performance of `ObjectSpace::WeakMap#[]`.
Base:
```
ObjectSpace::WeakMap#[]=
4.620M (± 6.4%) i/s (216.44 ns/i) - 23.342M in 5.072149s
ObjectSpace::WeakMap#[]
30.967M (± 1.9%) i/s (32.29 ns/i) - 154.998M in 5.007157s
```
Branch:
```
ObjectSpace::WeakMap#[]=
7.336M (± 2.8%) i/s (136.31 ns/i) - 36.755M in 5.013983s
ObjectSpace::WeakMap#[]
30.902M (± 5.4%) i/s (32.36 ns/i) - 155.901M in 5.064060s
```
Code:
```
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "benchmark-ips"
end
wmap = ObjectSpace::WeakMap.new
key = Object.new
val = Object.new
wmap[key] = val
Benchmark.ips do |x|
x.report("ObjectSpace::WeakMap#[]=") do |times|
i = 0
while i < times
wmap[Object.new] = Object.new
i += 1
end
end
x.report("ObjectSpace::WeakMap#[]") do |times|
i = 0
while i < times
wmap[key]
wmap[val] # does not exist
i += 1
end
end
end
```
# Alternative designs
Currently, `rb_gc_declare_weak_references` is designed to be an internal-only
API. This allows us to assume the object types that call
`rb_gc_declare_weak_references`. In the future, if we want to open up this API
to third parties, we may want to change this function to something like:
```c
void rb_gc_add_cleaner(VALUE obj, void (*callback)(VALUE obj));
```
This will allow the third party to implement a custom `callback` that gets
called after the marking phase of GC to clean up any dead references. I chose
not to implement this design because it is less efficient as we would need to
store a mapping from `obj` to `callback`, which requires extra memory.
|
|
[Bug #21710]
- struct.c: `struct_alloc`
It is possible for a `NEWOBJ` tracepoint call back to write fields
into a newly allocated object before `struct_alloc` had the time
to set the `RSTRUCT_GEN_FIELDS` flags and such.
Hence we can't blindly initialize the `fields_obj` reference to `0`
we first need to check no fields were added yet.
- object.c: `rb_class_allocate_instance`
Similarly, if a `NEWOBJ` tracepoint tries to set fields on the object,
the `shape_id` must already be set, as it's required on T_OBJECT to
know where to write fields.
`NEWOBJ_OF` had to be refactored to accept a `shape_id`.
|
|
to adopt strict shareable rule.
* (basically) shareable objects only refer shareable objects
* (exception) shareable objects can refere unshareable objects
but should not leak reference to unshareable objects to Ruby world
|
|
* `RB_OBJ_SET_SHAREABLE(obj)` makes obj shareable.
All of reachable objects from `obj` should be shareable.
* `RB_OBJ_SET_FROZEN_SHAREABLE(obj)` same as above
but freeze `obj` before making it shareable.
Also `rb_gc_verify_shareable(obj)` is introduced to check
the `obj` does not violate shareable rule (an shareable object
only refers shareable objects) strictly.
The rule has some exceptions (some shareable objects can refer to
unshareable objects, such as a Ractor object (which is a shareable
object) can refer to the Ractor local objects.
To handle such case, `check_shareable` flag is also introduced.
`STRICT_VERIFY_SHAREABLE` macro is also introduced to verify
the strict shareable rule at `SET_SHAREABLE`.
|
|
If we still fit in the existing imemo/fields object we can
update it atomically, saving a reallocation.
|
|
And get rid of the `obj_to_id_tbl`
It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.
We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.
The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/13159
|
|
Using `rb_obj_clone` introduce other problems, such as `initialize_*`
callbacks invocation in the context of the parent ractor.
So we can revert back to copy the content of the object slots,
but in a way that is aware of size pools.
Notes:
Merged: https://github.com/ruby/ruby/pull/13070
|
|
[Bug #20271]
[Bug #20267]
[Bug #20255]
`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.
Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.
We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:
```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```
If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
Notes:
Merged: https://github.com/ruby/ruby/pull/13008
|
|
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
Notes:
Merged: https://github.com/ruby/ruby/pull/12777
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12546
|
|
When reference updating ObjectSpace.trace_object_allocations, we need to
check whether the object is valid or not because it does not mark the
object so the object may be dead. This can cause a segmentation fault
if the object is on a free heap page.
For example, the following script crashes:
require "objspace"
objs = []
ObjectSpace.trace_object_allocations do
1_000_000.times do
objs << Object.new
end
end
objs = nil
# Free pages that the objs were on
GC.start
# Run compaction and check that it doesn't crash
GC.compact
Notes:
Merged: https://github.com/ruby/ruby/pull/12360
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12271
|
|
We should use the rb_gc_enable/rb_gc_disable_no_rest APIs instead of
directly setting the ruby_disable_gc variable.
Notes:
Merged: https://github.com/ruby/ruby/pull/12264
|
|
We have name fragmentation for this feature, including "shared GC",
"modular GC", and "external GC". This commit standardizes the feature
name to "modular GC" and the implementation to "GC library".
Notes:
Merged: https://github.com/ruby/ruby/pull/12261
|
|
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
Notes:
Merged: https://github.com/ruby/ruby/pull/12149
|
|
Use PR_SET_VMA_ANON_NAME to set human-readable names for anonymous
virtual memory areas mapped by `mmap()` when compiled and run on Linux
5.17 or higher. This makes it convenient for developers to debug mmap.
Notes:
Merged: https://github.com/ruby/ruby/pull/12119
|
|
This will add +MOD_GC to the version string and Ruby description when
Ruby is compiled with shared gc support.
When shared GC support is compiled in and a GC module has been loaded
using RUBY_GC_LIBRARY, the version string will include the name of
the currently active GC as reported by the rb_gc_active_gc_name function
in the form
+MOD_GC[gc_name]
[Feature #20794]
Notes:
Merged: https://github.com/ruby/ruby/pull/11872
|
|
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.
The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.
The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.
Since then various changes have happend in Ruby's memory layout:
* The concept of tomb heaps has been replaced by a global free pages list,
with each page having it's slot size reconfigured at the point when it
is resurrected
* the eden heap has been inlined into the size pool itself, so that now
the size pool directly controls the free_pages list, the sweeping
page, the compaction cursor and the other state that was previously
being managed by the eden heap.
Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
Notes:
Merged: https://github.com/ruby/ruby/pull/11771
|
|
|
|
This commit splits gc.c into two files:
- gc.c now only contains code not specific to Ruby GC. This includes
code to mark objects (which the GC implementation may choose not to
use) and wrappers for internal APIs that the implementation may need
to use (e.g. locking the VM).
- gc_impl.c now contains the implementation of Ruby's GC. This includes
marking, sweeping, compaction, and statistics. Most importantly,
gc_impl.c only uses public APIs in Ruby and a limited set of functions
exposed in gc.c. This allows us to build gc_impl.c independently of
Ruby and plug Ruby's GC into itself.
|
|
Many places call ruby_mimmalloc then MEMZERO. This can be reduced by
using ruby_mimcalloc instead.
|
|
|
|
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
|
|
|
|
This PR moves `rb_copy_wb_protected_attribute` and
`rb_gc_copy_finalizer` into a single function called
`rb_gc_copy_attributes` to be called by `init_copy`. This reduces the
surface area of the GC API.
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
|
|
ruby_env_debug_option gets called after Init_gc_stress, so the
--debug=gc_stress flag never works.
|
|
It is not used anywhere else.
|
|
|
|
It's not used outside of gc.c.
|
|
|
|
This removes the assumption about SIZE_POOL_COUNT for shapes.
|
|
|
|
|
|
Marking `Qnil` or `Qfalse` works fine, having
an extra macro to avoid it isn't needed.
|
|
|
|
|
|
rb_objspace_marked_object_p is no longer used in the objspace module, so
we can remove it.
|
|
rb_objspace_data_type_memsize is not used in the objspace module, so we
can make it private.
|
|
|
|
|
|
|
|
I was trying to debug an (unrelated) issue in the GC, and wanted to turn
on the trace-level GC output by compiling it with -DRGENGC_DEBUG=5.
Unfortunately, this actually causes a crash in newobj_init() because the
code there tries to log the obj_info() of the newly created object.
However, the object is not actually sufficiently set up for some of the
things that obj_info() tries to do:
* The instance variable table for a class is not yet initialized, and
when using variable-length RVALUES, said ivar table is embedded in
as-yet unitialized memory after the struct RValue. Attempting to read
this, as obj_info() does, causes a crash.
* T_DATA variables need to dereference their ->type field to print out
the underlying C type name, which is not set up until newobj_fill() is
called.
To fix this, create a new method `obj_info_basic`, which dumps out only
the parts of the object that are valid before the object is fully
initialized.
[Fixes #18795]
|
|
When generic instance variable has a shape, it is marked movable. If it
it transitions to too complex, it needs to update references otherwise
it may have incorrect references.
|
|
Previously the growth was 3(embed), 6, 12, 24, ...
With this change it's now 3(embed), 8, 16, 32, 64, ... by default.
However, since power of two isn't the best size for all allocators,
if `malloc_usable_size` is vailable, we use it to discover the best
offset.
On Linux/glibc 2.35 for instance, the growth will be 3(embed), 7, 15, 31
to avoid wasting 8B per object.
Test program:
```c
size_t test(size_t slots) {
size_t allocated = slots * VALUE_SIZE;
void *test_ptr = malloc(allocated);
size_t wasted = malloc_usable_size(test_ptr) - allocated;
free(test_ptr);
fprintf(stderr, "slots = %lu, wasted_bytes = %lu\n", slots, wasted);
return wasted;
}
int main(int argc, char *argv[]) {
size_t best_padding = 0;
size_t padding = 0;
for (padding = 0; padding <= 2; padding++) {
size_t wasted = test(8 - padding);
if (wasted == 0) {
best_padding = padding;
break;
}
}
size_t index = 0;
fprintf(stderr, "=============== naive ================\n");
size_t list_size = 4;
for (index = 0; index < 10; index++) {
test(list_size);
list_size *= 2;
}
fprintf(stderr, "=============== auto-padded (-%lu) ================\n", best_padding);
list_size = 4;
for (index = 0; index < 10; index ++) {
test(list_size - best_padding);
list_size *= 2;
}
fprintf(stderr, "\n\n");
return 0;
}
```
```
===== glibc ======
slots = 8, wasted_bytes = 8
slots = 7, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 8
slots = 8, wasted_bytes = 8
slots = 16, wasted_bytes = 8
slots = 32, wasted_bytes = 8
slots = 64, wasted_bytes = 8
slots = 128, wasted_bytes = 8
slots = 256, wasted_bytes = 8
slots = 512, wasted_bytes = 8
slots = 1024, wasted_bytes = 8
slots = 2048, wasted_bytes = 8
=============== auto-padded (-1) ================
slots = 3, wasted_bytes = 0
slots = 7, wasted_bytes = 0
slots = 15, wasted_bytes = 0
slots = 31, wasted_bytes = 0
slots = 63, wasted_bytes = 0
slots = 127, wasted_bytes = 0
slots = 255, wasted_bytes = 0
slots = 511, wasted_bytes = 0
slots = 1023, wasted_bytes = 0
slots = 2047, wasted_bytes = 0
```
```
========== jemalloc =======
slots = 8, wasted_bytes = 0
=============== naive ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
=============== auto-padded (-0) ================
slots = 4, wasted_bytes = 0
slots = 8, wasted_bytes = 0
slots = 16, wasted_bytes = 0
slots = 32, wasted_bytes = 0
slots = 64, wasted_bytes = 0
slots = 128, wasted_bytes = 0
slots = 256, wasted_bytes = 0
slots = 512, wasted_bytes = 0
slots = 1024, wasted_bytes = 0
slots = 2048, wasted_bytes = 0
```
|
|
WeakMap can crash during compaction because the st_insert could allocate
memory.
|
|
If we're during incremental marking, then Ruby code can execute that
deallocates certain memory buffers that have been called with
rb_gc_mark_weak, which can cause use-after-free bugs.
Notes:
Merged: https://github.com/ruby/ruby/pull/8375
|
|
This is an internal only function not exposed to the C extension API.
It's only use so far is from rb_vm_mark, where it's used to mark the
values in the vm->trap_list.cmd array.
There shouldn't be any reason why these cannot move.
This commit allows them to move by updating their references during the
reference updating step of compaction.
To do this we've introduced another internal function
rb_gc_update_values as a partner to rb_gc_mark_values.
This allows us to refactor rb_gc_mark_values to not pin
Notes:
Merged: https://github.com/ruby/ruby/pull/8341
|
|
[Feature #19783]
This commit adds support for weak references in the GC through the
function `rb_gc_mark_weak`. Unlike strong references, weak references
does not mark the object, but rather lets the GC know that an object
refers to another one. If the child object is freed, the pointer from
the parent object is overwritten with `Qundef`.
Co-Authored-By: Jean Boussier <byroot@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/8113
|
|
[Feature #18885]
For now, the optimizations performed are:
- Run a major GC
- Compact the heap
- Promote all surviving objects to oldgen
Other optimizations may follow.
Notes:
Merged: https://github.com/ruby/ruby/pull/7662
|