| Age | Commit message (Collapse) | Author |
|
[Bug #21548]
In lazy sweeping, if we need to allocate an object in a heap where we
weren't able to free any slots, but we also either have empty pages or
could allocate new pages, then we want to preemptively claim a page
because it's possible that sweeping another heap will call gc_sweep_finish_heap,
which may use up all of the empty/allocatable pages. If other heaps are
not finished sweeping then we do not finish this GC and we will end up
triggering a new GC cycle during this GC phase.
|
|
|
|
rb_gc_impl_writebarrier_remember is not Ractor safe because it writes to
bitmaps and also pushes onto the mark stack during incremental marking.
We should acquire the VM lock to prevent race conditions.
In the case that the object is not old, there is no performance impact.
However, we can see a performance impact in this microbenchmark where the
object is old:
4.times.map do
Ractor.new do
ary = []
3.times { GC.start }
10_000_000.times do |i|
ary.push(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
ary.clear
end
end
end.map(&:value)
Before:
Time (mean ± σ): 682.4 ms ± 5.1 ms [User: 2564.8 ms, System: 16.0 ms]
After:
Time (mean ± σ): 5.522 s ± 0.096 s [User: 8.237 s, System: 7.931 s]
Co-Authored-By: Luke Gruber <luke.gruber@shopify.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
|
|
Assuming not all objects are moved during compaction, it
is preferable to avoid rewriting references that haven't moved
as to avoid invalidating potentially shared memory pages.
|
|
|
|
objspace->flags.immediate_sweep shares the same word as
objspace->flags.during_incremental_marking. So in gc_start we need to
assign it after gc_enter() so that we hold the VM lock and have issued a
barrier, as rb_gc_impl_writebarrier is reading
objspace->flags.during_incremental_marking.
|
|
Some GC modules, notably MMTk, support parallel GC, i.e. multiple GC
threads work in parallel during a GC. Currently, when two GC threads
scan two iseq objects simultaneously when YJIT is enabled, both threads
will attempt to borrow `CodeBlock::mem_block`, which will result in
panic.
This commit makes one part of the change.
We now set the YJIT code memory to writable in bulk before the
reference-updating phase, and reset it to executable in bulk after the
reference-updating phase. Previously, YJIT lazily sets memory pages
writable while updating object references embedded in JIT-compiled
machine code, and sets the memory back to executable by calling
`mark_all_executable`. This approach is inherently unfriendly to
parallel GC because (1) it borrows `CodeBlock::mem_block`, and (2) it
sets the whole `CodeBlock` as executable which races with other GC
threads that are updating other iseq objects. It also has performance
overhead due to the frequent invocation of system calls. We now set the
permission of all the code memory in bulk before and after the reference
updating phase. Multiple GC threads can now perform raw memory writes
in parallel. We should also see performance improvement during moving
GC because of the reduced number of `mprotect` system calls.
|
|
The asan and valgrind macros when BUILDING_MODULAR_GC don't use the variables
which could the compiler to emit unused variable warnings.
|
|
We assert that the GC is not in a cycle in gc_start, but it does not show
what phase we're in if the assertion fails. This commit adds a debug
message for when the assertion fails.
|
|
|
|
This commit adds file and line to GC VM locking functions for debugging
purposes and adds upper case macros to pass __FILE__ and __LINE__.
Notes:
Merged: https://github.com/ruby/ruby/pull/13550
|
|
Some GC implementations want to always know when an object is written to,
even if the written value is a special constant. Checking special constants
in rb_obj_written was a micro-optimization that made assumptions about
the GC implementation.
Notes:
Merged: https://github.com/ruby/ruby/pull/13497
|
|
We don't want the default GC to depend on Ruby internals so we can build
it as a modular GC.
Notes:
Merged: https://github.com/ruby/ruby/pull/13476
|
|
* Added `Ractor::Port`
* `Ractor::Port#receive` (support multi-threads)
* `Rcator::Port#close`
* `Ractor::Port#closed?`
* Added some methods
* `Ractor#join`
* `Ractor#value`
* `Ractor#monitor`
* `Ractor#unmonitor`
* Removed some methods
* `Ractor#take`
* `Ractor.yield`
* Change the spec
* `Racotr.select`
You can wait for multiple sequences of messages with `Ractor::Port`.
```ruby
ports = 3.times.map{ Ractor::Port.new }
ports.map.with_index do |port, ri|
Ractor.new port,ri do |port, ri|
3.times{|i| port << "r#{ri}-#{i}"}
end
end
p ports.each{|port| pp 3.times.map{port.receive}}
```
In this example, we use 3 ports, and 3 Ractors send messages to them respectively.
We can receive a series of messages from each port.
You can use `Ractor#value` to get the last value of a Ractor's block:
```ruby
result = Ractor.new do
heavy_task()
end.value
```
You can wait for the termination of a Ractor with `Ractor#join` like this:
```ruby
Ractor.new do
some_task()
end.join
```
`#value` and `#join` are similar to `Thread#value` and `Thread#join`.
To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced.
This commit changes `Ractor.select()` method.
It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates.
We removes `Ractor.yield` and `Ractor#take` because:
* `Ractor::Port` supports most of similar use cases in a simpler manner.
* Removing them significantly simplifies the code.
We also change the internal thread scheduler code (thread_pthread.c):
* During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks.
This lock is released by `rb_ractor_sched_barrier_end()`
which is called at the end of operations that require the barrier.
* fix potential deadlock issues by checking interrupts just before setting UBF.
https://bugs.ruby-lang.org/issues/21262
Notes:
Merged: https://github.com/ruby/ruby/pull/13445
|
|
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.
[Feature #21353]
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/13341
|
|
We unpoison slots allocated out of the GC, so we don't need to disable
the assertions that read from the memory.
Notes:
Merged: https://github.com/ruby/ruby/pull/13351
|
|
We can assume that the compiler will have __builtin_clzll so we can implement
nlz_int64 using that.
Notes:
Merged: https://github.com/ruby/ruby/pull/13351
|
|
This allows RVALUE_OVERHEAD to be defined elsewhere.
Notes:
Merged: https://github.com/ruby/ruby/pull/13381
|
|
The finalizer table can't be read nor modified without the VM lock.
Notes:
Merged: https://github.com/ruby/ruby/pull/13350
|
|
Fix a regression introduced by: https://github.com/ruby/ruby/pull/13155
Notes:
Merged: https://github.com/ruby/ruby/pull/13350
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13350
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13340
|
|
The table is global so accesses must be synchronized.
Notes:
Merged: https://github.com/ruby/ruby/pull/13349
|
|
The previous implementation assumed `RBasic` size is `2 * sizeof(VALUE)`,
might as well not make assumption and use a proper `sizeof`.
Co-Authored-By: John Hawthorn <john@hawthorn.email>
Notes:
Merged: https://github.com/ruby/ruby/pull/13348
|
|
This avoids a race condition where we were clearing the cache from
another ractor while it was in use. Oops!
Fixes this failure http://ci.rvm.jp/results/master@oci-aarch64/5750416
Notes:
Merged: https://github.com/ruby/ruby/pull/13294
|
|
Followup: https://github.com/ruby/ruby/pull/13286
Notes:
Merged: https://github.com/ruby/ruby/pull/13288
|
|
After fork we reset to single ractor mode (which IMO we shouldn't do,
but it requires more work to fix) and so we need to add the pending
object counts back to the main heap.
Notes:
Merged: https://github.com/ruby/ruby/pull/13286
|
|
This allows the default GC to not need debug_counter.h when building as a
modular GC.
Notes:
Merged: https://github.com/ruby/ruby/pull/13269
|
|
We don't need to check for USE_DEBUG_COUNTER because the code is no-op
if USE_DEBUG_COUNTER is not enabled.
Notes:
Merged: https://github.com/ruby/ruby/pull/13269
|
|
And get rid of the `obj_to_id_tbl`
It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.
We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.
The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/13159
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13264
|
|
Currently the count of allocated object for a heap is incremented
without regards to parallelism which leads to incorrect counts.
By maintaining a local counter in the ractor newobj cache, and only
syncing atomically with some granularity, we can improve the correctness
without increasing contention.
The allocated object count is also synced when the ractor is freed.
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/13192
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13181
|
|
In cb1ea54bbf6cdf49c53f42720fec1a151069810c I added one more
metadata flag, but didn't notice `RB_GC_OBJECT_METADATA_ENTRY_COUNT`
had to be incremented.
This should fix ASAN builds.
Interestingly, bdb25959fb047af0358f33d7327b7752dca14aa4 already
caused the count to be off by one, so I had to increment it by
2.
Notes:
Merged: https://github.com/ruby/ruby/pull/13179
|
|
Given that the currently planned ractor local GC implementation
performance will heavilly be influenced by the number of shareable
objects it would be valuable to be able to know how many of them
are in the heap.
|
|
This makes the finalizer table fully self contained, so GC no
longer need to delay cleaning the `obj_to_id_tbl`.
Notes:
Merged: https://github.com/ruby/ruby/pull/13155
|
|
|
|
`objspace->finalizer_table` must be synchronized,
otherwise concurrent insertion from multiple ractors
will cause a crash.
Repro:
```ruby
ractors = 5.times.map do |i|
Ractor.new do
100_000.times.map do
o = Object.new
ObjectSpace.define_finalizer(o, ->(id) {})
o
end
end
end
ractors.each(&:take)
```
Notes:
Merged: https://github.com/ruby/ruby/pull/13151
|
|
This inverse table is only useful if `ObjectSpace._id2ref` is used,
which is extremely rare. The only notable exception is the `drb` gem
and even then it has an option not to rely on `_id2ref`.
So if we assume this table will never be looked up, we can just
not maintain it, and if it turns out `_id2ref` is called, we
can lock the VM and re-build it.
```
compare-ruby: ruby 3.5.0dev (2025-04-10T09:44:40Z master 684cfa42d7) +YJIT +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-04-10T10:13:43Z lazy-id-to-obj d3aa9626cc) +YJIT +PRISM [arm64-darwin24]
warming up..
| |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|baseline | 26.364M| 25.974M|
| | 1.01x| -|
|object_id | 10.293M| 14.202M|
| | -| 1.38x|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/13115
|
|
[Bug #21214]
If we allocate objects where one heap holds transient objects and another
holds long lived objects, then the heap with transient objects will grow
along the heap with long lived objects, causing higher memory usage.
For example, we can see this issue in this script:
def allocate_small_object = []
def allocate_large_object = Array.new(10)
arys = Array.new(1_000_000) do
# Allocate 10 small transient objects
10.times { allocate_small_object }
# Allocate 1 large object that is persistent
allocate_large_object
end
pp GC.stat
pp GC.stat_heap
Before this change:
heap_live_slots: 2837243
{0 =>
{slot_size: 40,
heap_eden_pages: 1123,
heap_eden_slots: 1838807},
2 =>
{slot_size: 160,
heap_eden_pages: 2449,
heap_eden_slots: 1001149},
}
After this change:
heap_live_slots: 1094474
{0 =>
{slot_size: 40,
heap_eden_pages: 58,
heap_eden_slots: 94973},
2 =>
{slot_size: 160,
heap_eden_pages: 2449,
heap_eden_slots: 1001149},
}
Notes:
Merged: https://github.com/ruby/ruby/pull/13061
|
|
That seemed like the logical thing to do to me, but ko1 disagree.
Notes:
Merged: https://github.com/ruby/ruby/pull/13008
|
|
[Bug #20271]
[Bug #20267]
[Bug #20255]
`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.
Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.
We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:
```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```
If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
Notes:
Merged: https://github.com/ruby/ruby/pull/13008
|
|
Was reading some assembly and noticed the dead branches generated for
FL_TEST(). Just a quick basic pass to change the obvious places; there
may be other opportunities.
Notes:
Merged: https://github.com/ruby/ruby/pull/12980
Merged-By: XrXr
|
|
It's not used outside of default.c.
Notes:
Merged: https://github.com/ruby/ruby/pull/12964
|
|
It's not used outside of defaut.c
Notes:
Merged: https://github.com/ruby/ruby/pull/12964
|
|
Moving object_id dumping from ObjectSpace to the GC flags allows ObjectSpace
to not assume the FL_SEEN_OBJ_ID flag and instead move it to the responsibility
of the GC.
Notes:
Merged: https://github.com/ruby/ruby/pull/12915
|
|
There are 7 entries in RB_GC_OBJECT_METADATA_ENTRY_COUNT.
|
|
This will allow ObjectSpace.dump to output the age of the object.
Notes:
Merged: https://github.com/ruby/ruby/pull/12777
|
|
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
Notes:
Merged: https://github.com/ruby/ruby/pull/12777
|
|
we had been using a stub weak definition of `mprotect` in wasm/missing.c
so far, but wasi-sdk 23 added mprotect emulation to wasi-libc[^1], so the
emulation is now linked instead. However, the emulation doesn't support
PROT_NONE and fails with ENOSYS, so we need to avoid calling mprotect
completely on WASI.
[^1]: https://github.com/WebAssembly/wasi-libc/commit/7528b13170462c82e367d91ae0ecead84e470ceb
Notes:
Merged: https://github.com/ruby/ruby/pull/12776
|