Age | Commit message (Collapse) | Author |
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5909
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5897
Merged-By: nobu <nobu@ruby-lang.org>
|
|
We don't need to allocate a new page in gc_sweep_finish_size_pool.
It can be allocated when needed.
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
Some size pools may not have any pages/slots, so total_slots is 0. This
causes a divide-by-zero in the calculation. This commit adds a special
case to catch the case when total_slots is 0 and returns the number of
pages for heap_init_slots.
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
If the size pool has no or few pages/slots, then min_free_slots will
be a very small number (or even 0). Then the heap won't be eligible to
grow, causing GC thrashing or infinite loops.
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
Size pools with no pages won't be swept so gc_sweep_finish_size_pool
will never be called on it, but gc_sweep_finish_size_pool must be called
to grow the size pool.
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
Depending on alignment, the last bitmap plane may not used. Then it will
appear as if all of the objects on that plane is unmarked, which will
cause a buffer overrun when we try to free the object. This commit
changes the loop to calculate the number of planes used
(bitmap_plane_count).
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
Since 4d8f76286beefbb8f7fba2479f6d0a0b4a47304c, we need to dereference
the includer field on iclasses, so we need to mark it to make sure
it's alive.
Sometimes during compaction we crash because the field is dangling,
though I have a hard time constructing such a situation. See
http://ci.rvm.jp/results/trunk@ruby-iga/3947725
Notes:
Merged: https://github.com/ruby/ruby/pull/5890
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5884
|
|
`start` is of type uintptr_t so it does not need to be casted to VALUE.
|
|
We didn't update the includer field during compaction so it could become
a dangling pointer after compaction. It's only recently that we started
to dereference the field, and we were only comparing the pointer before
then, so the omission only recently started to cause crashes.
By instrumenting object.c:833 with `rp(includer);`, you can see the
includer field become `T_NONE` with the following script:
```ruby
mod = Module.new do
protected def foo = 1
end
klass = Class.new do
include Module.new
def run
foo
end
end
klass.include(mod)
GC.verify_compaction_references(double_heap: true, toward: :empty)
klass.new.run
```
I found a crash in a private application that this patch fixes, but
wasn't able to develop a small reproducer. Hence the above demo that
requires instrumentation.
Notes:
Merged: https://github.com/ruby/ruby/pull/5880
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5783
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5775
|
|
During VM startup, rb_objspace_alloc sets malloc_limit
(objspace->malloc_params.limit) before ruby_gc_set_params is called, thus
nullifying the effect of RUBY_GC_MALLOC_LIMIT before the initial GC run.
The call sequence is as follows:
main.c::main()
ruby_init
ruby_setup
Init_BareVM
rb_objspace_alloc // malloc_limit = gc_params.malloc_limit_min;
ruby_options
ruby_process_options
process_options
ruby_gc_set_params // RUBY_GC_MALLOC_LIMIT => gc_params.malloc_limit_min
With ruby_gc_set_params setting malloc_limit, RUBY_GC_MALLOC_LIMIT
affects the process sooner.
[ruby-core:107170]
|
|
WASM does not have proper support for mmap.
Notes:
Merged: https://github.com/ruby/ruby/pull/5749
|
|
Commit dde164e968e382d50b07ad4559468885cbff33ef decoupled incremental
marking from page sizes. This commit changes Ruby heap page sizes to
64KiB. Doing so will have several benefits:
1. We can use compaction on systems with 64KiB system page sizes (e.g.
PowerPC).
2. Larger page sizes will allow Variable Width Allocation to increase
slot sizes and embed larger objects.
3. Since commit 002fa2859962f22de8afdbeece04966ea57b7da9, macOS has 64
KiB pages. Making page sizes 64 KiB will bring these systems to
parity.
I have attached some bechmark results below.
Discourse:
On Discourse, we saw much better p99 performance (e.g. for "categories"
it went from 214ms on master to 134ms on branch, for "home" it went
from 265ms to 251ms). We don’t see much change in p60, p75, and p90
performance. We also see a slight decrease in memory usage by 1.04x.
Branch RSS: 354.9MB
Master RSS: 368.2MB
railsbench:
On rails bench, we don’t see a big change in RPS or p99
performance. We don’t see a big difference in memory usage.
Branch RPS: 826.27
Master RPS: 824.85
Branch p99: 1.67
Master p99: 1.72
Branch RSS: 88.72MB
Master RSS: 88.48MB
liquid:
We don’t see a significant change in liquid performance.
Branch parse & render: 28.653 I/s
Master parse & render: 28.563 i/s
Notes:
Merged: https://github.com/ruby/ruby/pull/5749
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5746
|
|
Currently, rb_aligned_malloc uses mmap if Ruby heap pages can be
allocated through mmap (when system heap page size <= Ruby heap page
size). If Ruby heap page sizes is increased to 64KiB, then mmap will
be used on systems with 64KiB system page sizes. However, the transient
heap also uses rb_aligned_malloc and requires 32KiB alignment. This
would break in the current implementation since it would allocate sizes
through mmap that is not a multiple of the system page size.
This commit adds heap_page_body_allocate which will use mmap when
possible and changes rb_aligned_malloc to not use mmap (and only
use posix_memalign).
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5637
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5637
|
|
This commit changes the way compaction moves objects and sweeps pages in
order to better facilitate object movement between size pools.
Previously we would move the scan cursor first until we found an empty
slot and then we'd decrement the compact cursor until we found something
to move into that slot. We would sweep the page that contained the scan
cursor before trying to fill it
In this algorithm we first move the compact cursor down until we find an
object to move - We then take a free page from the desired destination
heap (always the same heap in this current iteration of the code).
If there is no free page we sweep the page at the sweeping_page cursor,
add it to the free pages, and advance the cursor to the next page, and
try again.
We sweep one page from each size pool in this way, and then repeat that
process until all the size pools are compacted (all the cursors have
met), and then we update references and sweep the rest of the heap.
Notes:
Merged: https://github.com/ruby/ruby/pull/5637
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5720
|
|
Currently, the number of incremental marking steps is calculated based
on the number of pooled pages available. This means that if we make Ruby
heap pages larger, it would run fewer incremental marking steps (which
would mean each incremental marking step takes longer).
This commit changes incremental marking to run after every
INCREMENTAL_MARK_STEP_ALLOCATIONS number of allocations. This means that
the behaviour of incremental marking remains the same regardless of the
Ruby heap page size.
I've benchmarked against discourse benchmarks and did not get a
significant change in response times beyond the margin of error. This is
expected as this new incremental marking algorithm behaves very
similarly to the previous one.
Notes:
Merged: https://github.com/ruby/ruby/pull/5732
|
|
* Prefixed ccan headers
* Remove unprefixed names in ccan/build_assert
* Remove unprefixed names in ccan/check_type
* Remove unprefixed names in ccan/container_of
* Remove unprefixed names in ccan/list
Co-authored-by: Samuel Williams <samuel.williams@oriontransfer.co.nz>
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
|
|
Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using
this macro will make it easier for us to change the allocation strategy
of rb_iseq_constant_body when using Variable Width Allocation.
Notes:
Merged: https://github.com/ruby/ruby/pull/5698
|
|
Previously, we would build a new `superclasses` array for each class,
even though for all immediate subclasses of a class, the array is
identical.
This avoids duplicating the arrays on leaf classes (those without
subclasses) by calculating and storing a "superclasses including self"
array on a class when it's first inherited and sharing that among all
superclasses.
An additional trick used is that the "superclass array including self"
is valid as "self"'s superclass array. It just has it's own class at the
end. We can use this to avoid an extra pointer of storage and can use
one bit of a flag to track that we've "upgraded" the array.
Notes:
Merged: https://github.com/ruby/ruby/pull/5604
|
|
Previously when checking ancestors, we would walk all the way up the
ancestry chain checking each parent for a matching class or module.
I believe this was especially unfriendly to CPU cache since for each
step we need to check two cache lines (the class and class ext).
This check is used quite often in:
* case statements
* rescue statements
* Calling protected methods
* Class#is_a?
* Module#===
* Module#<=>
I believe it's most common to check a class against a parent class, to
this commit aims to improve that (unfortunately does not help checking
for an included Module).
This is done by storing on each class the number and an array of all
parent classes, in order (BasicObject is at index 0). Using this we can
check whether a class is a subclass of another in constant time since we
know the location to expect it in the hierarchy.
Notes:
Merged: https://github.com/ruby/ruby/pull/5568
|
|
Changes size and capacity of darray to size_t to support more
elements.
Adds functions to darray that use GC allocation functions.
Notes:
Merged: https://github.com/ruby/ruby/pull/5546
|
|
`ObjectSpace::WeakMap#each*` should check key's liveness.
fix [Bug #18586]
Notes:
Merged: https://github.com/ruby/ruby/pull/5556
|
|
(1) gc_verify_internal_consistency() use barrier locking
for consistency while `during_gc == true` at the end
of the sweep on `RGENGC_CHECK_MODE >= 2`.
(2) `rb_objspace_reachable_objects_from()` is called without
VM synchronization and it checks `during_gc != true`.
So (1) and (2) causes BUG because of `during_gc == true`.
To prevent this error, wait for VM barrier on `during_gc == false`
and introduce VM locking on `rb_objspace_reachable_objects_from()`.
http://ci.rvm.jp/results/trunk-asserts@phosphorus-docker/3830088
Notes:
Merged: https://github.com/ruby/ruby/pull/5552
|
|
Cached mark stack chunks should also be freed when freeing objspace.
Notes:
Merged: https://github.com/ruby/ruby/pull/5536
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5523
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5523
|
|
gc_marks_continue will start sweeping when it finishes marking. However,
if the heap we are trying to allocate into is full, then the sweeping
may not yield any free slots. If we don't call gc_sweep_continue
immediate after this, then another GC will be started halfway during
lazy sweeping. gc_sweep_continue will either grow the heap or finish
sweeping.
Notes:
Merged: https://github.com/ruby/ruby/pull/5521
|
|
Add a new macro BASE_SLOT_SIZE that determines the slot size.
For Variable Width Allocation (compiled with USE_RVARGC=1), all slot
sizes are powers-of-2 multiples of BASE_SLOT_SIZE.
For USE_RVARGC=0, BASE_SLOT_SIZE is set to sizeof(RVALUE).
Notes:
Merged: https://github.com/ruby/ruby/pull/5517
|
|
The for loops are not correctly iterating heap pages in
gc_verify_heap_page.
Notes:
Merged: https://github.com/ruby/ruby/pull/5503
|
|
Define `MAP_ANONYMOUS` to `MAP_ANON` if undefined on old systems.
Notes:
Merged: https://github.com/ruby/ruby/pull/5506
Merged-By: nobu <nobu@ruby-lang.org>
|
|
|
|
|
|
These places never replace the value, so call rb_id_table_foreach_values
instead of rb_id_table_foreach_values_with_replace.
Notes:
Merged: https://github.com/ruby/ruby/pull/5486
|
|
Renames rb_id_table_foreach_with_replace to
rb_id_table_foreach_values_with_replace and passes only the value to the
callback. We can use this in GC compaction when we cannot access the
global symbol array.
Notes:
Merged: https://github.com/ruby/ruby/pull/5486
|
|
The if statement is redundant since if `index == 0` then
`BITS_BITLENGTH * index == 0`.
Notes:
Merged: https://github.com/ruby/ruby/pull/5479
|
|
NUM_IN_PAGE could return a value much larger than 64. According to the
C11 spec 6.5.7 paragraph 3 this is undefined behavior:
> If the value of the right operand is negative or is greater than or
> equal to the width of the promoted left operand, the behavior is
> undefined.
On most platforms, this is usually not a problem as the architecture
will mask off all out-of-range bits.
Notes:
Merged: https://github.com/ruby/ruby/pull/5478
|
|
WebAssembly doesn't support signals so we can't use read
barriers so we can't use compaction.
Notes:
Merged: https://github.com/ruby/ruby/pull/5475
|
|
|
|
WebAssembly has function local infinite registers and stack values, but
there is no way to scan the values in a call stack for now.
This implementation uses Asyncify to spilling out wasm locals into
linear memory.
Notes:
Merged: https://github.com/ruby/ruby/pull/5407
|
|
WASI currently does not yet support signal
Notes:
Merged: https://github.com/ruby/ruby/pull/5407
|