| Age | Commit message (Collapse) | Author |
|
There seems to be another manifestation of bug #20021, where some of the
compaction tests are failing on i686 for unrelated PR's because of fake
"live" references to moved objects on the machine stack.
We _could_ solve this by counting how many objects are pinned during
compaction, but doing that involves pushing down the mark & pin bitset
merge into gc_compact_plane and out of gc_compact_page, which I thought
was pretty ugly.
Now that we've solved bug #20022 though, we're able to compact
arbitrarily many objects with GC.verify_compaction_references, so the
number of objects we're moving is now 50,000 instead of 500. Since
that's now much larger than the number of objects likely to be pinned, I
think it's safe enough to just add a fudge-factor to the tests.
Any _other_ change in GC.verify_compaction_references that breaks
compaction is now highly likely to break the assertion by more than 10
objects, since it's operating on so many more in the first place.
[Bug #20021]
|
|
The intention of GC.verify_compaction_references is, I believe, to force
every single movable object to be moved, so that it's possible to debug
native extensions which not correctly updating their references to
objects they mark as movable.
To do this, it doubles the number of allocated pages for each size pool,
and sorts the heap pages so that the free ones are swept first; thus,
every object in an old page should be moved into a free slot in one of
the new pages.
This worked fine until movement of objects _between_ size pools during
compaction was implemented. That causes some problems for
verify_compaction_references:
* We were doubling the number of pages in each size pool, but actually
if some objects need to move into a _different_ pool, there's no
guarantee that they'll be enough room in that one.
* It's possible for the sweep & compact cursors to meet in one size pool
before all the objects that want to move into that size pool from
another are processed by the compaction.
You can see these problems by changing some of the movement tests in
test_gc_compact.rb to try and move e.g. 50,000 objects instead of
500; the test is not able to actually move all of the objects in a
single compaction run.
To fix this, we do two things in verify_compaction_references:
* Firstly, we add enough pages to every size pool to make them the same
size. This ensures that their compact cursors will all have space to
move during compaction (even if that means empty pages are
pointlessly compacted)
* Then, we examine every object and determine where it _wants_ to be
compacted into. We use this information to add additional pages to
each size pool to handle all objects which should live there.
With these two changes, we can move arbitrary amounts of objects into
the correct size pool in a single call to verify_compaction_references.
My _motivation_ for performing this work was to try and fix some test
stability issues in test_gc_compact.rb. I now no longer think that we
actually see this particular bug in rubyci.org, but I also think
verify_compaction_references should do what it says on the tin, so it's
worth keeping.
[Bug #20022]
|
|
There have been some sproradically flaky tests related to GC compaction,
which fail with:
1) Failure:
TestGCCompact#test_moving_hashes_down_size_pools [/test/ruby/test_gc_compact.rb:442]:
Expected 499 to be >= 500.
What's happening here, is that, _sometimes_, depending on very unlucky
combinations of machine things, one of the expected-to-be-moved hashes
might be found on the machine stack during GC, and thus pinned.
One factor which seems to make this _more_ likely is that GCC 11 on
Ubuntu 22.04 seems to want to allocate 440 bytes of stack space for
`gc_start`, which is much more than it actually uses on the common code
path. The result is that there are some 50-odd VALUE-sized cells "live"
on the stack which may well contain valid heap pointers from previous
function calls, and will need to be pinned.
This is, of course, totally normal and expected; Ruby's GC is
conservative and if there is the possibility that a VALUE might be live
on the machine stack, it can't be moved. However, it does make these
tests flaky.
This commit "fixes" the tests by performing the work in a fiber; the
fiber goes out of scope and should be collected by the call to
verify_compaction_references, so there should be no references to the
to-be-moved objects floating around on the machine stack.
Fixes [#20021]
|
|
This reverts commit c5abe0d08f8f7686422e6eef374cf8c78aefacb6.
Notes:
Merged: https://github.com/ruby/ruby/pull/8166
|
|
Needs more investigations.
|
|
This test is flaky on "SPARC Solaris 10 (gcc)" CI with this message:
TestGCCompact#test_moving_objects_between_size_pools [test/ruby/test_gc_compact.rb:378]:
Expected 499 to be >= 500.
|
|
It's not guaranteed that the first element will always be embedded.
Notes:
Merged: https://github.com/ruby/ruby/pull/8116
|
|
This test fails on Solaris SPARC with the following error and I can't
figure out why:
TestGCCompact#test_moving_hashes_down_size_pools
Expected 499 to be >= 500.
|
|
|
|
|
|
This test is flaky on some CI systems.
|
|
This allows Hashes with ST tables to fit int he 80 byte size pool.
Notes:
Merged: https://github.com/ruby/ruby/pull/7742
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7742
|
|
[Bug #19536]
When objects are moved between size pools, their frozen status is lost
in the shape. This will cause the frozen check to be bypassed when there
is an inline cache. For example, the following script should raise a
FrozenError, but doesn't on Ruby 3.2 and master.
class A
def add_ivars
@a = @b = @c = @d = 1
end
def set_a
@a = 10
end
end
a = A.new
a.add_ivars
a.freeze
b = A.new
b.add_ivars
b.set_a # Set the inline cache in set_a
GC.verify_compaction_references(expand_heap: true, toward: :empty)
a.set_a
Notes:
Merged: https://github.com/ruby/ruby/pull/7553
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7071
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7071
|
|
The reference updating code for strings is not re-embedding strings
because the code is incorrectly wrapped inside of a
`if (STR_SHARED_P(obj))` clause. Shared strings can't be re-embedded
so this ends up being a no-op. This means that strings can be moved to a
large size pool during compaction, but won't be re-embedded, which would
waste the space.
Notes:
Merged: https://github.com/ruby/ruby/pull/7071
|
|
Previously if any of the tests that move objects between size pools
failed to move anything, then the call to stats.dig would return `nil`
which would then cause assert_operator to error.
This should be a test Failure, rather than an Error so this commit uses
a default value of 0 if stats.dig fails to find a key.
Also refactor object movement tests to use stats.dig, rather than :[]
Notes:
Merged: https://github.com/ruby/ruby/pull/6978
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7013
|
|
Frozen arrays should not move from heap allocated to embedded because
frozen arrays could be shared roots for other (shared) arrays. If the
frozen array moves from heap allocated to embedded it would cause issues
since the shared array would no longer know where to set the pointer
in the shared root.
Notes:
Merged: https://github.com/ruby/ruby/pull/7013
|
|
When moving Objects between size pools we have to assign a new shape.
This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.
This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.
This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.
This allows us to remove the shape allocation from update references.
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
Notes:
Merged: https://github.com/ruby/ruby/pull/6938
|
|
This reverts commit 9c54466e299aa91af225bc2d92a3d7755730948f.
We're seeing crashes in Shopify CI after this commit.
|
|
When moving Objects between size pools we have to assign a new shape.
This happened during updating references - we tried to create a new shape
tree that mirrored the existing tree, but based on the root shape of the
new size pool.
This causes allocations to happen if the new tree doesn't already exist,
potentially triggering a GC, during GC.
This commit changes object movement to look for a pre-existing new tree
during object movement, and if that tree does not exist, we don't move
the object to the new pool.
This allows us to remove the shape allocation from update references.
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
Notes:
Merged: https://github.com/ruby/ruby/pull/6926
|
|
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.
This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6699
|
|
This commit implements Objects on Variable Width Allocation. This allows
Objects with more ivars to be embedded (i.e. contents directly follow the
object header) which improves performance through better cache locality.
Notes:
Merged: https://github.com/ruby/ruby/pull/6117
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6099
|
|
This commit enables Arrays to move between size pools during compaction.
This can occur if the array is mutated such that it would fit in a
different size pool when embedded.
The move is carried out in two stages:
1. The RVALUE is moved to a destination heap during object movement
phase of compaction
2. The array data is re-embedded and the original buffer free'd if
required. This happens during the update references step
Notes:
Merged: https://github.com/ruby/ruby/pull/6099
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6107
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6077
|
|
And re-embed any strings that can now fit inside the slot they've been
moved to
Notes:
Merged: https://github.com/ruby/ruby/pull/5986
|
|
Define `GC.verify_compaction_references` as a built-in ruby method,
according to GC compaction support via `GC::OPTS`.
Notes:
Merged: https://github.com/ruby/ruby/pull/5972
|
|
`vm_trace_hook()` runs global hooks before running local hooks.
Previously, we read the local hook list before running the global hooks
which led to use-after-free when a global hook frees the local hook
list. A global hook can do this by disabling a local TracePoint, for
example.
Delay local hook list loading until after running the global hooks.
Issue discovered by Jeremy Evans in GH-5862.
[Bug #18730]
Notes:
Merged: https://github.com/ruby/ruby/pull/5865
|
|
Fixes [Bug #18779]
Define the following methods as `rb_f_notimplement` on unsupported
platforms:
- GC.compact
- GC.auto_compact
- GC.auto_compact=
- GC.latest_compact_info
- GC.verify_compaction_references
This change allows users to call `GC.respond_to?(:compact)` to
properly test for compaction support. Previously, it was necessary to
invoke `GC.compact` or `GC.verify_compaction_references` and check if
those methods raised `NotImplementedError` to determine if compaction
was supported.
This follows the precedent set for other platform-specific
methods. For example, in `process.c` for methods such as
`Process.fork`, `Process.setpgid`, and `Process.getpriority`.
Notes:
Merged: https://github.com/ruby/ruby/pull/5934
|
|
WebAssembly doesn't support signals so we can't use read
barriers so we can't use compaction.
Notes:
Merged: https://github.com/ruby/ruby/pull/5475
|
|
|
|
Parallel worker's stdout is captured as the control protocol.
Notes:
Merged: https://github.com/ruby/ruby/pull/5286
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5077
|
|
The compaction tests get stuck randomly on s390x for unknown reason.
http://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20211104T030003Z.fail.html.gz
```
[13715/21145] TestGCCompact#test_gc_compact_statstimeout: output interval exceeds 1800.0 seconds.
```
We spent some time to investigate this issue, but we can't figure out
why, and it is unlikely that we'll be able to fix it anytime soon.
This random failure makes the CI unuseful, so tentatively we suppress
this test for a while. A contribution from those who are familiar with
s390x is welcome.
Notes:
Merged: https://github.com/ruby/ruby/pull/5077
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4697
|
|
Manual compaction also requires a read barrier, so we need to disable
even manual compaction on platforms that don't support mprotect.
[Bug #17871]
Notes:
Merged: https://github.com/ruby/ruby/pull/4528
|
|
... the following timeout failure.
http://rubyci.s3.amazonaws.com/rhel_zlinux/ruby-master/log/20210408T213303Z.fail.html.gz
```
[ 8871/21204] TestGCCompact#test_ast_compactstimeout: output interval exceeds 600.0 seconds.
timeout: the process group 28416 is alive.
PSOUT PGID PID ELAPSED %CPU VSZ COMMAND COMMAND
PSOUT 28416 28416 12:46 0.0 108120 gmake gmake TESTS=--hide-skip -v RUBYOPT=-w test-all
PSOUT 28416 28423 12:46 88.2 1446124 ruby ./test/runner.rb: TestGCCompact#test_ast_compacts
timeout: INT signal sent.
timeout: INT signal sent.
timeout: TERM signal sent.
timeout: TERM signal sent.
timeout: KILL signal sent.
```
This error repeatedly occurs on RHEL s390x.
This change sends SEGV when timeout occurs so that it should dump the backtrace.
|
|
This reverts commit a9920e7782f225b97e173a88640fe9e116b9964f.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4227
|
|
It seems this breaks tests on Solaris, so I'm reverting it until we
figure out the right fix.
http://rubyci.s3.amazonaws.com/solaris11-sunc/ruby-master/log/20210224T210007Z.fail.html.gz
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4221
|
|
Both explicit compaction routines (gc_compact and the verify references form)
need to clear the heap before executing compaction. Otherwise some
objects may not be alive, and we'll need the read barrier. The heap
must only contain *live* objects if we want to disable the read barrier
during explicit compaction.
The previous commit was missing the "clear the heap" phase from the
"verify references" explicit compaction function.
Fixes [Bug #17306]
|
|
This reverts commit 63ad55cd882e4010fe313d271af006a430b5ffa8.
Revert "Disable read barrier on explicit compaction request"
This reverts commit 490b57783d80f0c5f7882c66d9fb6aa02713c9a5.
|
|
Auto Compaction uses mprotect to implement a read barrier. mprotect can
only work on regions of memory that are a multiple of the OS page size.
Ruby's pages are a multiple of 4kb, but some platforms (like ppc64le)
don't have 4kb page sizes. This commit disables the features on those
platforms.
Fixes [Bug #17306]
|
|
|
|
* `GC.auto_compact=`, `GC.auto_compact` can be used to control when
compaction runs. Setting `auto_compact=` to true will cause
compaction to occurr duing major collections. At the moment,
compaction adds significant overhead to major collections, so please
test first!
[Feature #17176]
|