summaryrefslogtreecommitdiff
path: root/ext/objspace/objspace_dump.c
AgeCommit message (Collapse)Author
2022-12-15Indicate if a shape is too_complex in ObjectSpace#dumpJemma Issroff
Notes: Merged: https://github.com/ruby/ruby/pull/6939
2022-12-15Transition complex objects to "too complex" shapeJemma Issroff
When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is *not* too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6931
2022-12-15Add variation_count on classesJemma Issroff
Count how many "variations" each class creates. A "variation" is a a unique ordering of instance variables on a particular class. This can also be thought of as a branch in the shape tree. For example, the following Foo class will have 2 variations: ```ruby class Foo ; end Foo.new.instance_variable_set(:@a, 1) # case 1: creates one variation Foo.new.instance_variable_set(:@b, 1) # case 2: creates another variation foo = Foo.new foo.instance_variable_set(:@a, 1) # does not create a new variation foo.instance_variable_set(:@b, 1) # does not create a new variation (a continuation of the variation in case 1) ``` We will use this number to limit the amount of shapes that a class can create and fallback to using a hash iv lookup. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6931
2022-12-14objspace_dump.c: don't dump class of T_IMEMOJean Boussier
They don't actually have a class. Notes: Merged: https://github.com/ruby/ruby/pull/6925
2022-12-12[DOC] Don't document private methods in objspacePeter Zhu
2022-12-09objspace_dump.c: dump the capacity field for INITIAL_CAPACITY shapesJean Boussier
We forgot about that one, it's quite useful to see which capacity we started from. Notes: Merged: https://github.com/ruby/ruby/pull/6891
2022-12-08ObjectSpace.dump_all: dump shapes as wellJean Boussier
I see several arguments in doing so. First they use a non trivial amount of memory, so for various memory profiling/mapping tools it is relevant to have visibility of the space occupied by shapes. Then, some pathological code can create a tons of shape, so it is valuable to have a way to have a way to observe shapes without having to compile Ruby with `SHAPE_DEBUG=1`. And additionally it's likely much faster to dump then this way than to use `RubyVM::Shape`. There are however a few open questions: - Shapes can't respect the `since:` argument. Not sure what to do when it is provided. Would probably make sense to not dump them. - Maybe it would make more sense to have a separate `ObjectSpace.dump_shapes`? - Maybe instead `dump_all` should take a `shapes: false` argument? Additionally, `ObjectSpace.dump_shapes` is added for the use case of debugging the evolution of the shape tree. Notes: Merged: https://github.com/ruby/ruby/pull/6868
2022-12-05Add shape_id to heap dumpJemma Issroff
Notes: Merged: https://github.com/ruby/ruby/pull/6864
2022-11-10Remove numiv from RObjectJemma Issroff
Since object shapes store the capacity of an object, we no longer need the numiv field on RObjects. This gives us one extra slot which we can use to give embedded objects one more instance variable (for a total of 3 ivs). This commit removes the concept of numiv from RObject. Notes: Merged: https://github.com/ruby/ruby/pull/6699
2022-11-02Use shared flags of the typePeter Zhu
The ELTS_SHARED flag is generic, so we should prefer to use the flags specific of the type (STR_SHARED for strings and RARRAY_SHARED_FLAG for arrays).
2022-07-22Adjust indents [ci skip]Nobuyoshi Nakada
2022-07-22Get rid of magic numbersNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6166
2022-07-22Dump non-ASCII char as unsignedNobuyoshi Nakada
Non-ASCII code may be negative on platforms plain char is signed. Notes: Merged: https://github.com/ruby/ruby/pull/6166
2022-07-21Revert "objspace_dump.c: skip dumping method name if not pure ASCII"Jean byroot Boussier
This reverts commit 79406e3600862bbb6dcdd7c5ef8de1978e6f916c. Notes: Merged: https://github.com/ruby/ruby/pull/6165
2022-07-21objspace_dump.c: skip dumping method name if not pure ASCIIJean Boussier
Sidekiq has a method named `❨╯°□°❩╯︵┻━┻`which corrupts heap dumps. Normally we could just dump is as is since it's valid UTF-8 and need no escaping. But our code to escape control characters isn't UTF-8 aware so it's more complicated than it seems. Ultimately since the overwhelming majority of method names are pure ASCII, it's not a big loss to just skip it. Notes: Merged: https://github.com/ruby/ruby/pull/6161
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-07-05Local functions should be `static`Nobuyoshi Nakada
2022-07-04ObjectSpace.dump: Include string coderangeJean Boussier
I suspect that some shared pages are invalidated because some static string don't have their coderange set eagerly. So the first time they are scanned, the entire memory page is invalidated. Being able to see the coderange in `ObjectSpace` would help debug this. And in addition `dump` currently call `is_broken_string()` and `is_ascii_string()` which both end up scanning the string and assigning coderange. I think it's undesirable as `dump` should be read only. Notes: Merged: https://github.com/ruby/ruby/pull/6076
2022-03-01Show embed status of array when len is 0 in objspace dumpPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/5609
2022-02-03Add the size pool slot size to the output of ObjectSpace.dump/dump_allMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/5520
2021-04-26Fix compiler warnings in objspace_dump.c when assertions are turned onPeter Zhu
Example: ``` In file included from ../../../include/ruby/defines.h:72, from ../../../include/ruby/ruby.h:23, from ../../../gc.h:3, from ../../../ext/objspace/objspace_dump.c:15: ../../../ext/objspace/objspace_dump.c: In function ‘dump_append_ld’: ../../../ext/objspace/objspace_dump.c:95:26: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int’ [-Wsign-compare] 95 | RUBY_ASSERT(required <= width); | ^~ ``` Notes: Merged: https://github.com/ruby/ruby/pull/4417
2021-02-04objspace_dump.c: tag singleton classes and reference the superclassJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4104
2021-01-20objspace_dump.c: Handle allocation path and line missingJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/4078
2020-09-28Make ext/objspace ASAN friendlyAaron Patterson
ext/objspace iterates over the heap, but some slots in the heap are poisoned, so we need to take care of that when running with ASAN Notes: Merged: https://github.com/ruby/ruby/pull/3592
2020-09-15Parse ObjectSpace.dump_all / dump arguments in Ruby to avoid allocation noiseJean Boussier
[Feature #17045] ObjectSpace.dump_all should allocate as little as possible in the GC heap Up until this commit ObjectSpace.dump_all allocates two Hash because of `rb_scan_args`. It also can allocate a `File` because of `rb_io_get_write_io`. These allocations are problematic because `dump_all` dumps the Ruby heap, so it should try modify as little as possible what it is observing. Notes: Merged: https://github.com/ruby/ruby/pull/3530
2020-09-11Add missing breakKazuhiro NISHIYAMA
pointed out by Coverity Scan
2020-09-09Optimize ObjectSpace.dump_allJean Boussier
The two main optimization are: - buffer writes for improved performance - avoid formatting functions when possible ``` | |compare-ruby|built-ruby| |:------------------|-----------:|---------:| |dump_all_string | 1.038| 195.925| | | -| 188.77x| |dump_all_file | 33.453| 139.645| | | -| 4.17x| |dump_all_dev_null | 44.030| 278.552| | | -| 6.33x| ``` Notes: Merged: https://github.com/ruby/ruby/pull/3420
2020-09-09Add a :since option to dump_allJean Boussier
This is useful to see what a block of code allocated, e.g. ``` GC.start GC.disable ObjectSpace.trace_object_allocations do # run some code end gc_gen = GC.count allocations = ObjectSpace.dump_all(output: :file, since: gc_gen) GC.enable GC.start retentions = ObjectSpace.dump_all(output: :file, since: gc_gen) ``` Notes: Merged: https://github.com/ruby/ruby/pull/3368
2020-08-17Fix method name escaping in ObjectSpace.dumpJohn Hawthorn
It's possible to define methods with any name, even if the parser doesn't support it and it can only be used with ex. send. This fixes an issue where invalid JSON was output from ObjectSpace.dump when a method name needed escaping. Notes: Merged: https://github.com/ruby/ruby/pull/3383
2020-08-17Also escape DEL codeNobuyoshi Nakada
2020-08-17Fixed the radix for control charsNobuyoshi Nakada
2020-07-23Avoid allocating a string when dumping an anonymous module or classJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/3349
2020-07-10Fix missing imemo cases in objspace_dump by refactoringAlan Wu
imemo_callcache and imemo_callinfo were not handled by the `objspace` module and were showing up as "unknown" in the dump. Extract the code for naming imemos and use that in both the GC and the `objspace` module. Notes: Merged: https://github.com/ruby/ruby/pull/3304
2020-04-08Suppress -Wswitch warningsNobuyoshi Nakada
2019-12-26decouple internal.h headers卜部昌平
Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies). Notes: Merged: https://github.com/ruby/ruby/pull/2711
2019-07-31* expand tabs.git
2019-07-31Use 1 byte hint for ar_table [Feature #15602]Koichi Sasada
On ar_table, Do not keep a full-length hash value (FLHV, 8 bytes) but keep a 1 byte hint from a FLHV (lowest byte of FLHV). An ar_table only contains at least 8 entries, so hints consumes 8 bytes at most. We can store hints in RHash::ar_hint. On 32bit CPU, we use 4 entries ar_table. The advantages: * We don't need to keep FLHV so ar_table only consumes 16 bytes (VALUEs of key and value) * 8 entries = 128 bytes. * We don't need to scan ar_table, but only need to check hints in many cases. Especially we don't need to access ar_table if there is no match entries (in many cases). It will increase memory cache locality. The disadvantages: * This technique can increase `#eql?` time because hints can conflicts (in theory, it conflicts once in 256 times). It can introduce incompatibility if there is a object x where x.eql? returns true even if hash values are different. I believe we don't need to care such irregular case. * We need to re-calculate FLHV if we need to switch from ar_table to st_table (e.g. exceeds 8 entries). It also can introduce incompatibility, on mutating key objects. I believe we don't need to care such irregular case too. Add new debug counters to measure the performance: * artable_hint_hit - hint is matched and eql?#=>true * artable_hint_miss - hint is not matched but eql?#=>false * artable_hint_notfound - lookup counts
2019-07-08Let struct dump_config in objspace fit in a single cache lineLourens Naudé
Let dump_config boolean members roots and full_heap be bit flags instead Closes: https://github.com/ruby/ruby/pull/2274
2018-10-09ext/objspace/objspace_dump.c: print addresses consistentlynobu
The format addresses are printed in are different if you use `ObjectSpace.dump_all(output: :stdout)` vs. `ObjectSpace.dump_all(output: :string)` (or `ObjectSpace.dump`) due to differences in the underlying `vfprintf` implementation. Use `"%#"PRIxVALUE` to format `VALUE`. Co-authored-by: Ashe Connor <ashe@kivikakk.ee> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64974 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-09* expand tabs.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64973 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-09Revert "ext/objspace/objspace_dump.c: print addresses consistently"naruse
This reverts commit r64970. Visual C++ 12.0 doesn't have PRIxPTR. Anyway we have our own vfprintf implementation BSD_vfprintf(). If you want to have portable vfprintf, replace it with BSD_vfprintf like vsnprintf or just use BSD_vfprintf. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64972 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-08* expand tabs.svn
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64971 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-08ext/objspace/objspace_dump.c: print addresses consistentlytenderlove
The format addresses are printed in are different if you use `ObjectSpace.dump_all(output: :stdout)` vs. `ObjectSpace.dump_all(output: :string)` (or `ObjectSpace.dump`) due to differences in the underlying `vfprintf` implementation. Use %"PRIxPTR" instead to be consistent across both. Co-authored-by: Ashe Connor <ashe@kivikakk.ee> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64970 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-05-09Rename imemo_alloc with imemo_tmpbufmame
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63372 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-02-16no ID cache in Init functionsnobu
Init functions are called only once, cache is useless. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62429 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-01-09Include ruby/{io,encoding}.h before internal.hkazu
because of r61712 and r61713 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-21Fix `imemo_name` to dump new imemo typestenderlove
New IMEMO types were introduced, this just fixes the function that converts the type to support the new types. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-04Remove RNODE cast from NODE utility functionsmame
Now, casting NODE to VALUE is not recommended. This change requires an explicit cast from VALUE to NODE to use the NODE utility functions such as `nd_type`. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60643 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-09-27objspace_dump.c: remove unnecessary breaknobu
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60043 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-01-31Add IMEMO type to heap dump output.tenderlove
IMEMO objects have many types. Without this change, we cannot see what types of IMEMO objects are being used when dumping the heap. Adding the type to the IMEMO object will allow us to gather statistics about IMEMO objects being used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57486 b2dd03c8-39d4-4d8f-98ff-823fe69b080e