| Age | Commit message (Collapse) | Author |
|
Use no-inline version `rb_current_ec` on Arm64
The TLS across .so issue seems related to Arm64, but not Darwin.
|
|
Fix use-after-free in constant cache
[Bug #20921]
When we create a cache entry for a constant, the following sequence of
events could happen:
- vm_track_constant_cache is called to insert a constant cache.
- In vm_track_constant_cache, we first look up the ST table for the ID
of the constant. Assume the ST table exists because another iseq also
holds a cache entry for this ID.
- We then insert into this ST table with the iseq_inline_constant_cache.
- However, while inserting into this ST table, it allocates memory, which
could trigger a GC. Assume that it does trigger a GC.
- The GC frees the one and only other iseq that holds a cache entry for
this ID.
- In remove_from_constant_cache, it will appear that the ST table is now
empty because there are no more iseq with cache entries for this ID, so
we free the ST table.
- We complete GC and continue our st_insert. However, this ST table has
been freed so we now have a use-after-free.
This issue is very hard to reproduce, because it requires that the GC runs
at a very specific time. However, we can make it show up by applying this
patch which runs GC right before the st_insert to mimic the st_insert
triggering a GC:
diff --git a/vm_insnhelper.c b/vm_insnhelper.c
index 3cb23f06f0..a93998136a 100644
--- a/vm_insnhelper.c
+++ b/vm_insnhelper.c
@@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic)
rb_id_table_insert(const_cache, id, (VALUE)ics);
}
+ if (id == rb_intern("MyConstant")) rb_gc();
+
st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue);
}
And if we run this script:
Object.const_set("MyConstant", "Hello!")
my_proc = eval("-> { MyConstant }")
my_proc.call
my_proc = eval("-> { MyConstant }")
my_proc.call
We can see that ASAN outputs a use-after-free error:
==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68
READ of size 8 at 0x606000049528 thread T0
#0 0x102f3cea8 in do_hash st.c:321
#1 0x102f3ddd0 in rb_st_insert st.c:1132
#2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345
#3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
#4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
#5 0x1030bc1e0 in vm_exec_core insns.def:263
#6 0x1030b55fc in rb_vm_exec vm.c:2585
#7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
#8 0x102a82588 in rb_ec_exec_node eval.c:281
#9 0x102a81fe0 in ruby_run_node eval.c:319
#10 0x1027f3db4 in rb_main main.c:43
#11 0x1027f3bd4 in main main.c:68
#12 0x183900270 (<unknown module>)
0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558)
freed by thread T0 here:
#0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
#1 0x102ada89c in rb_gc_impl_free default.c:8183
#2 0x102ada7dc in ruby_sized_xfree gc.c:4507
#3 0x102ac4d34 in ruby_xfree gc.c:4518
#4 0x102f3cb34 in rb_st_free_table st.c:663
#5 0x102bd52d8 in remove_from_constant_cache iseq.c:119
#6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153
#7 0x102bbd2a0 in rb_iseq_free iseq.c:166
#8 0x102b32ed0 in rb_imemo_free imemo.c:564
#9 0x102ac4b44 in rb_gc_obj_free gc.c:1407
#10 0x102af4290 in gc_sweep_plane default.c:3546
#11 0x102af3bdc in gc_sweep_page default.c:3634
#12 0x102aeb140 in gc_sweep_step default.c:3906
#13 0x102aeadf0 in gc_sweep_rest default.c:3978
#14 0x102ae4714 in gc_sweep default.c:4155
#15 0x102af8474 in gc_start default.c:6484
#16 0x102afbe30 in garbage_collect default.c:6363
#17 0x102ad37f0 in rb_gc_impl_start default.c:6816
#18 0x102ad3634 in rb_gc gc.c:3624
#19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342
#20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
#21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
#22 0x1030bc1e0 in vm_exec_core insns.def:263
#23 0x1030b55fc in rb_vm_exec vm.c:2585
#24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
#25 0x102a82588 in rb_ec_exec_node eval.c:281
#26 0x102a81fe0 in ruby_run_node eval.c:319
#27 0x1027f3db4 in rb_main main.c:43
#28 0x1027f3bd4 in main main.c:68
#29 0x183900270 (<unknown module>)
previously allocated by thread T0 here:
#0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04)
#1 0x102ada0ec in rb_gc_impl_malloc default.c:8198
#2 0x102acee44 in ruby_xmalloc gc.c:4438
#3 0x102f3c85c in rb_st_init_table_with_size st.c:571
#4 0x102f3c900 in rb_st_init_table st.c:600
#5 0x102f3c920 in rb_st_init_numtable st.c:608
#6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337
#7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
#8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
#9 0x1030bc1e0 in vm_exec_core insns.def:263
#10 0x1030b55fc in rb_vm_exec vm.c:2585
#11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
#12 0x102a82588 in rb_ec_exec_node eval.c:281
#13 0x102a81fe0 in ruby_run_node eval.c:319
#14 0x1027f3db4 in rb_main main.c:43
#15 0x1027f3bd4 in main main.c:68
#16 0x183900270 (<unknown module>)
This commit fixes this bug by adding a inserting_constant_cache_id field
to the VM, which stores the ID that is currently being inserted and, in
remove_from_constant_cache, we don't free the ST table for ID equal to
this one.
Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
|
|
081ee3d35509110f383cb7dd8d1205def0cdd1e8,1c97abaabae6844c861705fd07f532292dcffa74: [Backport #19907] (#10315)
Add memory leak test for eval kwargs
De-dup identical callinfo objects
Previously every call to vm_ci_new (when the CI was not packable) would
result in a different callinfo being returned this meant that every
kwarg callsite had its own CI.
When calling, different CIs result in different CCs. These CIs and CCs
both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop
this resulted in a memory leak of both types of object. This also likely
resulted in extra memory used, and extra time searching, in non-eval
cases.
For simplicity in this commit I always allocate a CI object inside
rb_vm_ci_lookup, but ideally we would lazily allocate it only when
needed. I hope to do that as a follow up in the future.
|
|
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.
We made two attempts to fix this whilst keeping the promised semantics,
but:
* The first one involved masking/unmasking when flushing jobs, which
was believed to be too expensive
* The second one involved a lock-free, multi-producer, single-consumer
ringbuffer, which was too complex
The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.
For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.
For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.
So, this comit:
* Introduces a pre-registration API for jobs, with a GVL-requiring
rb_postponed_job_prereigster, which returns a handle which can be
used with an async-signal-safe rb_postponed_job_trigger.
* Deprecates rb_postponed_job_register (and re-implements it on top of
the preregister function for compatability)
* Moves all the internal usages of postponed job register
pre-registration
|
|
This patch introduces thread specific storage APIs
for tools which use `rb_internal_thread_event_hook` APIs.
* `rb_internal_thread_specific_key_create()` to create a tool specific
thread local storage key and allocate the storage if not available.
* `rb_internal_thread_specific_set()` sets a data to thread and tool
specific storage.
* `rb_internal_thread_specific_get()` gets a data in thread and tool
specific storage.
Note that `rb_internal_thread_specific_get|set(thread_val, key)`
can be called without GVL and safe for async signal and safe for
multi-threading (native threads). So you can call it in any internal
thread event hooks. Further more you can call it from other native
threads. Of course `thread_val` should be living while accessing the
data from this function.
Note that you should not forget to clean up the set data.
|
|
`rb_vm_tag_jmpbuf_{init,deinit}` are safe to raise exception since the
given tag is not yet pushed to `ec->tag` or already popped from it at
the time, so `ec->tag` is always valid and it's safe to raise exception
when xmalloc fails.
|
|
|
|
`rb_jmpbuf_t` type is considerably large due to inline-allocated
Asyncify buffer, and it leads to stack overflow even with small number
of C-method call frames. This commit allocates the Asyncify buffer used
by `rb_wasm_setjmp` in heap to mitigate the issue.
This patch introduces a new type `rb_vm_tag_jmpbuf_t` to abstract the
representation of a jump buffer, and init/deinit hook points to manage
lifetime of the buffer. These changes are effectively NFC for non-wasm
platforms.
|
|
* Port call threshold logic from Rust to C for performance
* Prefix global/field names with yjit_
* Fix linker error
* Fix preprocessor condition for rb_yjit_threshold_hit
* Fix third linker issue
* Exclude yjit_calls_at_interv from RJIT bindgen
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
This patch introduce M:N thread scheduler for Ractor system.
In general, M:N thread scheduler employs N native threads (OS threads)
to manage M user-level threads (Ruby threads in this case).
On the Ruby interpreter, 1 native thread is provided for 1 Ractor
and all Ruby threads are managed by the native thread.
From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means
1 Ruby thread has 1 native thread. M:N scheduler change this strategy.
Because of compatibility issue (and stableness issue of the implementation)
main Ractor doesn't use M:N scheduler on default. On the other words,
threads on the main Ractor will be managed with 1:1 thread scheduler.
There are additional settings by environment variables:
`RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor.
Note that non-main ractors use the M:N scheduler without this
configuration. With this configuration, single ractor applications
run threads on M:1 thread scheduler (green threads, user-level threads).
`RUBY_MAX_CPU=n` specifies maximum number of native threads for
M:N scheduler (default: 8).
This patch will be reverted soon if non-easy issues are found.
[Bug #19842]
|
|
|
|
|
|
Previously, Kernel#lambda returned a non-lambda proc when given a
non-literal block and issued a warning under the `:deprecated` category.
With this change, Kernel#lambda will always return a lambda proc, if it
returns without raising.
Due to interactions with block passing optimizations, we previously had
two separate code paths for detecting whether Kernel#lambda got a
literal block. This change allows us to remove one path, the hack done
with rb_control_frame_t::block_code introduced in 85a337f for supporting
situations where Kernel#lambda returned a non-lambda proc.
[Feature #19777]
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/8405
|
|
* YJIT: implement side chain fallback for setlocal to avoid exiting
* Update yjit/src/codegen.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
It's the actual cfp[6] in the default build, so it's confusing to say
otherwise in the comment.
|
|
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
`struct rb_calling_info::cd` is introduced and `rb_calling_info::ci`
is replaced with it to manipulate the inline cache of iseq while
method invocation process. So that `ci` can be acessed with
`calling->cd->ci`. It adds one indirection but it can be justified
by the following points:
1) `vm_search_method_fastpath()` doesn't need `ci` and also
`vm_call_iseq_setup_normal()` doesn't need `ci`. It means
reducing `cd->ci` access in `vm_sendish()` can make it faster.
2) most of method types need to access `ci` once in theory
so that 1 additional indirection doesn't matter.
Notes:
Merged: https://github.com/ruby/ruby/pull/8129
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
|
|
Remove rb_control_frame_t::__bp__ and optimize bmethod calls
This commit removes the __bp__ field from rb_control_frame_t. It was
introduced to help MJIT, but since MJIT was replaced by RJIT, we can use
vm_base_ptr() to compute it from the SP of the previous control frame
instead. Removing the field avoids needing to set it up when pushing new
frames.
Simply removing __bp__ would cause crashes since RJIT and YJIT used a
slightly different stack layout for bmethod calls than the interpreter.
At the moment of the call, the two layouts looked as follows:
┌────────────┐ ┌────────────┐
│ frame_base │ │ frame_base │
├────────────┤ ├────────────┤
│ ... │ │ ... │
├────────────┤ ├────────────┤
│ args │ │ args │
├────────────┤ └────────────┘<─prev_frame_sp
│ receiver │
prev_frame_sp─>└────────────┘
RJIT & YJIT interpreter
Essentially, vm_base_ptr() needs to compute the address to frame_base
given prev_frame_sp in the diagrams. The presence of the receiver
created an off-by-one situation.
Make the interpreter use the layout the JITs use for iseq-to-iseq
bmethod calls. Doing so removes unnecessary argument shifting and
vm_exec_core() re-entry from the interpreter, yielding a speed
improvement visible through `benchmark/vm_defined_method.yml`:
patched: 7578743.1 i/s
master: 4796596.3 i/s - 1.58x slower
C-to-iseq bmethod calls now store one more VALUE than before, but that
should have negligible impact on overall performance.
Note that re-entering vm_exec_core() used to be necessary for firing
TracePoint events, but that's no longer the case since
9121e57a5f50bc91bae48b3b91edb283bf96cb6b.
Closes ruby/ruby#6428
|
|
For compilers that do not eliminate references to functions that are
never called, such as SunC.
|
|
|
|
* Remove unused SIGCHLD handling.
* Remove unused `init_sigchld`.
* Remove unnecessary `#define RUBY_SIGCHLD (0)`.
* Remove unused `SIGCHLD_LOSSY`.
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
After [1], using ext/Setup to link some, but not all extensions failed
during linking. I did not know about this option, and had assumed that
only `--with-static-linked-ext` builds can include statically linked
extensions.
Include the support code for statically linked extensions in all
configurations like before [1]. Initialize the table lazily to minimize
footprint on builds that have no statically linked extensions.
[1]: 790cf4b6d0475614afb127b416e87cfa39044d67 "Fix autoload status of
statically linked extensions"
Notes:
Merged: https://github.com/ruby/ruby/pull/7729
|
|
SystemStackError
Originally, when 2e7bceb34ea858649e1f975a934ce1894d1f06a6 fixed cfuncs to no
longer use the VM stack for large array splats, it was thought to have fully
fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs)
back in Ruby 2.2.
After additional research, I determined that same issue affects almost all
types of method calls, not just iseq and cfunc calls. There were two main
types of remaining issues, important cases (where large array splat should
work) and pedantic cases (where large array splat raised SystemStackError
instead of ArgumentError).
Important cases:
```ruby
define_method(:a){|*a|}
a(*1380888.times)
def b(*a); end
send(:b, *1380888.times)
:b.to_proc.call(self, *1380888.times)
def d; yield(*1380888.times) end
d(&method(:b))
def self.method_missing(*a); end
not_a_method(*1380888.times)
```
Pedantic cases:
```ruby
def a; end
a(*1380888.times)
def b(_); end
b(*1380888.times)
def c(_=nil); end
c(*1380888.times)
c = Class.new do
attr_accessor :a
alias b a=
end.new
c.a(*1380888.times)
c.b(*1380888.times)
c = Struct.new(:a) do
alias b a=
end.new
c.a(*1380888.times)
c.b(*1380888.times)
```
This patch fixes all usage of CALLER_SETUP_ARG with splatting a large
number of arguments, and required similar fixes to use a temporary
hidden array in three other cases where the VM would use the VM stack
for handling a large number of arguments. However, it is possible
there may be additional cases where splatting a large number
of arguments still causes a SystemStackError.
This has a measurable performance impact, as it requires additional
checks for a large number of arguments in many additional cases.
This change is fairly invasive, as there were many different VM
functions that needed to be modified to support this. To avoid
too much API change, I modified struct rb_calling_info to add a
heap_argv member for storing the array, so I would not have to
thread it through many functions. This struct is always stack
allocated, which helps ensure sure GC doesn't collect it early.
Because of how invasive the changes are, and how rarely large
arrays are actually splatted in Ruby code, the existing test/spec
suites are not great at testing for correct behavior. To try to
find and fix all issues, I tested this in CI with
VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used
for all array splat method calls. This was very helpful in
finding breaking cases, especially ones involving flagged keyword
hashes.
Fixes [Bug #4040]
Co-authored-by: Jimmy Miller <jimmy.miller@shopify.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/7522
|
|
Rebuilding the loaded feature index slowed down with the bug fix
for #17885 in 79a4484a072e9769b603e7b4fbdb15b1d7eccb15. The
slowdown was extreme if realpath emulation was used, but even when
not emulated, it could be about 10x slower.
This adds loaded_features_realpath_map to rb_vm_struct. This is a
hidden hash mapping loaded feature paths to realpaths. When
rebuilding the loaded feature index, look at this hash to get
cached realpath values, and skip calling rb_check_realpath if a
cached value is found.
Fixes [Bug #19246]
Notes:
Merged: https://github.com/ruby/ruby/pull/7699
|
|
The `catch_except_p` flag is used for communicating between parent and
child iseq's that a throw instruction was emitted. So for example if a
child iseq has a throw in it and the parent wants to catch the throw, we
use this flag to communicate to the parent iseq that a throw instruction
was emitted.
This flag is only useful at compile time, it only impacts the
compilation process so it seems to be fine to move it from the iseq body
to the compile_data struct.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/7652
|
|
so that now shape can happily include gc.h
Notes:
Merged: https://github.com/ruby/ruby/pull/7393
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7393
|
|
|
|
to `rb_thread_t::has_dedicated_nt`
Notes:
Merged: https://github.com/ruby/ruby/pull/7638
|
|
`rb_current_ractor()` expects it has valid `ec` and `r`.
`rb_current_ractor_raw()` with a parameter `false` allows to return
NULL if `ec` is not available.
Notes:
Merged: https://github.com/ruby/ruby/pull/7617
|
|
If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and
the builtin-function (bf) is inline-able, the caller doesn't need to
build a method frame.
`vm_call_single_noarg_inline_builtin` is fast path for such cases.
Notes:
Merged: https://github.com/ruby/ruby/pull/7486
|
|
I closed https://github.com/ruby/ruby/pull/7543, but part of the diff
seems useful regardless, so I extracted it.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7465
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7465
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7465
|
|
* Remove `waitpid_lock` and related code.
* Remove un-necessary test.
* Remove `rb_thread_sleep_interruptible` dead code.
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* Revert "Remove special handling of `SIGCHLD`. (#7482)"
This reverts commit 44a0711eab7fbc71ac2c8ff489d8c53e97a8fe75.
* Revert "Remove prototypes for functions that are no longer used. (#7497)"
This reverts commit 4dce12bead3bfd91fd80b5e7195f7f540ffffacb.
* Revert "Remove SIGCHLD `waidpid`. (#7476)"
This reverts commit 1658e7d96696a656d9bd0a0c84c82cde86914ba2.
* Fix change to rjit variable name.
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
|
|
|
|
* Remove `waitpid_lock` and related code.
* Remove un-necessary test.
* Remove `rb_thread_sleep_interruptible` dead code.
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
We used to require MJIT is supported when YJIT is supported. However,
now that RJIT dropped some platforms that YJIT supports, it no longer
makes sense. We should be able to enable only YJIT, and vice versa.
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7462
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7462
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7461
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7459
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7448
|
|
|