summaryrefslogtreecommitdiff
path: root/iseq.c
AgeCommit message (Collapse)Author
2025-12-18Remove assertion in encoded_iseq_trace_instrument (#15616)Luke Gruber
`encoded_iseq_trace_instrument` is safe to call in a ractor if the iseq is new. In that case, the VM lock is not taken. This assertion was added in 4fb537b1ee28bb37dbe551ac65c279d436c756bc.
2025-12-17refactor: utilize a predefined macroYusuke Endoh
2025-12-16Make tracepoints with set_trace_func or TracePoint.new ractor local (#15468)Luke Gruber
Before this change, GC'ing any Ractor object caused you to lose all enabled tracepoints across all ractors (even main). Now tracepoints are ractor-local and this doesn't happen. Internal events are still global. Fixes [Bug #19112]
2025-12-12Binding#implicit_parameters, etc. support the implicit "it" parameterYusuke Endoh
[Bug #21049]
2025-11-28[DOC] Fix backticks in InstructionSequence docsPeter Zhu
2025-11-07renaming internal data structures and functions from namespace to boxSatoshi Tagomori
2025-10-30fix ibf and coverage sharable issueKoichi Sasada
2025-10-23skip jit payloadKoichi Sasada
They should be checked, but not sure JIT code...
2025-10-23use `SET_SHAREABLE`Koichi Sasada
to adopt strict shareable rule. * (basically) shareable objects only refer shareable objects * (exception) shareable objects can refere unshareable objects but should not leak reference to unshareable objects to Ruby world
2025-09-29Update Namespace#eval to use control frames instead of namespace_push/popSatoshi Tagomori
With this change, the argument code of Namespace#eval cannot refer local variables around the calling line, but it should not be able to refer these values. The code is evaluated in the receiver namespace, independently from the local context.
2025-09-29Update current namespace management by using control frames and lexical contextsSatoshi Tagomori
to fix inconsistent and wrong current namespace detections. This includes: * Moving load_path and related things from rb_vm_t to rb_namespace_t to simplify accessing those values via namespace (instead of accessing either vm or ns) * Initializing root_namespace earlier and consolidate builtin_namespace into root_namespace * Adding VM_FRAME_FLAG_NS_REQUIRE for checkpoints to detect a namespace to load/require files * Removing implicit refinements in the root namespace which was used to determine the namespace to be loaded (replaced by VM_FRAME_FLAG_NS_REQUIRE) * Removing namespaces from rb_proc_t because its namespace can be identified by lexical context * Starting to use ep[VM_ENV_DATA_INDEX_SPECVAL] to store the current namespace when the frame type is MAGIC_TOP or MAGIC_CLASS (block handlers don't exist in this case)
2025-09-25ZJIT: Forget about dead ISEQs in `Invariants`Alan Wu
Without this, we crash during reference update.
2025-09-24Ractor.shareable_procKoichi Sasada
call-seq: Ractor.sharable_proc(self: nil){} -> sharable proc It returns shareable Proc object. The Proc object is shareable and the self in a block will be replaced with the value passed via `self:` keyword. In a shareable Proc, the outer variables should * (1) refer shareable objects * (2) be not be overwritten ```ruby a = 42 Ractor.shareable_proc{ p a } #=> OK b = 43 Ractor.shareable_proc{ p b; b = 44 } #=> Ractor::IsolationError because 'b' is reassigned in the block. c = 44 Ractor.shareable_proc{ p c } #=> Ractor::IsolationError because 'c' will be reassigned outside of the block. c = 45 d = 45 d = 46 if cond Ractor.shareable_proc{ p d } #=> Ractor::IsolationError because 'd' was reassigned outside of the block. ``` The last `d`'s case can be relaxed in a future version. The above check will be done in a static analysis at compile time, so the reflection feature such as `Binding#local_varaible_set` can not be detected. ```ruby e = 42 shpr = Ractor.shareable_proc{ p e } #=> OK binding.local_variable_set(:e, 43) shpr.call #=> 42 (returns captured timing value) ``` Ractor.sharaeble_lambda is also introduced. [Feature #21550] [Feature #21557]
2025-08-28Make `RubyVM::AST.of` return a parent node of NODE_SCOPEYusuke Endoh
This change makes `RubyVM::AST.of` and `.node_id_for_backtrace_location` return a parent node of NODE_SCOPE (such as NODE_DEFN) instead of the NODE_SCOPE node itself. (In future, we may remove NODE_SCOPE, which is a bit hacky AST node.) This is preparation for [Feature #21543].
2025-08-27Rename rb_hook_list_mark_and_update to rb_hook_list_mark_and_movePeter Zhu
2025-08-18Use mark and move for iseqwPeter Zhu
2025-08-01Use `rb_gc_mark_weak` for `cc->klass`.Jean Boussier
One of the biggest remaining contention point is `RClass.cc_table`. The logical solution would be to turn it into a managed object, so we can use an RCU strategy, given it's read heavy. However, that's not currently possible because the table can't be freed before the owning class, given the class free function MUST go over all the CC entries to invalidate them. However if the `CC->klass` reference is weak marked, then the GC will take care of setting the reference to `Qundef`.
2025-07-11Rename some set_* functions to set_table_*Jeremy Evans
These functions conflict with the planned C-API functions. Since they deal with the underlying set_table pointers and not Set instances, this seems like a more accurate name as well.
2025-07-09ZJIT: Mark profiled objects when marking ISEQ (#13784)Takashi Kokubun
2025-06-18More write barriers to local_iseq and parent_iseqJohn Hawthorn
Found by wbcheck Notes: Merged: https://github.com/ruby/ruby/pull/13646
2025-06-03Fix memory leak in Prism's RubyVM::InstructionSequence.newPeter Zhu
[Bug #21394] There are two ways to make RubyVM::InstructionSequence.new raise which would cause the options->scopes to leak memory: 1. Passing in any (non T_FILE) object where the to_str raises. 2. Passing in a T_FILE object where String#initialize_dup raises. This is because rb_io_path dups the string. Example 1: 10.times do 100_000.times do RubyVM::InstructionSequence.new(nil) rescue TypeError end puts `ps -o rss= -p #{$$}` end Before: 13392 17104 20256 23920 27264 30432 33584 36752 40032 43232 After: 9392 11072 11648 11648 11648 11712 11712 11712 11744 11744 Example 2: require "tempfile" MyError = Class.new(StandardError) String.prepend(Module.new do def initialize_dup(_) if $raise_on_dup raise MyError else super end end end) Tempfile.create do |f| 10.times do 100_000.times do $raise_on_dup = true RubyVM::InstructionSequence.new(f) rescue MyError else raise "MyError was not raised during RubyVM::InstructionSequence.new" end puts `ps -o rss= -p #{$$}` ensure $raise_on_dup = false end end Before: 14080 18512 22000 25184 28320 31600 34736 37904 41088 44256 After: 12016 12464 12880 12880 12880 12912 12912 12912 12912 12912 Notes: Merged: https://github.com/ruby/ruby/pull/13496
2025-05-29Read {max_iv,variation}_count from prime classextJohn Hawthorn
MAX_IV_COUNT is a hint which determines the size of variable width allocation we should use for a given class. We don't need to scope this by namespace, if we end up with larger builtin objects on some namespaces that isn't a user-visible problem, just extra memory use. Similarly variation_count is used to track if a given object has had too many branches in shapes it has used, and to use too_complex when that happens. That's also just a hint, so we can use the same value across namespaces without it being visible to users. Previously variation_count was being incremented (written to) on the RCLASS_EXT_READABLE ext, which seems incorrect if we wanted it to be different across namespaces Notes: Merged: https://github.com/ruby/ruby/pull/13434
2025-05-15Ensure shape_id is never used on T_IMEMOJean Boussier
It doesn't make sense to set ivars or anything shape related on a T_IMEMO. Co-Authored-By: John Hawthorn <john@hawthorn.email> Notes: Merged: https://github.com/ruby/ruby/pull/13347
2025-05-12Cast up `int` instruction code to `VALUE`Nobuyoshi Nakada
Fix Visual C warnings: ``` iseq.c(3793): warning C4312: 'type cast': conversion from 'int' to 'void *' of greater size iseq.c(3794): warning C4312: 'type cast': conversion from 'int' to 'void *' of greater size ``` Notes: Merged: https://github.com/ruby/ruby/pull/13304
2025-05-11namespace on readSatoshi Tagomori
2025-04-28Add comments for cryptic functions in iseq.cTakashi Kokubun
2025-04-28ZJIT: Drop trace_zjit_* instructions (#13189)Takashi Kokubun
Notes: Merged-By: k0kubun <takashikkbn@gmail.com>
2025-04-26Use `set_table` to track const cachesJean Boussier
Now that we have a `set_table` implementation, we can use it to track const caches and save some memory. We could even save some more memory if `numtable` didn't store a copy of the `hash` and instead recomputed it every time, but this is a quick win. Notes: Merged: https://github.com/ruby/ruby/pull/13184
2025-04-19Fix style [ci skip]Nobuyoshi Nakada
2025-03-17Avoid pinning `storage_head` in `iseq_mark_and_move` (#12880)Eileen M. Uchitelle
* Avoid pinning `storage_head` in `iseq_mark_and_move` This refactor changes the behavior of `iseq_mark_and_move` to avoid pinning the `storage_head`. Previously pinning was required because they could be gc'd during `iseq_set_sequence` it would be possible to end up with a half build array of instructions. However, in order to implement a moving immix algorithm we can't pin these objects so this rafactoring changes the code to mark and move. To accomplish this, it was required to add `iseq_size`, `iseq_encoded`, and the `mark_bits` union to the `iseq_compile_data` struct. In addition `iseq_compile_data` sets a bool for whether there is a single or list of mark bits. While this change is needed for moving immix, it should be better for Ruby's GC as well. * Don't allocate mark_offset_bits for one word If only one word is needed, we don't need to allocate mark_offset_bits and can instead directly write to it. --------- Co-authored-by: Peter Zhu <peter@peterzhu.ca> Notes: Merged-By: eileencodes <eileencodes@gmail.com>
2025-03-12Push a real iseq in rb_vm_push_frame_fname()Alan Wu
Previously, vm_make_env_each() (used during proc creation and for the debug inspector C API) picked up the non-GC-allocated iseq that rb_vm_push_frame_fname() creates, which led to a SEGV when the GC tried to mark the non GC object. Put a real iseq imemo instead. Speed should be about the same since the old code also did a imemo allocation and a malloc allocation. Real iseq allows ironing out the special-casing of dummy frames in rb_execution_context_mark() and rb_execution_context_update(). A check is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy iseqs. [Bug #21180] Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/12898
2025-03-12Have `ast` live longer in ISeq.compile_file to fix GC stress crashAlan Wu
Previously, live range of `ast_value` ended on the call right before rb_ast_dispose(), which led to premature collection and use-after-free. We observed this crashing on -O3, -DVM_CHECK_MODE, with GCC 11.4.0 on Ubuntu. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/12898
2025-02-13[Feature #21116] Extract RJIT as a third-party gemNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/12740
2025-01-13Proc#parameters: Show anonymous optionals as `[:opt]`Alan Wu
Have this for lead parameters as well as parameters after rest ("post"). [Bug #20974] Notes: Merged: https://github.com/ruby/ruby/pull/12547
2025-01-07Correctly set node_id on iseq locationAaron Patterson
The iseq location object has a slot for node ids. parse.y was correctly populating that field but Prism was not. This commit populates the field with the ast node id for that iseq [Bug #21014] Notes: Merged: https://github.com/ruby/ruby/pull/12527
2025-01-02[DOC] Exclude 'Method' from RDoc's autolinkingNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/12496
2024-12-19Prefix asan_poison_object with rbPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/12385
2024-11-29Fix use-after-free in constant cachePeter Zhu
[Bug #20921] When we create a cache entry for a constant, the following sequence of events could happen: - vm_track_constant_cache is called to insert a constant cache. - In vm_track_constant_cache, we first look up the ST table for the ID of the constant. Assume the ST table exists because another iseq also holds a cache entry for this ID. - We then insert into this ST table with the iseq_inline_constant_cache. - However, while inserting into this ST table, it allocates memory, which could trigger a GC. Assume that it does trigger a GC. - The GC frees the one and only other iseq that holds a cache entry for this ID. - In remove_from_constant_cache, it will appear that the ST table is now empty because there are no more iseq with cache entries for this ID, so we free the ST table. - We complete GC and continue our st_insert. However, this ST table has been freed so we now have a use-after-free. This issue is very hard to reproduce, because it requires that the GC runs at a very specific time. However, we can make it show up by applying this patch which runs GC right before the st_insert to mimic the st_insert triggering a GC: diff --git a/vm_insnhelper.c b/vm_insnhelper.c index 3cb23f06f0..a93998136a 100644 --- a/vm_insnhelper.c +++ b/vm_insnhelper.c @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic) rb_id_table_insert(const_cache, id, (VALUE)ics); } + if (id == rb_intern("MyConstant")) rb_gc(); + st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue); } And if we run this script: Object.const_set("MyConstant", "Hello!") my_proc = eval("-> { MyConstant }") my_proc.call my_proc = eval("-> { MyConstant }") my_proc.call We can see that ASAN outputs a use-after-free error: ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68 READ of size 8 at 0x606000049528 thread T0 #0 0x102f3cea8 in do_hash st.c:321 #1 0x102f3ddd0 in rb_st_insert st.c:1132 #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345 #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #5 0x1030bc1e0 in vm_exec_core insns.def:263 #6 0x1030b55fc in rb_vm_exec vm.c:2585 #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #8 0x102a82588 in rb_ec_exec_node eval.c:281 #9 0x102a81fe0 in ruby_run_node eval.c:319 #10 0x1027f3db4 in rb_main main.c:43 #11 0x1027f3bd4 in main main.c:68 #12 0x183900270 (<unknown module>) 0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558) freed by thread T0 here: #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40) #1 0x102ada89c in rb_gc_impl_free default.c:8183 #2 0x102ada7dc in ruby_sized_xfree gc.c:4507 #3 0x102ac4d34 in ruby_xfree gc.c:4518 #4 0x102f3cb34 in rb_st_free_table st.c:663 #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119 #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153 #7 0x102bbd2a0 in rb_iseq_free iseq.c:166 #8 0x102b32ed0 in rb_imemo_free imemo.c:564 #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407 #10 0x102af4290 in gc_sweep_plane default.c:3546 #11 0x102af3bdc in gc_sweep_page default.c:3634 #12 0x102aeb140 in gc_sweep_step default.c:3906 #13 0x102aeadf0 in gc_sweep_rest default.c:3978 #14 0x102ae4714 in gc_sweep default.c:4155 #15 0x102af8474 in gc_start default.c:6484 #16 0x102afbe30 in garbage_collect default.c:6363 #17 0x102ad37f0 in rb_gc_impl_start default.c:6816 #18 0x102ad3634 in rb_gc gc.c:3624 #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342 #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #22 0x1030bc1e0 in vm_exec_core insns.def:263 #23 0x1030b55fc in rb_vm_exec vm.c:2585 #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #25 0x102a82588 in rb_ec_exec_node eval.c:281 #26 0x102a81fe0 in ruby_run_node eval.c:319 #27 0x1027f3db4 in rb_main main.c:43 #28 0x1027f3bd4 in main main.c:68 #29 0x183900270 (<unknown module>) previously allocated by thread T0 here: #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04) #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198 #2 0x102acee44 in ruby_xmalloc gc.c:4438 #3 0x102f3c85c in rb_st_init_table_with_size st.c:571 #4 0x102f3c900 in rb_st_init_table st.c:600 #5 0x102f3c920 in rb_st_init_numtable st.c:608 #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337 #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #9 0x1030bc1e0 in vm_exec_core insns.def:263 #10 0x1030b55fc in rb_vm_exec vm.c:2585 #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #12 0x102a82588 in rb_ec_exec_node eval.c:281 #13 0x102a81fe0 in ruby_run_node eval.c:319 #14 0x1027f3db4 in rb_main main.c:43 #15 0x1027f3bd4 in main main.c:68 #16 0x183900270 (<unknown module>) This commit fixes this bug by adding a inserting_constant_cache_id field to the VM, which stores the ID that is currently being inserted and, in remove_from_constant_cache, we don't free the ST table for ID equal to this one. Co-Authored-By: Alan Wu <alanwu@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/12203
2024-11-28Avoid an operation on a pointer after freeYusuke Endoh
A follow-up to ef59175a68c448fe334125824b477a9e1d5629bc. That commit uses `&body->local_table[...]` but `body->local_table` is already freed. I think it is an undefined behavior to calculate a pointer that exceeds the bound by more than 1. This change moves the free of `body->local_table` after the calculation. Coverity Scan found this issue. Notes: Merged: https://github.com/ruby/ruby/pull/12194
2024-11-13Move Array#map to RubyTakashi Kokubun
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/12074
2024-11-08Fix memory leak in prism when syntax error in iseq compilationPeter Zhu
If there's a syntax error during iseq compilation then prism would leak memory because it would not free the pm_parse_result_t. This commit changes pm_iseq_new_with_opt to have a rb_protect to catch when an error is raised, and return NULL and set error_state to a value that can be raised by calling rb_jump_tag after memory has been freed. For example: 10.times do 10_000.times do eval("/[/=~s") rescue SyntaxError end puts `ps -o rss= -p #{$$}` end Before: 39280 68736 99232 128864 158896 188208 217344 246304 275376 304592 After: 12192 13200 14256 14848 16000 16000 16000 16064 17232 17952 Notes: Merged: https://github.com/ruby/ruby/pull/12036
2024-10-16RubyVM::InstructionSequence.of Thread::Backtrace::LocationKevin Newton
This would be useful for debugging. Notes: Merged: https://github.com/ruby/ruby/pull/11896
2024-10-04Fix intermediate array off-by-one errorKevin Newton
Co-authored-by: Adam Hess <HParker@github.com> Notes: Merged: https://github.com/ruby/ruby/pull/11800
2024-10-02Mark iseq keyword default values during compilationPeter Zhu
During compilation, we write keyword default values into the iseq, so we should mark it to ensure it does not get GC'd. This might fix issues on ASAN like http://ci.rvm.jp/logfiles/brlog.trunk_asan.20240927-194923 ==805257==ERROR: AddressSanitizer: use-after-poison on address 0x7b7e5e3e2828 at pc 0x5e09ac4822f8 bp 0x7ffde56b0140 sp 0x7ffde56b0138 READ of size 8 at 0x7b7e5e3e2828 thread T0 #0 0x5e09ac4822f7 in RB_BUILTIN_TYPE include/ruby/internal/value_type.h:191:30 #1 0x5e09ac4822f7 in rbimpl_RB_TYPE_P_fastpath include/ruby/internal/value_type.h:352:19 #2 0x5e09ac4822f7 in gc_mark gc/default.c:4488:9 #3 0x5e09ac51011e in rb_iseq_mark_and_move iseq.c:361:17 #4 0x5e09ac4b85c4 in rb_imemo_mark_and_move imemo.c:386:9 #5 0x5e09ac467544 in rb_gc_mark_children gc.c:2508:9 #6 0x5e09ac482c24 in gc_mark_children gc/default.c:4673:5 #7 0x5e09ac482c24 in gc_mark_stacked_objects gc/default.c:4694:9 #8 0x5e09ac482c24 in gc_mark_stacked_objects_all gc/default.c:4732:12 #9 0x5e09ac48c7f9 in gc_marks_rest gc/default.c:5755:9 #10 0x5e09ac48c7f9 in gc_marks gc/default.c:5870:9 #11 0x5e09ac48c7f9 in gc_start gc/default.c:6517:13 Notes: Merged: https://github.com/ruby/ruby/pull/11755
2024-10-02Make default parser enum and define getter/setterNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/11761
2024-09-16[PRISM] Assume an eval context for RubyVM::ISEQ compileKevin Newton
Fixes [Bug #20741] Notes: Merged: https://github.com/ruby/ruby/pull/11632
2024-08-29[PRISM] Handle RubyVM.keep_script_linesKevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/11501
2024-08-21[PRISM] Implement unused block warningeileencodes
Related: ruby/prism#2935 Notes: Merged: https://github.com/ruby/ruby/pull/11415
2024-08-15Show anonymous and ambiguous params in ISeq disassemblyKevin Newton
Previously, in the disasesmbly for ISeqs, there's no way to know if the anon_rest, anon_kwrest, or ambiguous_param0 flags are set. This commit extends the names of the rest, kwrest, and lead params to display this information. They are relevant for the ISeqs' runtime behavior. Notes: Merged: https://github.com/ruby/ruby/pull/11237 Merged-By: XrXr
2024-08-11compile.c: don't allocate empty default values listJean Boussier
It just wastes memory. Notes: Merged: https://github.com/ruby/ruby/pull/11361