summaryrefslogtreecommitdiff
path: root/yjit/src/stats.rs
AgeCommit message (Collapse)Author
2026-03-16YJIT: Fix not reading locals from `cfp->ep` after `YJIT.enable` and ↵Alan Wu
exceptional entry [Backport #21941] In case of `--yjit-disable`, YJIT only starts to record environment escapes after `RubyVM::YJIT.enable`. Previously we falsely assumed that we always have a full history all the way back to VM boot. This had YJIT install and run code that assume EP=BP when EP≠BP for some exceptional entry into the middle of a running frame, if the environment escaped before `YJIT.enable`. The fix is to reject exceptional entry with an escaped environment. Rename things and explain in more detail how the predicate for deciding to assume EP=BP works. It's quite subtle since it reasons about all parties in the system that push a control frame and then run JIT code. Note that while can_assume_on_stack_env() checks the currently running environment if it so happens to be the one YJIT is compiling against, it can return true for any ISEQ. The check isn't necessary for fixing the bug, and the load bearing part of this patch is the change to exceptional entries. This fix is flat on speed and space on ruby-bench headline benchmarks. Many thanks for the community effort to create a small test case for this bug.
2025-12-15YJIT: Bail out if proc would be stored above stack topRandy Stauner
Fixes [Bug #21266].
2025-12-05YJIT: Fix including stats for ZJIT instructions when ZJIT not in buildAlan Wu
2025-11-26YJIT: Abort expandarray optimization if method_missing is definedRandy Stauner
Fixes: [Bug #21707] [AW: rewrote comments] Co-authored-by: Alan Wu <alanwu@ruby-lang.org>
2025-10-28YJIT, ZJIT: Fix unnecessary `use` of macrosTakashi Kokubun
https://github.com/ruby/ruby/actions/runs/18887695798/job/53907237061?pr=14975
2025-10-12YJIT: Fix unused warning from `cargo test`Alan Wu
2025-09-30ZJIT: Add --zjit-trace-exits (#14640)Aiden Fox Ivey
Add side exit tracing functionality for ZJIT
2025-09-11ZJIT: Add support for stats_allocatorAiden Fox Ivey
* Using the shared jit crate, support for a single global_allocator can function * Solves --zjit-mem-size
2025-08-27ZJIT: Implement side exit stats (#14357)Takashi Kokubun
2025-08-26Remove `opt_aref_with` and `opt_aset_with`Aaron Patterson
When these instructions were introduced it was common to read from a hash with mutable string literals. However, these days, I think these instructions are fairly rare. I tested this with the lobsters benchmark, and saw no difference in speed. In order to be sure, I tracked down every use of this instruction in the lobsters benchmark, and there were only 4 places where it was used. Additionally, this patch fixes a case where "chilled strings" should emit a warning but they don't. ```ruby class Foo def self.[](x)= x.gsub!(/hello/, "hi") end Foo["hello world"] ``` Removing these instructions shows this warning: ``` > ./miniruby -vw test.rb ruby 3.5.0dev (2025-08-25T21:36:50Z rm-opt_aref_with dca08e286c) +PRISM [arm64-darwin24] test.rb:2: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information) ``` [Feature #21553]
2025-05-15YJIT: handle opt_aset_withJean Boussier
``` # frozen_string_ltieral: true hash["literal"] = value ``` Notes: Merged: https://github.com/ruby/ruby/pull/13342
2025-03-07YJIT: Add Counter::invalidate_everythingAlan Wu
When YJIT is forced to discard all the code, that's bad for performance, so there should be an easy way to know about it. Notes: Merged: https://github.com/ruby/ruby/pull/12882
2025-02-14Only count VM instructions in YJIT stats buildsAaron Patterson
The instruction counter is slowing multi-Ractor applications. I had changed it to use a thread local, but using a thread local is slowing single threaded applications. This commit only enables the instruction counter in YJIT stats builds until we can figure out a way to gather the information with lower overhead. Co-authored-by: Randy Stauner <randy.stauner@shopify.com> Notes: Merged: https://github.com/ruby/ruby/pull/12670
2025-01-30YJIT: Remove comments that refer to the removed "stats" featureAlan Wu
The Cargo feature was removed in 2de8b5b8054f311c4cee112dcab5208b66cc62a4 and it's available in all build configs now. [ci skip]
2025-01-30YJIT: Turn on dead code lint for the stats moduleAlan Wu
2025-01-10YJIT: Rename send_iseq_forwarding->send_forwardingAlan Wu
It's in gen_send_general(), so nothing specifically to do with iseqs. Notes: Merged: https://github.com/ruby/ruby/pull/12550
2025-01-10Make rb_vm_insns_count a thread local variableAaron Patterson
`rb_vm_insns_count` is a global variable used for reporting YJIT statistics. It is a counter that tallies the number of interpreter instructions that have been executed, this way we can approximate how much time we're spending in YJIT compared to the interpreter. Unfortunately keeping this statistic means that every instruction executed in the interpreter loop must increment the counter. Normally this isn't a problem, but in multi-threaded situations (when Ractors are used), incrementing this counter can become quite costly due to page caching issues. Additionally, since there is no locking when incrementing this global, the count can't really make sense in a multi-threaded environment. This commit changes `rb_vm_insns_count` to a thread local. That way each Ractor has it's own copy of the counter and incrementing the counter becomes quite cheap. Of course this means that in multi-threaded situations, the value doesn't really make sense (but it didn't make sense before because of the lack of locking). The counter is used for YJIT statistics, and since YJIT is basically disabled when Ractors are in use, I don't think we care about inaccuracies (for the time being). We can revisit this counter when we give YJIT multi-threading support, but for the time being this commit restores multi-threaded performance. To test this, I used the benchmark in [Bug #20489]. Here is the performance on Ruby 3.2: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.53 secs fish external usr time 19.86 secs 370.00 micros 19.86 secs sys time 0.02 secs 320.00 micros 0.02 secs ``` We can see the regression in performance on the master branch: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T16:22:26Z master 4a2702dafb) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 24.87 secs fish external usr time 195.55 secs 0.00 micros 195.55 secs sys time 0.00 secs 716.00 micros 0.00 secs ``` Here are the stats after this commit: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T20:37:06Z tl 3ef0432779) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.46 secs fish external usr time 19.34 secs 381.00 micros 19.34 secs sys time 0.01 secs 321.00 micros 0.01 secs ``` [Bug #20489] Notes: Merged: https://github.com/ruby/ruby/pull/12549
2024-12-13YJIT: Speculate block arg for `c_func_method(&nil)` calls (#12326)Alan Wu
A good amount of call sites always pass nil as block argument, but the nil doesn't show up in the context. Put a runtime guard for those cases to handle it. Particular relevant for the `ruby-lsp` benchmark in `yjit-bench`. Up to a 2% speedup across headline benchmarks. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Kevin Menard <kevin@nirvdrum.com> Co-authored-by: Randy Stauner <randy.stauner@shopify.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-12-04YJIT: track time since initialization (#12263)Maxime Chevalier-Boisvert
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-11-20YJIT: Make compilation_failure a default stat (#12128)Alan Wu
It's good to monitor compilation failures. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-11-14YJIT: Specialize String#dup (#12090)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-11-14YJIT: Specialize Integer#pred (#12082)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-11-13YJIT: Add inline_block_count stat (#12081)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-11-13YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments (#12069)Randy Stauner
* YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments String#[] is in the top few C calls of several YJIT benchmarks: liquid-compile rubocop mail sudoku This speeds up these benchmarks by 1-2%. * YJIT: Try harder to get type info for `String#[]` In the large generated code of the mail gem the context doesn't have the type info. In that case if we peek at the stack and add a guard we can still apply the specialization and it speeds up the mail benchmark by 5%. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-10-23YJIT: Check when gen_branch() failsAlan Wu
We got some core dumps in the wild where a PendingBranch had everything as None, leading to a panic unwrapping in PendingBranch::into_branch(). This happened while compiling a `branchif`. It seems that the only way this can happen is when core::gen_branch() fails, but not due to OOM. We wouldn't have reach into_branch() when OOM, and the only way to not leave markers that would've set the branch's start_addr to some value in gen_branch() is for set_target() to fail, causing an early return. Unfortunately, it's hard to tell the exact sequence of events that led to this situation, but regardless, the dumps show us that we should check for errors in gen_branch(). Because gen_branch() is used deep in the stack during compilation (e.g. guard_known_class() -> jit_chain_guard() -> gen_branch()), it'd be bad for compile speed to propagate the error everywhere, not to mention the massive patch required. Opt for a flag checked near the end of compilation. Notes: Merged: https://github.com/ruby/ruby/pull/11938 Merged-By: XrXr
2024-10-07YJIT: Add --yjit-mem-size option (#11810)Takashi Kokubun
* YJIT: Add --yjit-mem-size option * Improve --help * s/the region/this virtual memory region/ Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-09-25YJIT: Cache Context decoding (#11680)Takashi Kokubun
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-09-17YJIT: Accept key for runtime_stats to return only that stat (#11536)Randy Stauner
Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-08-27YJIT: Encode doubles to VALUE objects and move stat generation to rust (#11388)Randy Stauner
* YJIT: Encode doubles to VALUE objects and move stat generation to rust Stats that can now be generated from rust have been moved there. * Move object_shape_count call for runtime_stats to rust This reduces the ruby method to a single primitive. * Change hash_aset_usize from macro to function Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-08-02YJIT: Enhance the `String#<<` method substitution to handle integer ↵Kevin Menard
codepoint values. (#11032) * Document why we need to explicitly spill registers. * Simplify passing a byte value to `str_buf_cat`. * YJIT: Enhance the `String#<<` method substitution to handle integer codepoint values. * YJIT: Move runtime type check into YJIT. Performing the check in YJIT means we can make assumptions about the type. It also improves correctness of stack traces in cases where the codepoint argument is not a String or a Fixnum. Notes: Merged-By: maximecb <maximecb@ruby-lang.org>
2024-06-18Optimized forwarding callers and calleesAaron Patterson
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(*a) = a def foo(...) list = [1, 2] bar(*list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | 5| delegatee(...) # CI2 (FORWARDING) | 6| end | 7| | 8| def caller | -> 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 -> 4| # | CI1 (argc: 2) 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | CI1 (argc: 2) -> 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | self 9| delegator(1, 2) # CI1 (argc: 2) | 1 10| end | 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
2024-06-13YJIT: Delete otherwise-empty defer_compilation() blocksAlan Wu
Calls to defer_compilation() leave behind a stub and a `struct Block` that we retain. If the block is empty, it only exits to hold the `struct Branch` that the stub needs. This patch transplants the branch out of the empty block into the newly generated block when the defer_compilation() stub is hit, and deletes the empty block to save memory. To assist the transplantation, `Block::outgoing` is now a `MutableBranchList`, and `Branch::Block` now in a `Cell`. These types don't incur a size cost. On the `lobsters` benchmark, `yjit_alloc_size` is roughly 98% of what it was before the change. Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Randy Stauner <randy@r4s6.net> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-06-12YJIT: add context cache hits stat (#10979)Maxime Chevalier-Boisvert
* YJIT: add context cache hits stat This stat should make more sense when it comes to interpreting the effectiveness of the cache on large deployed apps.
2024-06-11YJIT: Make num_contexts_encoded a default counterTakashi Kokubun
2024-06-11YJIT: add context cache size stat, lazily allocate cacheMaxime Chevalier-Boisvert
* YJIT: add context cache size stat * Allocate the context cache in a box so CRuby doesn't pay overhead * Add an extra debug assertion
2024-06-07YJIT: implement variable-length context encoding scheme (#10888)Maxime Chevalier-Boisvert
* Implement BitVector data structure for variable-length context encoding * Rename method to make intent clearer * Rename write_uint => push_uint to make intent clearer * Implement debug trait for BitVector * Fix bug in BitVector::read_uint_at(), enable more tests * Add one more test for good measure * Start sketching Context::encode() * Progress on variable length context encoding * Add tests. Fix bug. * Encode stack state * Add comments. Try to estimate context encoding size. * More compact encoding for stack size * Commit before rebase * Change Context::encode() to take a BitVector as input * Refactor BitVector::read_uint(), add helper read functions * Implement Context::decode() function. Add test. * Fix bug, add tests * Rename methods * Add Context::encode() and decode() methods using global data * Make encode and decode methods use u32 indices * Refactor YJIT to use variable-length context encoding * Tag functions as allow unused * Add a simple caching mechanism and stats for bytes per context etc * Add comments, fix formatting * Grow vector of bytes by 1.2x instead of 2x * Add debug assert to check round-trip encoding-decoding * Take some rustfmt formatting * Add decoded_from field to Context to reuse previous encodings * Remove olde context stats * Re-add stack_size assert * Disable decoded_from optimization for now
2024-05-28YJIT: limit size of call count stats dict (#10858)Maxime Chevalier-Boisvert
* YJIT: limit size of call count stats dict Someone reported that logs were getting bloated because the ISEQ and C call count dicts were huge, since they include all of the call sites. I wrote code on the Rust size to limit the size of the dict to avoid this problem. The size limit is hardcoded at 20, but I figure this is probably fine? * Fix bug reported by Kokubun.
2024-05-06YJIT: Fix comment and counter in rb_yjit_invalidate_ep_is_bp() (#10722)Alan Wu
`mem::take` substitutes an empty instance which makes `jit.ep_is_bp()` return false.
2024-04-03YJIT: Suppress warn(static_mut_refs) (#10440)Takashi Kokubun
2024-03-28YJIT: add iseq_alloc_count to stats (#10398)Maxime Chevalier-Boisvert
* YJIT: add iseq_alloc_count to stats * Remove an empty line --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
2024-03-25YJIT: Propagate Array, Hash, and String classes (#10323)Takashi Kokubun
2024-03-18YJIT: Support arity=-2 cfuncs (#10268)Alan Wu
This type of cfuncs shows up as consume a lot of cycles in profiles of the lobsters benchmark, even though in the stats they don't happen that frequently. Might be a bug in the profiling, but these calls are not too bad to support, so might as well do it. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-03-13YJIT: Fallback cfunc varg splat for ruby2_keywords (#10226)Takashi Kokubun
2024-03-06YJIT: String#getbyte codegen (#10188)Maxime Chevalier-Boisvert
* WIP getbyte implementation * WIP String#getbyte implementation * Fix whitespace in stats.rs * fix? * Fix whitespace, add comment --------- Co-authored-by: Aaron Patterson <aaron.patterson@shopify.com>
2024-02-27YJIT: Support splat with C methods with -1 arityAlan Wu
Usually we deal with splats by speculating that they're of a specific size. In this case, the C method takes a pointer and a length, so we can support changing sizes just fine.
2024-02-23YJIT: Lazily push a frame for specialized C funcs (#10080)Takashi Kokubun
* YJIT: Lazily push a frame for specialized C funcs Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> * Fix a comment on pc_to_cfunc * Rename rb_yjit_check_pc to rb_yjit_lazy_push_frame * Rename it to jit_prepare_lazy_frame_call * Fix a typo * Optimize String#getbyte as well * Optimize String#byteslice as well --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2024-02-22YJIT: Optimize attr_writer (#9986)Takashi Kokubun
* YJIT: Optimize attr_writer * Comment about StackOpnd vs SelfOpnd
2024-02-20YJIT: Support `**nil` for cfuncsAlan Wu
Similar to the iseq call support. Fairly straight forward.
2024-02-16YJIT: Remove unused countersAlan Wu
2024-02-16YJIT: Support `**nil`Alan Wu
This adds YJIT support for VM_CALL_KW_SPLAT with nil, specifically for when we already know from the context that it's done with a nil. This is enough to support forwarding with `...` when there no keyword arguments are present. Amend the kw_rest support to propagate the type of the parameter to help with this. Test interactions with splat, since the splat array sits lower on the stack when a kw_splat argument is present.