Optimize symproc calls

2023-04-25T15:06:16+00:00

Similar to the bmethod/send optimization, this avoids using
CALLER_ARG_SPLAT if not necessary.  As long as the receiver argument
can be shifted off, other arguments are passed through as-is.

This optimizes the following types of calls:

* symproc.(recv) ~5%
* symproc.(recv, *args) ~65% for args.length == 200
* symproc.(recv, *args, **kw) ~45% for args.length == 200
* symproc.(recv, **kw) ~30%
* symproc.(recv, kw: 1) ~100%

Note that empty argument splats do get slower with this approach,
by about 2-3%.  This is probably because iseq argument setup is
slower for empty argument splats than CALLER_SETUP_ARG is. Other
than non-empty argument splats, other argument splats are faster,
with the speedup depending on the number of arguments.

The following types of calls are not optimized:

* symproc.(*args)
* symproc.(*args, **kw)

This is because the you cannot shift the receiver argument off
without first splatting the arg.

ruby.git/benchmark/vm_call_symproc.yml, branch v4.0.3

Optimize symproc calls