| Age | Commit message (Collapse) | Author |
|
* use correct svar
Without this patch, svar location is used "nearest Ruby frame".
It is almost correct but it doesn't correct when the `each` method
is written in Ruby.
```ruby
class C
include Enumerable
def each
%w(bar baz).each{|e| yield e}
end
end
C.new.grep(/(b.)/){|e| p [$1, e]}
```
This patch fix this issue by traversing ifunc's cfp.
Note that if cfp doesn't specify this Thread's cfp stack, reserved
svar location (`ec->root_svar`) is used.
* make yjit-bindgen
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
|
|
* YJIT: Handle splat with opt more fully
* Update yjit/src/codegen.rs
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Grouping these together helps with finding all of the unimplemented
method types. It was interleaved with some other match arm long and
short previously.
Notes:
Merged: https://github.com/ruby/ruby/pull/7210
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
YJIT: Fix BorrowMutError
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Skip defer_compilation for fixnums if possible
* YJIT: It should be Some(false)
* YJIT: Define two_fixnums_on_stack on Context
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
This makes it so that the generator and the output code read in the same
order. I think it reads better this way.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7158
Merged-By: XrXr
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7155
|
|
Note: On the new code of yjit/src/core.rs:2178, we no longer leave the state `.block=None` but `.address=Some...`, which might be important.
We assume it's actually not needed and take a risk here to minimize heap allocations, but in case it turns out to be necessary, we could signal/resurrect that state by introducing a new BranchTarget (or BranchShape) variant dedicated to it.
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
are exiting (#6929)
YJIT: Implement splat for cfuncs. Split exit cases
This also implements a new check for ruby2keywords as the last
argument of a splat. This does mean that we generate more code, but in
actual benchmarks where we gained speed from this (binarytrees) I
don't see any significant slow down. I did have to struggle here with
the register allocator to find code that didn't allocate too many
registers. It's a bit hard when everything is implicit. But I think I
got to the minimal amount of copying and stuff given our current
allocation strategy.
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Fewer cycles running nops when these jumps are not taken. Fixing all
these so when they get copy pasted in the future we save on padding.
Notes:
Merged: https://github.com/ruby/ruby/pull/7150
|
|
YJIT: implement codegen for String#empty?
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* Add stats so we can keep track of x86 rel32 vs register calls
To know if we get that "prime real estate" as Alan put it.
* Fix bug pointed by Alan
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Shorter, and easier to parse without parentheses.
Notes:
Merged: https://github.com/ruby/ruby/pull/7121
|
|
Using a constant shows intention better and is less noisy. It always
took me a second to parse the long expression.
Notes:
Merged: https://github.com/ruby/ruby/pull/7121
|
|
* Add job to check clippy lints in CI
* Address all remaining clippy lints
* Check lints on arm64 as well
* Apply latest clippy lints
* Do not exit 0 on clippy warnings
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
|
|
* YJIT: Add a few asm comments
* YJIT: Clarify exiting insns
* YJIT: Fix cargo test
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* Differentiate T_ARRAY and array subclasses
This commit teaches the YJIT context the difference between Arrays
(objects with type T_ARRAY and class rb_cArray) vs Array subclasses
(objects with type T_ARRAY but _not_ class rb_cArray). It uses this
information to reduce the number of guards emitted when using
`jit_guard_known_klass` with rb_cArray, notably opt_aref
* Update yjit/src/core.rs
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Previously, we did not update `cfp->sp` before calling the C function of
ISEQs marked with `Primitive.attr! "inline"` (leaf builtins). This
caused the GC to miss temporary values on the stack in case the function
allocates and triggers a GC run. Right now, there is only a few leaf
builtins in numeric.rb on Integer methods such as `Integer#~`. Since
these methods only allocate when operating on big numbers, we missed
this issue.
Fix by saving PC and SP before calling the functions -- our usual
protocol for calling C functions that may allocate on the GC heap.
[Bug #19316]
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
warning: unused variable: `start_addr`
--> ../yjit/src/asm/mod.rs:359:39
|
359 | pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) {
| ^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_start_addr`
|
= note: `#[warn(unused_variables)]` on by default
warning: unused variable: `end_addr`
--> ../yjit/src/asm/mod.rs:359:60
|
359 | pub fn remove_comments(&mut self, start_addr: CodePtr, end_addr: CodePtr) {
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
* YJIT: Make iseq_get_location consistent with iseq.c
* YJIT: Call it "YJIT entry point"
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: Colorize outlined code differently
on --yjit-dump-disasm
* YJIT: Reduce the number of escape sequences
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
We should differentiate between set and get for megamorphic exits. This
patch fixes the megamorphic exit name in gen_setinstancevariable so that
we can tell the difference between megamorphic get / set sites
Notes:
Merged: https://github.com/ruby/ruby/pull/7072
|
|
Since the panic message is in stderr, better to use the same stream in
case stdout and stderr are not synced due to IO redirection.
|
|
It's a register spill issue. Fix by moving the Qnil fill snippet to
after registers are released.
[Bug #19299]
Notes:
Merged: https://github.com/ruby/ruby/pull/7059
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
On a hash miss we need to call default if it is redefined in order to
return the default value to be used. Previously we checked this with
rb_method_basic_definition_p, which avoids the method call but requires
a method lookup.
This commit replaces the previous check with BASIC_OP_UNREDEFINED_P and
a new BOP_DEFAULT. We still need to fall back to
rb_method_basic_definition_p when called on a subclasss of hash.
| |compare-ruby|built-ruby|
|:---------------|-----------:|---------:|
|hash_aref_miss | 2.692| 3.531|
| | -| 1.31x|
Co-authored-by: Daniel Colson <danieljamescolson@gmail.com>
Co-authored-by: "Ian C. Anderson" <ian@iancanderson.com>
Co-authored-by: Jack McCracken <me@jackmc.xyz>
Notes:
Merged: https://github.com/ruby/ruby/pull/6945
|
|
All the method call types need to handle argument shifting in case they're
called by `.send`, and we weren't handling that in `OPTIMIZED_METHOD_TYPE_CALL`.
Lack of shifting caused the stack size assertion in gen_leave() to fail.
Discovered by Rails CI: https://buildkite.com/rails/rails/builds/91705#018516c4-f8f8-469e-bc2d-ddeb25ca8317/1920-2067
Diagnosed with help from `@eileencodes` and `@k0kubun`.
Notes:
Merged: https://github.com/ruby/ruby/pull/6943
Merged-By: XrXr
|
|
SIZE_POOL_COUNT is a GC macro, it should belong in gc.h and not shape.h.
SIZE_POOL_COUNT doesn't depend on shape.h so we can have shape.h depend
on gc.h.
Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
Notes:
Merged: https://github.com/ruby/ruby/pull/6940
|
|
Stubs we generate for invalidation don't necessarily co-locate with the
code that jump to the stub. Since we rely on co-location to keep stubs
alive as they are in the outlined code block, it used to be possible for
code GC inside branch_stub_hit() to free the stub that's its direct
caller, leading us to return to freed code after.
Stubs used to look like:
```
mov arg0, branch_ptr
mov arg1, target_idx
mov arg2, ec
call branch_stub_hit
jmp return_reg
```
Since the call and the jump after the call is the same for all stubs, we
can extract them and use a static trampoline for them. That makes
branch_stub_hit() always return to static code. Stubs now look like:
```
mov arg0, branch_ptr
mov arg1, target_idx
jmp trampoline
```
Where the trampoline is:
```
mov arg2, ec
call branch_stub_hit
jmp return_reg
```
Code GC can now free stubs without problems since we'll always return
to the trampoline, which we generate once on boot and lives forever.
This might save a small bit of memory due to factoring out the static
part of stubs, but it's probably minor.
[Bug #19234]
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.
Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.
This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.
For example:
```ruby
class NG; end
HUGE_NUMBER.times do
NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```
We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.
For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:
```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```
However, the following class is *not* too complex because it only has
one leaf in the shape tree:
```ruby
class Foo
def initialize
@a = @b = @c = @d = @e = @f = @g = @h = @i = nil
end
end
9.times { Foo.new }
``
This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6931
|
|
It's idempotent.
Notes:
Merged: https://github.com/ruby/ruby/pull/6930
|
|
* YJIT: Change the default mem size to 64MiB
* Also update ruby --help
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
* YJIT: implement getconstant YARV instruction
* Constant id is not a pointer
* Stack operands must be read after jit_prepare_routine_call
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Notes:
Merged-By: k0kubun <takashikkbn@gmail.com>
|
|
The new version has an option to merge everything into a big
`extern "C"` block and it's nicer.
More importantly, this upgrade fixes an issue where Ubuntu with Clang 12
and macOS with Clang 14 gave a one line diff for `rb_shape_t`. It was
slightly annoying because we use macOS locally.
Notes:
Merged: https://github.com/ruby/ruby/pull/6887
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: maximecb <maximecb@ruby-lang.org>
|
|
Notes:
Merged-By: ioquatix <samuel@codeotaku.com>
|
|
With this change, we're storing the iv name on an inline cache on
setinstancevariable instructions. This allows us to check the inline
cache to count instance variables set in initialize and give us an
estimate of iv capacity for an object.
For the purpose of estimating the number of instance variables required
for an object, we're assuming that all initialize methods will call
`super`.
This change allows us to estimate the number of instance variables
required without disassembling instruction sequences.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Notes:
Merged: https://github.com/ruby/ruby/pull/6870
|
|
Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup
to determine whether `<=>` was overridden. The result of the lookup was
cached, but only for the duration of the specific method that
initialized the cmp_opt_data cache structure.
With this method lookup, `[x,y].max` is slower than doing `x > y ?
x : y` even though there's an optimized instruction for "new array max".
(John noticed somebody a proposed micro-optimization based on this fact
in https://github.com/mastodon/mastodon/pull/19903.)
```rb
a, b = 1, 2
Benchmark.ips do |bm|
bm.report('conditional') { a > b ? a : b }
bm.report('method') { [a, b].max }
bm.compare!
end
```
Before:
```
Comparison:
conditional: 22603733.2 i/s
method: 19820412.7 i/s - 1.14x (± 0.00) slower
```
This commit replaces the method lookup with a new CMP basic op, which
gives the examples above equivalent performance.
After:
```
Comparison:
method: 24022466.5 i/s
conditional: 23851094.2 i/s - same-ish: difference falls within
error
```
Relevant benchmarks show an improvement to Array#max and Array#min when
not using the optimized newarray_max instruction as well. They are
noticeably faster for small arrays with the relevant types, and the same
or maybe a touch faster on larger arrays.
```
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max
```
The benchmarks added in this commit also look generally improved.
Co-authored-by: John Hawthorn <jhawthorn@github.com>
|