ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2021-10-29	vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine	Yusuke Endoh
	This caused Bus error on 32 bit Solaris Notes: Merged: https://github.com/ruby/ruby/pull/5049
2021-10-27	the core problem is the Proc is not shareable	Satoshi Moris Tagomori
	Notes: Merged: https://github.com/ruby/ruby/pull/4771
2021-10-20	Extract yjit_force_iv_index and make it work when object is frozen	Alan Wu
	In an effort to simplify the logic YJIT generates for accessing instance variable, YJIT ensures that a given name-to-index mapping exists at compile time. In the case that the mapping doesn't exist, it was created by using rb_ivar_set() with Qundef on the sample object we see at compile time. This hack isn't fine if the sample object happens to be frozen, in which case YJIT would raise a FrozenError unexpectedly. To deal with this, make a new function that only reserves the mapping but doesn't touch the object. This is rb_obj_ensure_iv_index_mapping(). This new function superceeds the functionality of rb_iv_index_tbl_lookup() so it was removed. Reported by and includes a test case from John Hawthorn <john@hawthorn.email> Fixes: GH-282
2021-10-20	Add comments about special runtime routines YJIT calls	Alan Wu
	When YJIT make calls to routines without reconstructing interpreter state through jit_prepare_routine_call(), it relies on the routine to never allocate, raise, and push/pop control frames. Comment about this on the routines that YJTI calls. This is probably something we should dynamically verify on debug builds. It's hard to statically verify this as it requires verifying all functions in the call tree. Maybe something to look at in the future.
2021-10-20	Cleanup diff against upstream. Add comments	Alan Wu
	I did a `git diff --stat` against upstream and looked at all the files that are outside of YJIT to come up with these minor changes.
2021-10-20	Put YJIT into a single compilation unit	Alan Wu
	For upstreaming, we want functions we export either prefixed with "rb_" or made static. Historically we haven't been following this rule, so we were "leaking" a lot of symbols as `make leak-globals` would tell us. This change unifies everything YJIT into a single compilation unit, yjit.o, and makes everything unprefixed static to pass `make leak-globals`. This manual "unified build" setup is similar to that of vm.o. Having everything in one compilation unit allows static functions to be visible across YJIT files and removes the need for declarations in headers in some cases. Unnecessary declarations were removed. Other changes of note: - switched to MJIT_SYMBOL_EXPORT_BEGIN which indicates stuff as being off limits for native extensions - the first include of each YJIT file is change to be "internal.h" - undefined MAP_STACK before explicitly redefining it since it collide's with a definition in system headers. Consider renaming?
2021-10-20	Implement getclassvariable in yjit	eileencodes
	Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2021-10-20	Add a slowpath for opt_getinlinecache	Alan Wu
	Before this change, when we encounter a constant cache that is specific to a lexical scope, we unconditionally exit. This change falls back to the interpreter's cache in this situation. This should help constant expressions in `class << self`, which is popular at Shopify due to the style guide. This change relies on the cache being warm while compiling to detect the need for checking the lexical scope for simplicity.
2021-10-20	Remove vm_opt_aset	John Hawthorn

2021-10-20	Add comments for new function	Aaron Patterson

2021-10-20	Refactor attrset to use a function	Aaron Patterson
	This new function will do the write barrier / resize the object / check frozen for us
2021-10-20	Remove rb_opt_equality_specialized	John Hawthorn

2021-10-20	Implement splatarray	Kevin Newton

2021-10-20	Try to fix MJIT symbol clash with cargo cult	Maxime Chevalier-Boisvert

2021-10-20	Implement defined bytecode (#39)	Maxime Chevalier-Boisvert

2021-10-20	Implement setivar with a plain old function call (#34)	Maxime Chevalier-Boisvert
	* Implement setivar with a plain old function call * Remove return
2021-10-20	Implement opt_aset as interpreter handler call	Maxime Chevalier-Boisvert

2021-10-20	Implement opt_mod as call to interpreter function (#29)	Maxime Chevalier-Boisvert

2021-10-20	Implement opt_eq by calling interpreter function (#28)	Maxime Chevalier-Boisvert

2021-10-20	Implement send with alias method (#23)	Maxime Chevalier-Boisvert
	* Implement send with alias method * Add alias_method tests
2021-10-20	Implement calls to methods with simple optional params	Alan Wu
	* Implement calls to methods with simple optional params * Remove unnecessary MJIT_STATIC See comment for MJIT_STATIC. I added it not knowing whether it's required because the function next to it has it. Don't use it and wait for problems to come up instead. * Better naming, some comments * Count bailing on kw only iseqs On railsbench: ``` opt_send_without_block exit reasons: bmethod 59729 (27.7%) optimized_method 59137 (27.5%) iseq_complex_callee 41362 (19.2%) alias_method 33346 (15.5%) callsite_not_simple 19170 ( 8.9%) iseq_only_keywords 1300 ( 0.6%) kw_splat 1299 ( 0.6%) cfunc_ruby_array_varg 18 ( 0.0%) ```
2021-10-20	YJIT: Fancier opt_getinlinecache	Alan Wu
	Make sure `opt_getinlinecache` is in a block all on its own, and invalidate it from the interpreter when `opt_setinlinecache`. It will recompile with a filled cache the second time around. This lets YJIT runs well when the IC for constant is cold.
2021-10-20	YJIT: lazy polymorphic getinstancevariable	Alan Wu
	Lazily compile out a chain of checks for different known classes and whether `self` embeds its ivars or not. * Remove trailing whitespaces * Get proper addresss in Capstone disassembly * Lowercase address in Capstone disassembly Capstone uses lowercase for jump targets in generated listings. Let's match it. * Use the same successor in getivar guard chains Cuts down on duplication * Address reviews * Fix copypasta error * Add a comment
2021-10-20	WIP JIT-to-JIT returns	Maxime Chevalier-Boisvert

2021-10-19	Remove useless casts	Nobuyoshi Nakada

2021-10-19	Get rid of type-punning cast	Nobuyoshi Nakada

2021-10-03	Using NIL_P macro instead of `== Qnil`	S.H
	Notes: Merged: https://github.com/ruby/ruby/pull/4925 Merged-By: nobu <nobu@ruby-lang.org>
2021-10-01	Introduce rb_vm_call_with_refinements to DRY up a few calls	Jeremy Evans
	Notes: Merged: https://github.com/ruby/ruby/pull/4919
2021-09-30	Make Array#min/max optimization respect refined methods	Jeremy Evans
	Pass in ec to vm_opt_newarray_{max,min}. Avoids having to call GET_EC inside the functions, for better performance. While here, add a test for Array#min/max being redefined to test_optimization.rb. Fixes [Bug #18180] Notes: Merged: https://github.com/ruby/ruby/pull/4911 Merged-By: jeremyevans <code@jeremyevans.net>
2021-09-19	Extract hook macro for attributes	Nobuyoshi Nakada

2021-09-11	Remove printf family from the mjit header	Nobuyoshi Nakada
	Linking printf family functions makes mjit objects to link unnecessary code. Notes: Merged: https://github.com/ruby/ruby/pull/4820
2021-08-29	Support tracing of attr_reader and attr_writer	Jeremy Evans
	In vm_call_method_each_type, check for c_call and c_return events before dispatching to vm_call_ivar and vm_call_attrset. With this approach, the call cache will still dispatch directly to those functions, so this change will only decrease performance for the first (uncached) call, and even then, the performance decrease is very minimal. This approach requires that we clear the call caches when tracing is enabled or disabled. The approach currently switches all vm_call_ivar and vm_call_attrset call caches to vm_call_general any time tracing is enabled or disabled. So it could theoretically result in a slowdown for code that constantly enables or disables tracing. This approach does not handle targeted tracepoints, but from my testing, c_call and c_return events are not supported for targeted tracepoints, so that shouldn't matter. This includes a benchmark showing the performance decrease is minimal if detectable at all. Fixes [Bug #16383] Fixes [Bug #10470] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/4767
2021-08-11	Get rid of type-punning pointer casts [Bug #18062]	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/4716
2021-08-02	Using RBOOL macro	S.H
	Notes: Merged: https://github.com/ruby/ruby/pull/4695 Merged-By: nobu <nobu@ruby-lang.org>
2021-06-23	Refactor class variable cache functions	Nobuyoshi Nakada
	Extracted repeated code as update_classvariable_cache. When cvc table is not set in getclassvariable, an empty table was created but it has no id and would cause [BUG], so made the code same as setclassvariable.
2021-06-18	Add a cache for class variables	eileencodes
	Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-05-11	Revert "Filling cache values on cvar write"	Aaron Patterson
	This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11	Filling cache values on cvar write	eileencodes
	Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11	Add a cache for class variables	eileencodes
	This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-07	Protoized old pre-ANSI K&R style declarations and definitions	Nobuyoshi Nakada

2021-04-23	Add back checks for empty kw splat with tests (#4405)	Alan Wu
	This reverts commit a224ce8150f2bc687cf79eb415c931d87a4cd247. Turns out the checks are needed to handle splatting an array with an empty ruby2 keywords hash. Notes: Merged-By: XrXr
2021-04-23	Remove part of comment that is no longer accurate	Jeremy Evans
	In Ruby 2.7, empty keyword splats could be added back for backwards compatibility. However, that stopped in Ruby 3.0.
2021-04-23	Remove unnecessary checks for empty kw splat	Alan Wu
	These two checks are surrounded by an if that ensures the call site is not a kw splat call site. Notes: Merged: https://github.com/ruby/ruby/pull/4404 Merged-By: XrXr
2021-04-02	fix return from orphan Proc in lambda	Koichi Sasada
	A "return" statement in a Proc in a lambda like: `lambda{ proc{ return }.call }` should return outer lambda block. However, the inner Proc can become orphan Proc from the lambda block. This "return" escape outer-scope like method, but this behavior was decieded as a bug. [Bug #17105] This patch raises LocalJumpError by checking the proc is orphan or not from lambda blocks before escaping by "return". Most of tests are written by Jeremy Evans https://github.com/ruby/ruby/pull/4223 Notes: Merged: https://github.com/ruby/ruby/pull/4347
2021-03-17	return bool instead of VALUE	Aaron Patterson
	Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17	Refactor vm_defined to return a boolean	Aaron Patterson
	We just need this function to return whether or not the thing we're looking for is defined. If it's defined, return something true, otherwise false. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-17	Stop calling `rb_iseq_defined_string` in vm_defined	Aaron Patterson
	We already have access to the string from the iseqs, so we can stop calling this function. Notes: Merged: https://github.com/ruby/ruby/pull/4279
2021-03-10	Remove DEFINED_IVAR2 from enum	John Hawthorn
	This version of defined? doesn't seem to be possible to emit anymore. Notes: Merged: https://github.com/ruby/ruby/pull/4253
2021-02-13	opt_equality_by_mid for rb_equal_opt	Koichi Sasada
	This patch improves the performance of sequential and parallel execution of rb_equal() (and rb_eql()). [Bug #17497] rb_equal_opt (and rb_eql_opt) does not have own cd and it waste a time to initialize cd. This patch introduces opt_equality_by_mid() to check equality without cd. Furthermore, current master uses "static" cd on rb_equal_opt (and rb_eql_opt) and it hurts CPU caches on multi-thread execution. Now they are gone so there are no bottleneck on parallel execution. Notes: Merged: https://github.com/ruby/ruby/pull/4177
2021-01-29	global call-cache cache table for rb_funcall*	Koichi Sasada
	rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough. Notes: Merged: https://github.com/ruby/ruby/pull/4129