ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2024-03-06	Move FL_SINGLETON to FL_USER1	Jean Boussier
	This frees FL_USER0 on both T_MODULE and T_CLASS. Note: prior to this, FL_SINGLETON was never set on T_MODULE, so checking for `FL_SINGLETON` without first checking that `FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-01	Don't pin named structs defined in Ruby	Jean Boussier
	[Bug #20311] `rb_define_class_under` assumes it's called from C and that the reference might be held in a C global variable, so it adds the class to the VM root. In the case of `Struct.new('Name')` it's wasteful and make the struct immortal.
2024-02-28	Make rb_define_finalizer_no_check private	Peter Zhu

2024-02-28	Remove unused rb_gc_id2ref_obj_tbl	Peter Zhu

2024-02-26	Remove rb_objspace_marked_object_p	Peter Zhu
	rb_objspace_marked_object_p is no longer used in the objspace module, so we can remove it.
2024-02-26	Make rb_objspace_data_type_memsize private	Peter Zhu
	rb_objspace_data_type_memsize is not used in the objspace module, so we can make it private.
2024-02-26	Remove unused rb_objspace_each_objects_without_setup	Peter Zhu

2024-02-22	Extract imemo functions from gc.c into imemo.c	Peter Zhu

2024-02-21	Add IMEMO_NEW	Peter Zhu
	Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.
2024-02-20	De-dup identical callinfo objects	John Hawthorn
	Previously every call to vm_ci_new (when the CI was not packable) would result in a different callinfo being returned this meant that every kwarg callsite had its own CI. When calling, different CIs result in different CCs. These CIs and CCs both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop this resulted in a memory leak of both types of object. This also likely resulted in extra memory used, and extra time searching, in non-eval cases. For simplicity in this commit I always allocate a CI object inside rb_vm_ci_lookup, but ideally we would lazily allocate it only when needed. I hope to do that as a follow up in the future.
2024-02-21	Introduce NODE_REGX to manage regexp literal	yui-knk

2024-02-20	[Feature #20257] Rearchitect Ripper	yui-knk
	Introduce another semantic value stack for Ripper so that Ripper can manage both Node and Ruby Object separately. This rearchitectutre of Ripper solves these issues. Therefore adding test cases for them. * [Bug 10436] https://bugs.ruby-lang.org/issues/10436 * [Bug 18988] https://bugs.ruby-lang.org/issues/18988 * [Bug 20055] https://bugs.ruby-lang.org/issues/20055 Checked the differences of `Ripper.sexp` for files under `/test/ruby` are only on test_pattern_matching.rb. The differences comes from the differences between `new_hash_pattern_tail` functions between parser and Ripper. Ripper `new_hash_pattern_tail` didn’t call `assignable` then `kw_rest_arg` wasn’t marked as local variable. This is also fixed by this commit. ``` --- a/./tmp/before/test_pattern_matching.rb +++ b/./tmp/after/test_pattern_matching.rb @@ -3607,7 +3607,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [985, 10]]], + [:var_ref, [:@ident, “a”, [985, 10]]], :==, [:hash, nil]]], nil]]], @@ -3662,7 +3662,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [994, 10]]], + [:var_ref, [:@ident, “a”, [994, 10]]], :==, [:hash, [:assoclist_from_args, @@ -3813,7 +3813,7 @@ [:command, [:@ident, “raise”, [1022, 10]], [:args_add_block, - [[:vcall, [:@ident, “b”, [1022, 16]]]], + [[:var_ref, [:@ident, “b”, [1022, 16]]]], false]]], [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]], nil, @@ -3876,7 +3876,7 @@ [:@int, “0”, [1033, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1033, 20]]], + [:var_ref, [:@ident, “b”, [1033, 20]]], :==, [:hash, nil]]]], nil]]], @@ -3946,7 +3946,7 @@ [:@int, “0”, [1042, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1042, 20]]], + [:var_ref, [:@ident, “b”, [1042, 20]]], :==, [:hash, [:assoclist_from_args, @@ -5206,7 +5206,7 @@ [[:assoc_new, [:@label, “c:“, [1352, 22]], [:@int, “0”, [1352, 25]]]]]], - [:vcall, [:@ident, “r”, [1352, 29]]]], + [:var_ref, [:@ident, “r”, [1352, 29]]]], false]]], [:binary, [:call, @@ -5299,7 +5299,7 @@ [:assoc_new, [:@label, “c:“, [1367, 34]], [:@int, “0”, [1367, 37]]]]]], - [:vcall, [:@ident, “r”, [1367, 41]]]], + [:var_ref, [:@ident, “r”, [1367, 41]]]], false]]], [:binary, [:call, @@ -5931,7 +5931,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]], [[:binary, - [:vcall, [:@ident, “r”, [1534, 8]]], + [:var_ref, [:@ident, “r”, [1534, 8]]], :==, [:hash, [:assoclist_from_args, ```
2024-02-19	[Bug #20280] Check by `rb_parser_enc_str_coderange`	Nobuyoshi Nakada
	Co-authored-by: Yuichiro Kaneko <spiketeika@gmail.com>
2024-02-19	[Bug #20280] Raise SyntaxError on invalid encoding symbol	Nobuyoshi Nakada

2024-02-14	Move rb_class_allocate_instance from gc.c to object.c	Peter Zhu

2024-02-13	Specialize String#byteslice(a, b) (#9939)	Aaron Patterson
	* Specialize String#byteslice(a, b) This adds a specialization for String#byteslice when there are two parameters. This makes our protobuf parser go from 5.84x slower to 5.33x slower ``` Comparison: decode upstream (53738 bytes): 7228.5 i/s decode protobuff (53738 bytes): 1236.8 i/s - 5.84x slower Comparison: decode upstream (53738 bytes): 7024.8 i/s decode protobuff (53738 bytes): 1318.5 i/s - 5.33x slower ``` * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
2024-02-12	proc.c: get rid of `CLONESETUP`	Jean Boussier
	[Bug #20253] All the way down to Ruby 1.9, `Proc`, `Method`, `UnboundMethod` and `Binding` always had their own specific clone and dup routine. This caused various discrepancies with how other objects behave on `dup` and `clone. [Bug #20250], [Bug #20253]. This commit get rid of `CLONESETUP` and use the the same codepath as all other types, so ensure consistency. NB: It's still not accepting the `freeze` keyword argument on `clone`. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2024-02-09	Remove ruby object from string nodes	yui-knk
	String nodes holds ruby string object on `VALUE nd_lit`. This commit changes it to `struct rb_parser_string *string` to reduce dependency on ruby object. Sometimes these strings are concatenated with other string therefore string concatenate functions are needed.
2024-02-05	Make io_fwrite safe for compaction	Peter Zhu
	[Bug #20169] Embedded strings are not safe for system calls without the GVL because compaction can cause pages to be locked causing the operation to fail with EFAULT. This commit changes io_fwrite to use rb_str_tmp_frozen_no_embed_acquire, which guarantees that the return string is not embedded.
2024-02-01	Parenthesize casted argument	Nobuyoshi Nakada

2024-01-31	Introduced `rb_node_const_decl_val` function	S.H
	Introduce `rb_node_const_decl_val` function to allow `rb_ary_join` and `rb_ary_reverse` functions to be removed from Universal Parser.
2024-01-30	Use `UNDEF_P`	Nobuyoshi Nakada

2024-01-27	Introduce `NODE_ENCODING`	S.H
	`__ENCODING__ `was managed by `NODE_LIT` with Encoding object. Introduce `NODE_ENCODING` for 1. `__ENCODING__` is detectable from AST Node. 2. Reduce dependency Ruby object for parse.y
2024-01-24	Define `IO_WITHOUT_GVL` macro	Nobuyoshi Nakada

2024-01-23	Make lastline and nextline to be rb_parser_string	yui-knk
	This commit changes `struct parser_params` lastline and nextline from `VALUE` (String object) to `rb_parser_string_t *` so that dependency on Ruby Object is reduced. `parser_string_buffer_t string_buffer` is added to `struct parser_params` to manage `rb_parser_string_t` pointers of each line. All allocated line strings are freed in `rb_ruby_parser_free`.
2024-01-19	Mark asan fake stacks during machine stack marking	KJ Tsanaktsidis
	ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]
2024-01-19	Define special macros for asan/msan being enabled	KJ Tsanaktsidis
	__has_feature is a clang-ism, and GCC has a different way to tell if sanitizers are enabled. For this reason, I don't want to spray __has_feature all over the codebase for other places where conditional compilation based on sanitizers is required. [Bug #20001]
2024-01-19	Make stack bounds detection work with ASAN	KJ Tsanaktsidis
	Where a local variable is used as part of the stack bounds detection, it has to actually be on the stack. ASAN can put local variable on "fake stacks", however, with addresses in different memory mappings. This completely destroys the stack bounds calculation, and can lead to e.g. things not getting GC marked on the machine stack or stackoverflow checks that always fail. The __asan_addr_is_in_fake_stack helper can be used to get the _real_ stack address of such variables, and thus perform the stack size calculation properly [Bug #20001]
2024-01-12	s/SafeStringValue/StringValue/	Xavier Noria
	The macro SafeStringValue() became just StringValue() in c5c05460ac2, and it is deprecated nowadays. This patch replaces remaining macro usage. Some occurrences are left in ext/stringio and ext/win32ole, they should be fixed upstream. The macro itself is not deleted, because it may be used in extensions.
2024-01-12	Statically allocate parser config	yui-knk

2024-01-12	Revert "Pass down "stack start" variables from closer to the top of the stack"	KJ Tsanaktsidis
	This reverts commit 4ba8f0dc993953d3ddda6328e3ef17a2fc2cbde5.
2024-01-12	Revert "Make stack bounds detection work with ASAN"	KJ Tsanaktsidis
	This reverts commit 6185cfdf38e26026c6d38220eeca48689e54cdcf.
2024-01-12	Revert "Define special macros for asan/msan being enabled"	KJ Tsanaktsidis
	This reverts commit bdafad879093ef16a9a649154c4b2e4ebf492656.
2024-01-12	Revert "Mark asan fake stacks during machine stack marking"	KJ Tsanaktsidis
	This reverts commit d10bc3a2b8300cffc383e10c3730871e851be24c.
2024-01-12	Mark asan fake stacks during machine stack marking	KJ Tsanaktsidis
	ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]
2024-01-12	Define special macros for asan/msan being enabled	KJ Tsanaktsidis
	__has_feature is a clang-ism, and GCC has a different way to tell if sanitizers are enabled. For this reason, I don't want to spray __has_feature all over the codebase for other places where conditional compilation based on sanitizers is required. [Bug #20001]
2024-01-12	Make stack bounds detection work with ASAN	KJ Tsanaktsidis
	Where a local variable is used as part of the stack bounds detection, it has to actually be on the stack. ASAN can put local variable on "fake stacks", however, with addresses in different memory mappings. This completely destroys the stack bounds calculation, and can lead to e.g. things not getting GC marked on the machine stack or stackoverflow checks that always fail. The __asan_addr_is_in_fake_stack helper can be used to get the _real_ stack address of such variables, and thus perform the stack size calculation properly [Bug #20001]
2024-01-12	Pass down "stack start" variables from closer to the top of the stack	KJ Tsanaktsidis
	The implementation of `native_thread_init_stack` for the various threading models can use the address of a local variable as part of the calculation of the machine stack extents: * pthreads uses it as a lower-bound on the start of the stack, because glibc (and maybe other libcs) can store its own data on the stack before calling into user code on thread creation. * win32 uses it as an argument to VirtualQuery, which gets the extent of the memory mapping which contains the variable However, the local being used for this is actually allocated _inside_ the `native_thread_init_stack` frame; that means the caller might allocate a VALUE on the stack that actually lies outside the bounds stored in machine.stack_{start,end}. A local variable from one level above the topmost frame that stores VALUEs on the stack must be drilled down into the call to `native_thread_init_stack` to be used in the calculation. This probably doesn't _really_ matter for the win32 case (they'll be in the same memory mapping so VirtualQuery should return the same thing), but definitely could matter for the pthreads case. [Bug #20001]
2024-01-11	Free environ when RUBY_FREE_AT_EXIT	Peter Zhu
	The environ is malloc'd, so it gets reported as a memory leak. This commit adds ruby_free_proctitle which frees it during shutdown when RUBY_FREE_AT_EXIT is set. STACK OF 1 INSTANCE OF 'ROOT LEAK: <calloc in ruby_init_setproctitle>': 5 dyld 0x18b7090e0 start + 2360 4 ruby 0x10000e3a8 main + 100 main.c:58 3 ruby 0x1000b4dfc ruby_options + 180 eval.c:121 2 ruby 0x1001c5f70 ruby_process_options + 200 ruby.c:3014 1 ruby 0x10035c9fc ruby_init_setproctitle + 76 setproctitle.c:105 0 libsystem_malloc.dylib 0x18b8c7b78 _malloc_zone_calloc_instrumented_or_legacy + 100
2024-01-11	Fix crash when printing RGENGC_DEBUG=5 output from GC	KJ Tsanaktsidis
	I was trying to debug an (unrelated) issue in the GC, and wanted to turn on the trace-level GC output by compiling it with -DRGENGC_DEBUG=5. Unfortunately, this actually causes a crash in newobj_init() because the code there tries to log the obj_info() of the newly created object. However, the object is not actually sufficiently set up for some of the things that obj_info() tries to do: * The instance variable table for a class is not yet initialized, and when using variable-length RVALUES, said ivar table is embedded in as-yet unitialized memory after the struct RValue. Attempting to read this, as obj_info() does, causes a crash. * T_DATA variables need to dereference their ->type field to print out the underlying C type name, which is not set up until newobj_fill() is called. To fix this, create a new method `obj_info_basic`, which dumps out only the parts of the object that are valid before the object is fully initialized. [Fixes #18795]
2024-01-09	Introduce NODE_SYM to manage symbol literal	yui-knk
	`:sym` was managed by `NODE_LIT` with `Symbol` object. This commit introduces `NODE_SYM` so that 1. Symbol literal is detectable from AST Node 2. Reduce dependency on ruby object
2024-01-08	Change numeric node value functions argument to `NODE *`	yui-knk
	Change the argument to align with other node value functions like `rb_node_line_lineno_val`.
2024-01-07	Introduce Numeric Node's	S-H-GAMELINKS

2024-01-05	Do not `poll` first	Koichi Sasada
	Before this patch, the MN scheduler waits for the IO with the following steps: 1. `poll(fd, timeout=0)` to check fd is ready or not. 2. if fd is not ready, waits with MN thread scheduler 3. call `func` to issue the blocking I/O call The advantage of advanced `poll()` is we can wait for the IO ready for any fds. However `poll()` becomes overhead for already ready fds. This patch changes the steps like: 1. call `func` to issue the blocking I/O call 2. if the `func` returns `EWOULDBLOCK` the fd is `O_NONBLOCK` and we need to wait for fd is ready so that waits with MN thread scheduler. In this case, we can wait only for `O_NONBLOCK` fds. Otherwise it waits with blocking operations such as `read()` system call. However we don't need to call `poll()` to check fd is ready in advance. With this patch we can observe performance improvement on microbenchmark which repeats blocking I/O (not `O_NONBLOCK` fd) with and without MN thread scheduler. ```ruby require 'benchmark' f = open('/dev/null', 'w') f.sync = true TN = 1 N = 1_000_000 / TN Benchmark.bm{\|x\| x.report{ TN.times.map{ Thread.new{ N.times{f.print '.'} } }.each(&:join) } } __END__ TN = 1 user system total real ruby32 0.393966 0.101122 0.495088 ( 0.495235) ruby33 0.493963 0.089521 0.583484 ( 0.584091) ruby33+MN 0.639333 0.200843 0.840176 ( 0.840291) <- Slow this+MN 0.512231 0.099091 0.611322 ( 0.611074) <- Good ```
2024-01-02	Introduce NODE_FILE	yui-knk
	`__FILE__` was managed by `NODE_STR` with `String` object. This commit introduces `NODE_FILE` and `struct rb_parser_string` so that 1. `__FILE__` is detectable from AST Node 2. Reduce dependency ruby object
2023-12-29	Introduce NODE_LINE	yui-knk
	`__LINE__` was managed by `NODE_LIT` with `Integer` object. This commit introduces `NODE_LINE` so that 1. `__LINE__` is detectable from AST Node 2. Reduce dependency ruby object
2023-12-25	Move internal ST functions to internal/st.h	Peter Zhu
	st_replace and st_init_existing_table_with_size are functions used internally in Ruby and should not be publicly visible.
2023-12-20	Correct free_on_exit env var to free_at_exit	HParker

2023-12-20	declare `rb_thread_io_blocking_call`	Koichi Sasada

2023-12-19	Set m_tbl right after allocation	Peter Zhu
	We should set the m_tbl right after allocation before anything that can trigger GC to avoid clone_p from becoming old and needing to fire write barriers. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>