diff options
Diffstat (limited to 'doc/extension.rdoc')
| -rw-r--r-- | doc/extension.rdoc | 308 |
1 files changed, 253 insertions, 55 deletions
diff --git a/doc/extension.rdoc b/doc/extension.rdoc index c0360ae625..9fc507706e 100644 --- a/doc/extension.rdoc +++ b/doc/extension.rdoc @@ -1,5 +1,7 @@ # extension.rdoc - -*- RDoc -*- created at: Mon Aug 7 16:45:54 JST 1995 +{日本語}[rdoc-ref:extension.ja.rdoc] + = Creating extension libraries for Ruby This document explains how to make extension libraries for Ruby. @@ -315,11 +317,11 @@ rb_ary_aref(int argc, const VALUE *argv, VALUE ary) :: rb_ary_entry(VALUE ary, long offset) :: - \ary[offset] + ary\[offset] rb_ary_store(VALUE ary, long offset, VALUE obj) :: - \ary[offset] = obj + ary\[offset] = obj rb_ary_subseq(VALUE ary, long beg, long len) :: @@ -747,18 +749,68 @@ RUBY_TYPED_WB_PROTECTED :: barriers in all implementations of methods of that object as appropriate. Otherwise Ruby might crash while running. - More about write barriers can be found in "Generational GC" in - Appendix D. + More about write barriers can be found in {Generational + GC}[rdoc-ref:@Appendix+D.+Generational+GC]. RUBY_TYPED_FROZEN_SHAREABLE :: - This flag indicates that the object is shareable object - if the object is frozen. See Appendix F more details. + This flag indicates that the object is shareable object if the object + is frozen. See {Ractor support}[rdoc-ref:@Appendix+F.+Ractor+support] + more details. If this flag is not set, the object can not become a shareable object by Ractor.make_shareable() method. -You can allocate and wrap the structure in one step. +RUBY_TYPED_EMBEDDABLE :: + + This flag indicates that Ruby may store the C struct inside the object + slot, rather than allocate it separately with +malloc+. + However, it is not a guarantee. Ruby may decide not to embed the object. + For instance if it's too large to fit into one of the available slot sizes. + + Embedding the C struct inside the object slot reduces pointer chasing, + malloc overhead, and improves sweep performance. + In some cases, it can also reduce the memory footprint of the object. + + To be embeddable, types must abide by some restrictions: + + * Pointers to the C struct, or into the C struct, MUST NOT be stored, + as they become invalid when GC compaction occurs. + It is however valid to pass and use such pointers for as long as the Ruby + object remains on the stack. + + In a sense, this is similar to the restrictions of a stack allocated struct. + + The +RB_GC_GUARD+ macro must be used to ensure the object is not moved by + compaction and not freed, unless the object is passed directly as an + argument from Ruby to C, i.e. as a parameter of a function used with + +rb_define_method+ and similar. + + * The +DATA_PTR+ and +RTYPEDDATA_DATA+ macro can't be used. + Only +RTYPEDDATA_GET_DATA+` or +TypedData_Get_Struct+ macros can be used + with embeddable objects. + Accessing `RDATA(obj)->data` or `RTYPEDDATA(obj)->data` is invalid too. + + * The +dfree+ function MUST NOT free the C struct itself. + Setting +dfree+ to +RUBY_DEFAULT_FREE+ is fine. + To support older Ruby versions without this feature, you can + conditionally free the C struct if +RUBY_TYPED_EMBEDDABLE+ isn't defined. + + * The type must have the +RUBY_TYPED_FREE_IMMEDIATELY+ flag set. + + If the embedded C struct is of variable size, +rb_data_typed_object_zalloc+ + can be used instead of +TypedData_Make_Struct+. + + See {Embedded TypedData}[rdoc-ref:@Appendix+G.+Embedded+TypedData] for a + commented example of how to use +RUBY_TYPED_EMBEDDABLE+. + + +Note that this macro can raise an exception. If sval to be wrapped +holds a resource needs to be released (e.g., allocated memory, handle +from an external library, and etc), you will have to use rb_protect. + +You can allocate and wrap the structure in one step, in more +preferable manner. TypedData_Make_Struct(klass, type, data_type, sval) @@ -767,10 +819,71 @@ the structure, which is also allocated. This macro works like: (sval = ZALLOC(type), TypedData_Wrap_Struct(klass, data_type, sval)) +However, you should use this macro instead of "allocation then wrap" +like the above code if it is simply allocated, because the latter can +raise a NoMemoryError and sval will be memory leaked in that case. + Arguments klass and data_type work like their counterparts in TypedData_Wrap_Struct(). A pointer to the allocated structure will be assigned to sval, which should be a pointer of the type specified. +==== Declaratively marking/compacting struct references + +In the case where your struct refers to Ruby objects that are simple values, +not wrapped in conditional logic or complex data structures an alternative +approach to marking and reference updating is provided, by declaring offset +references to the VALUES in your struct. + +Doing this allows the Ruby GC to support marking these references and GC +compaction without the need to define the +dmark+ and +dcompact+ callbacks. + +You must define a static list of VALUE pointers to the offsets within your +struct where the references are located, and set the "data" member to point to +this reference list. The reference list must end with +RUBY_END_REFS+. + +Some Macros have been provided to make edge referencing easier: + +* <code>RUBY_TYPED_DECL_MARKING</code> =A flag that can be set on the +ruby_data_type_t+ to indicate that references are being declared as edges. + +* <code>RUBY_REFERENCES(ref_list_name)</code> - Define _ref_list_name_ as a list of references + +* <code>RUBY_REF_END</code> - The end mark of the references list. + +* <code>RUBY_REF_EDGE(struct, member)</code> - Declare _member_ as a VALUE edge from _struct_. Use this after +RUBY_REFERENCES_START+ + +* +RUBY_REFS_LIST_PTR+ - Coerce the reference list into a format that can be + accepted by the existing +dmark+ interface. + +The example below is from Dir (defined in +dir.c+) + + // The struct being wrapped. Notice this contains 3 members of which the second + // is a VALUE reference to another ruby object. + struct dir_data { + DIR *dir; + const VALUE path; + rb_encoding *enc; + } + + // Define a reference list `dir_refs` containing a single entry to `path`. + // Needs terminating with RUBY_REF_END + RUBY_REFERENCES(dir_refs) = { + RUBY_REF_EDGE(dir_data, path), + RUBY_REF_END + }; + + // Override the "dmark" field with the defined reference list now that we + // no longer need a marking callback and add RUBY_TYPED_DECL_MARKING to the + // flags field + static const rb_data_type_t dir_data_type = { + "dir", + {RUBY_REFS_LIST_PTR(dir_refs), dir_free, dir_memsize,}, + 0, NULL, RUBY_TYPED_WB_PROTECTED | RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_DECL_MARKING + }; + +Declaring simple references declaratively in this manner allows the GC to both +mark, and move the underlying object, and automatically update the reference to +it during compaction. + ==== Ruby object to C struct To retrieve the C pointer from the T_DATA object, use the macro @@ -1980,7 +2093,7 @@ the <code>*_kw</code> functions introduced in Ruby 2.7. #define rb_proc_call_with_block_kw(p, c, v, b, kw) rb_proc_call_with_block(p, c, v, b) #define rb_method_call_kw(c, v, m, kw) rb_method_call(c, v, m) #define rb_method_call_with_block_kw(c, v, m, b, kw) rb_method_call_with_block(c, v, m, b) - #define rb_eval_cmd_kwd(c, a, kw) rb_eval_cmd(c, a, 0) + #define rb_eval_cmd_kw(c, a, kw) rb_eval_cmd(c, a, 0) #endif == Appendix C. Functions available for use in extconf.rb @@ -2152,70 +2265,155 @@ Ractor safety around C extensions has the following properties: To make a "Ractor-safe" C extension, we need to check the following points: -(1) Do not share unshareable objects between ractors +1. Do not share unshareable objects between ractors + + For example, C's global variable can lead sharing an unshareable objects + between ractors. -For example, C's global variable can lead sharing an unshareable objects -between ractors. + VALUE g_var; + VALUE set(VALUE self, VALUE v){ return g_var = v; } + VALUE get(VALUE self){ return g_var; } - VALUE g_var; - VALUE set(VALUE self, VALUE v){ return g_var = v; } - VALUE get(VALUE self){ return g_var; } + set() and get() pair can share an unshareable objects using g_var, and + it is Ractor-unsafe. -set() and get() pair can share an unshareable objects using g_var, and -it is Ractor-unsafe. + Not only using global variables directly, some indirect data structure + such as global st_table can share the objects, so please take care. -Not only using global variables directly, some indirect data structure -such as global st_table can share the objects, so please take care. + Note that class and module objects are shareable objects, so you can + keep the code "cFoo = rb_define_class(...)" with C's global variables. -Note that class and module objects are shareable objects, so you can -keep the code "cFoo = rb_define_class(...)" with C's global variables. +2. Check the thread-safety of the extension -(2) Check the thread-safety of the extension + An extension should be thread-safe. For example, the following code is + not thread-safe: -An extension should be thread-safe. For example, the following code is -not thread-safe: + bool g_called = false; + VALUE call(VALUE self) { + if (g_called) rb_raise("recursive call is not allowed."); + g_called = true; + VALUE ret = do_something(); + g_called = false; + return ret; + } - bool g_called = false; - VALUE call(VALUE self) { - if (g_called) rb_raise("recursive call is not allowed."); - g_called = true; - VALUE ret = do_something(); - g_called = false; - return ret; + because g_called global variable should be synchronized by other + ractor's threads. To avoid such data-race, some synchronization should + be used. Check include/ruby/thread_native.h and include/ruby/atomic.h. + + With Ractors, all objects given as method parameters and the receiver (self) + are guaranteed to be from the current Ractor or to be shareable. As a + consequence, it is easier to make code ractor-safe than to make code generally + thread-safe. For example, we don't need to lock an array object to access the + element of it. + +3. Check the thread-safety of any used library + + If the extension relies on an external library, such as a function foo() from + a library libfoo, the function libfoo foo() should be thread safe. + +4. Make an object shareable + + This is not required to make an extension Ractor-safe. + + If an extension provides special objects defined by rb_data_type_t, + consider these objects can become shareable or not. + + RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be + shareable objects if the object is frozen. This means that if the object + is frozen, the mutation of wrapped data is not allowed. + +5. Others + + There are possibly other points or requirements which must be considered in the + making of a Ractor-safe extension. This document will be extended as they are + discovered. + +== Appendix G. Embedded TypedData + +Here is an example of how to use +RUBY_TYPED_EMBEDDABLE+:: + + struct my_data { + struct timespec created_at; + size_t buffer_capa; + char *buffer; + }; + + static void + my_data_free(void *ptr) + { + struct my_data *data = (struct my_data *)ptr; + + // Deliberately don't free `ptr` if it is embeddable. + // Only auxiliary memory need to be freed. + ruby_xfree(data->buffer); } -because g_called global variable should be synchronized by other -ractor's threads. To avoid such data-race, some synchronization should -be used. Check include/ruby/thread_native.h and include/ruby/atomic.h. + static size_t + my_data_size(const void *ptr) + { + const struct my_data *data = (const struct my_data *)ptr; + // We don't need to account for `sizeof(struct my_struct)` because it is embedded inside the Ruby object. + // Only auxiliary memory need to be reported. + return data->buffer_capa; + } -With Ractors, all objects given as method parameters and the receiver (self) -are guaranteed to be from the current Ractor or to be shareable. As a -consequence, it is easier to make code ractor-safe than to make code generally -thread-safe. For example, we don't need to lock an array object to access the -element of it. + static const rb_data_type_t my_type = { + .wrap_struct_name = "my_type", + .function = { + .dfree = my_data_free, + .dsize = my_data_size, + } + .flags = RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_EMBEDDABLE, + }; -(3) Check the thread-safety of any used library + static VALUE + my_data_alloc(VALUE klass) + { + struct my_data *data; + VALUE obj = TypedData_Make_Struct(klass, struct my_data, &my_type, data); -If the extension relies on an external library, such as a function foo() from -a library libfoo, the function libfoo foo() should be thread safe. + // Is it fine to pass pointers into the embedded struct, for as long as + // the called function won't use it after the Ruby object have left the stack. + clock_gettime(CLOCK_REALTIME, &data->created_at); + data->buffer_capa = 1024; + data->buffer = ZALLOC_N(char, data->buffer_capa); -(4) Make an object shareable + return obj + } -This is not required to make an extension Ractor-safe. + static VALUE + my_data_m_parse(VALUE klass) + { + struct my_data *data; + VALUE my_data_obj = my_data_alloc(klass); + TypedData_Get_Struct(obj, struct my_data, &my_type, data); -If an extension provides special objects defined by rb_data_type_t, -consider these objects can become shareable or not. + // `my_data_obj` was allocated from C, `RB_GC_GUARD` must be used to + // ensure the compiler will keep its reference on the stack. + RB_GC_GUARD(my_data_obj) + } -RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be -shareable objects if the object is frozen. This means that if the object -is frozen, the mutation of wrapped data is not allowed. + static VALUE + my_data_read(VALUE self) + { + struct my_data *data; + TypedData_Get_Struct(obj, struct my_data, &my_type, data); -(5) Others + // `self` is received from `rb_define_method` so `RB_GC_GUARD` isn't necessary. + return rb_str_new(data->buffer, data->buffer_capa) + } -There are possibly other points or requirements which must be considered in the -making of a Ractor-safe extension. This document will be extended as they are -discovered. + void + Init_my_data(void) + { + VALUE cMyData = rb_define_class("MyData"); + rb_define_method(cMyData, "read", my_data_read, 0); + rb_define_singleton_method(cMyData, "parse", my_data_m_parse, 0); + } -:enddoc: Local variables: -:enddoc: fill-column: 70 -:enddoc: end: +-- +Local variables: +fill-column: 70 +end: +++ |
