summaryrefslogtreecommitdiff
path: root/doc/extension.rdoc
diff options
context:
space:
mode:
Diffstat (limited to 'doc/extension.rdoc')
-rw-r--r--doc/extension.rdoc308
1 files changed, 253 insertions, 55 deletions
diff --git a/doc/extension.rdoc b/doc/extension.rdoc
index c0360ae625..9fc507706e 100644
--- a/doc/extension.rdoc
+++ b/doc/extension.rdoc
@@ -1,5 +1,7 @@
# extension.rdoc - -*- RDoc -*- created at: Mon Aug 7 16:45:54 JST 1995
+{日本語}[rdoc-ref:extension.ja.rdoc]
+
= Creating extension libraries for Ruby
This document explains how to make extension libraries for Ruby.
@@ -315,11 +317,11 @@ rb_ary_aref(int argc, const VALUE *argv, VALUE ary) ::
rb_ary_entry(VALUE ary, long offset) ::
- \ary[offset]
+ ary\[offset]
rb_ary_store(VALUE ary, long offset, VALUE obj) ::
- \ary[offset] = obj
+ ary\[offset] = obj
rb_ary_subseq(VALUE ary, long beg, long len) ::
@@ -747,18 +749,68 @@ RUBY_TYPED_WB_PROTECTED ::
barriers in all implementations of methods of that object as
appropriate. Otherwise Ruby might crash while running.
- More about write barriers can be found in "Generational GC" in
- Appendix D.
+ More about write barriers can be found in {Generational
+ GC}[rdoc-ref:@Appendix+D.+Generational+GC].
RUBY_TYPED_FROZEN_SHAREABLE ::
- This flag indicates that the object is shareable object
- if the object is frozen. See Appendix F more details.
+ This flag indicates that the object is shareable object if the object
+ is frozen. See {Ractor support}[rdoc-ref:@Appendix+F.+Ractor+support]
+ more details.
If this flag is not set, the object can not become a shareable
object by Ractor.make_shareable() method.
-You can allocate and wrap the structure in one step.
+RUBY_TYPED_EMBEDDABLE ::
+
+ This flag indicates that Ruby may store the C struct inside the object
+ slot, rather than allocate it separately with +malloc+.
+ However, it is not a guarantee. Ruby may decide not to embed the object.
+ For instance if it's too large to fit into one of the available slot sizes.
+
+ Embedding the C struct inside the object slot reduces pointer chasing,
+ malloc overhead, and improves sweep performance.
+ In some cases, it can also reduce the memory footprint of the object.
+
+ To be embeddable, types must abide by some restrictions:
+
+ * Pointers to the C struct, or into the C struct, MUST NOT be stored,
+ as they become invalid when GC compaction occurs.
+ It is however valid to pass and use such pointers for as long as the Ruby
+ object remains on the stack.
+
+ In a sense, this is similar to the restrictions of a stack allocated struct.
+
+ The +RB_GC_GUARD+ macro must be used to ensure the object is not moved by
+ compaction and not freed, unless the object is passed directly as an
+ argument from Ruby to C, i.e. as a parameter of a function used with
+ +rb_define_method+ and similar.
+
+ * The +DATA_PTR+ and +RTYPEDDATA_DATA+ macro can't be used.
+ Only +RTYPEDDATA_GET_DATA+` or +TypedData_Get_Struct+ macros can be used
+ with embeddable objects.
+ Accessing `RDATA(obj)->data` or `RTYPEDDATA(obj)->data` is invalid too.
+
+ * The +dfree+ function MUST NOT free the C struct itself.
+ Setting +dfree+ to +RUBY_DEFAULT_FREE+ is fine.
+ To support older Ruby versions without this feature, you can
+ conditionally free the C struct if +RUBY_TYPED_EMBEDDABLE+ isn't defined.
+
+ * The type must have the +RUBY_TYPED_FREE_IMMEDIATELY+ flag set.
+
+ If the embedded C struct is of variable size, +rb_data_typed_object_zalloc+
+ can be used instead of +TypedData_Make_Struct+.
+
+ See {Embedded TypedData}[rdoc-ref:@Appendix+G.+Embedded+TypedData] for a
+ commented example of how to use +RUBY_TYPED_EMBEDDABLE+.
+
+
+Note that this macro can raise an exception. If sval to be wrapped
+holds a resource needs to be released (e.g., allocated memory, handle
+from an external library, and etc), you will have to use rb_protect.
+
+You can allocate and wrap the structure in one step, in more
+preferable manner.
TypedData_Make_Struct(klass, type, data_type, sval)
@@ -767,10 +819,71 @@ the structure, which is also allocated. This macro works like:
(sval = ZALLOC(type), TypedData_Wrap_Struct(klass, data_type, sval))
+However, you should use this macro instead of "allocation then wrap"
+like the above code if it is simply allocated, because the latter can
+raise a NoMemoryError and sval will be memory leaked in that case.
+
Arguments klass and data_type work like their counterparts in
TypedData_Wrap_Struct(). A pointer to the allocated structure will
be assigned to sval, which should be a pointer of the type specified.
+==== Declaratively marking/compacting struct references
+
+In the case where your struct refers to Ruby objects that are simple values,
+not wrapped in conditional logic or complex data structures an alternative
+approach to marking and reference updating is provided, by declaring offset
+references to the VALUES in your struct.
+
+Doing this allows the Ruby GC to support marking these references and GC
+compaction without the need to define the +dmark+ and +dcompact+ callbacks.
+
+You must define a static list of VALUE pointers to the offsets within your
+struct where the references are located, and set the "data" member to point to
+this reference list. The reference list must end with +RUBY_END_REFS+.
+
+Some Macros have been provided to make edge referencing easier:
+
+* <code>RUBY_TYPED_DECL_MARKING</code> =A flag that can be set on the +ruby_data_type_t+ to indicate that references are being declared as edges.
+
+* <code>RUBY_REFERENCES(ref_list_name)</code> - Define _ref_list_name_ as a list of references
+
+* <code>RUBY_REF_END</code> - The end mark of the references list.
+
+* <code>RUBY_REF_EDGE(struct, member)</code> - Declare _member_ as a VALUE edge from _struct_. Use this after +RUBY_REFERENCES_START+
+
+* +RUBY_REFS_LIST_PTR+ - Coerce the reference list into a format that can be
+ accepted by the existing +dmark+ interface.
+
+The example below is from Dir (defined in +dir.c+)
+
+ // The struct being wrapped. Notice this contains 3 members of which the second
+ // is a VALUE reference to another ruby object.
+ struct dir_data {
+ DIR *dir;
+ const VALUE path;
+ rb_encoding *enc;
+ }
+
+ // Define a reference list `dir_refs` containing a single entry to `path`.
+ // Needs terminating with RUBY_REF_END
+ RUBY_REFERENCES(dir_refs) = {
+ RUBY_REF_EDGE(dir_data, path),
+ RUBY_REF_END
+ };
+
+ // Override the "dmark" field with the defined reference list now that we
+ // no longer need a marking callback and add RUBY_TYPED_DECL_MARKING to the
+ // flags field
+ static const rb_data_type_t dir_data_type = {
+ "dir",
+ {RUBY_REFS_LIST_PTR(dir_refs), dir_free, dir_memsize,},
+ 0, NULL, RUBY_TYPED_WB_PROTECTED | RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_DECL_MARKING
+ };
+
+Declaring simple references declaratively in this manner allows the GC to both
+mark, and move the underlying object, and automatically update the reference to
+it during compaction.
+
==== Ruby object to C struct
To retrieve the C pointer from the T_DATA object, use the macro
@@ -1980,7 +2093,7 @@ the <code>*_kw</code> functions introduced in Ruby 2.7.
#define rb_proc_call_with_block_kw(p, c, v, b, kw) rb_proc_call_with_block(p, c, v, b)
#define rb_method_call_kw(c, v, m, kw) rb_method_call(c, v, m)
#define rb_method_call_with_block_kw(c, v, m, b, kw) rb_method_call_with_block(c, v, m, b)
- #define rb_eval_cmd_kwd(c, a, kw) rb_eval_cmd(c, a, 0)
+ #define rb_eval_cmd_kw(c, a, kw) rb_eval_cmd(c, a, 0)
#endif
== Appendix C. Functions available for use in extconf.rb
@@ -2152,70 +2265,155 @@ Ractor safety around C extensions has the following properties:
To make a "Ractor-safe" C extension, we need to check the following points:
-(1) Do not share unshareable objects between ractors
+1. Do not share unshareable objects between ractors
+
+ For example, C's global variable can lead sharing an unshareable objects
+ between ractors.
-For example, C's global variable can lead sharing an unshareable objects
-between ractors.
+ VALUE g_var;
+ VALUE set(VALUE self, VALUE v){ return g_var = v; }
+ VALUE get(VALUE self){ return g_var; }
- VALUE g_var;
- VALUE set(VALUE self, VALUE v){ return g_var = v; }
- VALUE get(VALUE self){ return g_var; }
+ set() and get() pair can share an unshareable objects using g_var, and
+ it is Ractor-unsafe.
-set() and get() pair can share an unshareable objects using g_var, and
-it is Ractor-unsafe.
+ Not only using global variables directly, some indirect data structure
+ such as global st_table can share the objects, so please take care.
-Not only using global variables directly, some indirect data structure
-such as global st_table can share the objects, so please take care.
+ Note that class and module objects are shareable objects, so you can
+ keep the code "cFoo = rb_define_class(...)" with C's global variables.
-Note that class and module objects are shareable objects, so you can
-keep the code "cFoo = rb_define_class(...)" with C's global variables.
+2. Check the thread-safety of the extension
-(2) Check the thread-safety of the extension
+ An extension should be thread-safe. For example, the following code is
+ not thread-safe:
-An extension should be thread-safe. For example, the following code is
-not thread-safe:
+ bool g_called = false;
+ VALUE call(VALUE self) {
+ if (g_called) rb_raise("recursive call is not allowed.");
+ g_called = true;
+ VALUE ret = do_something();
+ g_called = false;
+ return ret;
+ }
- bool g_called = false;
- VALUE call(VALUE self) {
- if (g_called) rb_raise("recursive call is not allowed.");
- g_called = true;
- VALUE ret = do_something();
- g_called = false;
- return ret;
+ because g_called global variable should be synchronized by other
+ ractor's threads. To avoid such data-race, some synchronization should
+ be used. Check include/ruby/thread_native.h and include/ruby/atomic.h.
+
+ With Ractors, all objects given as method parameters and the receiver (self)
+ are guaranteed to be from the current Ractor or to be shareable. As a
+ consequence, it is easier to make code ractor-safe than to make code generally
+ thread-safe. For example, we don't need to lock an array object to access the
+ element of it.
+
+3. Check the thread-safety of any used library
+
+ If the extension relies on an external library, such as a function foo() from
+ a library libfoo, the function libfoo foo() should be thread safe.
+
+4. Make an object shareable
+
+ This is not required to make an extension Ractor-safe.
+
+ If an extension provides special objects defined by rb_data_type_t,
+ consider these objects can become shareable or not.
+
+ RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be
+ shareable objects if the object is frozen. This means that if the object
+ is frozen, the mutation of wrapped data is not allowed.
+
+5. Others
+
+ There are possibly other points or requirements which must be considered in the
+ making of a Ractor-safe extension. This document will be extended as they are
+ discovered.
+
+== Appendix G. Embedded TypedData
+
+Here is an example of how to use +RUBY_TYPED_EMBEDDABLE+::
+
+ struct my_data {
+ struct timespec created_at;
+ size_t buffer_capa;
+ char *buffer;
+ };
+
+ static void
+ my_data_free(void *ptr)
+ {
+ struct my_data *data = (struct my_data *)ptr;
+
+ // Deliberately don't free `ptr` if it is embeddable.
+ // Only auxiliary memory need to be freed.
+ ruby_xfree(data->buffer);
}
-because g_called global variable should be synchronized by other
-ractor's threads. To avoid such data-race, some synchronization should
-be used. Check include/ruby/thread_native.h and include/ruby/atomic.h.
+ static size_t
+ my_data_size(const void *ptr)
+ {
+ const struct my_data *data = (const struct my_data *)ptr;
+ // We don't need to account for `sizeof(struct my_struct)` because it is embedded inside the Ruby object.
+ // Only auxiliary memory need to be reported.
+ return data->buffer_capa;
+ }
-With Ractors, all objects given as method parameters and the receiver (self)
-are guaranteed to be from the current Ractor or to be shareable. As a
-consequence, it is easier to make code ractor-safe than to make code generally
-thread-safe. For example, we don't need to lock an array object to access the
-element of it.
+ static const rb_data_type_t my_type = {
+ .wrap_struct_name = "my_type",
+ .function = {
+ .dfree = my_data_free,
+ .dsize = my_data_size,
+ }
+ .flags = RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_EMBEDDABLE,
+ };
-(3) Check the thread-safety of any used library
+ static VALUE
+ my_data_alloc(VALUE klass)
+ {
+ struct my_data *data;
+ VALUE obj = TypedData_Make_Struct(klass, struct my_data, &my_type, data);
-If the extension relies on an external library, such as a function foo() from
-a library libfoo, the function libfoo foo() should be thread safe.
+ // Is it fine to pass pointers into the embedded struct, for as long as
+ // the called function won't use it after the Ruby object have left the stack.
+ clock_gettime(CLOCK_REALTIME, &data->created_at);
+ data->buffer_capa = 1024;
+ data->buffer = ZALLOC_N(char, data->buffer_capa);
-(4) Make an object shareable
+ return obj
+ }
-This is not required to make an extension Ractor-safe.
+ static VALUE
+ my_data_m_parse(VALUE klass)
+ {
+ struct my_data *data;
+ VALUE my_data_obj = my_data_alloc(klass);
+ TypedData_Get_Struct(obj, struct my_data, &my_type, data);
-If an extension provides special objects defined by rb_data_type_t,
-consider these objects can become shareable or not.
+ // `my_data_obj` was allocated from C, `RB_GC_GUARD` must be used to
+ // ensure the compiler will keep its reference on the stack.
+ RB_GC_GUARD(my_data_obj)
+ }
-RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be
-shareable objects if the object is frozen. This means that if the object
-is frozen, the mutation of wrapped data is not allowed.
+ static VALUE
+ my_data_read(VALUE self)
+ {
+ struct my_data *data;
+ TypedData_Get_Struct(obj, struct my_data, &my_type, data);
-(5) Others
+ // `self` is received from `rb_define_method` so `RB_GC_GUARD` isn't necessary.
+ return rb_str_new(data->buffer, data->buffer_capa)
+ }
-There are possibly other points or requirements which must be considered in the
-making of a Ractor-safe extension. This document will be extended as they are
-discovered.
+ void
+ Init_my_data(void)
+ {
+ VALUE cMyData = rb_define_class("MyData");
+ rb_define_method(cMyData, "read", my_data_read, 0);
+ rb_define_singleton_method(cMyData, "parse", my_data_m_parse, 0);
+ }
-:enddoc: Local variables:
-:enddoc: fill-column: 70
-:enddoc: end:
+--
+Local variables:
+fill-column: 70
+end:
+++