diff options
Diffstat (limited to 'doc')
290 files changed, 30224 insertions, 4221 deletions
diff --git a/doc/.document b/doc/.document index d739c9f6bc..3a95c9617b 100644 --- a/doc/.document +++ b/doc/.document @@ -1,3 +1,13 @@ -*.rdoc -NEWS-* +[^_]*.md +[^_]*.rb +[^_]*.rdoc +contributing +distribution +NEWS syntax +optparse +date +rdoc +jit +security +language diff --git a/doc/ChangeLog-0.06_to_0.52 b/doc/ChangeLog/ChangeLog-0.06_to_0.52 index 63826081b3..63826081b3 100644 --- a/doc/ChangeLog-0.06_to_0.52 +++ b/doc/ChangeLog/ChangeLog-0.06_to_0.52 diff --git a/doc/ChangeLog-0.50_to_0.60 b/doc/ChangeLog/ChangeLog-0.50_to_0.60 index 5f5b03ff40..5f5b03ff40 100644 --- a/doc/ChangeLog-0.50_to_0.60 +++ b/doc/ChangeLog/ChangeLog-0.50_to_0.60 diff --git a/doc/ChangeLog-0.60_to_1.1 b/doc/ChangeLog/ChangeLog-0.60_to_1.1 index bd5f140dc3..59d195e780 100644 --- a/doc/ChangeLog-0.60_to_1.1 +++ b/doc/ChangeLog/ChangeLog-0.60_to_1.1 @@ -28,7 +28,7 @@ Fri Aug 8 11:16:50 1997 Yukihiro Matsumoto <matz@netlab.co.jp> Thu Aug 7 11:40:01 1997 Yukihiro Matsumoto <matz@netlab.co.jp> - * variable.c (mod_constants): lists constants defiend in the + * variable.c (mod_constants): lists constants defined in the modules/classes. * variable.c (rb_const_set): no longer warns about constant @@ -49,7 +49,7 @@ Mon Aug 4 11:50:28 1997 Yukihiro Matsumoto <matz@netlab.co.jp> classes (or modules) dynamically. * variable.c (rb_class_path): scan class constants for anonymous - classes/modules to make up pathes. + classes/modules to make up paths. Wed Jul 30 08:45:12 1997 Yukihiro Matsumoto <matz@netlab.co.jp> @@ -76,7 +76,7 @@ Wed Jul 23 09:56:55 1997 Yukihiro Matsumoto <matz@caelum.co.jp> specified object. * class.c (mod_instance_methods): returns list of method names of - the class instnace. + the class instance. Fri Jul 11 22:38:55 1997 Yukihiro Matsumoto <matz@caelum.co.jp> @@ -538,7 +538,7 @@ Wed Mar 12 10:20:30 1997 Yukihiro Matsumoto <matz@caelum.co.jp> Mon Mar 10 20:44:22 1997 Yukihiro Matsumoto <matz@caelum.co.jp> * re.c (reg_regsub): \& for substitution. \`, \', and \+ are - avaiable also. + available also. Thu Mar 6 01:47:03 1997 Yukihiro Matsumoto <matz@caelum.co.jp> @@ -3573,7 +3573,7 @@ Fri Mar 17 15:56:44 1995 Yukihiro Matsumoto (matz@ix-02) * dln.c: dlopenã®ã‚るマシンã§ã¯ãã¡ã‚‰ã‚’使ã†ã‚ˆã†ã«ï¼ŽãŸã ã—,ã¡ã‚ƒã‚“ ã¨å‹•ã„ã¦ã„ã‚‹ã‹ã©ã†ã‹ã¯è‡ªä¿¡ãŒãªã„. - * regex.c: virtual concatinationã‚’ã‚„ã‚ãŸï¼Ž + * regex.c: virtual concatenationã‚’ã‚„ã‚ãŸï¼Ž Thu Mar 16 11:32:57 1995 Yukihiro Matsumoto (matz@ix-02) diff --git a/doc/ChangeLog-1.8.0 b/doc/ChangeLog/ChangeLog-1.8.0 index 3f7d6bfb3c..6d9453d011 100644 --- a/doc/ChangeLog-1.8.0 +++ b/doc/ChangeLog/ChangeLog-1.8.0 @@ -13020,7 +13020,7 @@ Tue Jun 12 00:41:18 2001 Yukihiro Matsumoto <matz@ruby-lang.org> Mon Jun 11 14:29:41 2001 WATANABE Hirofumi <eban@ruby-lang.org> - * confgure.in: add RUBY_CANONICAL_BUILD. + * configure.in: add RUBY_CANONICAL_BUILD. Sun Jun 10 17:31:47 2001 Guy Decoux <decoux@moulon.inra.fr> diff --git a/doc/ChangeLog-1.9.3 b/doc/ChangeLog/ChangeLog-1.9.3 index eecfc44325..03a7f3eabf 100644 --- a/doc/ChangeLog-1.9.3 +++ b/doc/ChangeLog/ChangeLog-1.9.3 @@ -5746,7 +5746,7 @@ Wed Mar 2 14:06:01 2011 NARUSE, Yui <naruse@ruby-lang.org> Wed Mar 2 14:02:29 2011 Shota Fukumori <sorah@tubusu.net> * test/testunit/test_parallel.rb(TestParallel#spawn_runner): - Fix outputing empty line in running test. + Fix outputting empty line in running test. * test/testunit/tests_for_parallel/test_third.rb: Remove `sleep` @@ -5765,7 +5765,7 @@ Tue Mar 1 21:48:22 2011 Shota Fukumori <sorah@tubusu.net> * test/testunit/test_parallel.rb(TestParallelWorker#test_quit_in_test): Fix for above specification change. * test/testunit/test_parallel.rb(TestParallel#spawn_runner): - Fix outputing empty line in running test. + Fix outputting empty line in running test. Tue Mar 1 20:51:57 2011 KOSAKI Motohiro <kosaki.motohiro@gmail.com> @@ -7541,7 +7541,7 @@ Tue Jan 11 20:32:59 2011 Tanaka Akira <akr@fsij.org> Tue Jan 11 13:06:38 2011 NAKAMURA Usaku <usa@ruby-lang.org> - * array.c (rb_ary_resize): should care of embeded array when extending + * array.c (rb_ary_resize): should care of embedded array when extending the array. * array.c (rb_ary_resize): need to set capa when changing the real @@ -9012,7 +9012,7 @@ Thu Dec 2 01:24:39 2010 NARUSE, Yui <naruse@ruby-lang.org> Thu Dec 2 01:02:03 2010 NARUSE, Yui <naruse@ruby-lang.org> - * ext/json: Update github/flori/json from 1.4.2+ to + * ext/json: Update github/ruby/json from 1.4.2+ to e22b2f2bdfe6a9b0. this fixes some bugs. Thu Dec 2 00:05:44 2010 NARUSE, Yui <naruse@ruby-lang.org> @@ -9563,7 +9563,7 @@ Wed Nov 17 16:09:52 2010 Yuki Sonoda (Yugui) <yugui@yugui.jp> Wed Nov 17 16:04:23 2010 Yuki Sonoda (Yugui) <yugui@yugui.jp> - * test/ruby/envutil.rb (Test::Unit::Assersions#assert_warn): + * test/ruby/envutil.rb (Test::Unit::Assertions#assert_warn): new assertion to assert that a particular warning message is displayed. forward port from branches/ruby_1_9_2@29795. @@ -9781,7 +9781,7 @@ Wed Nov 10 07:20:10 2010 Nobuyoshi Nakada <nobu@ruby-lang.org> Tue Nov 9 21:57:45 2010 Nobuyoshi Nakada <nobu@ruby-lang.org> * dln.c (init_funcname): allocate and build initialization - funciton name at once. + function name at once. Tue Nov 9 21:14:54 2010 Nobuyoshi Nakada <nobu@ruby-lang.org> @@ -23228,7 +23228,7 @@ Fri Sep 11 10:38:33 2009 URABE Shyouhei <shyouhei@ruby-lang.org> * lib/net/http.rb (Net::HTTPHeader::encode_kvpair): also call to_s to k. A patch from swdyh <youhei@gmail.com> - http://github.com/swdyh/ruby/tree/c847f43c2ccb679b9ff728f8b1b16c6ceeb57f39 + https://github.com/swdyh/ruby/tree/c847f43c2ccb679b9ff728f8b1b16c6ceeb57f39 Fri Sep 11 09:45:11 2009 Nobuyoshi Nakada <nobu@ruby-lang.org> @@ -62969,7 +62969,7 @@ Thu Jul 12 12:24:29 2007 Nobuyoshi Nakada <nobu@ruby-lang.org> Thu Jul 12 10:30:46 2007 Nobuyoshi Nakada <nobu@ruby-lang.org> - * thread.c (thread_start_func_2): moved prototye from thread_*.ci. + * thread.c (thread_start_func_2): moved prototype from thread_*.ci. * thread_pthread.ci (thread_start_func_2): not use a directive inside a macro argument. [ruby-talk:258763] @@ -73273,7 +73273,7 @@ Fri Nov 18 17:35:09 2005 Hidetoshi NAGAI <nagai@ai.kyutech.ac.jp> * ext/tk/lib/multi-tk.rb: add restriction to access the entried command table and manipulate other IPs (for reason of security). - Now, a IP object can be controlled by only its master IP or the + Now, an IP object can be controlled by only its master IP or the default IP. * ext/tk/lib/remote-tk.rb: add restriction to manipulate. @@ -76346,7 +76346,7 @@ Tue Jul 5 14:52:56 2005 Hidetoshi NAGAI <nagai@ai.kyutech.ac.jp> * ext/tk/lib/tk/validation.rb: ditto. - * ext/tk/lib/tk/namespace.rb: arguemnts for TclTkIp#_merge_tklist + * ext/tk/lib/tk/namespace.rb: arguments for TclTkIp#_merge_tklist should be UTF-8 strings. Mon Jul 4 19:29:32 2005 Hirokazu Yamamoto <ocean@m2.ccsnet.ne.jp> @@ -77285,7 +77285,7 @@ Sun May 15 09:57:30 2005 Nobuyoshi Nakada <nobu@ruby-lang.org> Sat May 14 23:59:11 2005 Nobuyoshi Nakada <nobu@ruby-lang.org> * error.c (exc_exception, {exit,name_err,syserr}_initialize): call - Execption#initialize. fixed: [ruby-talk:142593] + Exception#initialize. fixed: [ruby-talk:142593] Sat May 14 23:56:41 2005 Erik Huelsmann <ehuels@gmail.com> @@ -77435,7 +77435,7 @@ Sat Apr 30 06:57:39 2005 GOTOU Yuuzou <gotoyuzo@notwork.org> (suggested by Tatsuki Sugiura) * lib/webrick/cgi.rb - (WEBrick::CGI#initalize): set a dummy to @config[:ServerSoftware] + (WEBrick::CGI#initialize): set a dummy to @config[:ServerSoftware] if SERVER_SOFTWARE environment variable is not given. (WEBrick::CGI#start): req.path_info must be a String. (WEBrick::CGI::Socket#request_line): treat REQUEST_METHOD, PATH_INFO @@ -82696,7 +82696,7 @@ Tue Sep 14 20:24:49 2004 Minero Aoki <aamine@loveruby.net> * ext/ripper/depend: Borland make does not accept pipes in Makefile rules. [ruby-dev:24589] - * ext/ripper/depend: separate rules for developpers. + * ext/ripper/depend: separate rules for developers. * ext/ripper/Makefile.dev: new file. @@ -82931,7 +82931,7 @@ Wed Sep 8 18:44:03 2004 Nobuyoshi Nakada <nobu@ruby-lang.org> Wed Sep 8 15:19:49 2004 Hidetoshi NAGAI <nagai@ai.kyutech.ac.jp> - * ext/tcltklib/tcltklib.c (ip_init): cannot create a IP at level 4 + * ext/tcltklib/tcltklib.c (ip_init): cannot create an IP at level 4 * ext/tk/lib/multi-tk.rb: improve 'exit' operation, security check, and error treatment @@ -88452,7 +88452,7 @@ Tue Dec 16 03:17:29 2003 why the lucky stiff <why@ruby-lang.org> Tue Dec 16 01:14:44 2003 Nobuyoshi Nakada <nobu@ruby-lang.org> - * eval.c (catch_timer): check rb_thread_crtical in main native + * eval.c (catch_timer): check rb_thread_critical in main native thread. * eval.c (thread_timer): just sends signals periodically, to @@ -92076,7 +92076,7 @@ Mon Sep 1 16:59:10 2003 Nobuyoshi Nakada <nobu@ruby-lang.org> * eval.c (rb_thread_start_0): should not error_print() within terminated thread, because $stderr used by it might be - overriden now. [ruby-dev:21280] + overridden now. [ruby-dev:21280] Sun Aug 31 22:46:55 2003 WATANABE Hirofumi <eban@ruby-lang.org> @@ -92616,7 +92616,7 @@ Fri Aug 8 03:22:28 2003 GOTOU Yuuzou <gotoyuzo@notwork.org> Thu Aug 7 14:40:37 2003 WATANABE Hirofumi <eban@ruby-lang.org> - * cygwin/GNUmakefile: better --disbale-shared option support. + * cygwin/GNUmakefile: better --disable-shared option support. * cygwin/GNUmakefile: add forwarding DLL target for cygwin. diff --git a/doc/ChangeLog-2.0.0 b/doc/ChangeLog/ChangeLog-2.0.0 index a1a79b8dca..de47b66a69 100644 --- a/doc/ChangeLog-2.0.0 +++ b/doc/ChangeLog/ChangeLog-2.0.0 @@ -9758,7 +9758,7 @@ Thu Aug 23 16:20:04 2012 Koichi Sasada <ko1@atdot.net> are b10. If flonum is activated, then USE_FLONUM macro is 1. I'll write detailed in this technique on - https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/Flonum_tech + https://bugs.ruby-lang.org/projects/ruby-master/wiki/Flonum_tech * benchmark/bmx_temp.rb: add an benchmark for simple Float calculation. @@ -13008,7 +13008,7 @@ Thu Jun 7 15:53:03 2012 Koichi Sasada <ko1@atdot.net> * .gdbinit: add function `trace_machine_instructions' to trace in native machine assemble. - See https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/MachineInstructionsTraceWithGDB + See https://bugs.ruby-lang.org/projects/ruby-master/wiki/MachineInstructionsTraceWithGDB for more details. Wed Jun 6 21:31:21 2012 Tanaka Akira <akr@fsij.org> @@ -14426,7 +14426,7 @@ Tue May 8 02:34:26 2012 NARUSE, Yui <naruse@ruby-lang.org> Mon May 7 21:19:17 2012 NARUSE, Yui <naruse@ruby-lang.org> * ext/json: Merge JSON 1.7.1. - https://github.com/flori/json/commit/e5b9a9465c1159fae533bca320d950b772bcb4ac + https://github.com/ruby/json/commit/e5b9a9465c1159fae533bca320d950b772bcb4ac Mon May 7 22:54:22 2012 Martin Bosslet <Martin.Bosslet@googlemail.com> @@ -14711,7 +14711,7 @@ Fri Apr 27 01:45:05 2012 NARUSE, Yui <naruse@ruby-lang.org> (22) main thread waits at gvl_yield:112 (native_cond_wait) As described above, the main thread can't escape from rb_threadptr_execute_interrupts_common. - See extended memo: http://bugs.ruby-lang.org/projects/ruby-trunk/wiki/R35480_ExtendedMemo + See extended memo: http://bugs.ruby-lang.org/projects/ruby-master/wiki/R35480_ExtendedMemo Fri Apr 27 07:15:07 2012 Tanaka Akira <akr@fsij.org> @@ -16451,7 +16451,7 @@ Mon Mar 5 17:11:44 2012 Nobuyoshi Nakada <nobu@ruby-lang.org> Exception#initialize doesn't use visible instance variable for the exception message, so call the method with the message. patched by Jingwen Owen Ou <jingweno AT gmail.com>. - http://github.com/ruby/ruby/pull/41 + https://github.com/ruby/ruby/pull/41 Mon Mar 5 16:50:22 2012 NAKAMURA Usaku <usa@ruby-lang.org> @@ -16858,13 +16858,13 @@ Fri Feb 24 13:54:33 2012 Aaron Patterson <aaron@tenderlovemaking.com> Fri Feb 24 12:07:34 2012 Ayumu AIZAWA <ayumu.aizawa@gmail.com> * lib/net/http.rb: Fix documentation. Patched from Florian Mhun - via http://github.com/ruby/ruby/pull/96 + via https://github.com/ruby/ruby/pull/96 Fri Feb 24 11:48:07 2012 Ayumu AIZAWA <ayumu.aizawa@gmail.com> * string.c (rb_str_prepend): Fix documentation for String#prepend. - Patched from Franck Verrot via http://github.com/ruby/ruby/pull/98 - and Andrew Horsman via http://github.com/ruby/ruby/pull/55 + Patched from Franck Verrot via https://github.com/ruby/ruby/pull/98 + and Andrew Horsman via https://github.com/ruby/ruby/pull/55 Fri Feb 24 10:08:33 2012 Eric Hodel <drbrain@segment7.net> diff --git a/doc/ChangeLog-2.1.0 b/doc/ChangeLog/ChangeLog-2.1.0 index 76edfd3ce7..b6977dbccd 100644 --- a/doc/ChangeLog-2.1.0 +++ b/doc/ChangeLog/ChangeLog-2.1.0 @@ -659,7 +659,7 @@ Mon Dec 9 02:10:32 2013 NARUSE, Yui <naruse@ruby-lang.org> * lib/net/http/responses.rb: Add `HTTPIMUsed`, as it is also supported by rack/rails. - RFC - http://tools.ietf.org/html/rfc3229 + RFC - https://www.rfc-editor.org/rfc/rfc3229 by Vipul A M <vipulnsward@gmail.com> https://github.com/ruby/ruby/pull/447 fix GH-447 @@ -2167,7 +2167,7 @@ Wed Nov 20 11:46:38 2013 NARUSE, Yui <naruse@ruby-lang.org> * ext/json: merge JSON 1.8.1. https://github.com/nurse/json/compare/002ac2771ce32776b32ccd2d06e5604de6c36dcd...e09ffc0d7da25d0393873936c118c188c78dbac3 * Remove Rubinius exception since transcoding should be working now. - * Fix https://github.com/flori/json/issues/162 reported by Marc-Andre + * Fix https://github.com/ruby/json/issues/162 reported by Marc-Andre Lafortune <github_rocks@marc-andre.ca>. Thanks! * Applied patches by Yui NARUSE <naruse@airemix.jp> to suppress warning with -Wchar-subscripts and better validate UTF-8 strings. @@ -3596,7 +3596,7 @@ Tue Oct 22 19:19:05 2013 Koichi Sasada <ko1@atdot.net> maintains all pages. For example, pages are allocated from the heap_pages. - See https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/GC_design + See https://bugs.ruby-lang.org/projects/ruby-master/wiki/GC_design and https://bugs.ruby-lang.org/attachments/4015/data-heap_structure_with_multiple_heaps.png for more details. @@ -8612,7 +8612,7 @@ Wed Jul 17 14:31:13 2013 Koichi Sasada <ko1@atdot.net> (4) heap::sorted is an array of "slots", sorted by an address of slot::body. - See https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/GC_design + See https://bugs.ruby-lang.org/projects/ruby-master/wiki/GC_design for more details (figure). * gc.c: Avoid "heaps" terminology. It is ambiguous. @@ -17913,7 +17913,7 @@ Tue Feb 12 12:02:35 2013 NARUSE, Yui <naruse@ruby-lang.org> * ext/json: merge JSON 1.7.7. This includes security fix. [CVE-2013-0269] - https://github.com/flori/json/commit/d0a62f3ced7560daba2ad546d83f0479a5ae2cf2 + https://github.com/ruby/json/commit/d0a62f3ced7560daba2ad546d83f0479a5ae2cf2 https://groups.google.com/d/topic/rubyonrails-security/4_YvCpLzL58/discussion Mon Feb 11 23:08:48 2013 Tanaka Akira <akr@fsij.org> diff --git a/doc/ChangeLog-2.2.0 b/doc/ChangeLog/ChangeLog-2.2.0 index 5a7dbf826d..0edcf0122b 100644 --- a/doc/ChangeLog-2.2.0 +++ b/doc/ChangeLog/ChangeLog-2.2.0 @@ -5648,7 +5648,7 @@ Wed Aug 6 04:16:05 2014 NARUSE, Yui <naruse@ruby-lang.org> * lib/net/http/requests.rb (Net::HTTP::Options::RESPONSE_HAS_BODY): OPTIONS requests may have response bodies. [Feature #8429] - http://tools.ietf.org/html/rfc7231#section-4.3.7 + https://www.rfc-editor.org/rfc/rfc7231#section-4.3.7 Wed Aug 6 03:18:04 2014 NARUSE, Yui <naruse@ruby-lang.org> diff --git a/doc/ChangeLog-2.3.0 b/doc/ChangeLog/ChangeLog-2.3.0 index 7f3c4e672a..e616183895 100644 --- a/doc/ChangeLog-2.3.0 +++ b/doc/ChangeLog/ChangeLog-2.3.0 @@ -170,7 +170,7 @@ Tue Dec 22 14:31:28 2015 Toru Iwase <tietew@tietew.net> should return unfrozen new string. [ruby-core:72426] [Bug #11858] -Tue Dec 22 05:39:58 2015 Takashi Kokubun <takashikkbn@gmail.com> +Tue Dec 22 05:39:58 2015 Takashi Kokubun <k0kubun@ruby-lang.org> * ext/cgi/escape/escape.c (preserve_original_state): Preserve original state for tainted and frozen. [Fix GH-1166] @@ -208,7 +208,7 @@ Mon Dec 21 09:33:17 2015 Karol Bucek <kares@users.noreply.github.com> * ext/openssl/lib/openssl/ssl.rb (OpenSSL::SSL::SSLSocket): fix NotImplementedError typo. [Fix GH-1165] -Sun Dec 20 20:54:51 2015 Takashi Kokubun <takashikkbn@gmail.com> +Sun Dec 20 20:54:51 2015 Takashi Kokubun <k0kubun@ruby-lang.org> * cgi/escape/escape.c: Optimize CGI.escapeHTML for ASCII-compatible encodings. [Fix GH-1164] @@ -476,7 +476,7 @@ Tue Dec 15 17:57:57 2015 Martin Duerst <duerst@it.aoyama.ac.jp> to the correct one in the IANA registry (IBM037) and added an alias (ebcdic-cp-us) -Tue Dec 15 16:19:26 2015 Takashi Kokubun <takashikkbn@gmail.com> +Tue Dec 15 16:19:26 2015 Takashi Kokubun <k0kubun@ruby-lang.org> * lib/erb.rb: Render erb with array buffer for function call optimization. [fix GH-1143] @@ -488,7 +488,7 @@ Tue Dec 15 13:50:05 2015 Nobuyoshi Nakada <nobu@ruby-lang.org> * string.c (rb_str_oct): [DOC] mention radix indicators. [ruby-core:71310] [Bug #11648] -Tue Dec 15 12:20:30 2015 Takashi Kokubun <takashikkbn@gmail.com> +Tue Dec 15 12:20:30 2015 Takashi Kokubun <k0kubun@ruby-lang.org> * lib/erb.rb: Simplify regexp to optimize erb scanner. [fix GH-1144] @@ -1211,7 +1211,7 @@ Sun Dec 6 08:39:05 2015 SHIBATA Hiroshi <hsbt@ruby-lang.org> upstream changes. https://github.com/ruby/ruby/commit/4d059bf9f5f10f3d3088de49fc87e5555db7770d - https://github.com/flori/json/commit/d4c99de78905d96c3f301f48b2c789943bb3f098 + https://github.com/ruby/json/commit/d4c99de78905d96c3f301f48b2c789943bb3f098 * ext/json/lib/json/version.rb: ditto. @@ -2670,7 +2670,7 @@ Sat Nov 7 09:51:38 2015 Koichi Sasada <ko1@atdot.net> * vm_trace.c (rb_threadptr_exec_event_hooks_orig): maintain trace_running counter on internal events. - This patch is made by Takashi Kokubun <takashikkbn@gmail.com>. + This patch is made by Takashi Kokubun <k0kubun@ruby-lang.org>. [Bug #11603] https://github.com/ruby/ruby/pull/1059 Sat Nov 7 03:32:27 2015 Koichi Sasada <ko1@atdot.net> @@ -5283,7 +5283,7 @@ Sat Aug 1 06:54:36 2015 Aaron Patterson <tenderlove@ruby-lang.org> * ext/openssl/ossl_ssl.c (Init_ossl_ssl): OpenSSL declares these constants as longs, so we should follow that and use LONG2NUM. - http://git.io/vOqxD + https://github.com/openssl/openssl/blob/34750dc25d74e3db4c1ba43cd219d3f4825e4c65/include/openssl/ssl.h#L391 Sat Aug 1 04:06:29 2015 Aaron Patterson <tenderlove@ruby-lang.org> @@ -6754,7 +6754,8 @@ Thu Jul 2 09:51:44 2015 SHIBATA Hiroshi <hsbt@ruby-lang.org> Thu Jul 2 06:49:44 2015 SHIBATA Hiroshi <hsbt@ruby-lang.org> * lib/rubygems: Update to RubyGems HEAD(c202db2). - this version contains many enhancements see http://git.io/vtNwF + this version contains many enhancements see + https://github.com/rubygems/rubygems/blob/c202db2d681eb3c3a02f187d346fbb2e8d733b26/History.txt#L3 * test/rubygems: ditto. Wed Jul 1 23:50:34 2015 Kazuhiro NISHIYAMA <zn@mbf.nifty.com> @@ -10988,7 +10989,7 @@ Fri Feb 13 21:16:00 2015 Yusuke Endoh <mame@tsg.ne.jp> Fri Feb 13 14:19:06 2015 SHIBATA Hiroshi <shibata.hiroshi@gmail.com> - * ext/json: merge upstream from flori/json + * ext/json: merge upstream from ruby/json change usage of TypedData. [Feature #10739][ruby-core:67564] Thu Feb 12 18:34:01 2015 multisnow <infinity.blick.winkel@gmail.com> @@ -11555,7 +11556,7 @@ Tue Jan 13 21:08:22 2015 SHIBATA Hiroshi <shibata.hiroshi@gmail.com> * ext/json, test/json: merge JSON HEAD(259dee6) separate implementation of Typed_Data macro. - https://github.com/flori/json/compare/v1.8.1...v1.8.2 + https://github.com/ruby/json/compare/v1.8.1...v1.8.2 Tue Jan 13 14:16:35 2015 Nobuyoshi Nakada <nobu@ruby-lang.org> @@ -12008,7 +12009,7 @@ Mon Dec 29 10:37:27 2014 Thiago Lewin <thiago_lewin@yahoo.com.br> Mon Dec 29 07:27:23 2014 SHIBATA Hiroshi <shibata.hiroshi@gmail.com> * ext/json, test/json: merge JSON HEAD(17fe8e7) - https://github.com/flori/json/compare/v1.8.1...17fe8e7 + https://github.com/ruby/json/compare/v1.8.1...17fe8e7 Sun Dec 28 23:49:37 2014 Michal Papis <mpapis@gmail.com> diff --git a/doc/ChangeLog-2.4.0 b/doc/ChangeLog/ChangeLog-2.4.0 index 96b5ecb077..5e126fbd90 100644 --- a/doc/ChangeLog-2.4.0 +++ b/doc/ChangeLog/ChangeLog-2.4.0 @@ -792,7 +792,7 @@ Wed Oct 5 12:57:21 2016 Richard Schneeman <richard.schneeman+foo@gmail.com> Wed Oct 5 11:47:19 2016 SHIBATA Hiroshi <hsbt@ruby-lang.org> - * io.c: Fixed equivalent ruby code with core implemention. + * io.c: Fixed equivalent ruby code with core implementation. [fix GH-1429][ci skip] Patch by @sos4nt Wed Oct 5 11:36:21 2016 SHIBATA Hiroshi <hsbt@ruby-lang.org> @@ -888,7 +888,7 @@ Sun Oct 2 02:03:06 2016 NAKAMURA Usaku <usa@ruby-lang.org> Sat Oct 1 23:08:47 2016 NAKAMURA Usaku <usa@ruby-lang.org> - * ext/date/date_parse.c (date_zone_to_diff): it's nonsence and really + * ext/date/date_parse.c (date_zone_to_diff): it's nonsense and really harm that to use unary minus operator with unsigned value. get rid of test failures introduced at r56312. @@ -2668,8 +2668,8 @@ Wed Jul 6 07:11:27 2016 Shugo Maeda <shugo@ruby-lang.org> Tue Jul 5 20:49:30 2016 SHIBATA Hiroshi <hsbt@ruby-lang.org> * ext/json/*, test/json/*: Update json-2.0.1. - Changes of 2.0.0: https://github.com/flori/json/blob/f679ebd0c69a94e3e70a897ac9a229f5779c2ee1/CHANGES.md#2015-09-11-200 - Changes of 2.0.1: https://github.com/flori/json/blob/f679ebd0c69a94e3e70a897ac9a229f5779c2ee1/CHANGES.md#2016-07-01-201 + Changes of 2.0.0: https://github.com/ruby/json/blob/f679ebd0c69a94e3e70a897ac9a229f5779c2ee1/CHANGES.md#2015-09-11-200 + Changes of 2.0.1: https://github.com/ruby/json/blob/f679ebd0c69a94e3e70a897ac9a229f5779c2ee1/CHANGES.md#2016-07-01-201 [Feature #12542][ruby-dev:49706][fix GH-1395] Tue Jul 5 19:39:49 2016 Naohisa Goto <ngotogenome@gmail.com> @@ -7356,7 +7356,7 @@ Thu Mar 17 11:51:48 2016 NARUSE, Yui <naruse@ruby-lang.org> Note: CryptGenRandom function is PRNG and doesn't check its entropy, so it won't block. [Bug #12139] https://msdn.microsoft.com/ja-jp/library/windows/desktop/aa379942.aspx - https://tools.ietf.org/html/rfc4086#section-7.1.3 + https://www.rfc-editor.org/rfc/rfc4086#section-7.1.3 https://eprint.iacr.org/2007/419.pdf http://www.cs.huji.ac.il/~dolev/pubs/thesis/msc-thesis-leo.pdf diff --git a/doc/ChangeLog-YARV b/doc/ChangeLog/ChangeLog-YARV index a8b999dff2..83df05c52c 100644 --- a/doc/ChangeLog-YARV +++ b/doc/ChangeLog/ChangeLog-YARV @@ -493,7 +493,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * thread.c : remove some line break - * yarvcore.c : reoder initialize sequence to mark main thread + * yarvcore.c : reorder initialize sequence to mark main thread 2006-08-18(Fri) 16:51:34 +0900 Koichi Sasada <ko1@atdot.net> @@ -1481,7 +1481,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * yarvcore.c : ditto - * yarvtest/test_thread.rb : separete assersions to tests + * yarvtest/test_thread.rb : separate assertions to tests 2006-02-21(Tue) 02:13:33 +900 Yukihiro Matsumoto <matz@ruby-lang.org> @@ -1503,7 +1503,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * yarvcore.c : support Proc#dup/clone, Binding#dup/clone - * sample/test.rb : remove unsupport features (Proc as Binding) + * sample/test.rb : remove unsupported features (Proc as Binding) 2006-02-20(Mon) 16:28:59 +0900 Koichi Sasada <ko1@atdot.net> @@ -1560,7 +1560,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * yarvtest/test_thread.rb : add a test for above * vm.h, vm.c, vm_dump.c, insns.def : add FRAME_MAGIC_LAMBDA and - support return from lambda (especially retrun from method defined + support return from lambda (especially return from method defined by "define_method") * yarvtest/test_method.rb : add a test for above @@ -1606,7 +1606,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * vm.c : "return" from lambda{} break block - * eval.c : Unsupport Proc as Binding + * eval.c : Unsupported Proc as Binding * test/ruby/test_eval.rb : apply above changes @@ -3816,7 +3816,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2005-12-29(Thu) 12:27:12 +0900 Koichi Sasada <ko1@atdot.net> * compile.c, yarvcore.h : - remvoe needless yarv_iseq_t#rewind_frame_size + remove needless yarv_iseq_t#rewind_frame_size 2005-12-29(Thu) 11:17:58 +0900 Koichi Sasada <ko1@atdot.net> @@ -4530,7 +4530,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * vm.c : fix return process - * vm_macro.def : fix option prameters + * vm_macro.def : fix option parameters * yarvtest/test_method.rb : add tests for above @@ -4555,7 +4555,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * eval_intern.h : fix PASS_PASSED_BLOCK() - * eval_load.c : fix re-enter require (temporalily) + * eval_load.c : fix re-enter require (temporarily) * insns.def : permit re-open class when superclass is same @@ -4729,7 +4729,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * eval.c, eval_intern.h, vm.c, eval_jump.h, yarvcore.h : re-define PUSH/POP/EXEC/JUMP_TAG to use thread local tag - * inits.c, yarvcore.c : fix boostrap + * inits.c, yarvcore.c : fix bootstrap 2005-10-03(Mon) 22:28:24 +0900 Koichi Sasada <ko1@atdot.net> @@ -4909,7 +4909,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2005-09-14(Wed) 06:11:43 +0900 Koichi Sasada <ko1@atdot.net> * yarvcore.h, vm_evalbody.h, vm.h, vm_dump.c, - compile.c, yarvcore.c : use #ifdef insted of #if for recognize + compile.c, yarvcore.c : use #ifdef instead of #if for recognize vm options * vm_opts.h : fix default options @@ -4973,13 +4973,13 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * common.mk : rule test -> test2, test1 -> test - * compile.c : fix when clause bug and splat arugment + * compile.c : fix when clause bug and splat argument 2005-08-17(Wed) 05:22:31 +0900 Koichi Sasada <ko1@atdot.net> * compile.c : fix block local parameter setting routine and support - massign in block parameter initialze + massign in block parameter initialize * yarvtest/test_yield.rb : add tests for above @@ -5394,7 +5394,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * tmpl/vmtc.inc.tmpl : add const prefix - * /rb/asm_parse.rb, extconf.rb : added and make assembler analised output + * /rb/asm_parse.rb, extconf.rb : added and make assembler analysed output * opt_operand.def : add send operands unification @@ -5654,7 +5654,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2005-03-01(Tue) 13:50:04 +0900 Koichi Sasada <ko1@atdot.net> * yarvcore.c (yarvcore_eval_parsed) : added - (separeted from yarvcore_eval) + (separated from yarvcore_eval) * yarvcore.c, compile.c : iseq_translate_direct_threaded_code is moved to compile.c @@ -5806,7 +5806,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * compiled.c : add constant pool - * vm_evalbody.inc, call_cfunc.inc, vm.c : separeted from vm.c + * vm_evalbody.inc, call_cfunc.inc, vm.c : separated from vm.c * insns.def : fix return val @@ -5840,7 +5840,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * depend : fixed for above - * extconf.rb : add option --(enable|disalbe)-opt-insns-unification + * extconf.rb : add option --(enable|disable)-opt-insns-unification 2005-02-11(Fri) 12:14:39 +0900 Koichi Sasada <ko1@atdot.net> @@ -5957,7 +5957,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * evalc.patch : fix for above - * benchmark/bm_lists.rb : fix (unsupport block passing) + * benchmark/bm_lists.rb : fix (unsupported block passing) * benchmark/run.rb : use full path to ruby @@ -6014,7 +6014,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * test/test_block.rb (test_ifunc) : test for above - * vm.c (get_block_objec, thread_make_env_object) : fixed bugs + * vm.c (get_block_object, thread_make_env_object) : fixed bugs * test/test_bin.rb (test_xstr) : remove `ls` test @@ -6067,7 +6067,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2005-01-04(Tue) 06:25:45 +0900 Koichi Sasada <ko1@atdot.net> - * compile.h : COMPILE_ERROR break contol (instead of return) + * compile.h : COMPILE_ERROR break control (instead of return) * compile.c : support NODE_MASGN @@ -6108,7 +6108,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * insns.def : support super, zsuper (currently, super can't handle with block) - * test/test_bin.rb : add test for op_asgin2, op_assgin_and/or + * test/test_bin.rb : add test for op_assign2, op_assign_and/or * test/test_class.rb : add test for super, zsuper @@ -6272,7 +6272,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * opt_operand.def : add unification insn send * rb/insns2vm.rb : define symbol instead of declare const - variable (for more optmize on VC) + variable (for more optimize on VC) * insns.def : move enter point in send @@ -6322,7 +6322,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2004-12-02(Thu) 13:20:41 +0900 Koichi Sasada <ko1@atdot.net> * yarvcore.c, vm.h, vm.c, insns.def, insnhelper.h, yarvutil.rb : - add usage analisys framework + add usage analysis framework * disasm.c : insn_operand_intern to separate function @@ -6489,7 +6489,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> 2004-11-01(Mon) 04:45:54 +0900 Koichi Sasada <ko1@atdot.net> * yarvcore.h, compile.c, debug.c, version.h : - redesgin gc debug scheme (GC_CHECK()) + redesign gc debug scheme (GC_CHECK()) * yarvcore.c : mark iseqobj->current_block on GC @@ -6820,7 +6820,7 @@ Sun Dec 31 17:42:05 2006 Koichi Sasada <ko1@atdot.net> * depend : add tbench rule - * yarvcore.h : add 'exten ID idBackquote' + * yarvcore.h : add 'extern ID idBackquote' 2004-05-18(Tue) 00:09:48 +0900 Koichi Sasada <ko1@atdot.net> diff --git a/doc/NEWS-1.8.7 b/doc/NEWS/NEWS-1.8.7 index 5da39ff265..5da39ff265 100644 --- a/doc/NEWS-1.8.7 +++ b/doc/NEWS/NEWS-1.8.7 diff --git a/doc/NEWS-1.9.1 b/doc/NEWS/NEWS-1.9.1 index fb11026d60..fb11026d60 100644 --- a/doc/NEWS-1.9.1 +++ b/doc/NEWS/NEWS-1.9.1 diff --git a/doc/NEWS-1.9.2 b/doc/NEWS/NEWS-1.9.2 index fedb1f6633..430c6cc4f5 100644 --- a/doc/NEWS-1.9.2 +++ b/doc/NEWS/NEWS-1.9.2 @@ -240,7 +240,7 @@ with all sufficient information, see the ChangeLog file. * new alias of item.guid.isPermaLink= * DL - * Now uses libffi as a backend if avaiable. + * Now uses libffi as a backend if available. It means DL works fine on more platforms. * Fiddle diff --git a/doc/NEWS-1.9.3 b/doc/NEWS/NEWS-1.9.3 index 484660f420..484660f420 100644 --- a/doc/NEWS-1.9.3 +++ b/doc/NEWS/NEWS-1.9.3 diff --git a/doc/NEWS-2.0.0 b/doc/NEWS/NEWS-2.0.0 index 414789dcd1..e070b19976 100644 --- a/doc/NEWS-2.0.0 +++ b/doc/NEWS/NEWS-2.0.0 @@ -116,8 +116,7 @@ with all sufficient information, see the ChangeLog file. corresponding method in the prepending module. * added Module.prepended and Module.prepend_features, similar to included and append_features. - * added Module#refine, which extends a class or module locally. - [experimental] + * added Module#refine, which extends a class or module locally. [experimental] * extended method: * Module#define_method accepts a UnboundMethod from a Module. * Module#const_get accepts a qualified constant string, e.g. @@ -377,7 +376,7 @@ with all sufficient information, see the ChangeLog file. :TLSv1_2, :TLSv1_2_server, :TLSv1_2_client or :TLSv1_1, :TLSv1_1_server :TLSv1_1_client. The version being effectively used can be queried with OpenSSL::SSL#ssl_version. Furthermore, it is also possible to - blacklist the new TLS versions with OpenSSL::SSL:OP_NO_TLSv1_1 and + blacklist the new TLS versions with OpenSSL::SSL::OP_NO_TLSv1_1 and OpenSSL::SSL::OP_NO_TLSv1_2. * Added OpenSSL::SSL::SSLContext#renegotiation_cb. A user-defined callback may be set which gets called whenever a new handshake is negotiated. This diff --git a/doc/NEWS-2.1.0 b/doc/NEWS/NEWS-2.1.0 index 5d4152b8dc..26f2374e94 100644 --- a/doc/NEWS-2.1.0 +++ b/doc/NEWS/NEWS-2.1.0 @@ -155,7 +155,7 @@ with all sufficient information, see the ChangeLog file. Foo#foo private. * Kernel#untrusted?, untrust, and trust - * These methods are deprecated and their behavior is same as tainted?, + * These methods are deprecated and their behavior is the same as tainted?, taint, and untaint, respectively. If $VERBOSE is true, they show warnings. * Module#ancestors diff --git a/doc/NEWS-2.2.0 b/doc/NEWS/NEWS-2.2.0 index 5564c606ae..8b2bd0ba0a 100644 --- a/doc/NEWS-2.2.0 +++ b/doc/NEWS/NEWS-2.2.0 @@ -90,7 +90,7 @@ with all sufficient information, see the ChangeLog file. * Method * New methods: - * Method#curry([arity]) returns a curried Proc. + * Method#curry([ arity ]) returns a curried Proc. * Method#super_method returns a Method of superclass, which would be called when super is used. @@ -250,8 +250,7 @@ with all sufficient information, see the ChangeLog file. * Logger::Application is extracted to logger-application gem. It's unmaintain code. * ObjectSpace (after requiring "objspace") - * ObjectSpace.memsize_of(obj) returns a size includes sizeof(RVALUE). - [Bug #8984] + * ObjectSpace.memsize_of(obj) returns a size includes sizeof(RVALUE). [Bug #8984] * Prime * incompatible changes: @@ -319,7 +318,7 @@ with all sufficient information, see the ChangeLog file. * rb_sym2str() added. This is almost same as `rb_id2str(SYM2ID(sym))` but not pinning a dynamic symbol. -* rb_str_cat_cstr() added. This is same as `rb_str_cat2()`. +* rb_str_cat_cstr() added. This is the same as `rb_str_cat2()`. * `rb_str_substr()` and `rb_str_subseq()` will share middle of a string, but not only the end of a string, in the future. Therefore, result @@ -353,8 +352,7 @@ with all sufficient information, see the ChangeLog file. * VM * Use frozen string literals for Hash#[] and Hash#[]= * Fast keyword arguments passing [Feature #10440] - * Allow to receive huge splatted array by a rest argument - [Feature #10440] + * Allow to receive huge splatted array by a rest argument [Feature #10440] * Process * Process creation methods, such as spawn(), uses vfork() system call. diff --git a/doc/NEWS-2.3.0 b/doc/NEWS/NEWS-2.3.0 index 489aba4a89..065515257e 100644 --- a/doc/NEWS-2.3.0 +++ b/doc/NEWS/NEWS-2.3.0 @@ -16,20 +16,19 @@ with all sufficient information, see the ChangeLog file or Redmine * frozen-string-literal pragma: - * new pragma, frozen-string-literal has been experimentally introduced. - [Feature #8976] + * new pragma, frozen-string-literal has been experimentally introduced. [Feature #8976] * besides, --enable/--disable=frozen-string-literal options also have been introduced. [Feature #8976] * command line options --debug or --debug=frozen-string-literal enable additional debugging mode which shows created location with at frozen - object error (RuntimeError). - [Feature #11725] + object error (RuntimeError). [Feature #11725] * safe navigation operator: * new method call syntax, `object&.foo', method #foo is called on - `object' if it is not nil. - this is similar to `try!' in Active Support, except: + `object' if it is not nil. [Feature #11537] + + This is similar to `try!' in Active Support, except: * method name is syntactically required obj.try! {} # valid obj&. {} # syntax error @@ -38,7 +37,6 @@ with all sufficient information, see the ChangeLog file or Redmine obj&.foo(bar()) # bar() is conditionally evaluated * attribute assignment is valid obj&.attr += 1 - [Feature #11537] * the did_you_mean gem: @@ -53,15 +51,13 @@ with all sufficient information, see the ChangeLog file or Redmine * indented here document: * new string literal, here document starts with `<<~`. - refer doc/syntax/literals.rdoc for more details. - [Feature #9098] + refer doc/syntax/literals.rdoc for more details. [Feature #9098] === Core classes updates (outstanding ones only) * ARGF - * ARGF.read_nonblock supports `exception: false' like IO#read_nonblock. - [Feature #11358] + * ARGF.read_nonblock supports `exception: false' like IO#read_nonblock. [Feature #11358] * Array @@ -78,8 +74,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Enumerable - * Enumerable#grep_v is added as inverse version of Enumerable#grep. - [Feature #11049] + * Enumerable#grep_v is added as inverse version of Enumerable#grep. [Feature #11049] * Enumerable#chunk_while [Feature #10769] * Enumerator::Lazy @@ -105,8 +100,7 @@ with all sufficient information, see the ChangeLog file or Redmine this affect only files opened as binary. [Feature #11218] * new option parameter `flags' is added. - this parameter is bitwise-ORed to oflags generated by normal mode argument. - [Feature #11253] + this parameter is bitwise-ORed to oflags generated by normal mode argument. [Feature #11253] * IO#advise no longer raises Errno::ENOSYS in cases where it was detected at build time but not available at runtime. [Feature #11806] @@ -125,8 +119,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Numeric * Numeric#positive? and Numeric#negative? are added, which return - true when the receiver is positive and negative respectively. - [Feature #11151] + true when the receiver is positive and negative respectively. [Feature #11151] * Proc @@ -152,11 +145,9 @@ with all sufficient information, see the ChangeLog file or Redmine * String - * String#+@ and String#-@ are added to get mutable/frozen strings. - [Feature #11782] + * String#+@ and String#-@ are added to get mutable/frozen strings. [Feature #11782] - * String.new now accepts new option parameter `encoding'. - [Feature #11785] + * String.new now accepts new option parameter `encoding'. [Feature #11785] * Struct * Struct#dig [Feature #11688] @@ -233,12 +224,10 @@ with all sufficient information, see the ChangeLog file or Redmine * OpenSSL * OpenSSL::SSL::SSLSocket#accept_nonblock and - OpenSSL::SSL::SSLSocket#connect_nonblock supports `exception: false`. - [Feature #10532] + OpenSSL::SSL::SSLSocket#connect_nonblock supports `exception: false`. [Feature #10532] * Pathname - * Pathname#descend and Pathname#ascend supported blockless form. - [Feature #11052] + * Pathname#descend and Pathname#ascend supported blockless form. [Feature #11052] * Socket * Socket#connect_nonblock, Socket#accept_nonblock, @@ -246,8 +235,7 @@ with all sufficient information, see the ChangeLog file or Redmine BasicSocket#recv_nonblock, BasicSocket#recvmsg_nonblock, BasicSocket#sendmsg_nonblock all support `exception: false` to return :wait_readable or :wait_writable symbols instead of raising - IO::WaitReadable or IO::WaitWritable exceptions - [Feature #10532] [Feature #11229] + IO::WaitReadable or IO::WaitWritable exceptions [Feature #10532] [Feature #11229] * BasicSocket#recv and BasicSocket#recv_nonblock allow an output String buffer argument like IO#read and IO#read_nonblock to reduce GC overhead [Feature #11242] @@ -255,8 +243,7 @@ with all sufficient information, see the ChangeLog file or Redmine * StringIO * In read-only mode, StringIO#set_encoding no longer sets the encoding of its buffer string. Setting the encoding of the string directly - without StringIO#set_encoding may cause unpredictable behavior now. - [Bug #11827] + without StringIO#set_encoding may cause unpredictable behavior now. [Bug #11827] * timeout * Object#timeout is now warned as deprecated when called. @@ -297,8 +284,7 @@ with all sufficient information, see the ChangeLog file or Redmine * default value of Net::HTTP#open_timeout is now 60 (was nil). * Net::Telnet - * Net::Telnet is extracted to net-telnet gem. It's unmaintain code. - [Feature #11083] + * Net::Telnet is extracted to net-telnet gem. It's unmaintain code. [Feature #11083] * Psych * Updated to Psych 2.0.17 @@ -330,8 +316,7 @@ with all sufficient information, see the ChangeLog file or Redmine class is already defined but its superclass does not match the given superclass, as well as definitions in ruby level. -* rb_timespec_now() is added to fetch current datetime as struct timespec. - [Feature #11558] +* rb_timespec_now() is added to fetch current datetime as struct timespec. [Feature #11558] * rb_time_timespec_new() is added to create a time object with epoch, nanosecond, and UTC/localtime/time offset arguments. [Feature #11558] @@ -354,11 +339,9 @@ with all sufficient information, see the ChangeLog file or Redmine === Implementation improvements -* Optimize Proc#call to eliminate method frame construction. - [Feature #11569] +* Optimize Proc#call to eliminate method frame construction. [Feature #11569] -* Reconsidering method entry data structure. - [Bug #11278] +* Reconsidering method entry data structure. [Bug #11278] * Introducing new table data structure for ID keys tables used by method table and so on. New table structure is simple and fast @@ -367,13 +350,11 @@ with all sufficient information, see the ChangeLog file or Redmine * Machine code level tuning for object allocation and method calling code. r52099, r52254 -* RubyVM::InstructionSequence is extended for future improvement. - [Feature #11788] +* RubyVM::InstructionSequence is extended for future improvement. [Feature #11788] * Case dispatch is now optimized for all special constant literals including nil, true, and false. Previously, only literal strings, - symbols, integers and floats compiled to optimized case dispatch. - [Feature #11769] + symbols, integers and floats compiled to optimized case dispatch. [Feature #11769] * Instance variables on non-pure Ruby classes (T_DATA, T_FILE, etc..) is less expensive to store than before. [Feature #11170] @@ -382,8 +363,7 @@ with all sufficient information, see the ChangeLog file or Redmine constant-time. Previously, Struct elements beyond the first 10 elements used a linear scan. [Feature #10585] -* The Set class got several speed up. - [Misc #10754], [r52591] +* The Set class got several speed up. [Misc #10754], [r52591] * Socket and I/O-related improvements @@ -397,8 +377,8 @@ with all sufficient information, see the ChangeLog file or Redmine addition to reducing expensive exceptions. [Feature #11044] * (Linux-only) waiting on a single FD anywhere in the stdlib no longer - uses select(2), making it immune to slowdowns with high-numbered FDs. - [Feature #11081] [Feature #11377] + uses select(2), making it immune to slowdowns with high-numbered + FDs. [Feature #11081] [Feature #11377] * CGI.escapeHTML is optimized with C extension. https://github.com/ruby/ruby/pull/1164 diff --git a/doc/NEWS-2.4.0 b/doc/NEWS/NEWS-2.4.0 index 28e855cde1..8a02f03809 100644 --- a/doc/NEWS-2.4.0 +++ b/doc/NEWS/NEWS-2.4.0 @@ -14,16 +14,13 @@ with all sufficient information, see the ChangeLog file or Redmine === Language changes -* Multiple assignment in conditional expression is now allowed. - [Feature #10617] +* Multiple assignment in conditional expression is now allowed. [Feature #10617] * Refinements is enabled at method by Symbol#to_proc. [Feature #9451] -* Refinements is enabled with Kernel#send and BasicObject#__send__. - [Feature #11476] +* Refinements is enabled with Kernel#send and BasicObject#__send__. [Feature #11476] -* Rescue modifier now applicable to method arguments. - [Feature #12686] +* Rescue modifier now applicable to method arguments. [Feature #12686] * Toplevel return is now allowed. [Feature #4840] @@ -32,17 +29,21 @@ with all sufficient information, see the ChangeLog file or Redmine * Array * Array#concat [Feature #12333] + Now takes multiple arguments. * Array#max and Array#min. [Feature #12172] + This may cause a tiny incompatibility: if you redefine Enumerable#max and call max to an Array, your redefinition will be now ignored. You should also redefine Array#max. * Array#pack [Feature #12754] + Now takes optional argument `buffer:' to reuse already allocated buffer. * Array#sum [Feature #12217] + This is different from Enumerable#sum in that Array#sum doesn't depend on the definition of each method. @@ -56,8 +57,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Enumerable - * Enumerable#chunk called without a block now return an Enumerator - [Feature #2172] + * Enumerable#chunk called without a block now return an Enumerator [Feature #2172] * Enumerable#sum [Feature #12217] * Enumerable#uniq [Feature #11090] @@ -95,6 +95,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Integer#round now takes an optional keyword argument, half option, and the default behavior is round-up now. [Bug #12548] [Bug #12958] + half option can be one of :even, :up, and :down. [Feature #12953] * IO @@ -104,8 +105,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Kernel - * Kernel#clone now takes an optional keyword argument, freeze flag. - [Feature #12300] + * Kernel#clone now takes an optional keyword argument, freeze flag. [Feature #12300] * MatchData @@ -138,6 +138,7 @@ with all sufficient information, see the ChangeLog file or Redmine for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences. * Regexp#match? [Feature #8110] + This returns bool and doesn't save backref. * Update to Onigmo 6.0.0. @@ -153,6 +154,7 @@ with all sufficient information, see the ChangeLog file or Redmine * String#casecmp? [Feature #12786] * String#concat, String#prepend [Feature #12333] + Now takes multiple arguments. * String#each_line, String#lines now takes an optional keyword argument, @@ -189,8 +191,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Thread - * Thread#report_on_exception and Thread.report_on_exception - [Feature #6647] + * Thread#report_on_exception and Thread.report_on_exception [Feature #6647] * TracePoint @@ -200,8 +201,7 @@ with all sufficient information, see the ChangeLog file or Redmine * New module named Warning is introduced. By default it has only one singleton method, named warn. This makes it possible for - 3rd-party libraries to control the way warnings are handled. - [Feature #12299] + 3rd-party libraries to control the way warnings are handled. [Feature #12299] === Stdlib updates (outstanding ones only) @@ -215,8 +215,7 @@ with all sufficient information, see the ChangeLog file or Redmine * IPAddr - * IPAddr#== and IPAddr#<=> no longer raise an exception if coercion fails. - [Bug #12799] + * IPAddr#== and IPAddr#<=> no longer raise an exception if coercion fails. [Bug #12799] * IRB @@ -256,8 +255,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Readline - * Readline.quoting_detection_proc and Readline.quoting_detection_proc= - [Feature #12659] + * Readline.quoting_detection_proc and Readline.quoting_detection_proc= [Feature #12659] * REXML @@ -267,8 +265,7 @@ with all sufficient information, see the ChangeLog file or Redmine * set - * New methods: Set#compare_by_identity and Set#compare_by_identity?. - [Feature #12210] + * New methods: Set#compare_by_identity and Set#compare_by_identity?. [Feature #12210] * WEBrick @@ -277,6 +274,7 @@ with all sufficient information, see the ChangeLog file or Redmine === Compatibility issues (excluding feature bug fixes) * Array#sum and Enumerable#sum are implemented. [Feature #12217] + Ruby itself has no compatibility problem because Ruby didn't have sum method for arrays before Ruby 2.4. However many third party gems, activesupport, facets, simple_stats, etc, @@ -286,6 +284,7 @@ with all sufficient information, see the ChangeLog file or Redmine be perfectly compatible with all of them. * Fixnum and Bignum are unified into Integer [Feature #12005] + Fixnum class and Bignum class is removed. Integer class is changed from abstract class to concrete class. For example, 0 is an instance of Integer: 0.class returns Integer. @@ -300,6 +299,7 @@ with all sufficient information, see the ChangeLog file or Redmine * String/Symbol#upcase/downcase/swapcase/capitalize(!) now work for all of Unicode, not only for ASCII. [Feature #10085] + No change is needed if the data is in ASCII anyway or if the limitation to ASCII was only tolerated while waiting for a more extensive implementation. A change (using the :ascii option) is needed in cases where Unicode data @@ -307,6 +307,7 @@ with all sufficient information, see the ChangeLog file or Redmine A good example of this are internationalized domain names. * TRUE / FALSE / NIL + These constants are now obsoleted. [Feature #12574] Use true / false / nil resp. instead. @@ -332,8 +333,7 @@ with all sufficient information, see the ChangeLog file or Redmine * Shellwords.shellwords (shellsplit) treats the backslash as escape character only when followed by one of the following characters: - $ ` " \ <newline> - [Bug #10055] + $ ` " \ <newline> [Bug #10055] * Time @@ -348,11 +348,13 @@ with all sufficient information, see the ChangeLog file or Redmine * Tk * Tk is removed from stdlib. [Feature #8539] + https://github.com/ruby/tk is the new upstream. * XMLRPC * XMLRPC is removed from stdlib, and bundled as gem. [Feature #12160][ruby-core:74239] + https://github.com/ruby/xmlrpc is the new upstream. * Zlib @@ -390,8 +392,8 @@ with all sufficient information, see the ChangeLog file or Redmine === Miscellaneous changes * ChangeLog is removed from the repository. + It is generated from commit messages in Subversion by `make dist`. Also note that now people should follow Git style commit message. - The template is written at - [Short (50 chars or less) summary of changes](https://git-scm.com/book/ch5-2.html). - [Feature #12283] + The template is written at {Short (50 chars or less) summary of + changes}[https://git-scm.com/book/ch5-2.html]. [Feature #12283] diff --git a/doc/NEWS-2.5.0 b/doc/NEWS/NEWS-2.5.0 index 221c0328c1..af7f3ada01 100644 --- a/doc/NEWS-2.5.0 +++ b/doc/NEWS/NEWS-2.5.0 @@ -68,7 +68,7 @@ with all sufficient information, see the ChangeLog file or Redmine * File.rename releases GVL. [Feature #13951] * File::Stat#atime, File::Stat#mtime and File::Stat#ctime support fractional second timestamps on Windows 8 and later. [Feature #13726] - * File::Stat#ino and File.indentical? support ReFS 128bit ino on Windows 8.1 + * File::Stat#ino and File.identical? support ReFS 128bit ino on Windows 8.1 and later. [Feature #13731] * File.readable?, File.readable_real?, File.writable?, File.writable_real?, File.executable?, File.executable_real?, File.mkfifo, File.readlink, @@ -466,7 +466,7 @@ with all sufficient information, see the ChangeLog file or Redmine === Compatibility issues (excluding feature bug fixes) -* Socket +* BasicSocket * BasicSocket#read_nonblock and BasicSocket#write_nonblock no longer set the O_NONBLOCK file description flag as side effect diff --git a/doc/NEWS-2.6.0 b/doc/NEWS/NEWS-2.6.0 index 2303a5bd41..6e70696de2 100644 --- a/doc/NEWS-2.6.0 +++ b/doc/NEWS/NEWS-2.6.0 @@ -50,24 +50,24 @@ sufficient information, see the ChangeLog file or Redmine === Core classes updates (outstanding ones only) -Array:: +[Array] - New methods:: + [New methods] * Added Array#union and Array#difference instance methods. [Feature #14097] - Modified method:: + [Modified method] * Array#to_h now accepts a block that maps elements to new key/value pairs. [Feature #15143] - Aliased methods:: + [Aliased methods] * Array#filter is a new alias for Array#select. [Feature #13784] * Array#filter! is a new alias for Array#select!. [Feature #13784] -Binding:: +[Binding] - New method:: + [New method] * Added Binding#source_location. [Feature #14230] @@ -79,97 +79,97 @@ Binding:: binding's source location [Bug #4352]. So, users should use this newly-introduced method instead of Kernel#eval. -Dir:: +[Dir] - New methods:: + [New methods] * Added Dir#each_child and Dir#children instance methods. [Feature #13969] -Enumerable:: +[Enumerable] - New method:: + [New method] * Enumerable#chain returns an enumerator object that iterates over the elements of the receiver and then those of each argument in sequence. [Feature #15144] - Modified method:: + [Modified method] * Enumerable#to_h now accepts a block that maps elements to new key/value pairs. [Feature #15143] - Aliased method:: + [Aliased method] * Enumerable#filter is a new alias for Enumerable#select. [Feature #13784] -Enumerator::ArithmeticSequence:: +[Enumerator::ArithmeticSequence] * This is a new class to represent a generator of an arithmetic sequence, that is a number sequence defined by a common difference. It can be used for representing what is similar to Python's slice. You can get an instance of this class from Numeric#step and Range#step. -Enumerator::Chain:: +[Enumerator::Chain] * This is a new class to represent a chain of enumerables that works as a single enumerator, generated by such methods as Enumerable#chain and Enumerator#+. -Enumerator::Lazy:: +[Enumerator::Lazy] - Aliased method:: + [Aliased method] * Enumerator::Lazy#filter is a new alias for Enumerator::Lazy#select. [Feature #13784] -Enumerator:: +[Enumerator] - New methods:: + [New methods] * Enumerator#+ returns an enumerator object that iterates over the elements of the receiver and then those of the other operand. [Feature #15144] -ENV:: +[ENV] - Modified method:: + [Modified method] * ENV.to_h now accepts a block that maps names and values to new keys and values. [Feature #15143] -Exception:: +[Exception] - New options:: + [New options] * Exception#full_message takes +:highlight+ and +:order+ options. [Bug #14324] -Hash:: +[Hash] - Modified methods:: + [Modified methods] * Hash#merge, Hash#merge!, and Hash#update now accept multiple arguments. [Feature #15111] * Hash#to_h now accepts a block that maps keys and values to new keys and values. [Feature #15143] - Aliased methods:: + [Aliased methods] * Hash#filter is a new alias for Hash#select. [Feature #13784] * Hash#filter! is a new alias for Hash#select!. [Feature #13784] -IO:: +[IO] - New option:: + [New option] * Added new mode character <code>'x'</code> to open files for exclusive access. [Feature #11258] -Kernel:: +[Kernel] - Aliased method:: + [Aliased method] * Kernel#then is a new alias for Kernel#yield_self. [Feature #14594] - New options:: + [New options] * Kernel#Complex, Kernel#Float, Kernel#Integer, and Kernel#Rational take an +:exception+ option to specify the way of @@ -178,98 +178,98 @@ Kernel:: * Kernel#system takes an +:exception+ option to raise an exception on failure. [Feature #14386] - Incompatible changes:: + [Incompatible changes] * Kernel#system and Kernel#exec do not close non-standard file descriptors (the default of the +:close_others+ option is changed to +false+, but we still set the +FD_CLOEXEC+ flag on descriptors we create). [Misc #14907] -KeyError:: +[KeyError] - New options:: + [New options] * KeyError.new accepts +:receiver+ and +:key+ options to set receiver and key in Ruby code. [Feature #14313] -Method:: +[Method] - New methods:: + [New methods] * Added Method#<< and Method#>> for Proc composition. [Feature #6284] -Module:: +[Module] - Modified methods:: + [Modified methods] * Module#method_defined?, Module#private_method_defined?, and Module#protected_method_defined? now accept the second parameter as optional. If it is +true+ (the default value), it checks ancestor modules/classes, or checks only the class itself. [Feature #14944] -NameError:: +[NameError] - New option:: + [New option] * NameError.new accepts a +:receiver+ option to set receiver in Ruby code. [Feature #14313] -NilClass:: +[NilClass] - New method:: + [New method] * NilClass#=~ is added for compatibility. [Feature #15231] -NoMethodError:: +[NoMethodError] - New option:: + [New option] * NoMethodError.new accepts a +:receiver+ option to set receiver in Ruby code. [Feature #14313] -Numeric:: +[Numeric] - Incompatible changes:: + [Incompatible changes] * Numeric#step now returns an instance of the Enumerator::ArithmeticSequence class rather than one of the Enumerator class. -OpenStruct:: +[OpenStruct] - Modified method:: + [Modified method] * OpenStruct#to_h now accepts a block that maps keys and values to new keys and values. [Feature #15143] -Proc:: +[Proc] - New methods:: + [New methods] * Added Proc#<< and Proc#>> for Proc composition. [Feature #6284] - Incompatible changes:: + [Incompatible changes] * Proc#call doesn't change <code>$SAFE</code> any more. [Feature #14250] -Random:: +[Random] - New method:: + [New method] * Added Random.bytes. [Feature #4938] -Range:: +[Range] - New method:: + [New method] * Added Range#% instance method. [Feature #14697] - Incompatible changes:: + [Incompatible changes] * Range#=== now uses the +#cover?+ instead of the +#include?+ method. [Feature #14575] * Range#cover? now accepts a Range object. [Feature #14473] * Range#step now returns an instance of the Enumerator::ArithmeticSequence class rather than one of the Enumerator class. -Regexp/String:: +[Regexp/String] * Update Unicode version from 10.0.0 to 11.0.0. [Feature #14802] @@ -278,9 +278,9 @@ Regexp/String:: * Update Emoji version from 5.0 to 11.0.0 [Feature #14802] -RubyVM::AbstractSyntaxTree:: +[RubyVM::AbstractSyntaxTree] - New methods:: + [New methods] * RubyVM::AbstractSyntaxTree.parse parses a given string and returns AST nodes. [experimental] @@ -291,46 +291,46 @@ RubyVM::AbstractSyntaxTree:: * RubyVM::AbstractSyntaxTree.of returns AST nodes of the given proc or method. [experimental] -RubyVM:: +[RubyVM] - New method:: + [New method] * RubyVM.resolve_feature_path identifies the file that will be loaded by "require(feature)". [experimental] [Feature #15230] -String:: +[String] * String#crypt is now deprecated. [Feature #14915] - New features:: + [New features] * String#split yields each substring to the block if given. [Feature #4780] -Struct:: +[Struct] - Modified method:: + [Modified method] * Struct#to_h now accepts a block that maps keys and values to new keys and values. [Feature #15143] - Aliased method:: + [Aliased method] * Struct#filter is a new alias for Struct#select. [Feature #13784] -Time:: +[Time] - New features:: + [New features] * Time.new and Time#getlocal accept a timezone object as well as a UTC offset string. Time#+, Time#-, and Time#succ also preserve the timezone. [Feature #14850] -TracePoint:: +[TracePoint] - New features:: + [New features] * "script_compiled" event is supported. [Feature #15287] - New methods:: + [New methods] * TracePoint#parameters [Feature #14694] @@ -338,23 +338,23 @@ TracePoint:: * TracePoint#eval_script [Feature #15287] - Modified method:: + [Modified method] * TracePoint#enable accepts new keywords "target:" and "target_line:". [Feature #15289] === Stdlib updates (outstanding ones only) -BigDecimal:: +[BigDecimal] Update to version 1.4.0. This version includes several compatibility issues, see Compatibility issues section below for details. - Modified method:: + [Modified method] * BigDecimal() accepts the new keyword "exception:" similar to Float(). - Note for the differences among recent versions:: + [Note for the differences among recent versions] You should want to know the differences among recent versions of bigdecimal. Please select the suitable version of bigdecimal according to the following @@ -371,13 +371,13 @@ BigDecimal:: * 2.0.0 will be released soon after releasing Ruby 2.6.0. This version will not have the BigDecimal.new method. -Bundler:: +[Bundler] * Add Bundler to Standard Library. [Feature #12733] * Use 1.17.2, the latest stable version. -Coverage:: +[Coverage] A oneshot_lines mode is added. [Feature #15022] @@ -386,7 +386,7 @@ Coverage:: A hook for each line is fired at most once, and after it is fired the hook flag is removed, i.e., it runs with zero overhead. - New options:: + [New options] * Add +:oneshot_lines+ keyword argument to Coverage.start. @@ -394,20 +394,20 @@ Coverage:: If +clear+ is true, it clears the counters to zero. If +stop+ is true, it disables coverage measurement. - New methods:: + [New methods] * Coverage.line_stub, which is a simple helper function that creates the "stub" of line coverage from a given source code. -CSV:: +[CSV] * Upgrade to 3.0.2. This includes performance improvements especially for writing. Writing is about 2 times faster. See https://github.com/ruby/csv/blob/master/NEWS.md. -ERB:: +[ERB] - New options:: + [New options] * Add +:trim_mode+ and +:eoutvar+ keyword arguments to ERB.new. Now non-keyword arguments other than the first one are softly deprecated @@ -416,15 +416,15 @@ ERB:: * erb command's <tt>-S</tt> option is deprecated, and will be removed in the next version. -FileUtils:: +[FileUtils] - New methods:: + [New methods] * FileUtils#cp_lr. [Feature #4189] -Matrix:: +[Matrix] - New methods:: + [New methods] * Matrix#antisymmetric?, Matrix#skew_symmetric? @@ -436,30 +436,30 @@ Matrix:: * Vector#[]= -Net:: +[Net] - New options:: + [New options] * Add +:write_timeout+ keyword argument to Net::HTTP.new. [Feature #13396] - New methods:: + [New methods] * Add Net::HTTP#write_timeout and Net::HTTP#write_timeout=. [Feature #13396] - New constant:: + [New constant] * Add Net::HTTPClientException to deprecate Net::HTTPServerException, whose name is misleading. [Bug #14688] -NKF:: +[NKF] * Upgrade to nkf v2.1.5 -Psych:: +[Psych] * Upgrade to Psych 3.1.0 -RDoc:: +[RDoc] * Become about 2 times faster. @@ -478,12 +478,12 @@ RDoc:: * Fix many parsing bugs. -REXML:: +[REXML] * Upgrade to REXML 3.1.9. See https://github.com/ruby/rexml/blob/master/NEWS.md. - Improved some XPath implementations:: + [Improved some XPath implementations] * <code>concat()</code> function: Stringify all arguments before concatenating. @@ -493,7 +493,7 @@ REXML:: * Support <code>"*:#{ELEMENT_NAME}"</code> syntax in XPath 2.0. - Fixed some XPath implementations:: + [Fixed some XPath implementations] * <code>"//#{ELEMENT_NAME}[#{POSITION}]"</code> case @@ -517,14 +517,14 @@ REXML:: * <code>"name(#{NODE_SET})"</code> case -RSS:: +[RSS] - New options:: + [New options] * RSS::Parser.parse now accepts options as Hash. +:validate+ , +:ignore_unknown_element+ , +:parser_class+ options are available. -RubyGems:: +[RubyGems] * Upgrade to RubyGems 3.0.1 @@ -532,32 +532,32 @@ RubyGems:: * https://blog.rubygems.org/2018/12/23/3.0.1-released.html -Set:: +[Set] - Aliased method:: + [Aliased method] * Set#filter! is a new alias for Set#select!. [Feature #13784] -URI:: +[URI] - New constant:: + [New constant] * Add URI::File to handle the file URI scheme. [Feature #14035] === Compatibility issues (excluding feature bug fixes) -Dir:: +[Dir] * Dir.glob with <code>'\0'</code>-separated pattern list will be deprecated, and is now warned. [Feature #14643] -File:: +[File] * File.read, File.binread, File.write, File.binwrite, File.foreach, and File.readlines do not invoke external commands even if the path starts with the pipe character <code>'|'</code>. [Feature #14245] -Object:: +[Object] * Object#=~ is deprecated. [Feature #15231] @@ -580,7 +580,7 @@ Object:: * thwait * tracer -BigDecimal:: +[BigDecimal] * The following methods are removed. @@ -595,7 +595,7 @@ BigDecimal:: * BigDecimal.new will be removed in version 2.0. -Pathname:: +[Pathname] * Pathname#read, Pathname#binread, Pathname#write, Pathname#binwrite, Pathname#each_line and Pathname#readlines do not invoke external @@ -650,12 +650,12 @@ Pathname:: in their names. This eliminates the burden of each teeny upgrade on the platform that users need to rebuild every extension library. - Before:: + [Before] * libruby.2.6.0.dylib * libruby.2.6.dylib -> libruby.2.6.0.dylib * libruby.dylib -> libruby.2.6.0.dylib - After:: + [After] * libruby.2.6.dylib * libruby.dylib -> libruby.2.6.dylib diff --git a/doc/NEWS/NEWS-2.7.0 b/doc/NEWS/NEWS-2.7.0 new file mode 100644 index 0000000000..7607a473de --- /dev/null +++ b/doc/NEWS/NEWS-2.7.0 @@ -0,0 +1,845 @@ +# -*- rdoc -*- + += NEWS for Ruby 2.7.0 + +This document is a list of user visible feature changes made between +releases except for bug fixes. + +Note that each entry is kept so brief that no reason behind or reference +information is supplied with. For a full list of changes with all +sufficient information, see the ChangeLog file or Redmine +(e.g. <tt>https://bugs.ruby-lang.org/issues/$FEATURE_OR_BUG_NUMBER</tt>). + +== Changes since the 2.6.0 release + +=== Language changes + +==== Pattern matching + +* Pattern matching is introduced as an experimental feature. [Feature #14912] + + case [0, [1, 2, 3]] + in [a, [b, *c]] + p a #=> 0 + p b #=> 1 + p c #=> [2, 3] + end + + case {a: 0, b: 1} + in {a: 0, x: 1} + :unreachable + in {a: 0, b: var} + p var #=> 1 + end + + case -1 + in 0 then :unreachable + in 1 then :unreachable + end #=> NoMatchingPatternError + + json = <<END + { + "name": "Alice", + "age": 30, + "children": [{ "name": "Bob", "age": 2 }] + } + END + + JSON.parse(json, symbolize_names: true) in {name: "Alice", children: [{name: name, age: age}]} + + p name #=> "Bob" + p age #=> 2 + + JSON.parse(json, symbolize_names: true) in {name: "Alice", children: [{name: "Charlie", age: age}]} + #=> NoMatchingPatternError + +* See the following slides for more details: + * https://speakerdeck.com/k_tsj/pattern-matching-new-feature-in-ruby-2-dot-7 + * Note that the slides are slightly obsolete. + +* The warning against pattern matching can be suppressed with + {-W:no-experimental option}[#label-Warning+option]. + +==== The spec of keyword arguments is changed towards 3.0 + +* Automatic conversion of keyword arguments and positional arguments is + deprecated, and conversion will be removed in Ruby 3. [Feature #14183] + + * When a method call passes a Hash at the last argument, and when it + passes no keywords, and when the called method accepts keywords, + a warning is emitted. To continue treating the hash as keywords, + add a double splat operator to avoid the warning and ensure + correct behavior in Ruby 3. + + def foo(key: 42); end; foo({key: 42}) # warned + def foo(**kw); end; foo({key: 42}) # warned + def foo(key: 42); end; foo(**{key: 42}) # OK + def foo(**kw); end; foo(**{key: 42}) # OK + + * When a method call passes keywords to a method that accepts keywords, + but it does not pass enough required positional arguments, the + keywords are treated as a final required positional argument, and a + warning is emitted. Pass the argument as a hash instead of keywords + to avoid the warning and ensure correct behavior in Ruby 3. + + def foo(h, **kw); end; foo(key: 42) # warned + def foo(h, key: 42); end; foo(key: 42) # warned + def foo(h, **kw); end; foo({key: 42}) # OK + def foo(h, key: 42); end; foo({key: 42}) # OK + + * When a method accepts specific keywords but not a keyword splat, and + a hash or keywords splat is passed to the method that includes both + Symbol and non-Symbol keys, the hash will continue to be split, and + a warning will be emitted. You will need to update the calling code + to pass separate hashes to ensure correct behavior in Ruby 3. + + def foo(h={}, key: 42); end; foo("key" => 43, key: 42) # warned + def foo(h={}, key: 42); end; foo({"key" => 43, key: 42}) # warned + def foo(h={}, key: 42); end; foo({"key" => 43}, key: 42) # OK + + * If a method does not accept keywords, and is called with keywords, + the keywords are still treated as a positional hash, with no warning. + This behavior will continue to work in Ruby 3. + + def foo(opt={}); end; foo( key: 42 ) # OK + +* Non-symbols are allowed as keyword argument keys if the method accepts + arbitrary keywords. [Feature #14183] + + * Non-Symbol keys in a keyword arguments hash were prohibited in 2.6.0, + but are now allowed again. [Bug #15658] + + def foo(**kw); p kw; end; foo("str" => 1) #=> {"str"=>1} + +* <code>**nil</code> is allowed in method definitions to explicitly mark + that the method accepts no keywords. Calling such a method with keywords + will result in an ArgumentError. [Feature #14183] + + def foo(h, **nil); end; foo(key: 1) # ArgumentError + def foo(h, **nil); end; foo(**{key: 1}) # ArgumentError + def foo(h, **nil); end; foo("str" => 1) # ArgumentError + def foo(h, **nil); end; foo({key: 1}) # OK + def foo(h, **nil); end; foo({"str" => 1}) # OK + +* Passing an empty keyword splat to a method that does not accept keywords + no longer passes an empty hash, unless the empty hash is necessary for + a required parameter, in which case a warning will be emitted. Remove + the double splat to continue passing a positional hash. [Feature #14183] + + h = {}; def foo(*a) a end; foo(**h) # [] + h = {}; def foo(a) a end; foo(**h) # {} and warning + h = {}; def foo(*a) a end; foo(h) # [{}] + h = {}; def foo(a) a end; foo(h) # {} + +* Above warnings can be suppressed also with {-W:no-deprecated option}[#label-Warning+option]. + +==== Numbered parameters + +* Numbered parameters as default block parameters are introduced. [Feature #4475] + + [1, 2, 10].map { _1.to_s(16) } #=> ["1", "2", "a"] + [[1, 2], [3, 4]].map { _1 + _2 } #=> [3, 7] + + You can still define a local variable named +_1+ and so on, + and that is honored when present, but renders a warning. + + _1 = 0 #=> warning: `_1' is reserved for numbered parameter; consider another name + [1].each { p _1 } # prints 0 instead of 1 + +==== proc/lambda without block is deprecated + +* Proc.new and Kernel#proc with no block in a method called with a block will + now display a warning. + + def foo + proc + end + foo { puts "Hello" } #=> warning: Capturing the given block using Kernel#proc is deprecated; use `&block` instead + + This warning can be suppressed with {-W:no-deprecated option}[#label-Warning+option]. + +* Kernel#lambda with no block in a method called with a block raises an exception. + + def bar + lambda + end + bar { puts "Hello" } #=> tried to create Proc object without a block (ArgumentError) + +==== Other miscellaneous changes + +* A beginless range is experimentally introduced. It might be useful + in +case+, new call-sequence of the <code>Comparable#clamp</code>, + constants and DSLs. [Feature #14799] + + ary[..3] # identical to ary[0..3] + + case RUBY_VERSION + when ..."2.4" then puts "EOL" + # ... + end + + age.clamp(..100) + + where(sales: ..100) + +* Setting <code>$;</code> to a non-nil value will now display a warning. [Feature #14240] + This includes the usage in String#split. + This warning can be suppressed with {-W:no-deprecated option}[#label-Warning+option]. + +* Setting <code>$,</code> to a non-nil value will now display a warning. [Feature #14240] + This includes the usage in Array#join. + This warning can be suppressed with {-W:no-deprecated option}[#label-Warning+option]. + +* Quoted here-document identifiers must end within the same line. + + <<"EOS + " # This had been warned since 2.4; Now it raises a SyntaxError + EOS + +* The flip-flop syntax deprecation is reverted. [Feature #5400] + +* Comment lines can be placed between fluent dot now. + + foo + # .bar + .baz # => foo.baz + +* Calling a private method with a literal +self+ as the receiver + is now allowed. [Feature #11297] [Feature #16123] + +* Modifier rescue now operates the same for multiple assignment as single + assignment. [Bug #8279] + + a, b = raise rescue [1, 2] + # Previously parsed as: (a, b = raise) rescue [1, 2] + # Now parsed as: a, b = (raise rescue [1, 2]) + +* +yield+ in singleton class syntax will now display a warning. This behavior + will soon be deprecated. [Feature #15575]. + + def foo + class << Object.new + yield #=> warning: `yield' in class syntax will not be supported from Ruby 3.0. [Feature #15575] + end + end + foo { p :ok } + + This warning can be suppressed with {-W:no-deprecated option}[#label-Warning+option]. + +* Argument forwarding by <code>(...)</code> is introduced. [Feature #16253] + + def foo(...) + bar(...) + end + + All arguments to +foo+ are forwarded to +bar+, including keyword and + block arguments. + Note that the parentheses are mandatory. <code>bar ...</code> is parsed + as an endless range. + +* Access and setting of <code>$SAFE</code> will now always display a warning. + <code>$SAFE</code> will become a normal global variable in Ruby 3.0. [Feature #16131] + +* <code>Object#{taint,untaint,trust,untrust}</code> and related functions in the C-API + no longer have an effect (all objects are always considered untainted), and will now + display a warning in verbose mode. This warning will be disabled even in non-verbose mode in + Ruby 3.0, and the methods and C functions will be removed in Ruby 3.2. [Feature #16131] + +* Refinements take place at Object#method and Module#instance_method. [Feature #15373] + +=== Command line options + +==== Warning option + +The +-W+ option has been extended with a following +:+, to manage categorized +warnings. [Feature #16345] [Feature #16420] + +* To suppress deprecation warnings: + + $ ruby -e '$; = ""' + -e:1: warning: `$;' is deprecated + + $ ruby -W:no-deprecated -e '$; = //' + +* It works with the +RUBYOPT+ environment variable: + + $ RUBYOPT=-W:no-deprecated ruby -e '$; = //' + +* To suppress experimental feature warnings: + + $ ruby -e '0 in a' + -e:1: warning: Pattern matching is experimental, and the behavior may change in future versions of Ruby! + + $ ruby -W:no-experimental -e '0 in a' + +* To suppress both by using +RUBYOPT+, set space separated values: + + $ RUBYOPT='-W:no-deprecated -W:no-experimental' ruby -e '($; = "") in a' + +See also Warning in {Core classes updates}[#label-Core+classes+updates+-28outstanding+ones+only-29]. + +=== Core classes updates (outstanding ones only) + +[Array] + + [New methods] + + * Added Array#intersection. [Feature #16155] + + * Added Array#minmax, with a faster implementation than Enumerable#minmax. [Bug #15929] + +[Comparable] + + [Modified method] + + * Comparable#clamp now accepts a Range argument. [Feature #14784] + + -1.clamp(0..2) #=> 0 + 1.clamp(0..2) #=> 1 + 3.clamp(0..2) #=> 2 + # With beginless and endless ranges: + -1.clamp(0..) #=> 0 + 3.clamp(..2) #=> 2 + + +[Complex] + + [New method] + + * Added Complex#<=>. + So <code>0 <=> 0i</code> will not raise NoMethodError. [Bug #15857] + +[Dir] + + [Modified methods] + + * Dir.glob and Dir.[] no longer allow NUL-separated glob pattern. + Use Array instead. [Feature #14643] + +[Encoding] + + [New encoding] + + * Added new encoding CESU-8. [Feature #15931] + +[Enumerable] + + [New methods] + + * Added Enumerable#filter_map. [Feature #15323] + + [1, 2, 3].filter_map {|x| x.odd? ? x.to_s : nil } #=> ["1", "3"] + + * Added Enumerable#tally. [Feature #11076] + + ["A", "B", "C", "B", "A"].tally #=> {"A"=>2, "B"=>2, "C"=>1} + +[Enumerator] + + [New methods] + + * Added Enumerator.produce to generate an Enumerator from any custom + data transformation. [Feature #14781] + + require "date" + dates = Enumerator.produce(Date.today, &:succ) #=> infinite sequence of dates + dates.detect(&:tuesday?) #=> next Tuesday + + * Added Enumerator::Lazy#eager that generates a non-lazy enumerator + from a lazy enumerator. [Feature #15901] + + a = %w(foo bar baz) + e = a.lazy.map {|x| x.upcase }.map {|x| x + "!" }.eager + p e.class #=> Enumerator + p e.map {|x| x + "?" } #=> ["FOO!?", "BAR!?", "BAZ!?"] + + * Added Enumerator::Yielder#to_proc so that a Yielder object + can be directly passed to another method as a block + argument. [Feature #15618] + + * Added Enumerator::Lazy#with_index be lazy + Previously, Enumerator::Lazy#with_index was not defined, so it + picked up the default implementation from Enumerator, which was + not lazy. [Bug #7877] + + ("a"..).lazy.with_index(1) { |it, index| puts "#{index}:#{it}" }.take(3).force + # => 1:a + # 2:b + # 3:c + +[Fiber] + + [New method] + + * Added Fiber#raise that behaves like Fiber#resume but raises an + exception on the resumed fiber. [Feature #10344] + +[File] + + [New method] + + * Added File.absolute_path? to check whether a path is absolute or + not in a portable way. [Feature #15868] + + File.absolute_path?("/foo") # => true (on *nix) + File.absolute_path?("C:/foo") # => true (on Windows) + File.absolute_path?("foo") # => false + + [Modified method] + + * File.extname now returns a dot string for names ending with a dot on + non-Windows platforms. [Bug #15267] + + File.extname("foo.") #=> "." + +[FrozenError] + + [New method] + + * Added FrozenError#receiver to return the frozen object on which + modification was attempted. To set this object when raising + FrozenError in Ruby code, FrozenError.new accepts a +:receiver+ + option. [Feature #15751] + +[GC] + + [New method] + + * Added GC.compact method for compacting the heap. + This function compacts live objects in the heap so that fewer pages may + be used, and the heap may be more CoW (copy-on-write) friendly. [Feature #15626] + + Details on the algorithm and caveats can be found here: + https://bugs.ruby-lang.org/issues/15626 + +[IO] + + [New method] + + * Added IO#set_encoding_by_bom to check the BOM and set the external + encoding. [Bug #15210] + +[Integer] + + [Modified method] + + * Integer#[] now supports range operations. [Feature #8842] + + 0b01001101[2, 4] #=> 0b0011 + 0b01001100[2..5] #=> 0b0011 + 0b01001100[2...6] #=> 0b0011 + # ^^^^ + +[Method] + + [Modified method] + + * Method#inspect shows more information. [Feature #14145] + +[Module] + + [New methods] + + * Added Module#const_source_location to retrieve the location where a + constant is defined. [Feature #10771] + + * Added Module#ruby2_keywords for marking a method as passing keyword + arguments through a regular argument splat, useful when delegating + all arguments to another method in a way that can be backwards + compatible with older Ruby versions. [Bug #16154] + + [Modified methods] + + * Module#autoload? now takes an +inherit+ optional argument, like + Module#const_defined?. [Feature #15777] + + * Module#name now always returns a frozen String. The returned String is + always the same for a given Module. This change is + experimental. [Feature #16150] + +[NilClass / TrueClass / FalseClass] + + [Modified methods] + + * NilClass#to_s, TrueClass#to_s, and FalseClass#to_s now always return a + frozen String. The returned String is always the same for each of these + values. This change is experimental. [Feature #16150] + +[ObjectSpace::WeakMap] + + [Modified method] + + * ObjectSpace::WeakMap#[]= now accepts special objects as either key or + values. [Feature #16035] + +[Proc] + + [New method] + + * Added Proc#ruby2_keywords for marking the proc as passing keyword + arguments through a regular argument splat, useful when delegating + all arguments to another method or proc in a way that can be backwards + compatible with older Ruby versions. [Feature #16404] + +[Range] + + [New method] + + * Added Range#minmax, with a faster implementation than Enumerable#minmax. + It returns a maximum that now corresponds to Range#max. [Bug #15807] + + [Modified method] + + * Range#=== now uses Range#cover? for String arguments, too (in Ruby 2.6, it was + changed from Range#include? for all types except strings). [Bug #15449] + + +[RubyVM] + + [Removed method] + + * +RubyVM.resolve_feature_path+ moved to + <code>$LOAD_PATH.resolve_feature_path</code>. [Feature #15903] [Feature #15230] + +[String] + + [Unicode] + + * Update Unicode version and Emoji version from 11.0.0 to + 12.0.0. [Feature #15321] + + * Update Unicode version to 12.1.0, adding support for + U+32FF SQUARE ERA NAME REIWA. [Feature #15195] + + * Update Unicode Emoji version to 12.1. [Feature #16272] + +[Symbol] + + [New methods] + + * Added Symbol#start_with? and Symbol#end_with? methods. [Feature #16348] + +[Time] + + [New methods] + + * Added Time#ceil method. [Feature #15772] + + * Added Time#floor method. [Feature #15653] + + [Modified method] + + * Time#inspect is separated from Time#to_s and it shows + the time's sub second. [Feature #15958] + +[UnboundMethod] + + [New method] + + * Added UnboundMethod#bind_call method. [Feature #15955] + + <code>umethod.bind_call(obj, ...)</code> is semantically equivalent + to <code>umethod.bind(obj).call(...)</code>. This idiom is used in + some libraries to call a method that is overridden. The added + method does the same without allocation of an intermediate Method + object. + + class Foo + def add_1(x) + x + 1 + end + end + class Bar < Foo + def add_1(x) # override + x + 2 + end + end + + obj = Bar.new + p obj.add_1(1) #=> 3 + p Foo.instance_method(:add_1).bind(obj).call(1) #=> 2 + p Foo.instance_method(:add_1).bind_call(obj, 1) #=> 2 + +[Warning] + + [New methods] + + * Added Warning.[] and Warning.[]= to manage emitting/suppressing + some categories of warnings. [Feature #16345] [Feature #16420] + +[$LOAD_PATH] + + [New method] + + * Added <code>$LOAD_PATH.resolve_feature_path</code>. [Feature #15903] [Feature #15230] + +=== Stdlib updates (outstanding ones only) + +[Bundler] + + * Upgrade to Bundler 2.1.2. + See https://github.com/bundler/bundler/releases/tag/v2.1.2 + +[CGI] + + * CGI.escapeHTML becomes 2~5x faster when there is at least one escaped character. + See https://github.com/ruby/ruby/pull/2226 + +[CSV] + + * Upgrade to 3.1.2. + See https://github.com/ruby/csv/blob/master/NEWS.md. + +[Date] + + * Date.jisx0301, Date#jisx0301, and Date.parse support the new Japanese + era. [Feature #15742] + +[Delegator] + + * Object#DelegateClass accepts a block and module_evals it in the context + of the returned class, similar to Class.new and Struct.new. + +[ERB] + + * Prohibit marshaling ERB instance. + +[IRB] + + * Introduce syntax highlighting inspired by the Pry gem to Binding#irb + source lines, REPL input, and inspect output of some core-class objects. + + * Introduce multiline editing mode provided by Reline. + + * Show documentation when completion. + + * Enable auto indent and save/load history by default. + +[JSON] + + * Upgrade to 2.3.0. + +[Net::FTP] + + * Add Net::FTP#features to check available features, and Net::FTP#option to + enable/disable each of them. [Feature #15964] + +[Net::HTTP] + + * Add +ipaddr+ optional parameter to Net::HTTP#start to replace the address for + the TCP/IP connection. [Feature #5180] + +[Net::IMAP] + + * Add Server Name Indication (SNI) support. [Feature #15594] + +[open-uri] + + * Warn open-uri's "open" method at Kernel. + Use URI.open instead. [Misc #15893] + + * The default charset of "text/*" media type is UTF-8 instead of + ISO-8859-1. [Bug #15933] + +[OptionParser] + + * Now show "Did you mean?" for unknown options. [Feature #16256] + + test.rb: + + require "optparse" + OptionParser.new do |opts| + opts.on("-f", "--foo", "foo") {|v| } + opts.on("-b", "--bar", "bar") {|v| } + opts.on("-c", "--baz", "baz") {|v| } + end.parse! + + example: + + $ ruby test.rb --baa + Traceback (most recent call last): + test.rb:7:in `<main>': invalid option: --baa (OptionParser::InvalidOption) + Did you mean? baz + bar + +[Pathname] + + * Pathname.glob now delegates 3 arguments to Dir.glob + to accept +base+ keyword. [Feature #14405] + +[Racc] + + * Merge 1.4.15 from upstream repository and added cli of racc. + +[Reline] + + * New stdlib that is compatible with the readline stdlib but is + implemented in pure Ruby. It also provides a multiline editing mode. + +[REXML] + + * Upgrade to 3.2.3. + See https://github.com/ruby/rexml/blob/master/NEWS.md. + +[RSS] + + * Upgrade to RSS 0.2.8. + See https://github.com/ruby/rss/blob/master/NEWS.md. + +[RubyGems] + + * Upgrade to RubyGems 3.1.2. + * https://github.com/rubygems/rubygems/releases/tag/v3.1.0 + * https://github.com/rubygems/rubygems/releases/tag/v3.1.1 + * https://github.com/rubygems/rubygems/releases/tag/v3.1.2 + +[StringScanner] + + * Upgrade to 1.0.3. + See https://github.com/ruby/strscan/blob/master/NEWS.md. + +=== Compatibility issues (excluding feature bug fixes) + +* The following libraries are no longer bundled gems. + Install corresponding gems to use these features. + * CMath (cmath gem) + * Scanf (scanf gem) + * Shell (shell gem) + * Synchronizer (sync gem) + * ThreadsWait (thwait gem) + * E2MM (e2mmap gem) + +[Proc] + * The Proc#to_s format was changed. [Feature #16101] + +[Range] + * Range#minmax used to iterate on the range to determine the maximum. + It now uses the same algorithm as Range#max. In rare cases (e.g. + ranges of Floats or Strings), this may yield different results. [Bug #15807] + +=== Stdlib compatibility issues (excluding feature bug fixes) + +* Promote stdlib to default gems + * The following default gems were published on rubygems.org + * benchmark + * cgi + * delegate + * getoptlong + * net-pop + * net-smtp + * open3 + * pstore + * readline + * readline-ext + * singleton + * The following default gems were only promoted at ruby-core, + but not yet published on rubygems.org. + * monitor + * observer + * timeout + * tracer + * uri + * yaml +* The <tt>did_you_mean</tt> gem has been promoted up to a default gem from a bundled gem + +[pathname] + + * Kernel#Pathname when called with a Pathname argument now returns + the argument instead of creating a new Pathname. This is more + similar to other Kernel methods, but can break code that modifies + the return value and expects the argument not to be modified. + +[profile.rb, Profiler__] + + * Removed from standard library. It was unmaintained since Ruby 2.0.0. + +=== C API updates + +* Many <code>*_kw</code> functions have been added for setting whether + the final argument being passed should be treated as keywords. You + may need to switch to these functions to avoid keyword argument + separation warnings, and to ensure correct behavior in Ruby 3. + +* The <code>:</code> character in rb_scan_args format string is now + treated as keyword arguments. Passing a positional hash instead of + keyword arguments will emit a deprecation warning. + +* C API declarations with +ANYARGS+ are changed not to use +ANYARGS+. + See https://github.com/ruby/ruby/pull/2404 + +=== Implementation improvements + +[Fiber] + + * Allow selecting different coroutine implementations by using + +--with-coroutine=+, e.g. + + $ ./configure --with-coroutine=ucontext + $ ./configure --with-coroutine=copy + + * Replace previous stack cache with fiber pool cache. The fiber pool + allocates many stacks in a single memory region. Stack allocation + becomes O(log N) and fiber creation is amortized O(1). Around 10x + performance improvement was measured in micro-benchmarks. + https://github.com/ruby/ruby/pull/2224 + +[File] + * File.realpath now uses realpath(3) on many platforms, which can + significantly improve performance. [Feature #15797] + +[Hash] + * Change data structure of small Hash objects. [Feature #15602] + +[Monitor] + * Monitor class is written in C-extension. [Feature #16255] + +[Thread] + + * VM stack memory allocation is now combined with native thread stack, + improving thread allocation performance and reducing allocation related + failures. Around 10x performance improvement was measured in micro-benchmarks. + +[JIT] + + * JIT-ed code is recompiled to less-optimized code when an optimization assumption is invalidated. + + * Method inlining is performed when a method is considered as pure. + This optimization is still experimental and many methods are NOT considered as pure yet. + + * The default value of +--jit-max-cache+ is changed from 1,000 to 100. + + * The default value of +--jit-min-calls+ is changed from 5 to 10,000. + +[RubyVM] + + * Per-call-site method cache, which has been there since around 1.9, was + improved: cache hit rate raised from 89% to 94%. + See https://github.com/ruby/ruby/pull/2583 + +[RubyVM::InstructionSequence] + + * RubyVM::InstructionSequence#to_binary method generates compiled binary. + The binary size is reduced. [Feature #16163] + +=== Miscellaneous changes + +* Support for IA64 architecture has been removed. Hardware for testing was + difficult to find, native fiber code is difficult to implement, and it added + non-trivial complexity to the interpreter. [Feature #15894] + +* Require compilers to support C99. [Misc #15347] + + * Details of our dialect: https://bugs.ruby-lang.org/projects/ruby-master/wiki/C99 + +* Ruby's upstream repository is changed from Subversion to Git. + + * https://git.ruby-lang.org/ruby.git + + * RUBY_REVISION class is changed from Integer to String. + + * RUBY_DESCRIPTION includes Git revision instead of Subversion's one. + +* Support built-in methods in Ruby with the <code>_\_builtin_</code> syntax. [Feature #16254] + + Some methods are defined in *.rb (such as trace_point.rb). + For example, it is easy to define a method which accepts keyword arguments. diff --git a/doc/NEWS/NEWS-3.0.0.md b/doc/NEWS/NEWS-3.0.0.md new file mode 100644 index 0000000000..9fbaf504b4 --- /dev/null +++ b/doc/NEWS/NEWS-3.0.0.md @@ -0,0 +1,829 @@ +# NEWS for Ruby 3.0.0 + +This document is a list of user visible feature changes +since the **2.7.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Language changes + +* Keyword arguments are now separated from positional arguments. + Code that resulted in deprecation warnings in Ruby 2.7 will now + result in ArgumentError or different behavior. [[Feature #14183]] + +* Procs accepting a single rest argument and keywords are no longer + subject to autosplatting. This now matches the behavior of Procs + accepting a single rest argument and no keywords. + [[Feature #16166]] + + ```ruby + pr = proc{|*a, **kw| [a, kw]} + + pr.call([1]) + # 2.7 => [[1], {}] + # 3.0 => [[[1]], {}] + + pr.call([1, {a: 1}]) + # 2.7 => [[1], {:a=>1}] # and deprecation warning + # 3.0 => [[[1, {:a=>1}]], {}] + ``` + +* Arguments forwarding (`...`) now supports leading arguments. + [[Feature #16378]] + + ```ruby + def method_missing(meth, ...) + send(:"do_#{meth}", ...) + end + ``` + +* Pattern matching (`case/in`) is no longer experimental. [[Feature #17260]] + +* One-line pattern matching is redesigned. [EXPERIMENTAL] + + * `=>` is added. It can be used like a rightward assignment. + [[Feature #17260]] + + ```ruby + 0 => a + p a #=> 0 + + {b: 0, c: 1} => {b:} + p b #=> 0 + ``` + + * `in` is changed to return `true` or `false`. [[Feature #17371]] + + ```ruby + # version 3.0 + 0 in 1 #=> false + + # version 2.7 + 0 in 1 #=> raise NoMatchingPatternError + ``` + +* Find-pattern is added. [EXPERIMENTAL] + [[Feature #16828]] + + ```ruby + case ["a", 1, "b", "c", 2, "d", "e", "f", 3] + in [*pre, String => x, String => y, *post] + p pre #=> ["a", 1] + p x #=> "b" + p y #=> "c" + p post #=> [2, "d", "e", "f", 3] + end + ``` + +* Endless method definition is added. [EXPERIMENTAL] + [[Feature #16746]] + + ```ruby + def square(x) = x * x + ``` + +* Interpolated String literals are no longer frozen when + `# frozen-string-literal: true` is used. [[Feature #17104]] + +* Magic comment `shareable_constant_value` added to freeze constants. + See {Magic Comments}[rdoc-ref:syntax/comments.rdoc@Magic+Comments] for more details. + [[Feature #17273]] + +* A {static analysis}[rdoc-label:label-Static+analysis] foundation is + introduced. + * {RBS}[rdoc-label:label-RBS] is introduced. It is a type definition + language for Ruby programs. + * {TypeProf}[rdoc-label:label-TypeProf] is experimentally bundled. It is a + type analysis tool for Ruby programs. + +* Deprecation warnings are no longer shown by default (since Ruby 2.7.2). + Turn them on with `-W:deprecated` (or with `-w` to show other warnings too). + [[Feature #16345]] + +* `$SAFE` and `$KCODE` are now normal global variables with no special behavior. + C-API methods related to `$SAFE` have been removed. + [[Feature #16131]] [[Feature #17136]] + +* yield in singleton class definitions in methods is now a SyntaxError + instead of a warning. yield in a class definition outside of a method + is now a SyntaxError instead of a LocalJumpError. [[Feature #15575]] + +* When a class variable is overtaken by the same definition in an + ancestor class/module, a RuntimeError is now raised (previously, + it only issued a warning in verbose mode). Additionally, accessing a + class variable from the toplevel scope is now a RuntimeError. + [[Bug #14541]] + +* Assigning to a numbered parameter is now a SyntaxError instead of + a warning. + +## Command line options + +### `--help` option + +When the environment variable `RUBY_PAGER` or `PAGER` is present and has +a non-empty value, and the standard input and output are tty, the `--help` +option shows the help message via the pager designated by the value. +[[Feature #16754]] + +### `--backtrace-limit` option + +The `--backtrace-limit` option limits the maximum length of a backtrace. +[[Feature #8661]] + +## Core classes updates + +Outstanding ones only. + +* Array + + * The following methods now return Array instances instead of + subclass instances when called on subclass instances: + [[Bug #6087]] + + * Array#drop + * Array#drop_while + * Array#flatten + * Array#slice! + * Array#slice / Array#[] + * Array#take + * Array#take_while + * Array#uniq + * Array#* + + * Can be sliced with Enumerator::ArithmeticSequence + + ```ruby + dirty_data = ['--', 'data1', '--', 'data2', '--', 'data3'] + dirty_data[(1..).step(2)] # take each second element + # => ["data1", "data2", "data3"] + ``` + +* Binding + + * Binding#eval when called with one argument will use `"(eval)"` + for `__FILE__` and `1` for `__LINE__` in the evaluated code. + [[Bug #4352]] [[Bug #17419]] + +* ConditionVariable + + * ConditionVariable#wait may now invoke the `block`/`unblock` scheduler + hooks in a non-blocking context. [[Feature #16786]] + +* Dir + + * Dir.glob and Dir.[] now sort the results by default, and + accept the `sort:` keyword option. [[Feature #8709]] + +* ENV + + * ENV.except has been added, which returns a hash excluding the + given keys and their values. [[Feature #15822]] + + * Windows: Read ENV names and values as UTF-8 encoded Strings + [[Feature #12650]] + +* Encoding + + * Added new encoding IBM720. [[Feature #16233]] + + * Changed default for Encoding.default_external to UTF-8 on Windows + [[Feature #16604]] + +* Fiber + + * Fiber.new(blocking: true/false) allows you to create non-blocking + execution contexts. [[Feature #16786]] + + * Fiber#blocking? tells whether the fiber is non-blocking. [[Feature #16786]] + + * Fiber#backtrace and Fiber#backtrace_locations provide per-fiber backtrace. + [[Feature #16815]] + + * The limitation of Fiber#transfer is relaxed. [[Bug #17221]] + +* GC + + * GC.auto_compact= and GC.auto_compact have been added to control + when compaction runs. Setting `auto_compact=` to `true` will cause + compaction to occur during major collections. At the moment, + compaction adds significant overhead to major collections, so please + test first! [[Feature #17176]] + +* Hash + + * Hash#transform_keys and Hash#transform_keys! now accept a hash that maps + keys to new keys. [[Feature #16274]] + + * Hash#except has been added, which returns a hash excluding the + given keys and their values. [[Feature #15822]] + +* IO + + * IO#nonblock? now defaults to `true`. [[Feature #16786]] + + * IO#wait_readable, IO#wait_writable, IO#read, IO#write and other + related methods (e.g. IO#puts, IO#gets) may invoke the scheduler hook + `#io_wait(io, events, timeout)` in a non-blocking execution context. + [[Feature #16786]] + +* Kernel + + * Kernel#clone when called with the `freeze: false` keyword will call + `#initialize_clone` with the `freeze: false` keyword. + [[Bug #14266]] + + * Kernel#clone when called with the `freeze: true` keyword will call + `#initialize_clone` with the `freeze: true` keyword, and will + return a frozen copy even if the receiver is unfrozen. + [[Feature #16175]] + + * Kernel#eval when called with two arguments will use `"(eval)"` + for `__FILE__` and `1` for `__LINE__` in the evaluated code. + [[Bug #4352]] + + * Kernel#lambda now warns if called without a literal block. + [[Feature #15973]] + + * Kernel.sleep invokes the scheduler hook `#kernel_sleep(...)` in a + non-blocking execution context. [[Feature #16786]] + +* Module + + * Module#include and Module#prepend now affect classes and modules + that have already included or prepended the receiver, mirroring the + behavior if the arguments were included in the receiver before + the other modules and classes included or prepended the receiver. + [[Feature #9573]] + + ```ruby + class C; end + module M1; end + module M2; end + C.include M1 + M1.include M2 + p C.ancestors #=> [C, M1, M2, Object, Kernel, BasicObject] + ``` + + * Module#public, Module#protected, Module#private, Module#public_class_method, + Module#private_class_method, toplevel "private" and "public" methods + now accept single array argument with a list of method names. [[Feature #17314]] + + * Module#attr_accessor, Module#attr_reader, Module#attr_writer and Module#attr + methods now return an array of defined method names as symbols. + [[Feature #17314]] + + * Module#alias_method now returns the defined alias as a symbol. + [[Feature #17314]] + +* Mutex + + * `Mutex` is now acquired per-`Fiber` instead of per-`Thread`. This change + should be compatible for essentially all usages and avoids blocking when + using a scheduler. [[Feature #16792]] + +* Proc + + * Proc#== and Proc#eql? are now defined and will return true for + separate Proc instances if the procs were created from the same block. + [[Feature #14267]] + +* Queue / SizedQueue + + * Queue#pop, SizedQueue#push and related methods may now invoke the + `block`/`unblock` scheduler hooks in a non-blocking context. + [[Feature #16786]] + +* Ractor + + * New class added to enable parallel execution. See rdoc-ref:ractor.md for + more details. + +* Random + + * `Random::DEFAULT` now refers to the `Random` class instead of being a `Random` instance, + so it can work with `Ractor`. + [[Feature #17322]] + + * `Random::DEFAULT` is deprecated since its value is now confusing and it is no longer global, + use `Kernel.rand`/`Random.rand` directly, or create a `Random` instance with `Random.new` instead. + [[Feature #17351]] + + +* String + + * The following methods now return or yield String instances + instead of subclass instances when called on subclass instances: + [[Bug #10845]] + + * String#* + * String#capitalize + * String#center + * String#chomp + * String#chop + * String#delete + * String#delete_prefix + * String#delete_suffix + * String#downcase + * String#dump + * String#each_char + * String#each_grapheme_cluster + * String#each_line + * String#gsub + * String#ljust + * String#lstrip + * String#partition + * String#reverse + * String#rjust + * String#rpartition + * String#rstrip + * String#scrub + * String#slice! + * String#slice / String#[] + * String#split + * String#squeeze + * String#strip + * String#sub + * String#succ / String#next + * String#swapcase + * String#tr + * String#tr_s + * String#upcase + +* Symbol + + * Symbol#to_proc now returns a lambda Proc. [[Feature #16260]] + + * Symbol#name has been added, which returns the name of the symbol + if it is named. The returned string is frozen. [[Feature #16150]] + +* Fiber + + * Introduce Fiber.set_scheduler for intercepting blocking operations and + Fiber.scheduler for accessing the current scheduler. See + rdoc-ref:fiber.md for more details about what operations are supported and + how to implement the scheduler hooks. [[Feature #16786]] + + * Fiber.blocking? tells whether the current execution context is + blocking. [[Feature #16786]] + +* Thread + + * Thread#join invokes the scheduler hooks `block`/`unblock` in a + non-blocking execution context. [[Feature #16786]] + + * Thread.ignore_deadlock accessor has been added for disabling the + default deadlock detection, allowing the use of signal handlers to + break deadlock. [[Bug #13768]] + +* Warning + + * Warning#warn now supports a category keyword argument. + [[Feature #17122]] + +## Stdlib updates + +Outstanding ones only. + +* BigDecimal + + * Update to BigDecimal 3.0.0 + + * This version is Ractor compatible. + +* Bundler + + * Update to Bundler 2.2.3 + +* CGI + + * Update to 0.2.0 + + * This version is Ractor compatible. + +* CSV + + * Update to CSV 3.1.9 + +* Date + + * Update to Date 3.1.1 + + * This version is Ractor compatible. + +* Digest + + * Update to Digest 3.0.0 + + * This version is Ractor compatible. + +* Etc + + * Update to Etc 1.2.0 + + * This version is Ractor compatible. + +* Fiddle + + * Update to Fiddle 1.0.5 + +* IRB + + * Update to IRB 1.2.6 + +* JSON + + * Update to JSON 2.5.0 + + * This version is Ractor compatible. + +* Set + + * Update to set 1.0.0 + + * SortedSet has been removed for dependency and performance reasons. + + * Set#join is added as a shorthand for `.to_a.join`. + + * Set#<=> is added. + +* Socket + + * Add :connect_timeout to TCPSocket.new [[Feature #17187]] + +* Net::HTTP + + * Net::HTTP#verify_hostname= and Net::HTTP#verify_hostname have been + added to skip hostname verification. [[Feature #16555]] + + * Net::HTTP.get, Net::HTTP.get_response, and Net::HTTP.get_print + can take the request headers as a Hash in the second argument when the + first argument is a URI. [[Feature #16686]] + +* Net::SMTP + + * Add SNI support. + + * Net::SMTP.start arguments are keyword arguments. + + * TLS should not check the host name by default. + +* OpenStruct + + * Initialization is no longer lazy. [[Bug #12136]] + + * Builtin methods can now be overridden safely. [[Bug #15409]] + + * Implementation uses only methods ending with `!`. + + * Ractor compatible. + + * Improved support for YAML. [[Bug #8382]] + + * Use officially discouraged. Read OpenStruct@Caveats section. + +* Pathname + + * Ractor compatible. + +* Psych + + * Update to Psych 3.3.0 + + * This version is Ractor compatible. + +* Reline + + * Update to Reline 0.1.5 + +* RubyGems + + * Update to RubyGems 3.2.3 + +* StringIO + + * Update to StringIO 3.0.0 + + * This version is Ractor compatible. + +* StringScanner + + * Update to StringScanner 3.0.0 + + * This version is Ractor compatible. + +* URI + + * URI.escape and URI.unescape have been removed. + Instead, use the following methods depending on your specific use case. + + * CGI.escape + * URI.encode_www_form + * URI.encode_www_form_component + * CGI.unescape + * URI.decode_www_form + * URI.decode_www_form_component + +## Compatibility issues + +Excluding feature bug fixes. + +* Regexp literals and all Range objects are frozen. [[Feature #8948]] [[Feature #16377]] [[Feature #15504]] + + ```ruby + /foo/.frozen? #=> true + (42...).frozen? # => true + ``` + +* EXPERIMENTAL: Hash#each consistently yields a 2-element array. [[Bug #12706]] + + * Now `{ a: 1 }.each(&->(k, v) { })` raises an ArgumentError + due to lambda's arity check. + +* When writing to STDOUT redirected to a closed pipe, no broken pipe + error message will be shown now. [[Feature #14413]] + +* `TRUE`/`FALSE`/`NIL` constants are no longer defined. + +* Integer#zero? overrides Numeric#zero? for optimization. [[Misc #16961]] + +* Enumerable#grep and Enumerable#grep_v when passed a Regexp and no block no longer modify + Regexp.last_match. [[Bug #17030]] + +* Requiring 'open-uri' no longer redefines `Kernel#open`. + Call `URI.open` directly or `use URI#open` instead. [[Misc #15893]] + +* SortedSet has been removed for dependency and performance reasons. + +## Stdlib compatibility issues + +* Default gems + + * The following libraries are promoted to default gems from stdlib. + + * English + * abbrev + * base64 + * drb + * debug + * erb + * find + * net-ftp + * net-http + * net-imap + * net-protocol + * open-uri + * optparse + * pp + * prettyprint + * resolv-replace + * resolv + * rinda + * set + * securerandom + * shellwords + * tempfile + * tmpdir + * time + * tsort + * un + * weakref + + * The following extensions are promoted to default gems from stdlib. + + * digest + * io-nonblock + * io-wait + * nkf + * pathname + * syslog + * win32ole + +* Bundled gems + + * net-telnet and xmlrpc have been removed from the bundled gems. + If you are interested in maintaining them, please comment on + your plan to https://github.com/ruby/xmlrpc + or https://github.com/ruby/net-telnet. + +* SDBM has been removed from the Ruby standard library. [[Bug #8446]] + + * The issues of sdbm will be handled at https://github.com/ruby/sdbm + +* WEBrick has been removed from the Ruby standard library. [[Feature #17303]] + + * The issues of WEBrick will be handled at https://github.com/ruby/webrick + +## C API updates + +* C API functions related to `$SAFE` have been removed. + [[Feature #16131]] + +* C API header file `ruby/ruby.h` was split. [[GH-2991]] + + This should have no impact on extension libraries, + but users might experience slow compilations. + +* Memory view interface [EXPERIMENTAL] + + * The memory view interface is a C-API set to exchange a raw memory area, + such as a numeric array or a bitmap image, between extension libraries. + The extension libraries can share also the metadata of the memory area + that consists of the shape, the element format, and so on. + Using these kinds of metadata, the extension libraries can share even + a multidimensional array appropriately. + This feature is designed by referring to Python's buffer protocol. + [[Feature #13767]] [[Feature #14722]] + +* Ractor related C APIs are introduced (experimental) in "include/ruby/ractor.h". + +## Implementation improvements + +* New method cache mechanism for Ractor. [[Feature #16614]] + + * Inline method caches pointed from ISeq can be accessed by multiple Ractors + in parallel and synchronization is needed even for method caches. However, + such synchronization can be overhead so introducing new inline method cache + mechanisms, (1) Disposable inline method cache (2) per-Class method cache + and (3) new invalidation mechanism. (1) can avoid per-method call + synchronization because it only uses atomic operations. + See the ticket for more details. + +* The number of hashes allocated when using a keyword splat in + a method call has been reduced to a maximum of 1, and passing + a keyword splat to a method that accepts specific keywords + does not allocate a hash. + +* `super` is optimized when the same type of method is called in the previous call + if it's not refinements or an attr reader or writer. + +### JIT + +* Performance improvements of JIT-ed code + + * Microarchitectural optimizations + + * Native functions shared by multiple methods are deduplicated on JIT compaction. + + * Decrease code size of hot paths by some optimizations and partitioning cold paths. + + * Instance variables + + * Eliminate some redundant checks. + + * Skip checking a class and a object multiple times in a method when possible. + + * Optimize accesses in some core classes like Hash and their subclasses. + + * Method inlining support for some C methods + + * `Kernel`: `#class`, `#frozen?` + + * `Integer`: `#-@`, `#~`, `#abs`, `#bit_length`, `#even?`, `#integer?`, `#magnitude`, + `#odd?`, `#ord`, `#to_i`, `#to_int`, `#zero?` + + * `Struct`: reader methods for 10th or later members + + * Constant references are inlined. + + * Always generate appropriate code for `==`, `nil?`, and `!` calls depending on + a receiver class. + + * Reduce the number of PC accesses on branches and method returns. + + * Optimize C method calls a little. + +* Compilation process improvements + + * It does not keep temporary files in /tmp anymore. + + * Throttle GC and compaction of JIT-ed code. + + * Avoid GC-ing JIT-ed code when not necessary. + + * GC-ing JIT-ed code is executed in a background thread. + + * Reduce the number of locks between Ruby and JIT threads. + +## Static analysis + +### RBS + +* RBS is a new language for type definition of Ruby programs. + It allows writing types of classes and modules with advanced + types including union types, overloading, generics, and + _interface types_ for duck typing. + +* Ruby ships with type definitions for core/stdlib classes. + +* `rbs` gem is bundled to load and process RBS files. + +### TypeProf + +* TypeProf is a type analysis tool for Ruby code based on abstract interpretation. + + * It reads non-annotated Ruby code, tries inferring its type signature, and prints + the analysis result in RBS format. + + * Though it supports only a subset of the Ruby language yet, we will continuously + improve the coverage of language features, analysis performance, and usability. + +```ruby +# test.rb +def foo(x) + if x > 10 + x.to_s + else + nil + end +end + +foo(42) +``` + +```console +$ typeprof test.rb +# Classes +class Object + def foo : (Integer) -> String? +end +``` + +## Miscellaneous changes + +* Methods using `ruby2_keywords` will no longer keep empty keyword + splats, those are now removed just as they are for methods not + using `ruby2_keywords`. + +* When an exception is caught in the default handler, the error + message and backtrace are printed in order from the innermost. + [[Feature #8661]] + +* Accessing an uninitialized instance variable no longer emits a + warning in verbose mode. [[Feature #17055]] + +[Bug #4352]: https://bugs.ruby-lang.org/issues/4352 +[Bug #6087]: https://bugs.ruby-lang.org/issues/6087 +[Bug #8382]: https://bugs.ruby-lang.org/issues/8382 +[Bug #8446]: https://bugs.ruby-lang.org/issues/8446 +[Feature #8661]: https://bugs.ruby-lang.org/issues/8661 +[Feature #8709]: https://bugs.ruby-lang.org/issues/8709 +[Feature #8948]: https://bugs.ruby-lang.org/issues/8948 +[Feature #9573]: https://bugs.ruby-lang.org/issues/9573 +[Bug #10845]: https://bugs.ruby-lang.org/issues/10845 +[Bug #12136]: https://bugs.ruby-lang.org/issues/12136 +[Feature #12650]: https://bugs.ruby-lang.org/issues/12650 +[Bug #12706]: https://bugs.ruby-lang.org/issues/12706 +[Feature #13767]: https://bugs.ruby-lang.org/issues/13767 +[Bug #13768]: https://bugs.ruby-lang.org/issues/13768 +[Feature #14183]: https://bugs.ruby-lang.org/issues/14183 +[Bug #14266]: https://bugs.ruby-lang.org/issues/14266 +[Feature #14267]: https://bugs.ruby-lang.org/issues/14267 +[Feature #14413]: https://bugs.ruby-lang.org/issues/14413 +[Bug #14541]: https://bugs.ruby-lang.org/issues/14541 +[Feature #14722]: https://bugs.ruby-lang.org/issues/14722 +[Bug #15409]: https://bugs.ruby-lang.org/issues/15409 +[Feature #15504]: https://bugs.ruby-lang.org/issues/15504 +[Feature #15575]: https://bugs.ruby-lang.org/issues/15575 +[Feature #15822]: https://bugs.ruby-lang.org/issues/15822 +[Misc #15893]: https://bugs.ruby-lang.org/issues/15893 +[Feature #15921]: https://bugs.ruby-lang.org/issues/15921 +[Feature #15973]: https://bugs.ruby-lang.org/issues/15973 +[Feature #16131]: https://bugs.ruby-lang.org/issues/16131 +[Feature #16150]: https://bugs.ruby-lang.org/issues/16150 +[Feature #16166]: https://bugs.ruby-lang.org/issues/16166 +[Feature #16175]: https://bugs.ruby-lang.org/issues/16175 +[Feature #16233]: https://bugs.ruby-lang.org/issues/16233 +[Feature #16260]: https://bugs.ruby-lang.org/issues/16260 +[Feature #16274]: https://bugs.ruby-lang.org/issues/16274 +[Feature #16345]: https://bugs.ruby-lang.org/issues/16345 +[Feature #16377]: https://bugs.ruby-lang.org/issues/16377 +[Feature #16378]: https://bugs.ruby-lang.org/issues/16378 +[Feature #16555]: https://bugs.ruby-lang.org/issues/16555 +[Feature #16604]: https://bugs.ruby-lang.org/issues/16604 +[Feature #16614]: https://bugs.ruby-lang.org/issues/16614 +[Feature #16686]: https://bugs.ruby-lang.org/issues/16686 +[Feature #16746]: https://bugs.ruby-lang.org/issues/16746 +[Feature #16754]: https://bugs.ruby-lang.org/issues/16754 +[Feature #16786]: https://bugs.ruby-lang.org/issues/16786 +[Feature #16792]: https://bugs.ruby-lang.org/issues/16792 +[Feature #16815]: https://bugs.ruby-lang.org/issues/16815 +[Feature #16828]: https://bugs.ruby-lang.org/issues/16828 +[Misc #16961]: https://bugs.ruby-lang.org/issues/16961 +[Bug #17030]: https://bugs.ruby-lang.org/issues/17030 +[Feature #17055]: https://bugs.ruby-lang.org/issues/17055 +[Feature #17104]: https://bugs.ruby-lang.org/issues/17104 +[Feature #17122]: https://bugs.ruby-lang.org/issues/17122 +[Feature #17136]: https://bugs.ruby-lang.org/issues/17136 +[Feature #17176]: https://bugs.ruby-lang.org/issues/17176 +[Feature #17187]: https://bugs.ruby-lang.org/issues/17187 +[Bug #17221]: https://bugs.ruby-lang.org/issues/17221 +[Feature #17260]: https://bugs.ruby-lang.org/issues/17260 +[Feature #17273]: https://bugs.ruby-lang.org/issues/17273 +[Feature #17303]: https://bugs.ruby-lang.org/issues/17303 +[Feature #17314]: https://bugs.ruby-lang.org/issues/17314 +[Feature #17322]: https://bugs.ruby-lang.org/issues/17322 +[Feature #17351]: https://bugs.ruby-lang.org/issues/17351 +[Feature #17371]: https://bugs.ruby-lang.org/issues/17371 +[Bug #17419]: https://bugs.ruby-lang.org/issues/17419 +[GH-2991]: https://github.com/ruby/ruby/pull/2991 diff --git a/doc/NEWS/NEWS-3.1.0.md b/doc/NEWS/NEWS-3.1.0.md new file mode 100644 index 0000000000..686003894e --- /dev/null +++ b/doc/NEWS/NEWS-3.1.0.md @@ -0,0 +1,660 @@ +# NEWS for Ruby 3.1.0 + +This document is a list of user-visible feature changes +since the **3.0.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Language changes + +* The block argument can now be anonymous if the block will + only be passed to another method. [[Feature #11256]] + + ```ruby + def foo(&) + bar(&) + end + ``` + +* Pin operator now takes an expression. [[Feature #17411]] + + ```ruby + Prime.each_cons(2).lazy.find_all{_1 in [n, ^(n + 2)]}.take(3).to_a + #=> [[3, 5], [5, 7], [11, 13]] + ``` + +* Pin operator now supports instance, class, and global variables. + [[Feature #17724]] + + ```ruby + @n = 5 + Prime.each_cons(2).lazy.find{_1 in [n, ^@n]} + #=> [3, 5] + ``` + +* One-line pattern matching is no longer experimental. + +* Parentheses can be omitted in one-line pattern matching. + [[Feature #16182]] + + ```ruby + [0, 1] => _, x + {y: 2} => y: + x #=> 1 + y #=> 2 + ``` + +* Multiple assignment evaluation order has been made consistent with + single assignment evaluation order. With single assignment, Ruby + uses a left-to-right evaluation order. With this code: + + ```ruby + foo[0] = bar + ``` + + The following evaluation order is used: + + 1. `foo` + 2. `bar` + 3. `[]=` called on the result of `foo` + + In Ruby before 3.1.0, multiple assignment did not follow this + evaluation order. With this code: + + ```ruby + foo[0], bar.baz = a, b + ``` + + Versions of Ruby before 3.1.0 would evaluate in the following + order + + 1. `a` + 2. `b` + 3. `foo` + 4. `[]=` called on the result of `foo` + 5. `bar` + 6. `baz=` called on the result of `bar` + + Starting in Ruby 3.1.0, the evaluation order is now consistent with + single assignment, with the left-hand side being evaluated before + the right-hand side: + + 1. `foo` + 2. `bar` + 3. `a` + 4. `b` + 5. `[]=` called on the result of `foo` + 6. `baz=` called on the result of `bar` + + [[Bug #4443]] + +* Values in Hash literals and keyword arguments can be omitted. + [[Feature #14579]] + + For example, + + * `{x:, y:}` is a syntax sugar of `{x: x, y: y}`. + * `foo(x:, y:)` is a syntax sugar of `foo(x: x, y: y)`. + + Constant names, local variable names, and method names are allowed as + key names. Note that a reserved word is considered as a local + variable or method name even if it's a pseudo variable name such as + `self`. + +* Non main-Ractors can get instance variables (ivars) of classes/modules + if ivars refer to shareable objects. + [[Feature #17592]] + +* A command syntax is allowed in endless method definitions, i.e., + you can now write `def foo = puts "Hello"`. + Note that `private def foo = puts "Hello"` does not parse. + [[Feature #17398]] + +## Command line options + +* `--disable-gems` is now explicitly declared as "just for debugging". + Never use it in any real-world codebase. + [[Feature #17684]] + +## Core classes updates + +Note: We're only listing outstanding class updates. + +* Array + + * Array#intersect? is added. [[Feature #15198]] + +* Class + + * Class#subclasses, which returns an array of classes + directly inheriting from the receiver, not + including singleton classes. + [[Feature #18273]] + + ```ruby + class A; end + class B < A; end + class C < B; end + class D < A; end + A.subclasses #=> [D, B] + B.subclasses #=> [C] + C.subclasses #=> [] + ``` + +* Enumerable + + * Enumerable#compact is added. [[Feature #17312]] + + * Enumerable#tally now accepts an optional hash to count. [[Feature #17744]] + + * Enumerable#each_cons and each_slice to return a receiver. [[GH-1509]] + + ```ruby + [1, 2, 3].each_cons(2){} + # 3.0 => nil + # 3.1 => [1, 2, 3] + + [1, 2, 3].each_slice(2){} + # 3.0 => nil + # 3.1 => [1, 2, 3] + ``` + +* Enumerator::Lazy + + * Enumerator::Lazy#compact is added. [[Feature #17312]] + +* File + + * File.dirname now accepts an optional argument for the level to + strip path components. [[Feature #12194]] + +* GC + + * "GC.measure_total_time = true" enables the measurement of GC. + Measurement can introduce overhead. It is enabled by default. + GC.measure_total_time returns the current setting. + GC.stat[:time] or GC.stat(:time) returns measured time + in milli-seconds. [[[Feature #10917]]] + + * GC.total_time returns measured time in nano-seconds. [[[Feature #10917]]] + +* Integer + + * Integer.try_convert is added. [[Feature #15211]] + +* Kernel + + * Kernel#load now accepts a module as the second argument, + and will load the file using the given module as the + top-level module. [[Feature #6210]] + +* Marshal + + * Marshal.load now accepts a `freeze: true` option. + All returned objects are frozen except for `Class` and + `Module` instances. Strings are deduplicated. [[Feature #18148]] + +* MatchData + + * MatchData#match is added [[Feature #18172]] + + * MatchData#match_length is added [[Feature #18172]] + +* Method / UnboundMethod + + * Method#public?, Method#private?, Method#protected?, + UnboundMethod#public?, UnboundMethod#private?, + UnboundMethod#protected? have been added. [[Feature #11689]] + +* Module + + * Module#prepend now modifies the ancestor chain if the receiver + already includes the argument. Module#prepend still does not + modify the ancestor chain if the receiver has already prepended + the argument. [[Bug #17423]] + + * Module#private, #public, #protected, and #module_function will + now return their arguments. If a single argument is given, it + is returned. If no arguments are given, nil is returned. If + multiple arguments are given, they are returned as an array. + [[Feature #12495]] + +* Process + + * Process.\_fork is added. This is a core method for fork(2). + Do not call this method directly; it is called by existing + fork methods: Kernel.#fork, Process.fork, and IO.popen("-"). + Application monitoring libraries can overwrite this method to + hook fork events. [[Feature #17795]] + +* Struct + + * Passing only keyword arguments to Struct#initialize is warned. + You need to use a Hash literal to set a Hash to a first member. + [[Feature #16806]] + + * StructClass#keyword_init? is added [[Feature #18008]] + +* String + + * Update Unicode version to 13.0.0 [[Feature #17750]] + and Emoji version to 13.0 [[Feature #18029]] + + * String#unpack and String#unpack1 now accept an `offset:` keyword + argument to start the unpacking after an arbitrary number of bytes + have been skipped. If `offset` is outside of the string bounds + `ArgumentError` is raised. [[Feature #18254]] + +* Thread + + * Thread#native_thread_id is added. [[Feature #17853]] + +* Thread::Backtrace + + * Thread::Backtrace.limit, which returns the value to limit backtrace + length set by `--backtrace-limit` command line option, is added. + [[Feature #17479]] + +* Thread::Queue + + * Thread::Queue.new now accepts an Enumerable of initial values. + [[Feature #17327]] + +* Time + + * Time.new now accepts optional `in:` keyword argument for the + timezone, as well as `Time.at` and `Time.now`, so that is now + you can omit minor arguments to `Time.new`. [[Feature #17485]] + + ```ruby + Time.new(2021, 12, 25, in: "+07:00") + #=> 2021-12-25 00:00:00 +0700 + ``` + + At the same time, time component strings are converted to + integers more strictly now. + + ```ruby + Time.new(2021, 12, 25, "+07:30") + #=> invalid value for Integer(): "+07:30" (ArgumentError) + ``` + + Ruby 3.0 or earlier returned probably unexpected result + `2021-12-25 07:00:00`, not `2021-12-25 07:30:00` nor + `2021-12-25 00:00:00 +07:30`. + + * Time#strftime supports RFC 3339 UTC for unknown offset local + time, `-0000`, as `%-z`. [[Feature #17544]] + +* TracePoint + + * TracePoint.allow_reentry is added to allow reenter while TracePoint + callback. + [[Feature #15912]] + +* $LOAD_PATH + + * $LOAD_PATH.resolve_feature_path does not raise. [[Feature #16043]] + +* Fiber Scheduler + + * Add support for `Addrinfo.getaddrinfo` using `address_resolve` hook. + [[Feature #17370]] + + * Introduce non-blocking `Timeout.timeout` using `timeout_after` hook. + [[Feature #17470]] + + * Introduce new scheduler hooks `io_read` and `io_write` along with a + low level `IO::Buffer` for zero-copy read/write. [[Feature #18020]] + + * IO hooks `io_wait`, `io_read`, `io_write`, receive the original IO object + where possible. [[Bug #18003]] + + * Make `Monitor` fiber-safe. [[Bug #17827]] + + * Replace copy coroutine with pthread implementation. [[Feature #18015]] + +* Refinement + + * New class which represents a module created by Module#refine. + `include` and `prepend` are deprecated, and `import_methods` is added + instead. [[Bug #17429]] + +## Stdlib updates + +* The following default gem are updated. + * RubyGems 3.3.3 + * base64 0.1.1 + * benchmark 0.2.0 + * bigdecimal 3.1.1 + * bundler 2.3.3 + * cgi 0.3.1 + * csv 3.2.2 + * date 3.2.2 + * did_you_mean 1.6.1 + * digest 3.1.0 + * drb 2.1.0 + * erb 2.2.3 + * error_highlight 0.3.0 + * etc 1.3.0 + * fcntl 1.0.1 + * fiddle 1.1.0 + * fileutils 1.6.0 + * find 0.1.1 + * io-console 0.5.10 + * io-wait 0.2.1 + * ipaddr 1.2.3 + * irb 1.4.1 + * json 2.6.1 + * logger 1.5.0 + * net-http 0.2.0 + * net-protocol 0.1.2 + * nkf 0.1.1 + * open-uri 0.2.0 + * openssl 3.0.0 + * optparse 0.2.0 + * ostruct 0.5.2 + * pathname 0.2.0 + * pp 0.3.0 + * prettyprint 0.1.1 + * psych 4.0.3 + * racc 1.6.0 + * rdoc 6.4.0 + * readline 0.0.3 + * readline-ext 0.1.4 + * reline 0.3.0 + * resolv 0.2.1 + * rinda 0.1.1 + * ruby2_keywords 0.0.5 + * securerandom 0.1.1 + * set 1.0.2 + * stringio 3.0.1 + * strscan 3.0.1 + * tempfile 0.1.2 + * time 0.2.0 + * timeout 0.2.0 + * tmpdir 0.1.2 + * un 0.2.0 + * uri 0.11.0 + * yaml 0.2.0 + * zlib 2.1.1 +* The following bundled gems are updated. + * minitest 5.15.0 + * power_assert 2.0.1 + * rake 13.0.6 + * test-unit 3.5.3 + * rexml 3.2.5 + * rbs 2.0.0 + * typeprof 0.21.1 +* The following default gems are now bundled gems. + * net-ftp 0.1.3 + * net-imap 0.2.2 + * net-pop 0.1.1 + * net-smtp 0.3.1 + * matrix 0.4.2 + * prime 0.1.2 + * debug 1.4.0 +* The following gems has been removed from the Ruby standard library. + * dbm + * gdbm + * tracer + +* Coverage measurement now supports suspension. You can use `Coverage.suspend` + to stop the measurement temporarily, and `Coverage.resume` to restart it. + See [[Feature #18176]] in detail. + +* Random::Formatter is moved to random/formatter.rb, so that you can + use `Random#hex`, `Random#base64`, and so on without SecureRandom. + [[Feature #18190]] + +## Compatibility issues + +Note: Excluding feature bug fixes. + +* `rb_io_wait_readable`, `rb_io_wait_writable` and `rb_wait_for_single_fd` are + deprecated in favour of `rb_io_maybe_wait_readable`, + `rb_io_maybe_wait_writable` and `rb_io_maybe_wait` respectively. + `rb_thread_wait_fd` and `rb_thread_fd_writable` are deprecated. [[Bug #18003]] + +## Stdlib compatibility issues + +* `ERB#initialize` warns `safe_level` and later arguments even without -w. + [[Feature #14256]] + +* `lib/debug.rb` is replaced with `debug.gem` + +* `Kernel#pp` in `lib/pp.rb` uses the width of `IO#winsize` by default. + This means that the output width is automatically changed depending on + your terminal size. [[Feature #12913]] + +* Psych 4.0 changes `Psych.load` as `safe_load` by the default. + You may need to use Psych 3.3.2 for migrating to this behavior. + [[Bug #17866]] + +## C API updates + +* Documented. [[GH-4815]] + +* `rb_gc_force_recycle` is deprecated and has been changed to a no-op. + [[Feature #18290]] + +## Implementation improvements + +* Inline cache mechanism is introduced for reading class variables. + [[Feature #17763]] + +* `instance_eval` and `instance_exec` now only allocate a singleton class when + required, avoiding extra objects and improving performance. [[GH-5146]] + +* The performance of `Struct` accessors is improved. [[GH-5131]] + +* `mandatory_only?` builtin special form to improve performance on + builtin methods. [[GH-5112]] + +* Experimental feature Variable Width Allocation in the garbage collector. + This feature is turned off by default and can be enabled by compiling Ruby + with flag `USE_RVARGC=1` set. [[Feature #18045]] [[Feature #18239]] + +## JIT + +* Rename Ruby 3.0's `--jit` to `--mjit`, and alias `--jit` to `--yjit` + on non-Windows x86-64 platforms and to `--mjit` on others. + +### MJIT + +* The default `--mjit-max-cache` is changed from 100 to 10000. + +* JIT-ed code is no longer cancelled when a TracePoint for class events + is enabled. + +* The JIT compiler no longer skips compilation of methods longer than + 1000 instructions. + +* `--mjit-verbose` and `--mjit-warning` output "JIT cancel" when JIT-ed + code is disabled because TracePoint or GC.compact is used. + +### YJIT: New experimental in-process JIT compiler + +New JIT compiler available as an experimental feature. [[Feature #18229]] + +See [this blog post](https://shopify.engineering/yjit-just-in-time-compiler-cruby +) introducing the project. + +* Disabled by default, use `--yjit` command-line option to enable YJIT. + +* Performance improvements on benchmarks based on real-world software, + up to 22% on railsbench, 39% on liquid-render. + +* Fast warm-up times. + +* Limited to Unix-like x86-64 platforms for now. + +## Static analysis + +### RBS + +* Generics type parameters can be bounded ([PR](https://github.com/ruby/rbs/pull/844)). + + ```rbs + # `T` must be compatible with the `_Output` interface. + # `PrettyPrint[String]` is ok, but `PrettyPrint[Integer]` is a type error. + class PrettyPrint[T < _Output] + interface _Output + def <<: (String) -> void + end + + attr_reader output: T + + def initialize: (T output) -> void + end + ``` + +* Type aliases can be generic. ([PR](https://github.com/ruby/rbs/pull/823)) + + ```rbs + # Defines a generic type `list`. + type list[T] = [ T, list[T] ] + | nil + + type str_list = list[String] + type int_list = list[Integer] + ``` + +* [rbs collection](https://github.com/ruby/rbs/blob/cdd6a3a896001e25bd1feda3eab7f470bae935c1/docs/collection.md) has been introduced to manage gems’ RBSs. + +* Many signatures for built-in and standard libraries have been added/updated. + +* It includes many bug fixes and performance improvements too. + +See the [CHANGELOG.md](https://github.com/ruby/rbs/blob/cdd6a3a896001e25bd1feda3eab7f470bae935c1/CHANGELOG.md) for more information. + +### TypeProf + +* [Experimental IDE support](https://github.com/ruby/typeprof/blob/ca15c5dae9bd62668463165f8409bd66ce7de223/doc/ide.md) has been implemented. +* Many bug fixes and performance improvements since Ruby 3.0.0. + +## Debugger + +* A new debugger [debug.gem](https://github.com/ruby/debug) is bundled. + debug.gem is a fast debugger implementation, and it provides many features + like remote debugging, colorful REPL, IDE (VSCode) integration, and more. + It replaces `lib/debug.rb` standard library. + +* `rdbg` command is also installed into `bin/` directory to start and control + debugging execution. + +## error_highlight + +A built-in gem called error_highlight has been introduced. +It shows fine-grained error locations in the backtrace. + +Example: `title = json[:article][:title]` + +If `json` is nil, it shows: + +```console +$ ruby test.rb +test.rb:2:in `<main>': undefined method `[]' for nil:NilClass (NoMethodError) + +title = json[:article][:title] + ^^^^^^^^^^ +``` + +If `json[:article]` returns nil, it shows: + +```console +$ ruby test.rb +test.rb:2:in `<main>': undefined method `[]' for nil:NilClass (NoMethodError) + +title = json[:article][:title] + ^^^^^^^^ +``` + +This feature is enabled by default. +You can disable it by using a command-line option `--disable-error_highlight`. +See [the repository](https://github.com/ruby/error_highlight) in detail. + +## IRB Autocomplete and Document Display + +The IRB now has an autocomplete feature, where you can just type in the code, and the completion candidates dialog will appear. You can use Tab and Shift+Tab to move up and down. + +If documents are installed when you select a completion candidate, the documentation dialog will appear next to the completion candidates dialog, showing part of the content. You can read the full document by pressing Alt+d. + +## Miscellaneous changes + +* lib/objspace/trace.rb is added, which is a tool for tracing the object + allocation. Just by requiring this file, tracing is started *immediately*. + Just by `Kernel#p`, you can investigate where an object was created. + Note that just requiring this file brings a large performance overhead. + This is only for debugging purposes. Do not use this in production. + [[Feature #17762]] + +* Now exceptions raised in finalizers will be printed to `STDERR`, unless + `$VERBOSE` is `nil`. [[Feature #17798]] + +* `ruby -run -e httpd` displays URLs to access. [[Feature #17847]] + +* Add `ruby -run -e colorize` to colorize Ruby code using + `IRB::Color.colorize_code`. + +[Bug #4443]: https://bugs.ruby-lang.org/issues/4443 +[Feature #6210]: https://bugs.ruby-lang.org/issues/6210 +[Feature #10917]: https://bugs.ruby-lang.org/issues/10917 +[Feature #11256]: https://bugs.ruby-lang.org/issues/11256 +[Feature #11689]: https://bugs.ruby-lang.org/issues/11689 +[Feature #12194]: https://bugs.ruby-lang.org/issues/12194 +[Feature #12495]: https://bugs.ruby-lang.org/issues/12495 +[Feature #12913]: https://bugs.ruby-lang.org/issues/12913 +[Feature #14256]: https://bugs.ruby-lang.org/issues/14256 +[Feature #14579]: https://bugs.ruby-lang.org/issues/14579 +[Feature #15198]: https://bugs.ruby-lang.org/issues/15198 +[Feature #15211]: https://bugs.ruby-lang.org/issues/15211 +[Feature #15912]: https://bugs.ruby-lang.org/issues/15912 +[Feature #16043]: https://bugs.ruby-lang.org/issues/16043 +[Feature #16182]: https://bugs.ruby-lang.org/issues/16182 +[Feature #16806]: https://bugs.ruby-lang.org/issues/16806 +[Feature #17312]: https://bugs.ruby-lang.org/issues/17312 +[Feature #17327]: https://bugs.ruby-lang.org/issues/17327 +[Feature #17370]: https://bugs.ruby-lang.org/issues/17370 +[Feature #17398]: https://bugs.ruby-lang.org/issues/17398 +[Feature #17411]: https://bugs.ruby-lang.org/issues/17411 +[Bug #17423]: https://bugs.ruby-lang.org/issues/17423 +[Bug #17429]: https://bugs.ruby-lang.org/issues/17429 +[Feature #17470]: https://bugs.ruby-lang.org/issues/17470 +[Feature #17479]: https://bugs.ruby-lang.org/issues/17479 +[Feature #17485]: https://bugs.ruby-lang.org/issues/17485 +[Feature #17544]: https://bugs.ruby-lang.org/issues/17544 +[Feature #17592]: https://bugs.ruby-lang.org/issues/17592 +[Feature #17684]: https://bugs.ruby-lang.org/issues/17684 +[Feature #17724]: https://bugs.ruby-lang.org/issues/17724 +[Feature #17744]: https://bugs.ruby-lang.org/issues/17744 +[Feature #17750]: https://bugs.ruby-lang.org/issues/17750 +[Feature #17762]: https://bugs.ruby-lang.org/issues/17762 +[Feature #17763]: https://bugs.ruby-lang.org/issues/17763 +[Feature #17795]: https://bugs.ruby-lang.org/issues/17795 +[Feature #17798]: https://bugs.ruby-lang.org/issues/17798 +[Bug #17827]: https://bugs.ruby-lang.org/issues/17827 +[Feature #17847]: https://bugs.ruby-lang.org/issues/17847 +[Feature #17853]: https://bugs.ruby-lang.org/issues/17853 +[Bug #17866]: https://bugs.ruby-lang.org/issues/17866 +[Bug #18003]: https://bugs.ruby-lang.org/issues/18003 +[Feature #18008]: https://bugs.ruby-lang.org/issues/18008 +[Feature #18015]: https://bugs.ruby-lang.org/issues/18015 +[Feature #18020]: https://bugs.ruby-lang.org/issues/18020 +[Feature #18029]: https://bugs.ruby-lang.org/issues/18029 +[Feature #18045]: https://bugs.ruby-lang.org/issues/18045 +[Feature #18148]: https://bugs.ruby-lang.org/issues/18148 +[Feature #18172]: https://bugs.ruby-lang.org/issues/18172 +[Feature #18176]: https://bugs.ruby-lang.org/issues/18176 +[Feature #18190]: https://bugs.ruby-lang.org/issues/18190 +[Feature #18229]: https://bugs.ruby-lang.org/issues/18229 +[Feature #18239]: https://bugs.ruby-lang.org/issues/18239 +[Feature #18254]: https://bugs.ruby-lang.org/issues/18254 +[Feature #18273]: https://bugs.ruby-lang.org/issues/18273 +[Feature #18290]: https://bugs.ruby-lang.org/issues/18290 + +[GH-1509]: https://github.com/ruby/ruby/pull/1509 +[GH-4815]: https://github.com/ruby/ruby/pull/4815 +[GH-5112]: https://github.com/ruby/ruby/pull/5112 +[GH-5131]: https://github.com/ruby/ruby/pull/5131 +[GH-5146]: https://github.com/ruby/ruby/pull/5146 diff --git a/doc/NEWS/NEWS-3.2.0.md b/doc/NEWS/NEWS-3.2.0.md new file mode 100644 index 0000000000..3a48c1964d --- /dev/null +++ b/doc/NEWS/NEWS-3.2.0.md @@ -0,0 +1,820 @@ +# NEWS for Ruby 3.2.0 + +This document is a list of user-visible feature changes +since the **3.1.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Language changes + +* Anonymous rest and keyword rest arguments can now be passed as + arguments, instead of just used in method parameters. + [[Feature #18351]] + + ```ruby + def foo(*) + bar(*) + end + def baz(**) + quux(**) + end + ``` + +* A proc that accepts a single positional argument and keywords will + no longer autosplat. [[Bug #18633]] + + ```ruby + proc{|a, **k| a}.call([1, 2]) + # Ruby 3.1 and before + # => 1 + # Ruby 3.2 and after + # => [1, 2] + ``` + +* Constant assignment evaluation order for constants set on explicit + objects has been made consistent with single attribute assignment + evaluation order. With this code: + + ```ruby + foo::BAR = baz + ``` + + `foo` is now called before `baz`. Similarly, for multiple assignments + to constants, left-to-right evaluation order is used. With this + code: + + ```ruby + foo1::BAR1, foo2::BAR2 = baz1, baz2 + ``` + + The following evaluation order is now used: + + 1. `foo1` + 2. `foo2` + 3. `baz1` + 4. `baz2` + + [[Bug #15928]] + +* "Find pattern" is no longer experimental. + [[Feature #18585]] + +* Methods taking a rest parameter (like `*args`) and wishing to delegate keyword + arguments through `foo(*args)` must now be marked with `ruby2_keywords` + (if not already the case). In other words, all methods wishing to delegate + keyword arguments through `*args` must now be marked with `ruby2_keywords`, + with no exception. This will make it easier to transition to other ways of + delegation once a library can require Ruby 3+. Previously, the `ruby2_keywords` + flag was kept if the receiving method took `*args`, but this was a bug and an + inconsistency. A good technique to find the potentially-missing `ruby2_keywords` + is to run the test suite, for where it fails find the last method which must + receive keyword arguments, use `puts nil, caller, nil` there, and check each + method/block on the call chain which must delegate keywords is correctly marked + as `ruby2_keywords`. [[Bug #18625]] [[Bug #16466]] + + ```ruby + def target(**kw) + end + + # Accidentally worked without ruby2_keywords in Ruby 2.7-3.1, ruby2_keywords + # needed in 3.2+. Just like (*args, **kwargs) or (...) would be needed on + # both #foo and #bar when migrating away from ruby2_keywords. + ruby2_keywords def bar(*args) + target(*args) + end + + ruby2_keywords def foo(*args) + bar(*args) + end + + foo(k: 1) + ``` + +## Core classes updates + +Note: We're only listing outstanding class updates. + +* Fiber + + * Introduce Fiber.[] and Fiber.[]= for inheritable fiber storage. + Introduce Fiber#storage and Fiber#storage= (experimental) for + getting and resetting the current storage. Introduce + `Fiber.new(storage:)` for setting the storage when creating a + fiber. [[Feature #19078]] + + Existing Thread and Fiber local variables can be tricky to use. + Thread-local variables are shared between all fibers, making it + hard to isolate, while Fiber-local variables can be hard to + share. It is often desirable to define unit of execution + ("execution context") such that some state is shared between all + fibers and threads created in that context. This is what Fiber + storage provides. + + ```ruby + def log(message) + puts "#{Fiber[:request_id]}: #{message}" + end + + def handle_requests + while request = read_request + Fiber.schedule do + Fiber[:request_id] = SecureRandom.uuid + + request.messages.each do |message| + Fiber.schedule do + log("Handling #{message}") # Log includes inherited request_id. + end + end + end + end + end + ``` + + You should generally consider Fiber storage for any state which + you want to be shared implicitly between all fibers and threads + created in a given context, e.g. a connection pool, a request + id, a logger level, environment variables, configuration, etc. + +* Fiber::Scheduler + + * Introduce `Fiber::Scheduler#io_select` for non-blocking IO.select. + [[Feature #19060]] + +* IO + + * Introduce IO#timeout= and IO#timeout which can cause + IO::TimeoutError to be raised if a blocking operation exceeds the + specified timeout. [[Feature #18630]] + + ```ruby + STDIN.timeout = 1 + STDIN.read # => Blocking operation timed out! (IO::TimeoutError) + ``` + + * Introduce `IO.new(..., path:)` and promote `File#path` to `IO#path`. + [[Feature #19036]] + +* Class + + * Class#attached_object, which returns the object for which + the receiver is the singleton class. Raises TypeError if the + receiver is not a singleton class. + [[Feature #12084]] + + ```ruby + class Foo; end + + Foo.singleton_class.attached_object #=> Foo + Foo.new.singleton_class.attached_object #=> #<Foo:0x000000010491a370> + Foo.attached_object #=> TypeError: `Foo' is not a singleton class + nil.singleton_class.attached_object #=> TypeError: `NilClass' is not a singleton class + ``` + +* Data + + * New core class to represent simple immutable value object. The class is + similar to Struct and partially shares an implementation, but has more + lean and strict API. [[Feature #16122]] + + ```ruby + Measure = Data.define(:amount, :unit) + distance = Measure.new(100, 'km') #=> #<data Measure amount=100, unit="km"> + weight = Measure.new(amount: 50, unit: 'kg') #=> #<data Measure amount=50, unit="kg"> + weight.with(amount: 40) #=> #<data Measure amount=40, unit="kg"> + weight.amount #=> 50 + weight.amount = 40 #=> NoMethodError: undefined method `amount=' + ``` + +* Encoding + + * Encoding#replicate has been deprecated and will be removed in 3.3. [[Feature #18949]] + * The dummy `Encoding::UTF_16` and `Encoding::UTF_32` encodings no longer + try to dynamically guess the endian based on a byte order mark. + Use `Encoding::UTF_16BE`/`UTF_16LE` and `Encoding::UTF_32BE`/`UTF_32LE` instead. + This change speeds up getting the encoding of a String. [[Feature #18949]] + * Limit maximum encoding set size by 256. + If exceeding maximum size, `EncodingError` will be raised. [[Feature #18949]] + +* Enumerator + + * Enumerator.product has been added. Enumerator::Product is the implementation. [[Feature #18685]] + +* Exception + + * Exception#detailed_message has been added. + The default error printer calls this method on the Exception object + instead of #message. [[Feature #18564]] + +* Hash + + * Hash#shift now always returns nil if the hash is + empty, instead of returning the default value or + calling the default proc. [[Bug #16908]] + +* Integer + + * Integer#ceildiv has been added. [[Feature #18809]] + +* Kernel + + * Kernel#binding raises RuntimeError if called from a non-Ruby frame + (such as a method defined in C). [[Bug #18487]] + +* MatchData + + * MatchData#byteoffset has been added. [[Feature #13110]] + * MatchData#deconstruct has been added. [[Feature #18821]] + * MatchData#deconstruct_keys has been added. [[Feature #18821]] + +* Module + + * Module.used_refinements has been added. [[Feature #14332]] + * Module#refinements has been added. [[Feature #12737]] + * Module#const_added has been added. [[Feature #17881]] + * Module#undefined_instance_methods has been added. [[Feature #12655]] + +* Proc + + * Proc#dup returns an instance of subclass. [[Bug #17545]] + * Proc#parameters now accepts lambda keyword. [[Feature #15357]] + +* Process + * Added `RLIMIT_NPTS` constant to FreeBSD platform + +* Regexp + + * The cache-based optimization is introduced. + Many (but not all) Regexp matching is now in linear time, which + will prevent regular expression denial of service (ReDoS) + vulnerability. [[Feature #19104]] + + * Regexp.linear_time? is introduced. [[Feature #19194]] + + * Regexp.new now supports passing the regexp flags not only as an Integer, + but also as a String. Unknown flags raise ArgumentError. + Otherwise, anything other than `true`, `false`, `nil` or Integer will be warned. + [[Feature #18788]] + + * Regexp.timeout= has been added. Also, Regexp.new new supports timeout keyword. + See [[Feature #17837]] + +* Refinement + + * Refinement#refined_class has been added. [[Feature #12737]] + +* RubyVM::AbstractSyntaxTree + + * Add `error_tolerant` option for `parse`, `parse_file` and `of`. [[Feature #19013]] + With this option + + 1. SyntaxError is suppressed + 2. AST is returned for invalid input + 3. `end` is complemented when a parser reaches to the end of input but `end` is insufficient + 4. `end` is treated as keyword based on indent + + ```ruby + # Without error_tolerant option + root = RubyVM::AbstractSyntaxTree.parse(<<~RUBY) + def m + a = 10 + if + end + RUBY + # => <internal:ast>:33:in `parse': syntax error, unexpected `end' (SyntaxError) + + # With error_tolerant option + root = RubyVM::AbstractSyntaxTree.parse(<<~RUBY, error_tolerant: true) + def m + a = 10 + if + end + RUBY + p root # => #<RubyVM::AbstractSyntaxTree::Node:SCOPE@1:0-4:3> + + # `end` is treated as keyword based on indent + root = RubyVM::AbstractSyntaxTree.parse(<<~RUBY, error_tolerant: true) + module Z + class Foo + foo. + end + + def bar + end + end + RUBY + p root.children[-1].children[-1].children[-1].children[-2..-1] + # => [#<RubyVM::AbstractSyntaxTree::Node:CLASS@2:2-4:5>, #<RubyVM::AbstractSyntaxTree::Node:DEFN@6:2-7:5>] + ``` + + * Add `keep_tokens` option for `parse`, `parse_file` and `of`. Add `#tokens` and `#all_tokens` + for RubyVM::AbstractSyntaxTree::Node [[Feature #19070]] + + ```ruby + root = RubyVM::AbstractSyntaxTree.parse("x = 1 + 2", keep_tokens: true) + root.tokens # => [[0, :tIDENTIFIER, "x", [1, 0, 1, 1]], [1, :tSP, " ", [1, 1, 1, 2]], ...] + root.tokens.map{_1[2]}.join # => "x = 1 + 2" + ``` + +* Set + + * Set is now available as a built-in class without the need for `require "set"`. [[Feature #16989]] + It is currently autoloaded via the Set constant or a call to Enumerable#to_set. + +* String + + * String#byteindex and String#byterindex have been added. [[Feature #13110]] + * Update Unicode to Version 15.0.0 and Emoji Version 15.0. [[Feature #18639]] + (also applies to Regexp) + * String#bytesplice has been added. [[Feature #18598]] + * String#dedup has been added as an alias to String#-@. [[Feature #18595]] + +* Struct + + * A Struct class can also be initialized with keyword arguments + without `keyword_init: true` on Struct.new [[Feature #16806]] + + ```ruby + Post = Struct.new(:id, :name) + Post.new(1, "hello") #=> #<struct Post id=1, name="hello"> + # From Ruby 3.2, the following code also works without keyword_init: true. + Post.new(id: 1, name: "hello") #=> #<struct Post id=1, name="hello"> + ``` + +* Thread + + * Thread.each_caller_location is added. [[Feature #16663]] + +* Thread::Queue + + * Thread::Queue#pop(timeout: sec) is added. [[Feature #18774]] + +* Thread::SizedQueue + + * Thread::SizedQueue#pop(timeout: sec) is added. [[Feature #18774]] + * Thread::SizedQueue#push(timeout: sec) is added. [[Feature #18944]] + +* Time + + * Time#deconstruct_keys is added, allowing to use Time instances + in pattern-matching expressions [[Feature #19071]] + + * Time.new now can parse a string like generated by Time#inspect + and return a Time instance based on the given argument. + [[Feature #18033]] + +* SyntaxError + * SyntaxError#path has been added. [[Feature #19138]] + +* TracePoint + + * TracePoint#binding now returns `nil` for `c_call`/`c_return` TracePoints. + [[Bug #18487]] + * TracePoint#enable `target_thread` keyword argument now defaults to the + current thread if a block is given and `target` and `target_line` keyword + arguments are not passed. [[Bug #16889]] + +* UnboundMethod + + * `UnboundMethod#==` returns `true` if the actual method is same. For example, + `String.instance_method(:object_id) == Array.instance_method(:object_id)` + returns `true`. [[Feature #18798]] + + * `UnboundMethod#inspect` does not show the receiver of `instance_method`. + For example `String.instance_method(:object_id).inspect` returns + `"#<UnboundMethod: Kernel#object_id()>"` + (was `"#<UnboundMethod: String(Kernel)#object_id()>"`). + +* GC + + * Expose `need_major_gc` via `GC.latest_gc_info`. [GH-6791] + +* ObjectSpace + + * `ObjectSpace.dump_all` dump shapes as well. [GH-6868] + +## Stdlib updates + +* Bundler + + * Bundler now uses [PubGrub] resolver instead of [Molinillo] for performance improvement. + * Add --ext=rust support to bundle gem for creating simple gems with Rust extensions. + [[GH-rubygems-6149]] + * Make cloning git repos faster [[GH-rubygems-4475]] + +* RubyGems + + * Add mswin support for cargo builder. [[GH-rubygems-6167]] + +* CGI + + * `CGI.escapeURIComponent` and `CGI.unescapeURIComponent` are added. + [[Feature #18822]] + +* Coverage + + * `Coverage.setup` now accepts `eval: true`. By this, `eval` and related methods are + able to generate code coverage. [[Feature #19008]] + + * `Coverage.supported?(mode)` enables detection of what coverage modes are + supported. [[Feature #19026]] + +* Date + + * Added `Date#deconstruct_keys` and `DateTime#deconstruct_keys` same as [[Feature #19071]] + +* ERB + + * `ERB::Util.html_escape` is made faster than `CGI.escapeHTML`. + * It no longer allocates a String object when no character needs to be escaped. + * It skips calling `#to_s` method when an argument is already a String. + * `ERB::Escape.html_escape` is added as an alias to `ERB::Util.html_escape`, + which has not been monkey-patched by Rails. + * `ERB::Util.url_encode` is made faster using `CGI.escapeURIComponent`. + * `-S` option is removed from `erb` command. + +* FileUtils + + * Add FileUtils.ln_sr method and `relative:` option to FileUtils.ln_s. + [[Feature #18925]] + +* IRB + + * debug.gem integration commands have been added: `debug`, `break`, `catch`, + `next`, `delete`, `step`, `continue`, `finish`, `backtrace`, `info` + * They work even if you don't have `gem "debug"` in your Gemfile. + * See also: [What's new in Ruby 3.2's IRB?](https://st0012.dev/whats-new-in-ruby-3-2-irb) + * More Pry-like commands and features have been added. + * `edit` and `show_cmds` (like Pry's `help`) are added. + * `ls` takes `-g` or `-G` option to filter out outputs. + * `show_source` is aliased from `$` and accepts unquoted inputs. + * `whereami` is aliased from `@`. + +* Net::Protocol + + * Improve `Net::BufferedIO` performance. [[GH-net-protocol-14]] + +* Pathname + + * Added `Pathname#lutime`. [[GH-pathname-20]] + +* Socket + + * Added the following constants for supported platforms. + * `SO_INCOMING_CPU` + * `SO_INCOMING_NAPI_ID` + * `SO_RTABLE` + * `SO_SETFIB` + * `SO_USER_COOKIE` + * `TCP_KEEPALIVE` + * `TCP_CONNECTION_INFO` + +* SyntaxSuggest + + * The feature of `syntax_suggest` formerly `dead_end` is integrated in Ruby. + [[Feature #18159]] + +* UNIXSocket + + * Add support for UNIXSocket on Windows. Emulate anonymous sockets. Add + support for File.socket? and File::Stat#socket? where possible. + [[Feature #19135]] + +* The following default gems are updated. + + * RubyGems 3.4.1 + * abbrev 0.1.1 + * benchmark 0.2.1 + * bigdecimal 3.1.3 + * bundler 2.4.1 + * cgi 0.3.6 + * csv 3.2.6 + * date 3.3.3 + * delegate 0.3.0 + * did_you_mean 1.6.3 + * digest 3.1.1 + * drb 2.1.1 + * english 0.7.2 + * erb 4.0.2 + * error_highlight 0.5.1 + * etc 1.4.2 + * fcntl 1.0.2 + * fiddle 1.1.1 + * fileutils 1.7.0 + * forwardable 1.3.3 + * getoptlong 0.2.0 + * io-console 0.6.0 + * io-nonblock 0.2.0 + * io-wait 0.3.0 + * ipaddr 1.2.5 + * irb 1.6.2 + * json 2.6.3 + * logger 1.5.3 + * mutex_m 0.1.2 + * net-http 0.3.2 + * net-protocol 0.2.1 + * nkf 0.1.2 + * open-uri 0.3.0 + * open3 0.1.2 + * openssl 3.1.0 + * optparse 0.3.1 + * ostruct 0.5.5 + * pathname 0.2.1 + * pp 0.4.0 + * pstore 0.1.2 + * psych 5.0.1 + * racc 1.6.2 + * rdoc 6.5.0 + * readline-ext 0.1.5 + * reline 0.3.2 + * resolv 0.2.2 + * resolv-replace 0.1.1 + * securerandom 0.2.2 + * set 1.0.3 + * stringio 3.0.4 + * strscan 3.0.5 + * syntax_suggest 1.0.2 + * syslog 0.1.1 + * tempfile 0.1.3 + * time 0.2.1 + * timeout 0.3.1 + * tmpdir 0.1.3 + * tsort 0.1.1 + * un 0.2.1 + * uri 0.12.0 + * weakref 0.1.2 + * win32ole 1.8.9 + * yaml 0.2.1 + * zlib 3.0.0 + +* The following bundled gems are updated. + + * minitest 5.16.3 + * power_assert 2.0.3 + * test-unit 3.5.7 + * net-ftp 0.2.0 + * net-imap 0.3.4 + * net-pop 0.1.2 + * net-smtp 0.3.3 + * rbs 2.8.2 + * typeprof 0.21.3 + * debug 1.7.1 + +See GitHub releases like [GitHub Releases of Logger](https://github.com/ruby/logger/releases) or changelog for details of the default gems or bundled gems. + +## Supported platforms + +* WebAssembly/WASI is added. See [wasm/README.md] and [ruby.wasm] for more details. [[Feature #18462]] + +## Compatibility issues + +* `String#to_c` currently treat a sequence of underscores as an end of Complex + string. [[Bug #19087]] + +* Now `ENV.clone` raises `TypeError` as well as `ENV.dup` [[Bug #17767]] + +### Removed constants + +The following deprecated constants are removed. + +* `Fixnum` and `Bignum` [[Feature #12005]] +* `Random::DEFAULT` [[Feature #17351]] +* `Struct::Group` +* `Struct::Passwd` + +### Removed methods + +The following deprecated methods are removed. + +* `Dir.exists?` [[Feature #17391]] +* `File.exists?` [[Feature #17391]] +* `Kernel#=~` [[Feature #15231]] +* `Kernel#taint`, `Kernel#untaint`, `Kernel#tainted?` + [[Feature #16131]] +* `Kernel#trust`, `Kernel#untrust`, `Kernel#untrusted?` + [[Feature #16131]] +* `Method#public?`, `Method#private?`, `Method#protected?`, + `UnboundMethod#public?`, `UnboundMethod#private?`, `UnboundMethod#protected?` + [[Bug #18729]] [[Bug #18751]] [[Bug #18435]] + +### Source code incompatibility of extension libraries + +* Extension libraries provide PRNG, subclasses of Random, need updates. + See [PRNG update] below for more information. [[Bug #19100]] + +### Error printer + +* Ruby no longer escapes control characters and backslashes in an + error message. [[Feature #18367]] + +### Constant lookup when defining a class/module + +* When defining a class/module directly under the Object class by class/module + statement, if there is already a class/module defined by `Module#include` + with the same name, the statement was handled as "open class" in Ruby 3.1 or before. + Since Ruby 3.2, a new class is defined instead. [[Feature #18832]] + +## Stdlib compatibility issues + +* Psych no longer bundles libyaml sources. + And also Fiddle no longer bundles libffi sources. + Users need to install the libyaml/libffi library themselves via the package + manager like apt, yum, brew, etc. + + Psych and fiddle supported the static build with specific version of libyaml + and libffi sources. You can build psych with libyaml-0.2.5 like this. + + ```console + $ ./configure --with-libyaml-source-dir=/path/to/libyaml-0.2.5 + ``` + + And you can build fiddle with libffi-3.4.4 like this. + + ```console + $ ./configure --with-libffi-source-dir=/path/to/libffi-3.4.4 + ``` + + [[Feature #18571]] + +* Check cookie name/path/domain characters in `CGI::Cookie`. [[CVE-2021-33621]] + +* `URI.parse` return empty string in host instead of nil. [[sec-156615]] + +## C API updates + +### Updated C APIs + +The following APIs are updated. + +* PRNG update + + `rb_random_interface_t` in ruby/random.h updated and versioned. + Extension libraries which use this interface and built for older + versions need to rebuild with adding `init_int32` function. + +### Added C APIs + +* `VALUE rb_hash_new_capa(long capa)` was added to created hashes with the desired capacity. +* `rb_internal_thread_add_event_hook` and `rb_internal_thread_add_event_hook` were added to instrument threads scheduling. + The following events are available: + * `RUBY_INTERNAL_THREAD_EVENT_STARTED` + * `RUBY_INTERNAL_THREAD_EVENT_READY` + * `RUBY_INTERNAL_THREAD_EVENT_RESUMED` + * `RUBY_INTERNAL_THREAD_EVENT_SUSPENDED` + * `RUBY_INTERNAL_THREAD_EVENT_EXITED` +* `rb_debug_inspector_current_depth` and `rb_debug_inspector_frame_depth` are added for debuggers. + +### Removed C APIs + +The following deprecated APIs are removed. + +* `rb_cData` variable. +* "taintedness" and "trustedness" functions. [[Feature #16131]] + +## Implementation improvements + +* Fixed several race conditions in Kernel#autoload. [[Bug #18782]] +* Cache invalidation for expressions referencing constants is now + more fine-grained. `RubyVM.stat(:global_constant_state)` was + removed because it was closely tied to the previous caching scheme + where setting any constant invalidates all caches in the system. + New keys, `:constant_cache_invalidations` and `:constant_cache_misses`, + were introduced to help with use cases for `:global_constant_state`. + [[Feature #18589]] +* The cache-based optimization for Regexp matching is introduced. + [[Feature #19104]] +* [Variable Width Allocation](https://shopify.engineering/ruby-variable-width-allocation) + is now enabled by default. [[Feature #18239]] +* Added a new instance variable caching mechanism, called object shapes, which + improves inline cache hits for most objects and allows us to generate very + efficient JIT code. Objects whose instance variables are defined in a + consistent order will see the most performance benefits. + [[Feature #18776]] +* Speed up marking instruction sequences by using a bitmap to find "markable" + objects. This change results in faster major collections. + [[Feature #18875]] + +## JIT + +### YJIT + +* YJIT is no longer experimental + * Has been tested on production workloads for over a year and proven to be quite stable. +* YJIT now supports both x86-64 and arm64/aarch64 CPUs on Linux, MacOS, BSD and other UNIX platforms. + * This release brings support for Mac M1/M2, AWS Graviton and Raspberry Pi 4. +* Building YJIT now requires Rust 1.58.0+. [[Feature #18481]] + * In order to ensure that CRuby is built with YJIT, please install `rustc` >= 1.58.0 + before running `./configure` + * Please reach out to the YJIT team should you run into any issues. +* Physical memory for JIT code is lazily allocated. Unlike Ruby 3.1, + the RSS of a Ruby process is minimized because virtual memory pages + allocated by `--yjit-exec-mem-size` will not be mapped to physical + memory pages until actually utilized by JIT code. +* Introduce Code GC that frees all code pages when the memory consumption + by JIT code reaches `--yjit-exec-mem-size`. + * `RubyVM::YJIT.runtime_stats` returns Code GC metrics in addition to + existing `inline_code_size` and `outlined_code_size` keys: + `code_gc_count`, `live_page_count`, `freed_page_count`, and `freed_code_size`. +* Most of the statistics produced by `RubyVM::YJIT.runtime_stats` are now available in release builds. + * Simply run ruby with `--yjit-stats` to compute and dump stats (incurs some run-time overhead). +* YJIT is now optimized to take advantage of object shapes. [[Feature #18776]] +* Take advantage of finer-grained constant invalidation to invalidate less code when defining new constants. [[Feature #18589]] +* The default `--yjit-exec-mem-size` is changed to 64 (MiB). +* The default `--yjit-call-threshold` is changed to 30. + +### MJIT + +* The MJIT compiler is re-implemented in Ruby as `ruby_vm/mjit/compiler`. +* MJIT compiler is executed under a forked Ruby process instead of + doing it in a native thread called MJIT worker. [[Feature #18968]] + * As a result, Microsoft Visual Studio (MSWIN) is no longer supported. +* MinGW is no longer supported. [[Feature #18824]] +* Rename `--mjit-min-calls` to `--mjit-call-threshold`. +* Change default `--mjit-max-cache` back from 10000 to 100. + +[Feature #12005]: https://bugs.ruby-lang.org/issues/12005 +[Feature #12084]: https://bugs.ruby-lang.org/issues/12084 +[Feature #12655]: https://bugs.ruby-lang.org/issues/12655 +[Feature #12737]: https://bugs.ruby-lang.org/issues/12737 +[Feature #13110]: https://bugs.ruby-lang.org/issues/13110 +[Feature #14332]: https://bugs.ruby-lang.org/issues/14332 +[Feature #15231]: https://bugs.ruby-lang.org/issues/15231 +[Feature #15357]: https://bugs.ruby-lang.org/issues/15357 +[Bug #15928]: https://bugs.ruby-lang.org/issues/15928 +[Feature #16122]: https://bugs.ruby-lang.org/issues/16122 +[Feature #16131]: https://bugs.ruby-lang.org/issues/16131 +[Bug #16466]: https://bugs.ruby-lang.org/issues/16466 +[Feature #16663]: https://bugs.ruby-lang.org/issues/16663 +[Feature #16806]: https://bugs.ruby-lang.org/issues/16806 +[Bug #16889]: https://bugs.ruby-lang.org/issues/16889 +[Bug #16908]: https://bugs.ruby-lang.org/issues/16908 +[Feature #16989]: https://bugs.ruby-lang.org/issues/16989 +[Feature #17351]: https://bugs.ruby-lang.org/issues/17351 +[Feature #17391]: https://bugs.ruby-lang.org/issues/17391 +[Bug #17545]: https://bugs.ruby-lang.org/issues/17545 +[Bug #17767]: https://bugs.ruby-lang.org/issues/17767 +[Feature #17837]: https://bugs.ruby-lang.org/issues/17837 +[Feature #17881]: https://bugs.ruby-lang.org/issues/17881 +[Feature #18033]: https://bugs.ruby-lang.org/issues/18033 +[Feature #18159]: https://bugs.ruby-lang.org/issues/18159 +[Feature #18239]: https://bugs.ruby-lang.org/issues/18239#note-17 +[Feature #18351]: https://bugs.ruby-lang.org/issues/18351 +[Feature #18367]: https://bugs.ruby-lang.org/issues/18367 +[Bug #18435]: https://bugs.ruby-lang.org/issues/18435 +[Feature #18462]: https://bugs.ruby-lang.org/issues/18462 +[Feature #18481]: https://bugs.ruby-lang.org/issues/18481 +[Bug #18487]: https://bugs.ruby-lang.org/issues/18487 +[Feature #18564]: https://bugs.ruby-lang.org/issues/18564 +[Feature #18571]: https://bugs.ruby-lang.org/issues/18571 +[Feature #18585]: https://bugs.ruby-lang.org/issues/18585 +[Feature #18589]: https://bugs.ruby-lang.org/issues/18589 +[Feature #18595]: https://bugs.ruby-lang.org/issues/18595 +[Feature #18598]: https://bugs.ruby-lang.org/issues/18598 +[Bug #18625]: https://bugs.ruby-lang.org/issues/18625 +[Feature #18630]: https://bugs.ruby-lang.org/issues/18630 +[Bug #18633]: https://bugs.ruby-lang.org/issues/18633 +[Feature #18639]: https://bugs.ruby-lang.org/issues/18639 +[Feature #18685]: https://bugs.ruby-lang.org/issues/18685 +[Bug #18729]: https://bugs.ruby-lang.org/issues/18729 +[Bug #18751]: https://bugs.ruby-lang.org/issues/18751 +[Feature #18774]: https://bugs.ruby-lang.org/issues/18774 +[Feature #18776]: https://bugs.ruby-lang.org/issues/18776 +[Bug #18782]: https://bugs.ruby-lang.org/issues/18782 +[Feature #18788]: https://bugs.ruby-lang.org/issues/18788 +[Feature #18798]: https://bugs.ruby-lang.org/issues/18798 +[Feature #18809]: https://bugs.ruby-lang.org/issues/18809 +[Feature #18821]: https://bugs.ruby-lang.org/issues/18821 +[Feature #18822]: https://bugs.ruby-lang.org/issues/18822 +[Feature #18824]: https://bugs.ruby-lang.org/issues/18824 +[Feature #18832]: https://bugs.ruby-lang.org/issues/18832 +[Feature #18875]: https://bugs.ruby-lang.org/issues/18875 +[Feature #18925]: https://bugs.ruby-lang.org/issues/18925 +[Feature #18944]: https://bugs.ruby-lang.org/issues/18944 +[Feature #18949]: https://bugs.ruby-lang.org/issues/18949 +[Feature #18968]: https://bugs.ruby-lang.org/issues/18968 +[Feature #19008]: https://bugs.ruby-lang.org/issues/19008 +[Feature #19013]: https://bugs.ruby-lang.org/issues/19013 +[Feature #19026]: https://bugs.ruby-lang.org/issues/19026 +[Feature #19036]: https://bugs.ruby-lang.org/issues/19036 +[Feature #19060]: https://bugs.ruby-lang.org/issues/19060 +[Feature #19070]: https://bugs.ruby-lang.org/issues/19070 +[Feature #19071]: https://bugs.ruby-lang.org/issues/19071 +[Feature #19078]: https://bugs.ruby-lang.org/issues/19078 +[Bug #19087]: https://bugs.ruby-lang.org/issues/19087 +[Bug #19100]: https://bugs.ruby-lang.org/issues/19100 +[Feature #19104]: https://bugs.ruby-lang.org/issues/19104 +[Feature #19135]: https://bugs.ruby-lang.org/issues/19135 +[Feature #19138]: https://bugs.ruby-lang.org/issues/19138 +[Feature #19194]: https://bugs.ruby-lang.org/issues/19194 +[Molinillo]: https://github.com/CocoaPods/Molinillo +[PubGrub]: https://github.com/jhawthorn/pub_grub +[GH-net-protocol-14]: https://github.com/ruby/net-protocol/pull/14 +[GH-pathname-20]: https://github.com/ruby/pathname/pull/20 +[GH-6791]: https://github.com/ruby/ruby/pull/6791 +[GH-6868]: https://github.com/ruby/ruby/pull/6868 +[GH-rubygems-4475]: https://github.com/rubygems/rubygems/pull/4475 +[GH-rubygems-6149]: https://github.com/rubygems/rubygems/pull/6149 +[GH-rubygems-6167]: https://github.com/rubygems/rubygems/pull/6167 +[sec-156615]: https://hackerone.com/reports/156615 +[CVE-2021-33621]: https://www.ruby-lang.org/en/news/2022/11/22/http-response-splitting-in-cgi-cve-2021-33621/ +[wasm/README.md]: https://github.com/ruby/ruby/blob/master/wasm/README.md +[ruby.wasm]: https://github.com/ruby/ruby.wasm diff --git a/doc/NEWS/NEWS-3.3.0.md b/doc/NEWS/NEWS-3.3.0.md new file mode 100644 index 0000000000..364786d754 --- /dev/null +++ b/doc/NEWS/NEWS-3.3.0.md @@ -0,0 +1,529 @@ +# NEWS for Ruby 3.3.0 + +This document is a list of user-visible feature changes +since the **3.2.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Command line options + +* A new `performance` warning category was introduced. + They are not displayed by default even in verbose mode. + Turn them on with `-W:performance` or `Warning[:performance] = true`. [[Feature #19538]] + +* A new `RUBY_CRASH_REPORT` environment variable was introduced to allow + redirecting Ruby crash reports to a file or sub command. See the `BUG REPORT ENVIRONMENT` + section of the ruby manpage for further details. [[Feature #19790]] + +## Core classes updates + +Note: We're only listing outstanding class updates. + +* Array + + * Array#pack now raises ArgumentError for unknown directives. [[Bug #19150]] + +* Dir + + * Dir.for_fd added for returning a Dir object for the directory specified + by the provided directory file descriptor. [[Feature #19347]] + * Dir.fchdir added for changing the directory to the directory specified + by the provided directory file descriptor. [[Feature #19347]] + * Dir#chdir added for changing the directory to the directory specified by + the provided `Dir` object. [[Feature #19347]] + +* Encoding + + * `Encoding#replicate` has been removed, it was already deprecated. [[Feature #18949]] + +* Fiber + + * Introduce Fiber#kill. [[Bug #595]] + + ```ruby + fiber = Fiber.new do + while true + puts "Yielding..." + Fiber.yield + end + ensure + puts "Exiting..." + end + + fiber.resume + # Yielding... + fiber.kill + # Exiting... + ``` + +* MatchData + + * MatchData#named_captures now accepts optional `symbolize_names` + keyword. [[Feature #19591]] + +* Module + + * Module#set_temporary_name added for setting a temporary name for a + module. [[Feature #19521]] + +* ObjectSpace::WeakKeyMap + + * New core class to build collections with weak references. + The class use equality semantic to lookup keys like a regular hash, + but it doesn't hold strong references on the keys. [[Feature #18498]] + +* ObjectSpace::WeakMap + + * ObjectSpace::WeakMap#delete was added to eagerly clear weak map + entries. [[Feature #19561]] + +* Proc + * Now Proc#dup and Proc#clone call `#initialize_dup` and `#initialize_clone` + hooks respectively. [[Feature #19362]] + +* Process + + * New Process.warmup method that notify the Ruby virtual machine that the boot sequence is finished, + and that now is a good time to optimize the application. This is useful + for long-running applications. The actual optimizations performed are entirely + implementation-specific and may change in the future without notice. [[Feature #18885]] + +* Process::Status + + * Process::Status#& and Process::Status#>> are deprecated. [[Bug #19868]] + +* Range + + * Range#reverse_each can now process beginless ranges with an Integer endpoint. [[Feature #18515]] + * Range#reverse_each now raises TypeError for endless ranges. [[Feature #18551]] + * Range#overlap? added for checking if two ranges overlap. [[Feature #19839]] + +* Refinement + + * Add Refinement#target as an alternative of Refinement#refined_class. + Refinement#refined_class is deprecated and will be removed in Ruby + 3.4. [[Feature #19714]] + +* Regexp + + * The cache-based optimization now supports lookarounds and atomic groupings. That is, match + for Regexp containing these extensions can now also be performed in linear time to the length + of the input string. However, these cannot contain captures and cannot be nested. [[Feature #19725]] + +* String + + * String#unpack now raises ArgumentError for unknown directives. [[Bug #19150]] + * String#bytesplice now accepts new arguments index/length or range of the + source string to be copied. [[Feature #19314]] + +* Thread::Queue + + * Thread::Queue#freeze now raises TypeError. [[Bug #17146]] + +* Thread::SizedQueue + + * Thread::SizedQueue#freeze now raises TypeError. [[Bug #17146]] + +* Time + + * Time.new with a string argument became stricter. [[Bug #19293]] + + ```ruby + Time.new('2023-12-20') + # no time information (ArgumentError) + ``` + +* TracePoint + + * TracePoint supports `rescue` event. When the raised exception was rescued, + the TracePoint will fire the hook. `rescue` event only supports Ruby-level + `rescue`. [[Feature #19572]] + +## Stdlib updates + +* RubyGems and Bundler warn if users do `require` the following gems without adding them to Gemfile or gemspec. + This is because they will become the bundled gems in the future version of Ruby. This warning is suppressed + if you use bootsnap gem. We recommend to run your application with `DISABLE_BOOTSNAP=1` environmental variable + at least once. This is limitation of this version. + [[Feature #19351]] [[Feature #19776]] [[Feature #19843]] + * abbrev + * base64 + * bigdecimal + * csv + * drb + * getoptlong + * mutex_m + * nkf + * observer + * racc + * resolv-replace + * rinda + * syslog + +* Socket#recv and Socket#recv_nonblock returns `nil` instead of an empty string on closed + connections. Socket#recvmsg and Socket#recvmsg_nonblock returns `nil` instead of an empty packet on closed + connections. [[Bug #19012]] + +* Name resolution such as Socket.getaddrinfo, Socket.getnameinfo, Addrinfo.getaddrinfo, etc. + can now be interrupted. [[Feature #19965]] + +* Random::Formatter#alphanumeric is extended to accept optional `chars` + keyword argument. [[Feature #18183]] + +The following default gem is added. + +* prism 0.19.0 + +The following default gems are updated. + +* RubyGems 3.5.3 +* abbrev 0.1.2 +* base64 0.2.0 +* benchmark 0.3.0 +* bigdecimal 3.1.5 +* bundler 2.5.3 +* cgi 0.4.1 +* csv 3.2.8 +* date 3.3.4 +* delegate 0.3.1 +* drb 2.2.0 +* english 0.8.0 +* erb 4.0.3 +* error_highlight 0.6.0 +* etc 1.4.3 +* fcntl 1.1.0 +* fiddle 1.1.2 +* fileutils 1.7.2 +* find 0.2.0 +* getoptlong 0.2.1 +* io-console 0.7.1 +* io-nonblock 0.3.0 +* io-wait 0.3.1 +* ipaddr 1.2.6 +* irb 1.11.0 +* json 2.7.1 +* logger 1.6.0 +* mutex_m 0.2.0 +* net-http 0.4.0 +* net-protocol 0.2.2 +* nkf 0.1.3 +* observer 0.1.2 +* open-uri 0.4.1 +* open3 0.2.1 +* openssl 3.2.0 +* optparse 0.4.0 +* ostruct 0.6.0 +* pathname 0.3.0 +* pp 0.5.0 +* prettyprint 0.2.0 +* pstore 0.1.3 +* psych 5.1.2 +* rdoc 6.6.2 +* readline 0.0.4 +* reline 0.4.1 +* resolv 0.3.0 +* rinda 0.2.0 +* securerandom 0.3.1 +* set 1.1.0 +* shellwords 0.2.0 +* singleton 0.2.0 +* stringio 3.1.0 +* strscan 3.0.7 +* syntax_suggest 2.0.0 +* syslog 0.1.2 +* tempfile 0.2.1 +* time 0.3.0 +* timeout 0.4.1 +* tmpdir 0.2.0 +* tsort 0.2.0 +* un 0.3.0 +* uri 0.13.0 +* weakref 0.1.3 +* win32ole 1.8.10 +* yaml 0.3.0 +* zlib 3.1.0 + +The following bundled gem is promoted from default gems. + +* racc 1.7.3 + +The following bundled gems are updated. + +* minitest 5.20.0 +* rake 13.1.0 +* test-unit 3.6.1 +* rexml 3.2.6 +* rss 0.3.0 +* net-ftp 0.3.3 +* net-imap 0.4.9 +* net-smtp 0.4.0 +* rbs 3.4.0 +* typeprof 0.21.9 +* debug 1.9.1 + +See GitHub releases like [Logger](https://github.com/ruby/logger/releases) or +changelog for details of the default gems or bundled gems. + +### Prism + +* Introduced [the Prism parser](https://github.com/ruby/prism) as a default gem + * Prism is a portable, error tolerant, and maintainable recursive descent parser for the Ruby language +* Prism is production ready and actively maintained, you can use it in place of Ripper + * There is [extensive documentation](https://ruby.github.io/prism/) on how to use Prism + * Prism is both a C library that will be used internally by CRuby and a Ruby gem that can be used by any tooling which needs to parse Ruby code + * Notable methods in the Prism API are: + * `Prism.parse(source)` which returns the AST as part of a parse result object + * `Prism.parse_comments(source)` which returns the comments + * `Prism.parse_success?(source)` which returns true if there are no errors +* You can make pull requests or issues directly on [the Prism repository](https://github.com/ruby/prism) if you are interested in contributing +* You can now use `ruby --parser=prism` or `RUBYOPT="--parser=prism"` to experiment with the Prism compiler. Please note that this flag is for debugging only. + +## Compatibility issues + +* Subprocess creation/forking via the following file open methods is deprecated. [[Feature #19630]] + * Kernel#open + * URI.open + * IO.binread + * IO.foreach + * IO.readlines + * IO.read + * IO.write + +* When given a non-lambda, non-literal block, Kernel#lambda with now raises + ArgumentError instead of returning it unmodified. These usages have been + issuing warnings under the `Warning[:deprecated]` category since Ruby 3.0.0. + [[Feature #19777]] + +* The `RUBY_GC_HEAP_INIT_SLOTS` environment variable has been deprecated and + removed. Environment variables `RUBY_GC_HEAP_%d_INIT_SLOTS` should be + used instead. [[Feature #19785]] + +* `it` calls without arguments in a block with no ordinary parameters are + deprecated. `it` will be a reference to the first block parameter in Ruby 3.4. + [[Feature #18980]] + +* Error message for NoMethodError have changed to not use the target object's `#inspect` + for efficiency, and says "instance of ClassName" instead. [[Feature #18285]] + + ```ruby + ([1] * 100).nonexisting + # undefined method `nonexisting' for an instance of Array (NoMethodError) + ``` + +* Now anonymous parameters forwarding is disallowed inside a block + that uses anonymous parameters. [[Feature #19370]] + +## Stdlib compatibility issues + +* `racc` is promoted to bundled gems. + * You need to add `racc` to your `Gemfile` if you use `racc` under bundler environment. +* `ext/readline` is retired + * We have `reline` that is pure Ruby implementation compatible with `ext/readline` API. + We rely on `reline` in the future. If you need to use `ext/readline`, you can install + `ext/readline` via rubygems.org with `gem install readline-ext`. + * We no longer need to install libraries like `libreadline` or `libedit`. + +## C API updates + +* `rb_postponed_job` updates + * New APIs and deprecated APIs (see comments for details) + * added: `rb_postponed_job_preregister()` + * added: `rb_postponed_job_trigger()` + * deprecated: `rb_postponed_job_register()` (and semantic change. see below) + * deprecated: `rb_postponed_job_register_one()` + * The postponed job APIs have been changed to address some rare crashes. + To solve the issue, we introduced new two APIs and deprecated current APIs. + The semantics of these functions have also changed slightly; `rb_postponed_job_register` + now behaves like the `once` variant in that multiple calls with the same + `func` might be coalesced into a single execution of the `func` + [[Feature #20057]] + +* Some updates for internal thread event hook APIs + * `rb_internal_thread_event_data_t` with a target Ruby thread (VALUE) + and callback functions (`rb_internal_thread_event_callback`) receive it. + https://github.com/ruby/ruby/pull/8885 + * The following functions are introduced to manipulate Ruby thread local data + from internal thread event hook APIs (they are introduced since Ruby 3.2). + https://github.com/ruby/ruby/pull/8936 + * `rb_internal_thread_specific_key_create()` + * `rb_internal_thread_specific_get()` + * `rb_internal_thread_specific_set()` + +* `rb_profile_thread_frames()` is introduced to get a frames from + a specific thread. + [[Feature #10602]] + +* `rb_data_define()` is introduced to define `Data`. [[Feature #19757]] + +* `rb_ext_resolve_symbol()` is introduced to search a function from + extension libraries. [[Feature #20005]] + +* IO related updates: + * The details of `rb_io_t` will be hidden and deprecated attributes + are added for each members. [[Feature #19057]] + * `rb_io_path(VALUE io)` is introduced to get a path of `io`. + * `rb_io_closed_p(VALUE io)` to get opening or closing of `io`. + * `rb_io_mode(VALUE io)` to get the mode of `io`. + * `rb_io_open_descriptor()` is introduced to make an IO object from a file + descriptor. + +## Implementation improvements + +### Parser + +* Replace Bison with [Lrama LALR parser generator](https://github.com/ruby/lrama). + No need to install Bison to build Ruby from source code anymore. + We will no longer suffer bison compatibility issues and we can use new features by just implementing it to Lrama. [[Feature #19637]] + * See [The future vision of Ruby Parser](https://rubykaigi.org/2023/presentations/spikeolaf.html) for detail. + * Lrama internal parser is a LR parser generated by Racc for maintainability. + * Parameterizing Rules `(?, *, +)` are supported, it will be used in Ruby parse.y. + +### GC / Memory management + +* Major performance improvements over Ruby 3.2 + * Young objects referenced by old objects are no longer immediately + promoted to the old generation. This significantly reduces the frequency of + major GC collections. [[Feature #19678]] + * A new `REMEMBERED_WB_UNPROTECTED_OBJECTS_LIMIT_RATIO` tuning variable was + introduced to control the number of unprotected objects cause a major GC + collection to trigger. The default is set to `0.01` (1%). This significantly + reduces the frequency of major GC collection. [[Feature #19571]] + * Write Barriers were implemented for many core types that were missing them, + notably `Time`, `Enumerator`, `MatchData`, `Method`, `File::Stat`, `BigDecimal` + and several others. This significantly reduces minor GC collection time and major + GC collection frequency. + * Most core classes are now using Variable Width Allocation, notably `Hash`, `Time`, + `Thread::Backtrace`, `Thread::Backtrace::Location`, `File::Stat`, `Method`. + This makes these classes faster to allocate and free, use less memory and reduce + heap fragmentation. +* `defined?(@ivar)` is optimized with Object Shapes. + +### YJIT + +* Major performance improvements over Ruby 3.2 + * Support for splat and rest arguments has been improved. + * Registers are allocated for stack operations of the virtual machine. + * More calls with optional arguments are compiled. Exception handlers are also compiled. + * Unsupported call types and megamorphic call sites no longer exit to the interpreter. + * Basic methods like Rails `#blank?` and + [specialized `#present?`](https://github.com/rails/rails/pull/49909) are inlined. + * `Integer#*`, `Integer#!=`, `String#!=`, `String#getbyte`, + `Kernel#block_given?`, `Kernel#is_a?`, `Kernel#instance_of?`, and `Module#===` + are specially optimized. + * Compilation speed is now slightly faster than Ruby 3.2. + * Now more than 3x faster than the interpreter on Optcarrot! +* Significantly improved memory usage over Ruby 3.2 + * Metadata for compiled code uses a lot less memory. + * `--yjit-call-threshold` is automatically raised from 30 to 120 + when the application has more than 40,000 ISEQs. + * `--yjit-cold-threshold` is added to skip compiling cold ISEQs. + * More compact code is generated on Arm64. +* Code GC is now disabled by default + * `--yjit-exec-mem-size` is treated as a hard limit where compilation of new code stops. + * No sudden drops in performance due to code GC. + Better copy-on-write behavior on servers reforking with + [Pitchfork](https://github.com/shopify/pitchfork). + * You can still enable code GC if desired with `--yjit-code-gc` +* Add `RubyVM::YJIT.enable` that can enable YJIT at run-time + * You can start YJIT without modifying command-line arguments or environment variables. + Rails 7.2 will [enable YJIT by default](https://github.com/rails/rails/pull/49947) + using this method. + * This can also be used to enable YJIT only once your application is + done booting. `--yjit-disable` can be used if you want to use other + YJIT options while disabling YJIT at boot. +* More YJIT stats are available by default + * `yjit_alloc_size` and several more metadata-related stats are now available by default. + * `ratio_in_yjit` stat produced by `--yjit-stats` is now available in release builds, + a special stats or dev build is no longer required to access most stats. +* Add more profiling capabilities + * `--yjit-perf` is added to facilitate profiling with Linux perf. + * `--yjit-trace-exits` now supports sampling with `--yjit-trace-exits-sample-rate=N` +* More thorough testing and multiple bug fixes +* `--yjit-stats=quiet` is added to avoid printing stats on exit. + +### MJIT + +* MJIT is removed. + * `--disable-jit-support` is removed. Consider using `--disable-yjit --disable-rjit` instead. + +### RJIT + +* Introduced a pure-Ruby JIT compiler RJIT. + * RJIT supports only x86\_64 architecture on Unix platforms. + * Unlike MJIT, it doesn't require a C compiler at runtime. +* RJIT exists only for experimental purposes. + * You should keep using YJIT in production. + +### M:N Thread scheduler + +* M:N Thread scheduler is introduced. [[Feature #19842]] + * Background: Ruby 1.8 and before, M:1 thread scheduler (M Ruby threads + with 1 native thread. Called as User level threads or Green threads) + is used. Ruby 1.9 and later, 1:1 thread scheduler (1 Ruby thread with + 1 native thread). M:1 threads takes lower resources compare with 1:1 + threads because it needs only 1 native threads. However it is difficult + to support context switching for all of blocking operation so 1:1 + threads are employed from Ruby 1.9. M:N thread scheduler uses N native + threads for M Ruby threads (N is small number in general). It doesn't + need same number of native threads as Ruby threads (similar to the M:1 + thread scheduler). Also our M:N threads supports blocking operations + well same as 1:1 threads. See the ticket for more details. + Our M:N thread scheduler refers on the goroutine scheduler in the + Go language. + * In a ractor, only 1 thread can run in a same time because of + implementation. Therefore, applications that use only one Ractor + (most applications) M:N thread scheduler works as M:1 thread scheduler + with further extension from Ruby 1.8. + * M:N thread scheduler can introduce incompatibility for C-extensions, + so it is disabled by default on the main Ractors. + `RUBY_MN_THREADS=1` environment variable will enable it. + On non-main Ractors, M:N thread scheduler is enabled (and can not + disable it now). + * `N` (the number of native threads) can be specified with `RUBY_MAX_CPU` + environment variable. The default is 8. + Note that more than `N` native threads are used to support many kind of + blocking operations. + +[Bug #595]: https://bugs.ruby-lang.org/issues/595 +[Feature #10602]: https://bugs.ruby-lang.org/issues/10602 +[Bug #17146]: https://bugs.ruby-lang.org/issues/17146 +[Feature #18183]: https://bugs.ruby-lang.org/issues/18183 +[Feature #18285]: https://bugs.ruby-lang.org/issues/18285 +[Feature #18498]: https://bugs.ruby-lang.org/issues/18498 +[Feature #18515]: https://bugs.ruby-lang.org/issues/18515 +[Feature #18551]: https://bugs.ruby-lang.org/issues/18551 +[Feature #18885]: https://bugs.ruby-lang.org/issues/18885 +[Feature #18949]: https://bugs.ruby-lang.org/issues/18949 +[Feature #18980]: https://bugs.ruby-lang.org/issues/18980 +[Bug #19012]: https://bugs.ruby-lang.org/issues/19012 +[Feature #19057]: https://bugs.ruby-lang.org/issues/19057 +[Bug #19150]: https://bugs.ruby-lang.org/issues/19150 +[Bug #19293]: https://bugs.ruby-lang.org/issues/19293 +[Feature #19314]: https://bugs.ruby-lang.org/issues/19314 +[Feature #19347]: https://bugs.ruby-lang.org/issues/19347 +[Feature #19351]: https://bugs.ruby-lang.org/issues/19351 +[Feature #19362]: https://bugs.ruby-lang.org/issues/19362 +[Feature #19370]: https://bugs.ruby-lang.org/issues/19370 +[Feature #19521]: https://bugs.ruby-lang.org/issues/19521 +[Feature #19538]: https://bugs.ruby-lang.org/issues/19538 +[Feature #19561]: https://bugs.ruby-lang.org/issues/19561 +[Feature #19571]: https://bugs.ruby-lang.org/issues/19571 +[Feature #19572]: https://bugs.ruby-lang.org/issues/19572 +[Feature #19591]: https://bugs.ruby-lang.org/issues/19591 +[Feature #19630]: https://bugs.ruby-lang.org/issues/19630 +[Feature #19637]: https://bugs.ruby-lang.org/issues/19637 +[Feature #19678]: https://bugs.ruby-lang.org/issues/19678 +[Feature #19714]: https://bugs.ruby-lang.org/issues/19714 +[Feature #19725]: https://bugs.ruby-lang.org/issues/19725 +[Feature #19757]: https://bugs.ruby-lang.org/issues/19757 +[Feature #19776]: https://bugs.ruby-lang.org/issues/19776 +[Feature #19777]: https://bugs.ruby-lang.org/issues/19777 +[Feature #19785]: https://bugs.ruby-lang.org/issues/19785 +[Feature #19790]: https://bugs.ruby-lang.org/issues/19790 +[Feature #19839]: https://bugs.ruby-lang.org/issues/19839 +[Feature #19842]: https://bugs.ruby-lang.org/issues/19842 +[Feature #19843]: https://bugs.ruby-lang.org/issues/19843 +[Bug #19868]: https://bugs.ruby-lang.org/issues/19868 +[Feature #19965]: https://bugs.ruby-lang.org/issues/19965 +[Feature #20005]: https://bugs.ruby-lang.org/issues/20005 +[Feature #20057]: https://bugs.ruby-lang.org/issues/20057 diff --git a/doc/NEWS/NEWS-3.4.0.md b/doc/NEWS/NEWS-3.4.0.md new file mode 100644 index 0000000000..e9cc3a9569 --- /dev/null +++ b/doc/NEWS/NEWS-3.4.0.md @@ -0,0 +1,962 @@ +# NEWS for Ruby 3.4.0 + +This document is a list of user-visible feature changes +since the **3.3.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Language changes + +* `it` is added to reference a block parameter. [[Feature #18980]] + +* String literals in files without a `frozen_string_literal` comment now emit a deprecation warning + when they are mutated. + These warnings can be enabled with `-W:deprecated` or by setting `Warning[:deprecated] = true`. + To disable this change, you can run Ruby with the `--disable-frozen-string-literal` + command line argument. [[Feature #20205]] + + * `String#+@` now duplicates when mutating the string would emit + a deprecation warning, offered as a replacement for the + `str.dup if str.frozen?` pattern. + +* Keyword splatting `nil` when calling methods is now supported. + `**nil` is treated similarly to `**{}`, passing no keywords, + and not calling any conversion methods. [[Bug #20064]] + +* Block passing is no longer allowed in index assignment + (e.g. `a[0, &b] = 1`). [[Bug #19918]] + +* Keyword arguments are no longer allowed in index assignment + (e.g. `a[0, kw: 1] = 2`). [[Bug #20218]] + +* The toplevel name `::Ruby` is reserved now, and the definition will be warned + when `Warning[:deprecated]`. [[Feature #20884]] + +## Core classes updates + +Note: We're only listing outstanding class updates. + + +* Array + + * `Array#fetch_values` was added. [[Feature #20702]] + +* Exception + + * `Exception#set_backtrace` now accepts arrays of `Thread::Backtrace::Location`. + `Kernel#raise`, `Thread#raise` and `Fiber#raise` also accept this new format. [[Feature #13557]] + +* Fiber::Scheduler + + * An optional `Fiber::Scheduler#blocking_operation_wait` hook allows blocking operations to be moved out of the + event loop in order to reduce latency and improve multi-core processor utilization. [[Feature #20876]] + +* GC + + * `GC.config` added to allow setting configuration variables on the Garbage + Collector. [[Feature #20443]] + + * GC configuration parameter `rgengc_allow_full_mark` introduced. When `false` + GC will only mark young objects. Default is `true`. [[Feature #20443]] + +* Hash + + * `Hash.new` now accepts an optional `capacity:` argument, to preallocate the hash with a given capacity. + This can improve performance when building large hashes incrementally by saving on reallocation and + rehashing of keys. [[Feature #19236]] + +* IO::Buffer + + * `IO::Buffer#copy` can release the GVL, allowing other threads to run while copying data. [[Feature #20902]] + +* Integer + + * `Integer#**` used to return `Float::INFINITY` when the return value is large, but now returns an `Integer`. + If the return value is extremely large, it raises an exception. + [[Feature #20811]] + +* MatchData + + * `MatchData#bytebegin` and `MatchData#byteend` have been added. [[Feature #20576]] + +* Object + + * `Object#singleton_method` now returns methods in modules prepended to or included in the + receiver's singleton class. [[Bug #20620]] + + ```rb + o = Object.new + o.extend(Module.new{def a = 1}) + o.singleton_method(:a).call #=> 1 + ``` + +* Ractor + + * `require` in Ractor is allowed. The requiring process will be run on + the main Ractor. + `Ractor._require(feature)` is added to run requiring process on the + main Ractor. + [[Feature #20627]] + + * `Ractor.main?` is added. [[Feature #20627]] + + * `Ractor.[]` and `Ractor.[]=` are added to access the ractor local storage + of the current Ractor. [[Feature #20715]] + + * `Ractor.store_if_absent(key){ init }` is added to initialize ractor local + variables in thread-safety. [[Feature #20875]] + +* Range + + * `Range#size` now raises `TypeError` if the range is not iterable. [[Misc #18984]] + * `Range#step` now consistently has a semantics of iterating by using `+` operator + for all types, not only numerics. [[Feature #18368]] + + ```ruby + (Time.utc(2022, 2, 24)..).step(24*60*60).take(3) + #=> [2022-02-24 00:00:00 UTC, 2022-02-25 00:00:00 UTC, 2022-02-26 00:00:00 UTC] + ``` + +* Rational + + * `Rational#**` used to return `Float::INFINITY` or `Float::NAN` + when the numerator of the return value is large, but now returns an `Rational`. + If it is extremely large, it raises an exception. [[Feature #20811]] + +* RubyVM::AbstractSyntaxTree + + * Add `RubyVM::AbstractSyntaxTree::Node#locations` method which returns location objects + associated with the AST node. [[Feature #20624]] + * Add `RubyVM::AbstractSyntaxTree::Location` class which holds location information. [[Feature #20624]] + + +* String + + * `String#append_as_bytes` was added to more easily and efficiently work with binary buffers and protocols. + It directly concatenate the arguments into the string without any encoding validation or conversion. + [[Feature #20594]] + +* Symbol + + * The string returned by `Symbol#to_s` now emits a deprecation warning when mutated, and will be + frozen in a future version of Ruby. + These warnings can be enabled with `-W:deprecated` or by setting `Warning[:deprecated] = true`. + [[Feature #20350]] + +* Time + + * On Windows, now `Time#zone` encodes the system timezone name in UTF-8 + instead of the active code page, if it contains non-ASCII characters. + [[Bug #20929]] + + * `Time#xmlschema`, and its `Time#iso8601` alias have been moved into the core Time + class while previously it was an extension provided by the `time` gem. [[Feature #20707]] + +* Warning + + * Add `Warning.categories` method which returns a list of possible warning categories. + [[Feature #20293]] + +## Stdlib updates + +We only list stdlib changes that are notable feature changes. + +* RubyGems + + * Add `--attestation` option to gem push. It enabled to store signature of build artifact to sigstore.dev. + +* Bundler + + * Add a `lockfile_checksums` configuration to include checksums in fresh lockfiles. + * Add bundle lock `--add-checksums` to add checksums to an existing lockfile. + +* JSON + + * Performance improvements `JSON.parse` about 1.5 times faster than json-2.7.x. + +* Tempfile + + * The keyword argument `anonymous: true` is implemented for Tempfile.create. + `Tempfile.create(anonymous: true)` removes the created temporary file immediately. + So applications don't need to remove the file. + [[Feature #20497]] + +* win32/sspi.rb + + * This library is now extracted from the Ruby repository to [ruby/net-http-sspi]. + [[Feature #20775]] + +* Socket + + * `Socket::ResolutionError` and `Socket::ResolutionError#error_code` was added. + [[Feature #20018]] + +* IRB + + * Interactive method completion is now improved with type information by default. + [[Feature #20778]] + +Other changes are listed in the following sections. we also listed release history from the previous bundled version that is Ruby 3.3.0 if it has GitHub releases. + +The following default gem is added. + +* win32-registry 0.1.0 + +The following default gems are updated. + +* RubyGems 3.6.2 +* benchmark 0.4.0 +* bundler 2.6.2 +* date 3.4.1 +* delegate 0.4.0 +* did_you_mean 2.0.0 +* digest 3.2.0 +* erb 4.0.4 +* error_highlight 0.7.0 +* etc 1.4.5 +* fcntl 1.2.0 +* fiddle 1.1.6 +* fileutils 1.7.3 +* io-console 0.8.0 +* io-nonblock 0.3.1 +* ipaddr 1.2.7 +* irb 1.14.3 +* json 2.9.1 +* logger 1.6.4 +* net-http 0.6.0 +* open-uri 0.5.0 +* openssl 3.3.0 +* optparse 0.6.0 +* ostruct 0.6.1 +* pathname 0.4.0 +* pp 0.6.2 +* prism 1.2.0 +* pstore 0.1.4 +* psych 5.2.2 +* rdoc 6.10.0 +* reline 0.6.0 +* resolv 0.6.0 +* securerandom 0.4.1 +* set 1.1.1 +* shellwords 0.2.2 +* singleton 0.3.0 +* stringio 3.1.2 +* strscan 3.1.2 +* syntax_suggest 2.0.2 +* tempfile 0.3.1 +* time 0.4.1 +* timeout 0.4.3 +* tmpdir 0.3.1 +* uri 1.0.2 +* win32ole 1.9.1 +* yaml 0.4.0 +* zlib 3.2.1 + + * 3.5.3 to [v3.5.4][RubyGems-v3.5.4], [v3.5.5][RubyGems-v3.5.5], [v3.5.6][RubyGems-v3.5.6], [v3.5.7][RubyGems-v3.5.7], [v3.5.8][RubyGems-v3.5.8], [v3.5.9][RubyGems-v3.5.9], [v3.5.10][RubyGems-v3.5.10], [v3.5.11][RubyGems-v3.5.11], [v3.5.12][RubyGems-v3.5.12], [v3.5.13][RubyGems-v3.5.13], [v3.5.14][RubyGems-v3.5.14], [v3.5.15][RubyGems-v3.5.15], [v3.5.16][RubyGems-v3.5.16], [v3.5.17][RubyGems-v3.5.17], [v3.5.18][RubyGems-v3.5.18], [v3.5.19][RubyGems-v3.5.19], [v3.5.20][RubyGems-v3.5.20], [v3.5.21][RubyGems-v3.5.21], [v3.5.22][RubyGems-v3.5.22], [v3.5.23][RubyGems-v3.5.23], [v3.6.0][RubyGems-v3.6.0], [v3.6.1][RubyGems-v3.6.1], [v3.6.2][RubyGems-v3.6.2] +* [benchmark][benchmark] 0.4.0 + * 0.3.0 to [v0.4.0][benchmark-v0.4.0] +* [bundler][bundler] 2.6.2 + * 2.5.3 to [v2.5.4][bundler-v2.5.4], [v2.5.5][bundler-v2.5.5], [v2.5.6][bundler-v2.5.6], [v2.5.7][bundler-v2.5.7], [v2.5.8][bundler-v2.5.8], [v2.5.9][bundler-v2.5.9], [v2.5.10][bundler-v2.5.10], [v2.5.11][bundler-v2.5.11], [v2.5.12][bundler-v2.5.12], [v2.5.13][bundler-v2.5.13], [v2.5.14][bundler-v2.5.14], [v2.5.15][bundler-v2.5.15], [v2.5.16][bundler-v2.5.16], [v2.5.17][bundler-v2.5.17], [v2.5.18][bundler-v2.5.18], [v2.5.19][bundler-v2.5.19], [v2.5.20][bundler-v2.5.20], [v2.5.21][bundler-v2.5.21], [v2.5.22][bundler-v2.5.22], [v2.5.23][bundler-v2.5.23], [v2.6.0][bundler-v2.6.0], [v2.6.1][bundler-v2.6.1], [v2.6.2][bundler-v2.6.2] +* [date][date] 3.4.1 + * 3.3.4 to [v3.4.0][date-v3.4.0], [v3.4.1][date-v3.4.1] +* [delegate][delegate] 0.4.0 + * 0.3.1 to [v0.4.0][delegate-v0.4.0] +* [did_you_mean][did_you_mean] 2.0.0 + * 1.6.3 to [v2.0.0][did_you_mean-v2.0.0] +* [digest][digest] 3.2.0 + * 3.1.1 to [v3.2.0.pre0][digest-v3.2.0.pre0], [v3.2.0][digest-v3.2.0] +* [erb][erb] 4.0.4 + * 4.0.3 to [v4.0.4][erb-v4.0.4] +* [error_highlight][error_highlight] 0.7.0 + * 0.6.0 to [v0.7.0][error_highlight-v0.7.0] +* [etc][etc] 1.4.5 + * 1.4.3 to [v1.4.4][etc-v1.4.4], [v1.4.5][etc-v1.4.5] +* [fcntl][fcntl] 1.2.0 + * 1.1.0 to [v1.2.0][fcntl-v1.2.0] +* [fiddle][fiddle] 1.1.6 + * 1.1.2 to [v1.1.3][fiddle-v1.1.3], [v1.1.4][fiddle-v1.1.4], [v1.1.5][fiddle-v1.1.5], [v1.1.6][fiddle-v1.1.6] +* [fileutils][fileutils] 1.7.3 + * 1.7.2 to [v1.7.3][fileutils-v1.7.3] +* [io-console][io-console] 0.8.0 + * 0.7.1 to [v0.7.2][io-console-v0.7.2], [v0.8.0.beta1][io-console-v0.8.0.beta1], [v0.8.0][io-console-v0.8.0] +* [io-nonblock][io-nonblock] 0.3.1 + * 0.3.0 to [v0.3.1][io-nonblock-v0.3.1] +* [ipaddr][ipaddr] 1.2.7 + * 1.2.6 to [v1.2.7][ipaddr-v1.2.7] +* [irb][irb] 1.14.3 + * 1.11.0 to [v1.11.1][irb-v1.11.1], [v1.11.2][irb-v1.11.2], [v1.12.0][irb-v1.12.0], [v1.13.0][irb-v1.13.0], [v1.13.1][irb-v1.13.1], [v1.13.2][irb-v1.13.2], [v1.14.0][irb-v1.14.0], [v1.14.1][irb-v1.14.1], [v1.14.2][irb-v1.14.2], [v1.14.3][irb-v1.14.3] +* [json][json] 2.9.1 + * 2.7.1 to [v2.7.2][json-v2.7.2], [v2.7.3.rc1][json-v2.7.3.rc1], [v2.7.3][json-v2.7.3], [v2.7.4][json-v2.7.4], [v2.7.5][json-v2.7.5], [v2.7.6][json-v2.7.6], [v2.8.0][json-v2.8.0], [v2.8.1][json-v2.8.1], [v2.8.2][json-v2.8.2], [v2.9.0][json-v2.9.0], [v2.9.1][json-v2.9.1] +* [logger][logger] 1.6.4 + * 1.6.0 to [v1.6.1][logger-v1.6.1], [v1.6.2][logger-v1.6.2], [v1.6.3][logger-v1.6.3], [v1.6.4][logger-v1.6.4] +* [net-http][net-http] 0.6.0 + * 0.4.0 to [v0.4.1][net-http-v0.4.1], [v0.5.0][net-http-v0.5.0], [v0.6.0][net-http-v0.6.0] +* [open-uri][open-uri] 0.5.0 + * 0.4.1 to [v0.5.0][open-uri-v0.5.0] +* [optparse][optparse] 0.6.0 + * 0.4.0 to [v0.5.0][optparse-v0.5.0], [v0.6.0][optparse-v0.6.0] +* [ostruct][ostruct] 0.6.1 + * 0.6.0 to [v0.6.1][ostruct-v0.6.1] +* [pathname][pathname] 0.4.0 + * 0.3.0 to [v0.4.0][pathname-v0.4.0] +* [pp][pp] 0.6.2 + * 0.5.0 to [v0.6.0][pp-v0.6.0], [v0.6.1][pp-v0.6.1], [v0.6.2][pp-v0.6.2] +* [prism][prism] 1.2.0 + * 0.19.0 to [v0.20.0][prism-v0.20.0], [v0.21.0][prism-v0.21.0], [v0.22.0][prism-v0.22.0], [v0.23.0][prism-v0.23.0], [v0.24.0][prism-v0.24.0], [v0.25.0][prism-v0.25.0], [v0.26.0][prism-v0.26.0], [v0.27.0][prism-v0.27.0], [v0.28.0][prism-v0.28.0], [v0.29.0][prism-v0.29.0], [v0.30.0][prism-v0.30.0], [v1.0.0][prism-v1.0.0], [v1.1.0][prism-v1.1.0], [v1.2.0][prism-v1.2.0] +* [pstore][pstore] 0.1.4 + * 0.1.3 to [v0.1.4][pstore-v0.1.4] +* [psych][psych] 5.2.2 + * 5.1.2 to [v5.2.0.beta1][psych-v5.2.0.beta1], [v5.2.0.beta2][psych-v5.2.0.beta2], [v5.2.0.beta3][psych-v5.2.0.beta3], [v5.2.0.beta4][psych-v5.2.0.beta4], [v5.2.0.beta5][psych-v5.2.0.beta5], [v5.2.0.beta6][psych-v5.2.0.beta6], [v5.2.0.beta7][psych-v5.2.0.beta7], [v5.2.0][psych-v5.2.0], [v5.2.1][psych-v5.2.1], [v5.2.2][psych-v5.2.2] +* [rdoc][rdoc] 6.10.0 + * 6.6.2 to [v6.7.0][rdoc-v6.7.0], [v6.8.0][rdoc-v6.8.0], [v6.8.1][rdoc-v6.8.1], [v6.9.0][rdoc-v6.9.0], [v6.9.1][rdoc-v6.9.1], [v6.10.0][rdoc-v6.10.0] +* [reline][reline] 0.6.0 + * 0.4.1 to [v0.4.2][reline-v0.4.2], [v0.4.3][reline-v0.4.3], [v0.5.0.pre.1][reline-v0.5.0.pre.1], [v0.5.0][reline-v0.5.0], [v0.5.1][reline-v0.5.1], [v0.5.2][reline-v0.5.2], [v0.5.3][reline-v0.5.3], [v0.5.4][reline-v0.5.4], [v0.5.5][reline-v0.5.5], [v0.5.6][reline-v0.5.6], [v0.5.7][reline-v0.5.7], [v0.5.8][reline-v0.5.8], [v0.5.9][reline-v0.5.9], [v0.5.10][reline-v0.5.10], [v0.5.11][reline-v0.5.11], [v0.5.12][reline-v0.5.12], [v0.6.0][reline-v0.6.0] +* [resolv][resolv] 0.6.0 + * 0.3.0 to [v0.4.0][resolv-v0.4.0], [v0.5.0][resolv-v0.5.0], [v0.6.0][resolv-v0.6.0] +* [securerandom][securerandom] 0.4.1 + * 0.3.1 to [v0.3.2][securerandom-v0.3.2], [v0.4.0][securerandom-v0.4.0], [v0.4.1][securerandom-v0.4.1] +* [set][set] 1.1.1 + * 1.1.0 to [v1.1.1][set-v1.1.1] +* [shellwords][shellwords] 0.2.2 + * 0.2.0 to [v0.2.1][shellwords-v0.2.1], [v0.2.2][shellwords-v0.2.2] +* [singleton][singleton] 0.3.0 + * 0.2.0 to [v0.3.0][singleton-v0.3.0] +* [stringio][stringio] 3.1.2 + * 3.1.0 to [v3.1.1][stringio-v3.1.1], [v3.1.2][stringio-v3.1.2] +* [strscan][strscan] 3.1.2 + * 3.0.7 to [v3.0.8][strscan-v3.0.8], [v3.0.9][strscan-v3.0.9], [v3.1.0][strscan-v3.1.0], [v3.1.1][strscan-v3.1.1], [v3.1.2][strscan-v3.1.2] +* [syntax_suggest][syntax_suggest] 2.0.2 + * 2.0.0 to [v2.0.1][syntax_suggest-v2.0.1], [v2.0.2][syntax_suggest-v2.0.2] +* [tempfile][tempfile] 0.3.1 + * 0.2.1 to [v0.3.0][tempfile-v0.3.0], [v0.3.1][tempfile-v0.3.1] +* [time][time] 0.4.1 + * 0.3.0 to [v0.4.0][time-v0.4.0], [v0.4.1][time-v0.4.1] +* [timeout][timeout] 0.4.3 + * 0.4.1 to [v0.4.2][timeout-v0.4.2], [v0.4.3][timeout-v0.4.3] +* [tmpdir][tmpdir] 0.3.1 + * 0.2.0 to [v0.3.0][tmpdir-v0.3.0], [v0.3.1][tmpdir-v0.3.1] +* [uri][uri] 1.0.2 + * 0.13.0 to [v0.13.1][uri-v0.13.1], [v1.0.0][uri-v1.0.0], [v1.0.1][uri-v1.0.1], [v1.0.2][uri-v1.0.2] +* [win32ole][win32ole] 1.9.1 + * 1.8.10 to [v1.9.0][win32ole-v1.9.0], [v1.9.1][win32ole-v1.9.1] +* [yaml][yaml] 0.4.0 + * 0.3.0 to [v0.4.0][yaml-v0.4.0] +* [zlib][zlib] 3.2.1 + * 3.1.0 to [v3.1.1][zlib-v3.1.1], [v3.2.0][zlib-v3.2.0], [v3.2.1][zlib-v3.2.1] + +The following bundled gem is added. + +* [repl_type_completor][repl_type_completor] 0.1.9 + +The following bundled gems are updated. + +* [minitest][minitest] 5.25.4 + * 5.20.0 to [v5.25.4][minitest-v5.25.4] +* [power_assert][power_assert] 2.0.5 + * 2.0.3 to [v2.0.4][power_assert-v2.0.4], [v2.0.5][power_assert-v2.0.5] +* [rake][rake] 13.2.1 + * 13.1.0 to [v13.2.0][rake-v13.2.0], [v13.2.1][rake-v13.2.1] +* [test-unit][test-unit] 3.6.7 + * 3.6.1 to [3.6.2][test-unit-3.6.2], [3.6.3][test-unit-3.6.3], [3.6.4][test-unit-3.6.4], [3.6.5][test-unit-3.6.5], [3.6.6][test-unit-3.6.6], [3.6.7][test-unit-3.6.7] +* [rexml][rexml] 3.4.0 + * 3.2.6 to [v3.2.7][rexml-v3.2.7], [v3.2.8][rexml-v3.2.8], [v3.2.9][rexml-v3.2.9], [v3.3.0][rexml-v3.3.0], [v3.3.1][rexml-v3.3.1], [v3.3.2][rexml-v3.3.2], [v3.3.3][rexml-v3.3.3], [v3.3.4][rexml-v3.3.4], [v3.3.5][rexml-v3.3.5], [v3.3.6][rexml-v3.3.6], [v3.3.7][rexml-v3.3.7], [v3.3.8][rexml-v3.3.8], [v3.3.9][rexml-v3.3.9], [v3.4.0][rexml-v3.4.0] +* [rss][rss] 0.3.1 + * 0.3.0 to [0.3.1][rss-0.3.1] +* [net-ftp][net-ftp] 0.3.8 + * 0.3.3 to [v0.3.4][net-ftp-v0.3.4], [v0.3.5][net-ftp-v0.3.5], [v0.3.6][net-ftp-v0.3.6], [v0.3.7][net-ftp-v0.3.7], [v0.3.8][net-ftp-v0.3.8] +* [net-imap][net-imap] 0.5.4 + * 0.4.9 to [v0.4.9.1][net-imap-v0.4.9.1], [v0.4.10][net-imap-v0.4.10], [v0.4.11][net-imap-v0.4.11], [v0.4.12][net-imap-v0.4.12], [v0.4.13][net-imap-v0.4.13], [v0.4.14][net-imap-v0.4.14], [v0.4.15][net-imap-v0.4.15], [v0.4.16][net-imap-v0.4.16], [v0.4.17][net-imap-v0.4.17], [v0.5.0][net-imap-v0.5.0], [v0.4.18][net-imap-v0.4.18], [v0.5.1][net-imap-v0.5.1], [v0.5.2][net-imap-v0.5.2], [v0.5.3][net-imap-v0.5.3], [v0.5.4][net-imap-v0.5.4] +* [net-smtp][net-smtp] 0.5.0 + * 0.4.0 to [v0.4.0.1][net-smtp-v0.4.0.1], [v0.5.0][net-smtp-v0.5.0] +* [prime][prime] 0.1.3 + * 0.1.2 to [v0.1.3][prime-v0.1.3] +* [rbs][rbs] 3.8.0 + * 3.4.0 to [v3.4.1][rbs-v3.4.1], [v3.4.2][rbs-v3.4.2], [v3.4.3][rbs-v3.4.3], [v3.4.4][rbs-v3.4.4], [v3.5.0.pre.1][rbs-v3.5.0.pre.1], [v3.5.0.pre.2][rbs-v3.5.0.pre.2], [v3.5.0][rbs-v3.5.0], [v3.5.1][rbs-v3.5.1], [v3.5.2][rbs-v3.5.2], [v3.5.3][rbs-v3.5.3], [v3.6.0.dev.1][rbs-v3.6.0.dev.1], [v3.6.0.pre.1][rbs-v3.6.0.pre.1], [v3.6.0.pre.2][rbs-v3.6.0.pre.2], [v3.6.0.pre.3][rbs-v3.6.0.pre.3], [v3.6.0][rbs-v3.6.0], [v3.6.1][rbs-v3.6.1], [v3.7.0.dev.1][rbs-v3.7.0.dev.1], [v3.7.0.pre.1][rbs-v3.7.0.pre.1], [v3.7.0][rbs-v3.7.0], [v3.8.0.pre.1][rbs-v3.8.0.pre.1] [v3.8.0][rbs-v3.8.0] +* [typeprof][typeprof] 0.30.1 + * 0.21.9 to [v0.30.1][typeprof-v0.30.1] +* [debug][debug] 1.10.0 + * 1.9.1 to [v1.9.2][debug-v1.9.2], [v1.10.0][debug-v1.10.0] +* [racc][racc] 1.8.1 + * 1.7.3 to [v1.8.0][racc-v1.8.0], [v1.8.1][racc-v1.8.1] + +The following bundled gems are promoted from default gems. + +* [mutex_m][mutex_m] 0.3.0 + * 0.2.0 to [v0.3.0][mutex_m-v0.3.0] +* [getoptlong][getoptlong] 0.2.1 +* [base64][base64] 0.2.0 +* [bigdecimal][bigdecimal] 3.1.8 + * 3.1.5 to [v3.1.6][bigdecimal-v3.1.6], [v3.1.7][bigdecimal-v3.1.7], [v3.1.8][bigdecimal-v3.1.8] +* [observer][observer] 0.1.2 +* [abbrev][abbrev] 0.1.2 +* [resolv-replace][resolv-replace] 0.1.1 +* [rinda][rinda] 0.2.0 +* [drb][drb] 2.2.1 + * 2.2.0 to [v2.2.1][drb-v2.2.1] +* [nkf][nkf] 0.2.0 + * 0.1.3 to [v0.2.0][nkf-v0.2.0] +* [syslog][syslog] 0.2.0 + * 0.1.2 to [v0.2.0][syslog-v0.2.0] +* [csv][csv] 3.3.2 + * 3.2.8 to [v3.2.9][csv-v3.2.9], [v3.3.0][csv-v3.3.0], [v3.3.1][csv-v3.3.1], [v3.3.2][csv-v3.3.2] + +## Supported platforms + +## Compatibility issues + +* Error messages and backtrace displays have been changed. + + * Use a single quote instead of a backtick as an opening quote. [[Feature #16495]] + * Display a class name before a method name (only when the class has a permanent name). [[Feature #19117]] + * Extra `rescue`/`ensure` frames are no longer available on the backtrace. [[Feature #20275]] + * `Kernel#caller`, `Thread::Backtrace::Location`’s methods, etc. are also changed accordingly. + + Old: + ``` + test.rb:1:in `foo': undefined method `time' for an instance of Integer + from test.rb:2:in `<main>' + ``` + + New: + ``` + test.rb:1:in 'Object#foo': undefined method 'time' for an instance of Integer + from test.rb:2:in '<main>' + ``` + +* `Hash#inspect` rendering have been changed. [[Bug #20433]] + + * Symbol keys are displayed using the modern symbol key syntax: `"{user: 1}"` + * Other keys now have spaces around `=>`: `'{"user" => 1}'`, while previously they didn't: `'{"user"=>1}'` + +* `Kernel#Float()` now accepts a decimal string with decimal part omitted. [[Feature #20705]] + + ```rb + Float("1.") #=> 1.0 (previously, an ArgumentError was raised) + Float("1.E-1") #=> 0.1 (previously, an ArgumentError was raised) + ``` + +* `String#to_f` now accepts a decimal string with decimal part omitted. [[Feature #20705]] + Note that the result changes when an exponent is specified. + + ```rb + "1.".to_f #=> 1.0 + "1.E-1".to_f #=> 0.1 (previously, 1.0 was returned) + ``` + +* `Refinement#refined_class` has been removed. [[Feature #19714]] + +## Stdlib compatibility issues + +* DidYouMean + + * `DidYouMean::SPELL_CHECKERS[]=` and `DidYouMean::SPELL_CHECKERS.merge!` are removed. + +* Net::HTTP + + * Removed the following deprecated constants: + * `Net::HTTP::ProxyMod` + * `Net::NetPrivate::HTTPRequest` + * `Net::HTTPInformationCode` + * `Net::HTTPSuccessCode` + * `Net::HTTPRedirectionCode` + * `Net::HTTPRetriableCode` + * `Net::HTTPClientErrorCode` + * `Net::HTTPFatalErrorCode` + * `Net::HTTPServerErrorCode` + * `Net::HTTPResponseReceiver` + * `Net::HTTPResponceReceiver` + + These constants were deprecated from 2012. + +* Timeout + + * Reject negative values for `Timeout.timeout`. [[Bug #20795]] + +* URI + + * Switched default parser to RFC 3986 compliant from RFC 2396 compliant. + [[Bug #19266]] + +## C API updates + +* `rb_newobj` and `rb_newobj_of` (and corresponding macros `RB_NEWOBJ`, `RB_NEWOBJ_OF`, `NEWOBJ`, `NEWOBJ_OF`) have been removed. [[Feature #20265]] +* Removed deprecated function `rb_gc_force_recycle`. [[Feature #18290]] + +## Implementation improvements + +* The default parser is now Prism. + To use the conventional parser, use the command-line argument `--parser=parse.y`. + [[Feature #20564]] + +* Happy Eyeballs version 2 (RFC8305), an algorithm that ensures faster and more reliable connections + by attempting IPv6 and IPv4 concurrently, is used in `Socket.tcp` and `TCPSocket.new`. + To disable it globally, set the environment variable `RUBY_TCP_NO_FAST_FALLBACK=1` or + call `Socket.tcp_fast_fallback=false`. + Or to disable it on a per-method basis, use the keyword argument `fast_fallback: false`. + [[Feature #20108]] [[Feature #20782]] + +* Alternative garbage collector (GC) implementations can be loaded dynamically + through the modular garbage collector feature. To enable this feature, + configure Ruby with `--with-modular-gc` at build time. GC libraries can be + loaded at runtime using the environment variable `RUBY_GC_LIBRARY`. + [[Feature #20351]] + +* Ruby's built-in garbage collector has been split into a separate file at + `gc/default/default.c` and interacts with Ruby using an API defined in + `gc/gc_impl.h`. The built-in garbage collector can now also be built as a + library using `make modular-gc MODULAR_GC=default` and enabled using the + environment variable `RUBY_GC_LIBRARY=default`. [[Feature #20470]] + +* An experimental GC library is provided based on [MMTk](https://www.mmtk.io/). + This GC library can be built using `make modular-gc MODULAR_GC=mmtk` and + enabled using the environment variable `RUBY_GC_LIBRARY=mmtk`. This requires + the Rust toolchain on the build machine. [[Feature #20860]] + +### YJIT + +#### New features + +* Command-line options + * `--yjit-mem-size` introduces a unified memory limit (default 128MiB) to track total YJIT memory usage, + providing a more intuitive alternative to the old `--yjit-exec-mem-size` option. + * `--yjit-trace-exits=COUNTER` allows tracing of counted exits and fallbacks. + * `--yjit-perf=codegen` allows profiling of JIT code based on YJIT's codegen functions. + * `--yjit-log` enables a compilation log to track what gets compiled. +* Ruby API + * `RubyVM::YJIT.enable(log: true)` also enables a compilation log. + * `RubyVM::YJIT.log` provides access to the tail of the compilation log at run-time. +* YJIT stats + * `RubyVM::YJIT.runtime_stats` now always provides additional statistics on + invalidation, inlining, and metadata encoding. + * `RubyVM::YJIT.runtime_stats[:iseq_calls]` is added to profile non-inlined Ruby method calls. + * `RubyVM::YJIT.runtime_stats[:cfunc_calls]` is truncated to the top 20 entries for better performance. + +#### New optimizations + +* Compressed context reduces memory needed to store YJIT metadata +* Allocate registers for local variables and Ruby method arguments +* When YJIT is enabled, use more Core primitives written in Ruby: + * `Array#each`, `Array#select`, `Array#map` rewritten in Ruby for better performance [[Feature #20182]]. +* Ability to inline small/trivial methods such as: + * Empty methods + * Methods returning a constant + * Methods returning `self` + * Methods directly returning an argument +* Specialized codegen for many more runtime methods +* Optimize `String#getbyte`, `String#setbyte` and other string methods +* Optimize bitwise operations to speed up low-level bit/byte manipulation +* Support shareable constants in multi-ractor mode +* Various other incremental optimizations + +## Miscellaneous changes + +* Passing a block to a method which doesn't use the passed block will show + a warning on verbose mode (`-w`). + In connection with this, a new `strict_unused_block` warning category was introduced. + Turn them on with `-W:strict_unused_block` or `Warning[:strict_unused_block] = true`. + [[Feature #15554]] + +* Redefining some core methods that are specially optimized by the interpreter + and JIT like `String#freeze` or `Integer#+` now emits a performance class + warning (`-W:performance` or `Warning[:performance] = true`). + [[Feature #20429]] + +[Feature #13557]: https://bugs.ruby-lang.org/issues/13557 +[Feature #15554]: https://bugs.ruby-lang.org/issues/15554 +[Feature #16495]: https://bugs.ruby-lang.org/issues/16495 +[Feature #18290]: https://bugs.ruby-lang.org/issues/18290 +[Feature #18368]: https://bugs.ruby-lang.org/issues/18368 +[Feature #18980]: https://bugs.ruby-lang.org/issues/18980 +[Misc #18984]: https://bugs.ruby-lang.org/issues/18984 +[Feature #19117]: https://bugs.ruby-lang.org/issues/19117 +[Feature #19236]: https://bugs.ruby-lang.org/issues/19236 +[Bug #19266]: https://bugs.ruby-lang.org/issues/19266 +[Feature #19714]: https://bugs.ruby-lang.org/issues/19714 +[Bug #19918]: https://bugs.ruby-lang.org/issues/19918 +[Feature #20018]: https://bugs.ruby-lang.org/issues/20018 +[Bug #20064]: https://bugs.ruby-lang.org/issues/20064 +[Feature #20108]: https://bugs.ruby-lang.org/issues/20108 +[Feature #20182]: https://bugs.ruby-lang.org/issues/20182 +[Feature #20205]: https://bugs.ruby-lang.org/issues/20205 +[Bug #20218]: https://bugs.ruby-lang.org/issues/20218 +[Feature #20265]: https://bugs.ruby-lang.org/issues/20265 +[Feature #20275]: https://bugs.ruby-lang.org/issues/20275 +[Feature #20293]: https://bugs.ruby-lang.org/issues/20293 +[Feature #20350]: https://bugs.ruby-lang.org/issues/20350 +[Feature #20351]: https://bugs.ruby-lang.org/issues/20351 +[Feature #20429]: https://bugs.ruby-lang.org/issues/20429 +[Bug #20433]: https://bugs.ruby-lang.org/issues/20433 +[Feature #20443]: https://bugs.ruby-lang.org/issues/20443 +[Feature #20470]: https://bugs.ruby-lang.org/issues/20470 +[Feature #20497]: https://bugs.ruby-lang.org/issues/20497 +[Feature #20564]: https://bugs.ruby-lang.org/issues/20564 +[Feature #20576]: https://bugs.ruby-lang.org/issues/20576 +[Feature #20594]: https://bugs.ruby-lang.org/issues/20594 +[Bug #20620]: https://bugs.ruby-lang.org/issues/20620 +[Feature #20624]: https://bugs.ruby-lang.org/issues/20624 +[Feature #20627]: https://bugs.ruby-lang.org/issues/20627 +[Feature #20702]: https://bugs.ruby-lang.org/issues/20702 +[Feature #20705]: https://bugs.ruby-lang.org/issues/20705 +[Feature #20707]: https://bugs.ruby-lang.org/issues/20707 +[Feature #20715]: https://bugs.ruby-lang.org/issues/20715 +[Feature #20775]: https://bugs.ruby-lang.org/issues/20775 +[Feature #20778]: https://bugs.ruby-lang.org/issues/20778 +[Feature #20782]: https://bugs.ruby-lang.org/issues/20782 +[Bug #20795]: https://bugs.ruby-lang.org/issues/20795 +[Feature #20811]: https://bugs.ruby-lang.org/issues/20811 +[Feature #20860]: https://bugs.ruby-lang.org/issues/20860 +[Feature #20875]: https://bugs.ruby-lang.org/issues/20875 +[Feature #20876]: https://bugs.ruby-lang.org/issues/20876 +[Feature #20884]: https://bugs.ruby-lang.org/issues/20884 +[Feature #20902]: https://bugs.ruby-lang.org/issues/20902 +[Bug #20929]: https://bugs.ruby-lang.org/issues/20929 +[RubyGems-v3.5.4]: https://github.com/rubygems/rubygems/releases/tag/v3.5.4 +[RubyGems-v3.5.5]: https://github.com/rubygems/rubygems/releases/tag/v3.5.5 +[RubyGems-v3.5.6]: https://github.com/rubygems/rubygems/releases/tag/v3.5.6 +[RubyGems-v3.5.7]: https://github.com/rubygems/rubygems/releases/tag/v3.5.7 +[RubyGems-v3.5.8]: https://github.com/rubygems/rubygems/releases/tag/v3.5.8 +[RubyGems-v3.5.9]: https://github.com/rubygems/rubygems/releases/tag/v3.5.9 +[RubyGems-v3.5.10]: https://github.com/rubygems/rubygems/releases/tag/v3.5.10 +[RubyGems-v3.5.11]: https://github.com/rubygems/rubygems/releases/tag/v3.5.11 +[RubyGems-v3.5.12]: https://github.com/rubygems/rubygems/releases/tag/v3.5.12 +[RubyGems-v3.5.13]: https://github.com/rubygems/rubygems/releases/tag/v3.5.13 +[RubyGems-v3.5.14]: https://github.com/rubygems/rubygems/releases/tag/v3.5.14 +[RubyGems-v3.5.15]: https://github.com/rubygems/rubygems/releases/tag/v3.5.15 +[RubyGems-v3.5.16]: https://github.com/rubygems/rubygems/releases/tag/v3.5.16 +[RubyGems-v3.5.17]: https://github.com/rubygems/rubygems/releases/tag/v3.5.17 +[RubyGems-v3.5.18]: https://github.com/rubygems/rubygems/releases/tag/v3.5.18 +[RubyGems-v3.5.19]: https://github.com/rubygems/rubygems/releases/tag/v3.5.19 +[RubyGems-v3.5.20]: https://github.com/rubygems/rubygems/releases/tag/v3.5.20 +[RubyGems-v3.5.21]: https://github.com/rubygems/rubygems/releases/tag/v3.5.21 +[RubyGems-v3.5.22]: https://github.com/rubygems/rubygems/releases/tag/v3.5.22 +[RubyGems-v3.5.23]: https://github.com/rubygems/rubygems/releases/tag/v3.5.23 +[RubyGems-v3.6.0]: https://github.com/rubygems/rubygems/releases/tag/v3.6.0 +[RubyGems-v3.6.1]: https://github.com/rubygems/rubygems/releases/tag/v3.6.1 +[RubyGems-v3.6.2]: https://github.com/rubygems/rubygems/releases/tag/v3.6.2 +[benchmark-v0.4.0]: https://github.com/ruby/benchmark/releases/tag/v0.4.0 +[bundler-v2.5.4]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.4 +[bundler-v2.5.5]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.5 +[bundler-v2.5.6]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.6 +[bundler-v2.5.7]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.7 +[bundler-v2.5.8]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.8 +[bundler-v2.5.9]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.9 +[bundler-v2.5.10]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.10 +[bundler-v2.5.11]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.11 +[bundler-v2.5.12]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.12 +[bundler-v2.5.13]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.13 +[bundler-v2.5.14]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.14 +[bundler-v2.5.15]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.15 +[bundler-v2.5.16]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.16 +[bundler-v2.5.17]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.17 +[bundler-v2.5.18]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.18 +[bundler-v2.5.19]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.19 +[bundler-v2.5.20]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.20 +[bundler-v2.5.21]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.21 +[bundler-v2.5.22]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.22 +[bundler-v2.5.23]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.5.23 +[bundler-v2.6.0]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.6.0 +[bundler-v2.6.1]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.6.1 +[bundler-v2.6.2]: https://github.com/rubygems/rubygems/releases/tag/bundler-v2.6.2 +[date-v3.4.0]: https://github.com/ruby/date/releases/tag/v3.4.0 +[date-v3.4.1]: https://github.com/ruby/date/releases/tag/v3.4.1 +[delegate-v0.4.0]: https://github.com/ruby/delegate/releases/tag/v0.4.0 +[did_you_mean-v2.0.0]: https://github.com/ruby/did_you_mean/releases/tag/v2.0.0 +[digest-v3.2.0.pre0]: https://github.com/ruby/digest/releases/tag/v3.2.0.pre0 +[digest-v3.2.0]: https://github.com/ruby/digest/releases/tag/v3.2.0 +[erb-v4.0.4]: https://github.com/ruby/erb/releases/tag/v4.0.4 +[etc-v1.4.4]: https://github.com/ruby/etc/releases/tag/v1.4.4 +[etc-v1.4.5]: https://github.com/ruby/etc/releases/tag/v1.4.5 +[fcntl-v1.2.0]: https://github.com/ruby/fcntl/releases/tag/v1.2.0 +[fiddle-v1.1.3]: https://github.com/ruby/fiddle/releases/tag/v1.1.3 +[fiddle-v1.1.4]: https://github.com/ruby/fiddle/releases/tag/v1.1.4 +[fiddle-v1.1.5]: https://github.com/ruby/fiddle/releases/tag/v1.1.5 +[fiddle-v1.1.6]: https://github.com/ruby/fiddle/releases/tag/v1.1.6 +[fileutils-v1.7.3]: https://github.com/ruby/fileutils/releases/tag/v1.7.3 +[io-console-v0.7.2]: https://github.com/ruby/io-console/releases/tag/v0.7.2 +[io-console-v0.8.0.beta1]: https://github.com/ruby/io-console/releases/tag/v0.8.0.beta1 +[io-console-v0.8.0]: https://github.com/ruby/io-console/releases/tag/v0.8.0 +[io-nonblock-v0.3.1]: https://github.com/ruby/io-nonblock/releases/tag/v0.3.1 +[ipaddr-v1.2.7]: https://github.com/ruby/ipaddr/releases/tag/v1.2.7 +[irb-v1.11.1]: https://github.com/ruby/irb/releases/tag/v1.11.1 +[irb-v1.11.2]: https://github.com/ruby/irb/releases/tag/v1.11.2 +[irb-v1.12.0]: https://github.com/ruby/irb/releases/tag/v1.12.0 +[irb-v1.13.0]: https://github.com/ruby/irb/releases/tag/v1.13.0 +[irb-v1.13.1]: https://github.com/ruby/irb/releases/tag/v1.13.1 +[irb-v1.13.2]: https://github.com/ruby/irb/releases/tag/v1.13.2 +[irb-v1.14.0]: https://github.com/ruby/irb/releases/tag/v1.14.0 +[irb-v1.14.1]: https://github.com/ruby/irb/releases/tag/v1.14.1 +[irb-v1.14.2]: https://github.com/ruby/irb/releases/tag/v1.14.2 +[irb-v1.14.3]: https://github.com/ruby/irb/releases/tag/v1.14.3 +[json-v2.7.2]: https://github.com/ruby/json/releases/tag/v2.7.2 +[json-v2.7.3.rc1]: https://github.com/ruby/json/releases/tag/v2.7.3.rc1 +[json-v2.7.3]: https://github.com/ruby/json/releases/tag/v2.7.3 +[json-v2.7.4]: https://github.com/ruby/json/releases/tag/v2.7.4 +[json-v2.7.5]: https://github.com/ruby/json/releases/tag/v2.7.5 +[json-v2.7.6]: https://github.com/ruby/json/releases/tag/v2.7.6 +[json-v2.8.0]: https://github.com/ruby/json/releases/tag/v2.8.0 +[json-v2.8.1]: https://github.com/ruby/json/releases/tag/v2.8.1 +[json-v2.8.2]: https://github.com/ruby/json/releases/tag/v2.8.2 +[json-v2.9.0]: https://github.com/ruby/json/releases/tag/v2.9.0 +[json-v2.9.1]: https://github.com/ruby/json/releases/tag/v2.9.1 +[logger-v1.6.1]: https://github.com/ruby/logger/releases/tag/v1.6.1 +[logger-v1.6.2]: https://github.com/ruby/logger/releases/tag/v1.6.2 +[logger-v1.6.3]: https://github.com/ruby/logger/releases/tag/v1.6.3 +[logger-v1.6.4]: https://github.com/ruby/logger/releases/tag/v1.6.4 +[net-http-v0.4.1]: https://github.com/ruby/net-http/releases/tag/v0.4.1 +[net-http-v0.5.0]: https://github.com/ruby/net-http/releases/tag/v0.5.0 +[net-http-v0.6.0]: https://github.com/ruby/net-http/releases/tag/v0.6.0 +[open-uri-v0.5.0]: https://github.com/ruby/open-uri/releases/tag/v0.5.0 +[optparse-v0.5.0]: https://github.com/ruby/optparse/releases/tag/v0.5.0 +[optparse-v0.6.0]: https://github.com/ruby/optparse/releases/tag/v0.6.0 +[ostruct-v0.6.1]: https://github.com/ruby/ostruct/releases/tag/v0.6.1 +[pathname-v0.4.0]: https://github.com/ruby/pathname/releases/tag/v0.4.0 +[pp-v0.6.0]: https://github.com/ruby/pp/releases/tag/v0.6.0 +[pp-v0.6.1]: https://github.com/ruby/pp/releases/tag/v0.6.1 +[pp-v0.6.2]: https://github.com/ruby/pp/releases/tag/v0.6.2 +[prism-v0.20.0]: https://github.com/ruby/prism/releases/tag/v0.20.0 +[prism-v0.21.0]: https://github.com/ruby/prism/releases/tag/v0.21.0 +[prism-v0.22.0]: https://github.com/ruby/prism/releases/tag/v0.22.0 +[prism-v0.23.0]: https://github.com/ruby/prism/releases/tag/v0.23.0 +[prism-v0.24.0]: https://github.com/ruby/prism/releases/tag/v0.24.0 +[prism-v0.25.0]: https://github.com/ruby/prism/releases/tag/v0.25.0 +[prism-v0.26.0]: https://github.com/ruby/prism/releases/tag/v0.26.0 +[prism-v0.27.0]: https://github.com/ruby/prism/releases/tag/v0.27.0 +[prism-v0.28.0]: https://github.com/ruby/prism/releases/tag/v0.28.0 +[prism-v0.29.0]: https://github.com/ruby/prism/releases/tag/v0.29.0 +[prism-v0.30.0]: https://github.com/ruby/prism/releases/tag/v0.30.0 +[prism-v1.0.0]: https://github.com/ruby/prism/releases/tag/v1.0.0 +[prism-v1.1.0]: https://github.com/ruby/prism/releases/tag/v1.1.0 +[prism-v1.2.0]: https://github.com/ruby/prism/releases/tag/v1.2.0 +[pstore-v0.1.4]: https://github.com/ruby/pstore/releases/tag/v0.1.4 +[psych-v5.2.0.beta1]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta1 +[psych-v5.2.0]: https://github.com/ruby/psych/releases/tag/v5.2.0 +[psych-v5.2.0.beta2]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta2 +[psych-v5.2.0.beta3]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta3 +[psych-v5.2.0.beta4]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta4 +[psych-v5.2.0.beta5]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta5 +[psych-v5.2.0.beta6]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta6 +[psych-v5.2.0.beta7]: https://github.com/ruby/psych/releases/tag/v5.2.0.beta7 +[psych-v5.2.1]: https://github.com/ruby/psych/releases/tag/v5.2.1 +[psych-v5.2.2]: https://github.com/ruby/psych/releases/tag/v5.2.2 +[rdoc-v6.7.0]: https://github.com/ruby/rdoc/releases/tag/v6.7.0 +[rdoc-v6.8.0]: https://github.com/ruby/rdoc/releases/tag/v6.8.0 +[rdoc-v6.8.1]: https://github.com/ruby/rdoc/releases/tag/v6.8.1 +[rdoc-v6.9.0]: https://github.com/ruby/rdoc/releases/tag/v6.9.0 +[rdoc-v6.9.1]: https://github.com/ruby/rdoc/releases/tag/v6.9.1 +[rdoc-v6.10.0]: https://github.com/ruby/rdoc/releases/tag/v6.10.0 +[reline-v0.5.0.pre.1]: https://github.com/ruby/reline/releases/tag/v0.5.0.pre.1 +[reline-v0.4.2]: https://github.com/ruby/reline/releases/tag/v0.4.2 +[reline-v0.4.3]: https://github.com/ruby/reline/releases/tag/v0.4.3 +[reline-v0.5.0]: https://github.com/ruby/reline/releases/tag/v0.5.0 +[reline-v0.5.1]: https://github.com/ruby/reline/releases/tag/v0.5.1 +[reline-v0.5.2]: https://github.com/ruby/reline/releases/tag/v0.5.2 +[reline-v0.5.3]: https://github.com/ruby/reline/releases/tag/v0.5.3 +[reline-v0.5.4]: https://github.com/ruby/reline/releases/tag/v0.5.4 +[reline-v0.5.5]: https://github.com/ruby/reline/releases/tag/v0.5.5 +[reline-v0.5.6]: https://github.com/ruby/reline/releases/tag/v0.5.6 +[reline-v0.5.7]: https://github.com/ruby/reline/releases/tag/v0.5.7 +[reline-v0.5.8]: https://github.com/ruby/reline/releases/tag/v0.5.8 +[reline-v0.5.9]: https://github.com/ruby/reline/releases/tag/v0.5.9 +[reline-v0.5.10]: https://github.com/ruby/reline/releases/tag/v0.5.10 +[reline-v0.5.11]: https://github.com/ruby/reline/releases/tag/v0.5.11 +[reline-v0.5.12]: https://github.com/ruby/reline/releases/tag/v0.5.12 +[reline-v0.6.0]: https://github.com/ruby/reline/releases/tag/v0.6.0 +[resolv-v0.4.0]: https://github.com/ruby/resolv/releases/tag/v0.4.0 +[resolv-v0.5.0]: https://github.com/ruby/resolv/releases/tag/v0.5.0 +[resolv-v0.6.0]: https://github.com/ruby/resolv/releases/tag/v0.6.0 +[securerandom-v0.3.2]: https://github.com/ruby/securerandom/releases/tag/v0.3.2 +[securerandom-v0.4.0]: https://github.com/ruby/securerandom/releases/tag/v0.4.0 +[securerandom-v0.4.1]: https://github.com/ruby/securerandom/releases/tag/v0.4.1 +[set-v1.1.1]: https://github.com/ruby/set/releases/tag/v1.1.1 +[shellwords-v0.2.1]: https://github.com/ruby/shellwords/releases/tag/v0.2.1 +[shellwords-v0.2.2]: https://github.com/ruby/shellwords/releases/tag/v0.2.2 +[singleton-v0.3.0]: https://github.com/ruby/singleton/releases/tag/v0.3.0 +[stringio-v3.1.1]: https://github.com/ruby/stringio/releases/tag/v3.1.1 +[stringio-v3.1.2]: https://github.com/ruby/stringio/releases/tag/v3.1.2 +[strscan-v3.0.8]: https://github.com/ruby/strscan/releases/tag/v3.0.8 +[strscan-v3.0.9]: https://github.com/ruby/strscan/releases/tag/v3.0.9 +[strscan-v3.1.0]: https://github.com/ruby/strscan/releases/tag/v3.1.0 +[strscan-v3.1.1]: https://github.com/ruby/strscan/releases/tag/v3.1.1 +[strscan-v3.1.2]: https://github.com/ruby/strscan/releases/tag/v3.1.2 +[syntax_suggest-v2.0.1]: https://github.com/ruby/syntax_suggest/releases/tag/v2.0.1 +[syntax_suggest-v2.0.2]: https://github.com/ruby/syntax_suggest/releases/tag/v2.0.2 +[tempfile-v0.3.0]: https://github.com/ruby/tempfile/releases/tag/v0.3.0 +[tempfile-v0.3.1]: https://github.com/ruby/tempfile/releases/tag/v0.3.1 +[time-v0.4.0]: https://github.com/ruby/time/releases/tag/v0.4.0 +[time-v0.4.1]: https://github.com/ruby/time/releases/tag/v0.4.1 +[timeout-v0.4.2]: https://github.com/ruby/timeout/releases/tag/v0.4.2 +[timeout-v0.4.3]: https://github.com/ruby/timeout/releases/tag/v0.4.3 +[tmpdir-v0.3.0]: https://github.com/ruby/tmpdir/releases/tag/v0.3.0 +[tmpdir-v0.3.1]: https://github.com/ruby/tmpdir/releases/tag/v0.3.1 +[uri-v0.13.1]: https://github.com/ruby/uri/releases/tag/v0.13.1 +[uri-v1.0.0]: https://github.com/ruby/uri/releases/tag/v1.0.0 +[uri-v1.0.1]: https://github.com/ruby/uri/releases/tag/v1.0.1 +[uri-v1.0.2]: https://github.com/ruby/uri/releases/tag/v1.0.2 +[win32ole-v1.9.0]: https://github.com/ruby/win32ole/releases/tag/v1.9.0 +[win32ole-v1.9.1]: https://github.com/ruby/win32ole/releases/tag/v1.9.1 +[yaml-v0.4.0]: https://github.com/ruby/yaml/releases/tag/v0.4.0 +[zlib-v3.1.1]: https://github.com/ruby/zlib/releases/tag/v3.1.1 +[zlib-v3.2.0]: https://github.com/ruby/zlib/releases/tag/v3.2.0 +[zlib-v3.2.1]: https://github.com/ruby/zlib/releases/tag/v3.2.1 +[minitest-v5.25.4]: https://github.com/seattlerb/minitest/releases/tag/v5.25.4 +[power_assert-v2.0.4]: https://github.com/ruby/power_assert/releases/tag/v2.0.4 +[power_assert-v2.0.5]: https://github.com/ruby/power_assert/releases/tag/v2.0.5 +[rake-v13.2.0]: https://github.com/ruby/rake/releases/tag/v13.2.0 +[rake-v13.2.1]: https://github.com/ruby/rake/releases/tag/v13.2.1 +[test-unit-3.6.2]: https://github.com/test-unit/test-unit/releases/tag/3.6.2 +[test-unit-3.6.3]: https://github.com/test-unit/test-unit/releases/tag/3.6.3 +[test-unit-3.6.4]: https://github.com/test-unit/test-unit/releases/tag/3.6.4 +[test-unit-3.6.5]: https://github.com/test-unit/test-unit/releases/tag/3.6.5 +[test-unit-3.6.6]: https://github.com/test-unit/test-unit/releases/tag/3.6.6 +[test-unit-3.6.7]: https://github.com/test-unit/test-unit/releases/tag/3.6.7 +[rexml-v3.2.7]: https://github.com/ruby/rexml/releases/tag/v3.2.7 +[rexml-v3.2.8]: https://github.com/ruby/rexml/releases/tag/v3.2.8 +[rexml-v3.2.9]: https://github.com/ruby/rexml/releases/tag/v3.2.9 +[rexml-v3.3.0]: https://github.com/ruby/rexml/releases/tag/v3.3.0 +[rexml-v3.3.1]: https://github.com/ruby/rexml/releases/tag/v3.3.1 +[rexml-v3.3.2]: https://github.com/ruby/rexml/releases/tag/v3.3.2 +[rexml-v3.3.3]: https://github.com/ruby/rexml/releases/tag/v3.3.3 +[rexml-v3.3.4]: https://github.com/ruby/rexml/releases/tag/v3.3.4 +[rexml-v3.3.5]: https://github.com/ruby/rexml/releases/tag/v3.3.5 +[rexml-v3.3.6]: https://github.com/ruby/rexml/releases/tag/v3.3.6 +[rexml-v3.3.7]: https://github.com/ruby/rexml/releases/tag/v3.3.7 +[rexml-v3.3.8]: https://github.com/ruby/rexml/releases/tag/v3.3.8 +[rexml-v3.3.9]: https://github.com/ruby/rexml/releases/tag/v3.3.9 +[rexml-v3.4.0]: https://github.com/ruby/rexml/releases/tag/v3.4.0 +[rss-0.3.1]: https://github.com/ruby/rss/releases/tag/0.3.1 +[net-ftp-v0.3.4]: https://github.com/ruby/net-ftp/releases/tag/v0.3.4 +[net-ftp-v0.3.5]: https://github.com/ruby/net-ftp/releases/tag/v0.3.5 +[net-ftp-v0.3.6]: https://github.com/ruby/net-ftp/releases/tag/v0.3.6 +[net-ftp-v0.3.7]: https://github.com/ruby/net-ftp/releases/tag/v0.3.7 +[net-ftp-v0.3.8]: https://github.com/ruby/net-ftp/releases/tag/v0.3.8 +[net-imap-v0.4.9.1]: https://github.com/ruby/net-imap/releases/tag/v0.4.9.1 +[net-imap-v0.4.10]: https://github.com/ruby/net-imap/releases/tag/v0.4.10 +[net-imap-v0.4.11]: https://github.com/ruby/net-imap/releases/tag/v0.4.11 +[net-imap-v0.4.12]: https://github.com/ruby/net-imap/releases/tag/v0.4.12 +[net-imap-v0.4.13]: https://github.com/ruby/net-imap/releases/tag/v0.4.13 +[net-imap-v0.4.14]: https://github.com/ruby/net-imap/releases/tag/v0.4.14 +[net-imap-v0.4.15]: https://github.com/ruby/net-imap/releases/tag/v0.4.15 +[net-imap-v0.4.16]: https://github.com/ruby/net-imap/releases/tag/v0.4.16 +[net-imap-v0.4.17]: https://github.com/ruby/net-imap/releases/tag/v0.4.17 +[net-imap-v0.5.0]: https://github.com/ruby/net-imap/releases/tag/v0.5.0 +[net-imap-v0.4.18]: https://github.com/ruby/net-imap/releases/tag/v0.4.18 +[net-imap-v0.5.1]: https://github.com/ruby/net-imap/releases/tag/v0.5.1 +[net-imap-v0.5.2]: https://github.com/ruby/net-imap/releases/tag/v0.5.2 +[net-imap-v0.5.3]: https://github.com/ruby/net-imap/releases/tag/v0.5.3 +[net-imap-v0.5.4]: https://github.com/ruby/net-imap/releases/tag/v0.5.4 +[net-smtp-v0.4.0.1]: https://github.com/ruby/net-smtp/releases/tag/v0.4.0.1 +[net-smtp-v0.5.0]: https://github.com/ruby/net-smtp/releases/tag/v0.5.0 +[prime-v0.1.3]: https://github.com/ruby/prime/releases/tag/v0.1.3 +[rbs-v3.4.1]: https://github.com/ruby/rbs/releases/tag/v3.4.1 +[rbs-v3.4.2]: https://github.com/ruby/rbs/releases/tag/v3.4.2 +[rbs-v3.4.3]: https://github.com/ruby/rbs/releases/tag/v3.4.3 +[rbs-v3.4.4]: https://github.com/ruby/rbs/releases/tag/v3.4.4 +[rbs-v3.5.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.5.0.pre.1 +[rbs-v3.5.0.pre.2]: https://github.com/ruby/rbs/releases/tag/v3.5.0.pre.2 +[rbs-v3.5.0]: https://github.com/ruby/rbs/releases/tag/v3.5.0 +[rbs-v3.5.1]: https://github.com/ruby/rbs/releases/tag/v3.5.1 +[rbs-v3.5.2]: https://github.com/ruby/rbs/releases/tag/v3.5.2 +[rbs-v3.5.3]: https://github.com/ruby/rbs/releases/tag/v3.5.3 +[rbs-v3.6.0.dev.1]: https://github.com/ruby/rbs/releases/tag/v3.6.0.dev.1 +[rbs-v3.6.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.6.0.pre.1 +[rbs-v3.6.0.pre.2]: https://github.com/ruby/rbs/releases/tag/v3.6.0.pre.2 +[rbs-v3.6.0.pre.3]: https://github.com/ruby/rbs/releases/tag/v3.6.0.pre.3 +[rbs-v3.6.0]: https://github.com/ruby/rbs/releases/tag/v3.6.0 +[rbs-v3.6.1]: https://github.com/ruby/rbs/releases/tag/v3.6.1 +[rbs-v3.7.0.dev.1]: https://github.com/ruby/rbs/releases/tag/v3.7.0.dev.1 +[rbs-v3.7.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.7.0.pre.1 +[rbs-v3.7.0]: https://github.com/ruby/rbs/releases/tag/v3.7.0 +[rbs-v3.8.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.8.0.pre.1 +[rbs-v3.8.0]: https://github.com/ruby/rbs/releases/tag/v3.8.0 +[debug-v1.9.2]: https://github.com/ruby/debug/releases/tag/v1.9.2 +[debug-v1.10.0]: https://github.com/ruby/debug/releases/tag/v1.10.0 +[racc-v1.8.0]: https://github.com/ruby/racc/releases/tag/v1.8.0 +[racc-v1.8.1]: https://github.com/ruby/racc/releases/tag/v1.8.1 +[mutex_m-v0.3.0]: https://github.com/ruby/mutex_m/releases/tag/v0.3.0 +[bigdecimal-v3.1.6]: https://github.com/ruby/bigdecimal/releases/tag/v3.1.6 +[bigdecimal-v3.1.7]: https://github.com/ruby/bigdecimal/releases/tag/v3.1.7 +[bigdecimal-v3.1.8]: https://github.com/ruby/bigdecimal/releases/tag/v3.1.8 +[drb-v2.2.1]: https://github.com/ruby/drb/releases/tag/v2.2.1 +[nkf-v0.2.0]: https://github.com/ruby/nkf/releases/tag/v0.2.0 +[syslog-v0.2.0]: https://github.com/ruby/syslog/releases/tag/v0.2.0 +[csv-v3.2.9]: https://github.com/ruby/csv/releases/tag/v3.2.9 +[csv-v3.3.0]: https://github.com/ruby/csv/releases/tag/v3.3.0 +[csv-v3.3.1]: https://github.com/ruby/csv/releases/tag/v3.3.1 +[csv-v3.3.2]: https://github.com/ruby/csv/releases/tag/v3.3.2 +[ruby/net-http-sspi]: https://github.com/ruby/net-http-sspi +[typeprof-v0.30.1]: https://github.com/ruby/typeprof/releases/tag/v0.30.1 + +[RubyGems]: https://github.com/rubygems/rubygems +[benchmark]: https://github.com/ruby/benchmark +[bundler]: https://github.com/rubygems/rubygems +[date]: https://github.com/ruby/date +[delegate]: https://github.com/ruby/delegate +[did_you_mean]: https://github.com/ruby/did_you_mean +[digest]: https://github.com/ruby/digest +[erb]: https://github.com/ruby/erb +[error_highlight]: https://github.com/ruby/error_highlight +[etc]: https://github.com/ruby/etc +[fcntl]: https://github.com/ruby/fcntl +[fiddle]: https://github.com/ruby/fiddle +[fileutils]: https://github.com/ruby/fileutils +[io-console]: https://github.com/ruby/io-console +[io-nonblock]: https://github.com/ruby/io-nonblock +[ipaddr]: https://github.com/ruby/ipaddr +[irb]: https://github.com/ruby/irb +[json]: https://github.com/ruby/json +[logger]: https://github.com/ruby/logger +[net-http]: https://github.com/ruby/net-http +[open-uri]: https://github.com/ruby/open-uri +[optparse]: https://github.com/ruby/optparse +[ostruct]: https://github.com/ruby/ostruct +[pathname]: https://github.com/ruby/pathname +[pp]: https://github.com/ruby/pp +[prism]: https://github.com/ruby/prism +[pstore]: https://github.com/ruby/pstore +[psych]: https://github.com/ruby/psych +[rdoc]: https://github.com/ruby/rdoc +[reline]: https://github.com/ruby/reline +[resolv]: https://github.com/ruby/resolv +[securerandom]: https://github.com/ruby/securerandom +[set]: https://github.com/ruby/set +[shellwords]: https://github.com/ruby/shellwords +[singleton]: https://github.com/ruby/singleton +[stringio]: https://github.com/ruby/stringio +[strscan]: https://github.com/ruby/strscan +[syntax_suggest]: https://github.com/ruby/syntax_suggest +[tempfile]: https://github.com/ruby/tempfile +[time]: https://github.com/ruby/time +[timeout]: https://github.com/ruby/timeout +[tmpdir]: https://github.com/ruby/tmpdir +[uri]: https://github.com/ruby/uri +[win32ole]: https://github.com/ruby/win32ole +[yaml]: https://github.com/ruby/yaml +[zlib]: https://github.com/ruby/zlib + +[repl_type_completor]: https://github.com/ruby/repl_type_completor +[minitest]: https://github.com/seattlerb/minitest +[power_assert]: https://github.com/ruby/power_assert +[rake]: https://github.com/ruby/rake +[test-unit]: https://github.com/test-unit/test-unit +[rexml]: https://github.com/ruby/rexml +[rss]: https://github.com/ruby/rss +[net-ftp]: https://github.com/ruby/net-ftp +[net-imap]: https://github.com/ruby/net-imap +[net-smtp]: https://github.com/ruby/net-smtp +[prime]: https://github.com/ruby/prime +[rbs]: https://github.com/ruby/rbs +[typeprof]: https://github.com/ruby/typeprof +[debug]: https://github.com/ruby/debug +[racc]: https://github.com/ruby/racc +[mutex_m]: https://github.com/ruby/mutex_m +[getoptlong]: https://github.com/ruby/getoptlong +[base64]: https://github.com/ruby/base64 +[bigdecimal]: https://github.com/ruby/bigdecimal +[observer]: https://github.com/ruby/observer +[abbrev]: https://github.com/ruby/abbrev +[resolv-replace]: https://github.com/ruby/resolv-replace +[rinda]: https://github.com/ruby/rinda +[drb]: https://github.com/ruby/drb +[nkf]: https://github.com/ruby/nkf +[syslog]: https://github.com/ruby/syslog +[csv]: https://github.com/ruby/csv diff --git a/doc/NEWS/NEWS-4.0.0.md b/doc/NEWS/NEWS-4.0.0.md new file mode 100644 index 0000000000..5d932fbf5d --- /dev/null +++ b/doc/NEWS/NEWS-4.0.0.md @@ -0,0 +1,802 @@ +# NEWS for Ruby 4.0.0 + +This document is a list of user-visible feature changes +since the **3.4.0** release, except for bug fixes. + +Note that each entry is kept to a minimum, see links for details. + +## Language changes + +* `*nil` no longer calls `nil.to_a`, similar to how `**nil` does + not call `nil.to_hash`. [[Feature #21047]] + +* Logical binary operators (`||`, `&&`, `and` and `or`) at the + beginning of a line continue the previous line, like fluent dot. + The following code examples are equal: + + ```ruby + if condition1 + && condition2 + ... + end + ``` + + Previously: + + ```ruby + if condition1 && condition2 + ... + end + ``` + + ```ruby + if condition1 && + condition2 + ... + end + ``` + + [[Feature #20925]] + +## Core classes updates + +Note: We're only listing outstanding class updates. + +* Array + + * `Array#rfind` has been added as a more efficient alternative to `array.reverse_each.find` [[Feature #21678]] + * `Array#find` has been added as a more efficient override of `Enumerable#find` [[Feature #21678]] +* Binding + + * `Binding#local_variables` does no longer include numbered parameters. + Also, `Binding#local_variable_get`, `Binding#local_variable_set`, and + `Binding#local_variable_defined?` reject to handle numbered parameters. + [[Bug #21049]] + + * `Binding#implicit_parameters`, `Binding#implicit_parameter_get`, and + `Binding#implicit_parameter_defined?` have been added to access + numbered parameters and "it" parameter. [[Bug #21049]] + +* Enumerator + + * `Enumerator.produce` now accepts an optional `size` keyword argument + to specify the size of the enumerator. It can be an integer, + `Float::INFINITY`, a callable object (such as a lambda), or `nil` to + indicate unknown size. When not specified, the size defaults to + `Float::INFINITY`. + + ```ruby + # Infinite enumerator + enum = Enumerator.produce(1, size: Float::INFINITY, &:succ) + enum.size # => Float::INFINITY + + # Finite enumerator with known/computable size + abs_dir = File.expand_path("./baz") # => "/foo/bar/baz" + traverser = Enumerator.produce(abs_dir, size: -> { abs_dir.count("/") + 1 }) { + raise StopIteration if it == "/" + File.dirname(it) + } + traverser.size # => 4 + ``` + + [[Feature #21701]] + +* ErrorHighlight + + * When an ArgumentError is raised, it now displays code snippets for + both the method call (caller) and the method definition (callee). + [[Feature #21543]] + + ``` + test.rb:1:in 'Object#add': wrong number of arguments (given 1, expected 2) (ArgumentError) + + caller: test.rb:3 + | add(1) + ^^^ + callee: test.rb:1 + | def add(x, y) = x + y + ^^^ + from test.rb:3:in '<main>' + ``` + +* Fiber + + * Introduce support for `Fiber#raise(cause:)` argument similar to + `Kernel#raise`. [[Feature #21360]] + +* Fiber::Scheduler + + * Introduce `Fiber::Scheduler#fiber_interrupt` to interrupt a fiber with a + given exception. The initial use case is to interrupt a fiber that is + waiting on a blocking IO operation when the IO operation is closed. + [[Feature #21166]] + + * Introduce `Fiber::Scheduler#yield` to allow the fiber scheduler to + continue processing when signal exceptions are disabled. + [[Bug #21633]] + + * Reintroduce the `Fiber::Scheduler#io_close` hook for asynchronous `IO#close`. + + * Invoke `Fiber::Scheduler#io_write` when flushing the IO write buffer. + [[Bug #21789]] + +* File + + * `File::Stat#birthtime` is now available on Linux via the statx + system call when supported by the kernel and filesystem. + [[Feature #21205]] + +* IO + + * `IO.select` accepts `Float::INFINITY` as a timeout argument. + [[Feature #20610]] + + * A deprecated behavior, process creation by `IO` class methods + with a leading `|`, was removed. [[Feature #19630]] + +* Kernel + + * `Kernel#inspect` now checks for the existence of a `#instance_variables_to_inspect` method, + allowing control over which instance variables are displayed in the `#inspect` string: + + ```ruby + class DatabaseConfig + def initialize(host, user, password) + @host = host + @user = user + @password = password + end + + private def instance_variables_to_inspect = [:@host, :@user] + end + + conf = DatabaseConfig.new("localhost", "root", "hunter2") + conf.inspect #=> #<DatabaseConfig:0x0000000104def350 @host="localhost", @user="root"> + ``` + + [[Feature #21219]] + + * A deprecated behavior, process creation by `Kernel#open` with a + leading `|`, was removed. [[Feature #19630]] + +* Math + + * `Math.log1p` and `Math.expm1` are added. [[Feature #21527]] + +* Pathname + + * Pathname has been promoted from a default gem to a core class of Ruby. + [[Feature #17473]] + +* Proc + + * `Proc#parameters` now shows anonymous optional parameters as `[:opt]` + instead of `[:opt, nil]`, making the output consistent with when the + anonymous parameter is required. [[Bug #20974]] + +* Ractor + + * `Ractor::Port` class was added for a new synchronization mechanism + to communicate between Ractors. [[Feature #21262]] + + ```ruby + port1 = Ractor::Port.new + port2 = Ractor::Port.new + Ractor.new port1, port2 do |port1, port2| + port1 << 1 + port2 << 11 + port1 << 2 + port2 << 12 + end + 2.times{ p port1.receive } #=> 1, 2 + 2.times{ p port2.receive } #=> 11, 12 + ``` + + `Ractor::Port` provides the following methods: + + * `Ractor::Port#receive` + * `Ractor::Port#send` (or `Ractor::Port#<<`) + * `Ractor::Port#close` + * `Ractor::Port#closed?` + + As a result, `Ractor.yield` and `Ractor#take` were removed. + + * `Ractor#join` and `Ractor#value` were added to wait for the + termination of a Ractor. These are similar to `Thread#join` + and `Thread#value`. + + * `Ractor#monitor` and `Ractor#unmonitor` were added as low-level + interfaces used internally to implement `Ractor#join`. + + * `Ractor.select` now only accepts Ractors and Ports. If Ractors are given, + it returns when a Ractor terminates. + + * `Ractor#default_port` was added. Each `Ractor` has a default port, + which is used by `Ractor.send`, `Ractor.receive`. + + * `Ractor#close_incoming` and `Ractor#close_outgoing` were removed. + + * `Ractor.shareable_proc` and `Ractor.shareable_lambda` are introduced + to make shareable Proc or lambda. + [[Feature #21550]], [[Feature #21557]] + +* Range + + * `Range#to_set` now performs size checks to prevent issues with + endless ranges. [[Bug #21654]] + + * `Range#overlap?` now correctly handles infinite (unbounded) ranges. + [[Bug #21185]] + + * `Range#max` behavior on beginless integer ranges has been fixed. + [[Bug #21174]] [[Bug #21175]] + +* Ruby + + * A new toplevel module `Ruby` has been defined, which contains + Ruby-related constants. This module was reserved in Ruby 3.4 + and is now officially defined. [[Feature #20884]] + +* Ruby::Box + + * A new (experimental) feature to provide separation about definitions. + For the detail of "Ruby Box", see [doc/language/box.md](doc/language/box.md). + [[Feature #21311]] [[Misc #21385]] + +* Set + + * `Set` is now a core class, instead of an autoloaded stdlib class. + [[Feature #21216]] + + * `Set#inspect` now uses a simpler display, similar to literal arrays. + (e.g., `Set[1, 2, 3]` instead of `#<Set: {1, 2, 3}>`). [[Feature #21389]] + + * Passing arguments to `Set#to_set` and `Enumerable#to_set` is now deprecated. + [[Feature #21390]] + +* Socket + + * `Socket.tcp` & `TCPSocket.new` accepts an `open_timeout` keyword argument to specify + the timeout for the initial connection. [[Feature #21347]] + * When a user-specified timeout occurred in `TCPSocket.new`, either `Errno::ETIMEDOUT` + or `IO::TimeoutError` could previously be raised depending on the situation. + This behavior has been unified so that `IO::TimeoutError` is now consistently raised. + (Please note that, in `Socket.tcp`, there are still cases where `Errno::ETIMEDOUT` + may be raised in similar situations, and that in both cases `Errno::ETIMEDOUT` may be + raised when the timeout occurs at the OS level.) + +* String + + * Update Unicode to Version 17.0.0 and Emoji Version 17.0. + [[Feature #19908]][[Feature #20724]][[Feature #21275]] (also applies to Regexp) + + * `String#strip`, `strip!`, `lstrip`, `lstrip!`, `rstrip`, and `rstrip!` + are extended to accept `*selectors` arguments. [[Feature #21552]] + +* Thread + + * Introduce support for `Thread#raise(cause:)` argument similar to + `Kernel#raise`. [[Feature #21360]] + +## Stdlib updates + +We only list stdlib changes that are notable feature changes. + +Other changes are listed in the following sections. We also listed release +history from the previous bundled version that is Ruby 3.4.0 if it has GitHub +releases. + +The following bundled gems are promoted from default gems. + +* ostruct 0.6.3 + * 0.6.1 to [v0.6.2][ostruct-v0.6.2], [v0.6.3][ostruct-v0.6.3] +* pstore 0.2.0 + * 0.1.4 to [v0.2.0][pstore-v0.2.0] +* benchmark 0.5.0 + * 0.4.0 to [v0.4.1][benchmark-v0.4.1], [v0.5.0][benchmark-v0.5.0] +* logger 1.7.0 + * 1.6.4 to [v1.6.5][logger-v1.6.5], [v1.6.6][logger-v1.6.6], [v1.7.0][logger-v1.7.0] +* rdoc 7.0.3 + * 6.14.0 to [v6.14.1][rdoc-v6.14.1], [v6.14.2][rdoc-v6.14.2], [v6.15.0][rdoc-v6.15.0], [v6.15.1][rdoc-v6.15.1], [v6.16.0][rdoc-v6.16.0], [v6.16.1][rdoc-v6.16.1], [v6.17.0][rdoc-v6.17.0], [v7.0.0][rdoc-v7.0.0], [v7.0.1][rdoc-v7.0.1], [v7.0.2][rdoc-v7.0.2], [v7.0.3][rdoc-v7.0.3] +* win32ole 1.9.2 + * 1.9.1 to [v1.9.2][win32ole-v1.9.2] +* irb 1.16.0 + * 1.14.3 to [v1.15.0][irb-v1.15.0], [v1.15.1][irb-v1.15.1], [v1.15.2][irb-v1.15.2], [v1.15.3][irb-v1.15.3], [v1.16.0][irb-v1.16.0] +* reline 0.6.3 + * 0.6.0 to [v0.6.1][reline-v0.6.1], [v0.6.2][reline-v0.6.2], [v0.6.3][reline-v0.6.3] +* readline 0.0.4 +* fiddle 1.1.8 + * 1.1.6 to [v1.1.7][fiddle-v1.1.7], [v1.1.8][fiddle-v1.1.8] + +The following default gem is added. + +* win32-registry 0.1.2 + +The following default gems are updated. + +* RubyGems 4.0.3 +* bundler 4.0.3 +* date 3.5.1 + * 3.4.1 to [v3.5.0][date-v3.5.0], [v3.5.1][date-v3.5.1] +* delegate 0.6.1 + * 0.4.0 to [v0.5.0][delegate-v0.5.0], [v0.6.0][delegate-v0.6.0], [v0.6.1][delegate-v0.6.1] +* digest 3.2.1 + * 3.2.0 to [v3.2.1][digest-v3.2.1] +* english 0.8.1 + * 0.8.0 to [v0.8.1][english-v0.8.1] +* erb 6.0.1 + * 4.0.4 to [v5.1.2][erb-v5.1.2], [v5.1.3][erb-v5.1.3], [v6.0.0][erb-v6.0.0], [v6.0.1][erb-v6.0.1] +* error_highlight 0.7.1 +* etc 1.4.6 +* fcntl 1.3.0 + * 1.2.0 to [v1.3.0][fcntl-v1.3.0] +* fileutils 1.8.0 + * 1.7.3 to [v1.8.0][fileutils-v1.8.0] +* forwardable 1.4.0 + * 1.3.3 to [v1.4.0][forwardable-v1.4.0] +* io-console 0.8.2 + * 0.8.1 to [v0.8.2][io-console-v0.8.2] +* io-nonblock 0.3.2 +* io-wait 0.4.0 + * 0.3.2 to [v0.3.3][io-wait-v0.3.3], [v0.3.5.test1][io-wait-v0.3.5.test1], [v0.3.5][io-wait-v0.3.5], [v0.3.6][io-wait-v0.3.6], [v0.4.0][io-wait-v0.4.0] +* ipaddr 1.2.8 +* json 2.18.0 + * 2.9.1 to [v2.10.0][json-v2.10.0], [v2.10.1][json-v2.10.1], [v2.10.2][json-v2.10.2], [v2.11.0][json-v2.11.0], [v2.11.1][json-v2.11.1], [v2.11.2][json-v2.11.2], [v2.11.3][json-v2.11.3], [v2.12.0][json-v2.12.0], [v2.12.1][json-v2.12.1], [v2.12.2][json-v2.12.2], [v2.13.0][json-v2.13.0], [v2.13.1][json-v2.13.1], [v2.13.2][json-v2.13.2], [v2.14.0][json-v2.14.0], [v2.14.1][json-v2.14.1], [v2.15.0][json-v2.15.0], [v2.15.1][json-v2.15.1], [v2.15.2][json-v2.15.2], [v2.16.0][json-v2.16.0], [v2.17.0][json-v2.17.0], [v2.17.1][json-v2.17.1], [v2.18.0][json-v2.18.0] +* net-http 0.9.1 + * 0.6.0 to [v0.7.0][net-http-v0.7.0], [v0.8.0][net-http-v0.8.0], [v0.9.0][net-http-v0.9.0], [v0.9.1][net-http-v0.9.1] +* openssl 4.0.0 + * 3.3.1 to [v3.3.2][openssl-v3.3.2], [v4.0.0][openssl-v4.0.0] +* optparse 0.8.1 + * 0.6.0 to [v0.7.0][optparse-v0.7.0], [v0.8.0][optparse-v0.8.0], [v0.8.1][optparse-v0.8.1] +* pp 0.6.3 + * 0.6.2 to [v0.6.3][pp-v0.6.3] +* prism 1.7.0 + * 1.5.2 to [v1.6.0][prism-v1.6.0], [v1.7.0][prism-v1.7.0] +* psych 5.3.1 + * 5.2.2 to [v5.2.3][psych-v5.2.3], [v5.2.4][psych-v5.2.4], [v5.2.5][psych-v5.2.5], [v5.2.6][psych-v5.2.6], [v5.3.0][psych-v5.3.0], [v5.3.1][psych-v5.3.1] +* resolv 0.7.0 + * 0.6.2 to [v0.6.3][resolv-v0.6.3], [v0.7.0][resolv-v0.7.0] +* stringio 3.2.0 + * 3.1.2 to [v3.1.3][stringio-v3.1.3], [v3.1.4][stringio-v3.1.4], [v3.1.5][stringio-v3.1.5], [v3.1.6][stringio-v3.1.6], [v3.1.7][stringio-v3.1.7], [v3.1.8][stringio-v3.1.8], [v3.1.9][stringio-v3.1.9], [v3.2.0][stringio-v3.2.0] +* strscan 3.1.6 + * 3.1.2 to [v3.1.3][strscan-v3.1.3], [v3.1.4][strscan-v3.1.4], [v3.1.5][strscan-v3.1.5], [v3.1.6][strscan-v3.1.6] +* time 0.4.2 + * 0.4.1 to [v0.4.2][time-v0.4.2] +* timeout 0.6.0 + * 0.4.3 to [v0.4.4][timeout-v0.4.4], [v0.5.0][timeout-v0.5.0], [v0.6.0][timeout-v0.6.0] +* uri 1.1.1 + * 1.0.4 to [v1.1.0][uri-v1.1.0], [v1.1.1][uri-v1.1.1] +* weakref 0.1.4 + * 0.1.3 to [v0.1.4][weakref-v0.1.4] +* zlib 3.2.2 + * 3.2.1 to [v3.2.2][zlib-v3.2.2] + +The following bundled gems are updated. + +* minitest 6.0.0 +* power_assert 3.0.1 + * 2.0.5 to [v3.0.0][power_assert-v3.0.0], [v3.0.1][power_assert-v3.0.1] +* rake 13.3.1 + * 13.2.1 to [v13.3.0][rake-v13.3.0], [v13.3.1][rake-v13.3.1] +* test-unit 3.7.5 + * 3.6.7 to [3.6.8][test-unit-3.6.8], [3.6.9][test-unit-3.6.9], [3.7.0][test-unit-3.7.0], [3.7.1][test-unit-3.7.1], [3.7.2][test-unit-3.7.2], [3.7.3][test-unit-3.7.3], [3.7.4][test-unit-3.7.4], [3.7.5][test-unit-3.7.5] +* rexml 3.4.4 +* rss 0.3.2 + * 0.3.1 to [0.3.2][rss-0.3.2] +* net-ftp 0.3.9 + * 0.3.8 to [v0.3.9][net-ftp-v0.3.9] +* net-imap 0.6.2 + * 0.5.8 to [v0.5.9][net-imap-v0.5.9], [v0.5.10][net-imap-v0.5.10], [v0.5.11][net-imap-v0.5.11], [v0.5.12][net-imap-v0.5.12], [v0.5.13][net-imap-v0.5.13], [v0.6.0][net-imap-v0.6.0], [v0.6.1][net-imap-v0.6.1], [v0.6.2][net-imap-v0.6.2] +* net-smtp 0.5.1 + * 0.5.0 to [v0.5.1][net-smtp-v0.5.1] +* matrix 0.4.3 + * 0.4.2 to [v0.4.3][matrix-v0.4.3] +* prime 0.1.4 + * 0.1.3 to [v0.1.4][prime-v0.1.4] +* rbs 3.10.0 + * 3.8.0 to [v3.8.1][rbs-v3.8.1], [v3.9.0.dev.1][rbs-v3.9.0.dev.1], [v3.9.0.pre.1][rbs-v3.9.0.pre.1], [v3.9.0.pre.2][rbs-v3.9.0.pre.2], [v3.9.0][rbs-v3.9.0], [v3.9.1][rbs-v3.9.1], [v3.9.2][rbs-v3.9.2], [v3.9.3][rbs-v3.9.3], [v3.9.4][rbs-v3.9.4], [v3.9.5][rbs-v3.9.5], [v3.10.0.pre.1][rbs-v3.10.0.pre.1], [v3.10.0.pre.2][rbs-v3.10.0.pre.2], [v3.10.0][rbs-v3.10.0] +* typeprof 0.31.1 +* debug 1.11.1 + * 1.11.0 to [v1.11.1][debug-v1.11.1] +* base64 0.3.0 + * 0.2.0 to [v0.3.0][base64-v0.3.0] +* bigdecimal 4.0.1 + * 3.1.8 to [v3.2.0][bigdecimal-v3.2.0], [v3.2.1][bigdecimal-v3.2.1], [v3.2.2][bigdecimal-v3.2.2], [v3.2.3][bigdecimal-v3.2.3], [v3.3.0][bigdecimal-v3.3.0], [v3.3.1][bigdecimal-v3.3.1], [v4.0.0][bigdecimal-v4.0.0], [v4.0.1][bigdecimal-v4.0.1] +* drb 2.2.3 + * 2.2.1 to [v2.2.3][drb-v2.2.3] +* syslog 0.3.0 + * 0.2.0 to [v0.3.0][syslog-v0.3.0] +* csv 3.3.5 + * 3.3.2 to [v3.3.3][csv-v3.3.3], [v3.3.4][csv-v3.3.4], [v3.3.5][csv-v3.3.5] +* repl_type_completor 0.1.12 + +### RubyGems and Bundler + +Ruby 4.0 bundled RubyGems and Bundler version 4. see the following links for details. + +* [Upgrading to RubyGems/Bundler 4 - RubyGems Blog](https://blog.rubygems.org/2025/12/03/upgrade-to-rubygems-bundler-4.html) +* [4.0.0 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/03/4.0.0-released.html) +* [4.0.1 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/09/4.0.1-released.html) +* [4.0.2 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/17/4.0.2-released.html) +* [4.0.3 Released - RubyGems Blog](https://blog.rubygems.org/2025/12/23/4.0.3-released.html) + +## Supported platforms + +* Windows + + * Dropped support for MSVC versions older than 14.0 (_MSC_VER 1900). + This means Visual Studio 2015 or later is now required. + +## Compatibility issues + +* The following methods were removed from Ractor due to the addition of `Ractor::Port`: + + * `Ractor.yield` + * `Ractor#take` + * `Ractor#close_incoming` + * `Ractor#close_outgoing` + + [[Feature #21262]] + +* `ObjectSpace._id2ref` is deprecated. [[Feature #15408]] + +* `Process::Status#&` and `Process::Status#>>` have been removed. + They were deprecated in Ruby 3.3. [[Bug #19868]] + +* `rb_path_check` has been removed. This function was used for + `$SAFE` path checking which was removed in Ruby 2.7, + and was already deprecated. + [[Feature #20971]] + +* A backtrace for `ArgumentError` of "wrong number of arguments" now + include the receiver's class or module name (e.g., in `Foo#bar` + instead of in `bar`). [[Bug #21698]] + +* Backtraces no longer display `internal` frames. + These methods now appear as if it is in the Ruby source file, + consistent with other C-implemented methods. [[Bug #20968]] + + Before: + ``` + ruby -e '[1].fetch_values(42)' + <internal:array>:211:in 'Array#fetch': index 42 outside of array bounds: -1...1 (IndexError) + from <internal:array>:211:in 'block in Array#fetch_values' + from <internal:array>:211:in 'Array#map!' + from <internal:array>:211:in 'Array#fetch_values' + from -e:1:in '<main>' + ``` + + After: + ``` + $ ruby -e '[1].fetch_values(42)' + -e:1:in 'Array#fetch_values': index 42 outside of array bounds: -1...1 (IndexError) + from -e:1:in '<main>' + ``` + +## Stdlib compatibility issues + +* CGI library is removed from the default gems. Now we only provide `cgi/escape` for + the following methods: + + * `CGI.escape` and `CGI.unescape` + * `CGI.escapeHTML` and `CGI.unescapeHTML` + * `CGI.escapeURIComponent` and `CGI.unescapeURIComponent` + * `CGI.escapeElement` and `CGI.unescapeElement` + + [[Feature #21258]] + +* With the move of `Set` from stdlib to core class, `set/sorted_set.rb` has + been removed, and `SortedSet` is no longer an autoloaded constant. Please + install the `sorted_set` gem and `require 'sorted_set'` to use `SortedSet`. + [[Feature #21287]] + +* Net::HTTP + + * The default behavior of automatically setting the `Content-Type` header + to `application/x-www-form-urlencoded` for requests with a body + (e.g., `POST`, `PUT`) when the header was not explicitly set has been + removed. If your application relied on this automatic default, your + requests will now be sent without a Content-Type header, potentially + breaking compatibility with certain servers. + [[GH-net-http #205]] + +## C API updates + +* IO + + * `rb_thread_fd_close` is deprecated and now a no-op. If you need to expose + file descriptors from C extensions to Ruby code, create an `IO` instance + using `RUBY_IO_MODE_EXTERNAL` and use `rb_io_close(io)` to close it (this + also interrupts and waits for all pending operations on the `IO` + instance). Directly closing file descriptors does not interrupt pending + operations, and may lead to undefined behaviour. In other words, if two + `IO` objects share the same file descriptor, closing one does not affect + the other. [[Feature #18455]] + +* GVL + + * `rb_thread_call_with_gvl` now works with or without the GVL. + This allows gems to avoid checking `ruby_thread_has_gvl_p`. + Please still be diligent about the GVL. [[Feature #20750]] + +* Set + + * A C API for `Set` has been added. The following methods are supported: + [[Feature #21459]] + + * `rb_set_foreach` + * `rb_set_new` + * `rb_set_new_capa` + * `rb_set_lookup` + * `rb_set_add` + * `rb_set_clear` + * `rb_set_delete` + * `rb_set_size` + +## Implementation improvements + +* `Class#new` (ex. `Object.new`) is faster in all cases, but especially when passing keyword arguments. This has also been integrated into YJIT and ZJIT. [[Feature #21254]] +* GC heaps of different size pools now grow independently, reducing memory usage when only some pools contain long-lived objects +* GC sweeping is faster on pages of large objects +* "Generic ivar" objects (String, Array, `TypedData`, etc.) now use a new internal "fields" object for faster instance variable access +* The GC avoids maintaining an internal `id2ref` table until it is first used, making `object_id` allocation and GC sweeping faster +* `object_id` and `hash` are faster on Class and Module objects +* Larger bignum Integers can remain embedded using variable width allocation +* `Random`, `Enumerator::Product`, `Enumerator::Chain`, `Addrinfo`, + `StringScanner`, and some internal objects are now write-barrier protected, + which reduces GC overhead. + +### Ractor + +A lot of work has gone into making Ractors more stable, performant, and usable. These improvements bring Ractor implementation closer to leaving experimental status. + +* Performance improvements + * Frozen strings and the symbol table internally use a lock-free hash set [[Feature #21268]] + * Method cache lookups avoid locking in most cases + * Class (and generic ivar) instance variable access is faster and avoids locking + * CPU cache contention is avoided in object allocation by using a per-ractor counter + * CPU cache contention is avoided in xmalloc/xfree by using a thread-local counter + * `object_id` avoids locking in most cases +* Bug fixes and stability + * Fixed possible deadlocks when combining Ractors and Threads + * Fixed issues with require and autoload in a Ractor + * Fixed encoding/transcoding issues across Ractors + * Fixed race conditions in GC operations and method invalidation + * Fixed issues with processes forking after starting a Ractor + * GC allocation counts are now accurate under Ractors + * Fixed TracePoints not working after GC [[Bug #19112]] + +## JIT + +* ZJIT + * Introduce an [experimental method-based JIT compiler](https://docs.ruby-lang.org/en/master/jit/zjit_md.html). + Where available, ZJIT can be enabled at runtime with the `--zjit` option or by calling `RubyVM::ZJIT.enable`. + When building Ruby, Rust 1.85.0 or later is required to include ZJIT support. + * As of Ruby 4.0.0, ZJIT is faster than the interpreter, but not yet as fast as YJIT. + We encourage experimentation with ZJIT, but advise against deploying it in production for now. + * Our goal is to make ZJIT faster than YJIT and production-ready in Ruby 4.1. +* YJIT + * `RubyVM::YJIT.runtime_stats` + * `ratio_in_yjit` no longer works in the default build. + Use `--enable-yjit=stats` on `configure` to enable it on `--yjit-stats`. + * Add `invalidate_everything` to default stats, which is + incremented when every code is invalidated by TracePoint. + * Add `mem_size:` and `call_threshold:` options to `RubyVM::YJIT.enable`. +* RJIT + * `--rjit` is removed. We will move the implementation of the third-party JIT API + to the [ruby/rjit](https://github.com/ruby/rjit) repository. + +[Feature #15408]: https://bugs.ruby-lang.org/issues/15408 +[Feature #17473]: https://bugs.ruby-lang.org/issues/17473 +[Feature #18455]: https://bugs.ruby-lang.org/issues/18455 +[Bug #19112]: https://bugs.ruby-lang.org/issues/19112 +[Feature #19630]: https://bugs.ruby-lang.org/issues/19630 +[Bug #19868]: https://bugs.ruby-lang.org/issues/19868 +[Feature #19908]: https://bugs.ruby-lang.org/issues/19908 +[Feature #20610]: https://bugs.ruby-lang.org/issues/20610 +[Feature #20724]: https://bugs.ruby-lang.org/issues/20724 +[Feature #20750]: https://bugs.ruby-lang.org/issues/20750 +[Feature #20884]: https://bugs.ruby-lang.org/issues/20884 +[Feature #20925]: https://bugs.ruby-lang.org/issues/20925 +[Bug #20968]: https://bugs.ruby-lang.org/issues/20968 +[Feature #20971]: https://bugs.ruby-lang.org/issues/20971 +[Bug #20974]: https://bugs.ruby-lang.org/issues/20974 +[Feature #21047]: https://bugs.ruby-lang.org/issues/21047 +[Bug #21049]: https://bugs.ruby-lang.org/issues/21049 +[Feature #21166]: https://bugs.ruby-lang.org/issues/21166 +[Bug #21174]: https://bugs.ruby-lang.org/issues/21174 +[Bug #21175]: https://bugs.ruby-lang.org/issues/21175 +[Bug #21185]: https://bugs.ruby-lang.org/issues/21185 +[Feature #21205]: https://bugs.ruby-lang.org/issues/21205 +[Feature #21216]: https://bugs.ruby-lang.org/issues/21216 +[Feature #21219]: https://bugs.ruby-lang.org/issues/21219 +[Feature #21254]: https://bugs.ruby-lang.org/issues/21254 +[Feature #21258]: https://bugs.ruby-lang.org/issues/21258 +[Feature #21268]: https://bugs.ruby-lang.org/issues/21268 +[Feature #21262]: https://bugs.ruby-lang.org/issues/21262 +[Feature #21275]: https://bugs.ruby-lang.org/issues/21275 +[Feature #21287]: https://bugs.ruby-lang.org/issues/21287 +[Feature #21311]: https://bugs.ruby-lang.org/issues/21311 +[Feature #21347]: https://bugs.ruby-lang.org/issues/21347 +[Feature #21360]: https://bugs.ruby-lang.org/issues/21360 +[Misc #21385]: https://bugs.ruby-lang.org/issues/21385 +[Feature #21389]: https://bugs.ruby-lang.org/issues/21389 +[Feature #21390]: https://bugs.ruby-lang.org/issues/21390 +[Feature #21459]: https://bugs.ruby-lang.org/issues/21459 +[Feature #21527]: https://bugs.ruby-lang.org/issues/21527 +[Feature #21543]: https://bugs.ruby-lang.org/issues/21543 +[Feature #21550]: https://bugs.ruby-lang.org/issues/21550 +[Feature #21552]: https://bugs.ruby-lang.org/issues/21552 +[Feature #21557]: https://bugs.ruby-lang.org/issues/21557 +[Bug #21633]: https://bugs.ruby-lang.org/issues/21633 +[Bug #21654]: https://bugs.ruby-lang.org/issues/21654 +[Feature #21678]: https://bugs.ruby-lang.org/issues/21678 +[Bug #21698]: https://bugs.ruby-lang.org/issues/21698 +[Feature #21701]: https://bugs.ruby-lang.org/issues/21701 +[Bug #21789]: https://bugs.ruby-lang.org/issues/21789 +[GH-net-http #205]: https://github.com/ruby/net-http/issues/205 +[ostruct-v0.6.2]: https://github.com/ruby/ostruct/releases/tag/v0.6.2 +[ostruct-v0.6.3]: https://github.com/ruby/ostruct/releases/tag/v0.6.3 +[pstore-v0.2.0]: https://github.com/ruby/pstore/releases/tag/v0.2.0 +[benchmark-v0.4.1]: https://github.com/ruby/benchmark/releases/tag/v0.4.1 +[benchmark-v0.5.0]: https://github.com/ruby/benchmark/releases/tag/v0.5.0 +[logger-v1.6.5]: https://github.com/ruby/logger/releases/tag/v1.6.5 +[logger-v1.6.6]: https://github.com/ruby/logger/releases/tag/v1.6.6 +[logger-v1.7.0]: https://github.com/ruby/logger/releases/tag/v1.7.0 +[rdoc-v6.14.1]: https://github.com/ruby/rdoc/releases/tag/v6.14.1 +[rdoc-v6.14.2]: https://github.com/ruby/rdoc/releases/tag/v6.14.2 +[rdoc-v6.15.0]: https://github.com/ruby/rdoc/releases/tag/v6.15.0 +[rdoc-v6.15.1]: https://github.com/ruby/rdoc/releases/tag/v6.15.1 +[rdoc-v6.16.0]: https://github.com/ruby/rdoc/releases/tag/v6.16.0 +[rdoc-v6.16.1]: https://github.com/ruby/rdoc/releases/tag/v6.16.1 +[rdoc-v6.17.0]: https://github.com/ruby/rdoc/releases/tag/v6.17.0 +[rdoc-v7.0.0]: https://github.com/ruby/rdoc/releases/tag/v7.0.0 +[rdoc-v7.0.1]: https://github.com/ruby/rdoc/releases/tag/v7.0.1 +[rdoc-v7.0.2]: https://github.com/ruby/rdoc/releases/tag/v7.0.2 +[rdoc-v7.0.3]: https://github.com/ruby/rdoc/releases/tag/v7.0.3 +[win32ole-v1.9.2]: https://github.com/ruby/win32ole/releases/tag/v1.9.2 +[irb-v1.15.0]: https://github.com/ruby/irb/releases/tag/v1.15.0 +[irb-v1.15.1]: https://github.com/ruby/irb/releases/tag/v1.15.1 +[irb-v1.15.2]: https://github.com/ruby/irb/releases/tag/v1.15.2 +[irb-v1.15.3]: https://github.com/ruby/irb/releases/tag/v1.15.3 +[irb-v1.16.0]: https://github.com/ruby/irb/releases/tag/v1.16.0 +[reline-v0.6.1]: https://github.com/ruby/reline/releases/tag/v0.6.1 +[reline-v0.6.2]: https://github.com/ruby/reline/releases/tag/v0.6.2 +[reline-v0.6.3]: https://github.com/ruby/reline/releases/tag/v0.6.3 +[fiddle-v1.1.7]: https://github.com/ruby/fiddle/releases/tag/v1.1.7 +[fiddle-v1.1.8]: https://github.com/ruby/fiddle/releases/tag/v1.1.8 +[date-v3.5.0]: https://github.com/ruby/date/releases/tag/v3.5.0 +[date-v3.5.1]: https://github.com/ruby/date/releases/tag/v3.5.1 +[delegate-v0.5.0]: https://github.com/ruby/delegate/releases/tag/v0.5.0 +[delegate-v0.6.0]: https://github.com/ruby/delegate/releases/tag/v0.6.0 +[delegate-v0.6.1]: https://github.com/ruby/delegate/releases/tag/v0.6.1 +[digest-v3.2.1]: https://github.com/ruby/digest/releases/tag/v3.2.1 +[english-v0.8.1]: https://github.com/ruby/english/releases/tag/v0.8.1 +[erb-v5.1.2]: https://github.com/ruby/erb/releases/tag/v5.1.2 +[erb-v5.1.3]: https://github.com/ruby/erb/releases/tag/v5.1.3 +[erb-v6.0.0]: https://github.com/ruby/erb/releases/tag/v6.0.0 +[erb-v6.0.1]: https://github.com/ruby/erb/releases/tag/v6.0.1 +[fcntl-v1.3.0]: https://github.com/ruby/fcntl/releases/tag/v1.3.0 +[fileutils-v1.8.0]: https://github.com/ruby/fileutils/releases/tag/v1.8.0 +[forwardable-v1.4.0]: https://github.com/ruby/forwardable/releases/tag/v1.4.0 +[io-console-v0.8.2]: https://github.com/ruby/io-console/releases/tag/v0.8.2 +[io-wait-v0.3.3]: https://github.com/ruby/io-wait/releases/tag/v0.3.3 +[io-wait-v0.3.5.test1]: https://github.com/ruby/io-wait/releases/tag/v0.3.5.test1 +[io-wait-v0.3.5]: https://github.com/ruby/io-wait/releases/tag/v0.3.5 +[io-wait-v0.3.6]: https://github.com/ruby/io-wait/releases/tag/v0.3.6 +[io-wait-v0.4.0]: https://github.com/ruby/io-wait/releases/tag/v0.4.0 +[json-v2.10.0]: https://github.com/ruby/json/releases/tag/v2.10.0 +[json-v2.10.1]: https://github.com/ruby/json/releases/tag/v2.10.1 +[json-v2.10.2]: https://github.com/ruby/json/releases/tag/v2.10.2 +[json-v2.11.0]: https://github.com/ruby/json/releases/tag/v2.11.0 +[json-v2.11.1]: https://github.com/ruby/json/releases/tag/v2.11.1 +[json-v2.11.2]: https://github.com/ruby/json/releases/tag/v2.11.2 +[json-v2.11.3]: https://github.com/ruby/json/releases/tag/v2.11.3 +[json-v2.12.0]: https://github.com/ruby/json/releases/tag/v2.12.0 +[json-v2.12.1]: https://github.com/ruby/json/releases/tag/v2.12.1 +[json-v2.12.2]: https://github.com/ruby/json/releases/tag/v2.12.2 +[json-v2.13.0]: https://github.com/ruby/json/releases/tag/v2.13.0 +[json-v2.13.1]: https://github.com/ruby/json/releases/tag/v2.13.1 +[json-v2.13.2]: https://github.com/ruby/json/releases/tag/v2.13.2 +[json-v2.14.0]: https://github.com/ruby/json/releases/tag/v2.14.0 +[json-v2.14.1]: https://github.com/ruby/json/releases/tag/v2.14.1 +[json-v2.15.0]: https://github.com/ruby/json/releases/tag/v2.15.0 +[json-v2.15.1]: https://github.com/ruby/json/releases/tag/v2.15.1 +[json-v2.15.2]: https://github.com/ruby/json/releases/tag/v2.15.2 +[json-v2.16.0]: https://github.com/ruby/json/releases/tag/v2.16.0 +[json-v2.17.0]: https://github.com/ruby/json/releases/tag/v2.17.0 +[json-v2.17.1]: https://github.com/ruby/json/releases/tag/v2.17.1 +[json-v2.18.0]: https://github.com/ruby/json/releases/tag/v2.18.0 +[net-http-v0.7.0]: https://github.com/ruby/net-http/releases/tag/v0.7.0 +[net-http-v0.8.0]: https://github.com/ruby/net-http/releases/tag/v0.8.0 +[net-http-v0.9.0]: https://github.com/ruby/net-http/releases/tag/v0.9.0 +[net-http-v0.9.1]: https://github.com/ruby/net-http/releases/tag/v0.9.1 +[openssl-v3.3.2]: https://github.com/ruby/openssl/releases/tag/v3.3.2 +[openssl-v4.0.0]: https://github.com/ruby/openssl/releases/tag/v4.0.0 +[optparse-v0.7.0]: https://github.com/ruby/optparse/releases/tag/v0.7.0 +[optparse-v0.8.0]: https://github.com/ruby/optparse/releases/tag/v0.8.0 +[optparse-v0.8.1]: https://github.com/ruby/optparse/releases/tag/v0.8.1 +[pp-v0.6.3]: https://github.com/ruby/pp/releases/tag/v0.6.3 +[prism-v1.6.0]: https://github.com/ruby/prism/releases/tag/v1.6.0 +[prism-v1.7.0]: https://github.com/ruby/prism/releases/tag/v1.7.0 +[psych-v5.2.3]: https://github.com/ruby/psych/releases/tag/v5.2.3 +[psych-v5.2.4]: https://github.com/ruby/psych/releases/tag/v5.2.4 +[psych-v5.2.5]: https://github.com/ruby/psych/releases/tag/v5.2.5 +[psych-v5.2.6]: https://github.com/ruby/psych/releases/tag/v5.2.6 +[psych-v5.3.0]: https://github.com/ruby/psych/releases/tag/v5.3.0 +[psych-v5.3.1]: https://github.com/ruby/psych/releases/tag/v5.3.1 +[resolv-v0.6.3]: https://github.com/ruby/resolv/releases/tag/v0.6.3 +[resolv-v0.7.0]: https://github.com/ruby/resolv/releases/tag/v0.7.0 +[stringio-v3.1.3]: https://github.com/ruby/stringio/releases/tag/v3.1.3 +[stringio-v3.1.4]: https://github.com/ruby/stringio/releases/tag/v3.1.4 +[stringio-v3.1.5]: https://github.com/ruby/stringio/releases/tag/v3.1.5 +[stringio-v3.1.6]: https://github.com/ruby/stringio/releases/tag/v3.1.6 +[stringio-v3.1.7]: https://github.com/ruby/stringio/releases/tag/v3.1.7 +[stringio-v3.1.8]: https://github.com/ruby/stringio/releases/tag/v3.1.8 +[stringio-v3.1.9]: https://github.com/ruby/stringio/releases/tag/v3.1.9 +[stringio-v3.2.0]: https://github.com/ruby/stringio/releases/tag/v3.2.0 +[strscan-v3.1.3]: https://github.com/ruby/strscan/releases/tag/v3.1.3 +[strscan-v3.1.4]: https://github.com/ruby/strscan/releases/tag/v3.1.4 +[strscan-v3.1.5]: https://github.com/ruby/strscan/releases/tag/v3.1.5 +[strscan-v3.1.6]: https://github.com/ruby/strscan/releases/tag/v3.1.6 +[time-v0.4.2]: https://github.com/ruby/time/releases/tag/v0.4.2 +[timeout-v0.4.4]: https://github.com/ruby/timeout/releases/tag/v0.4.4 +[timeout-v0.5.0]: https://github.com/ruby/timeout/releases/tag/v0.5.0 +[timeout-v0.6.0]: https://github.com/ruby/timeout/releases/tag/v0.6.0 +[uri-v1.1.0]: https://github.com/ruby/uri/releases/tag/v1.1.0 +[uri-v1.1.1]: https://github.com/ruby/uri/releases/tag/v1.1.1 +[weakref-v0.1.4]: https://github.com/ruby/weakref/releases/tag/v0.1.4 +[zlib-v3.2.2]: https://github.com/ruby/zlib/releases/tag/v3.2.2 +[power_assert-v3.0.0]: https://github.com/ruby/power_assert/releases/tag/v3.0.0 +[power_assert-v3.0.1]: https://github.com/ruby/power_assert/releases/tag/v3.0.1 +[rake-v13.3.0]: https://github.com/ruby/rake/releases/tag/v13.3.0 +[rake-v13.3.1]: https://github.com/ruby/rake/releases/tag/v13.3.1 +[test-unit-3.6.8]: https://github.com/test-unit/test-unit/releases/tag/3.6.8 +[test-unit-3.6.9]: https://github.com/test-unit/test-unit/releases/tag/3.6.9 +[test-unit-3.7.0]: https://github.com/test-unit/test-unit/releases/tag/3.7.0 +[test-unit-3.7.1]: https://github.com/test-unit/test-unit/releases/tag/3.7.1 +[test-unit-3.7.2]: https://github.com/test-unit/test-unit/releases/tag/3.7.2 +[test-unit-3.7.3]: https://github.com/test-unit/test-unit/releases/tag/3.7.3 +[test-unit-3.7.4]: https://github.com/test-unit/test-unit/releases/tag/3.7.4 +[test-unit-3.7.5]: https://github.com/test-unit/test-unit/releases/tag/3.7.5 +[rss-0.3.2]: https://github.com/ruby/rss/releases/tag/0.3.2 +[net-ftp-v0.3.9]: https://github.com/ruby/net-ftp/releases/tag/v0.3.9 +[net-imap-v0.5.9]: https://github.com/ruby/net-imap/releases/tag/v0.5.9 +[net-imap-v0.5.10]: https://github.com/ruby/net-imap/releases/tag/v0.5.10 +[net-imap-v0.5.11]: https://github.com/ruby/net-imap/releases/tag/v0.5.11 +[net-imap-v0.5.12]: https://github.com/ruby/net-imap/releases/tag/v0.5.12 +[net-imap-v0.5.13]: https://github.com/ruby/net-imap/releases/tag/v0.5.13 +[net-imap-v0.6.0]: https://github.com/ruby/net-imap/releases/tag/v0.6.0 +[net-imap-v0.6.1]: https://github.com/ruby/net-imap/releases/tag/v0.6.1 +[net-imap-v0.6.2]: https://github.com/ruby/net-imap/releases/tag/v0.6.2 +[net-smtp-v0.5.1]: https://github.com/ruby/net-smtp/releases/tag/v0.5.1 +[matrix-v0.4.3]: https://github.com/ruby/matrix/releases/tag/v0.4.3 +[prime-v0.1.4]: https://github.com/ruby/prime/releases/tag/v0.1.4 +[rbs-v3.8.1]: https://github.com/ruby/rbs/releases/tag/v3.8.1 +[rbs-v3.9.0.dev.1]: https://github.com/ruby/rbs/releases/tag/v3.9.0.dev.1 +[rbs-v3.9.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.9.0.pre.1 +[rbs-v3.9.0.pre.2]: https://github.com/ruby/rbs/releases/tag/v3.9.0.pre.2 +[rbs-v3.9.0]: https://github.com/ruby/rbs/releases/tag/v3.9.0 +[rbs-v3.9.1]: https://github.com/ruby/rbs/releases/tag/v3.9.1 +[rbs-v3.9.2]: https://github.com/ruby/rbs/releases/tag/v3.9.2 +[rbs-v3.9.3]: https://github.com/ruby/rbs/releases/tag/v3.9.3 +[rbs-v3.9.4]: https://github.com/ruby/rbs/releases/tag/v3.9.4 +[rbs-v3.9.5]: https://github.com/ruby/rbs/releases/tag/v3.9.5 +[rbs-v3.10.0.pre.1]: https://github.com/ruby/rbs/releases/tag/v3.10.0.pre.1 +[rbs-v3.10.0.pre.2]: https://github.com/ruby/rbs/releases/tag/v3.10.0.pre.2 +[rbs-v3.10.0]: https://github.com/ruby/rbs/releases/tag/v3.10.0 +[debug-v1.11.1]: https://github.com/ruby/debug/releases/tag/v1.11.1 +[base64-v0.3.0]: https://github.com/ruby/base64/releases/tag/v0.3.0 +[bigdecimal-v3.2.0]: https://github.com/ruby/bigdecimal/releases/tag/v3.2.0 +[bigdecimal-v3.2.1]: https://github.com/ruby/bigdecimal/releases/tag/v3.2.1 +[bigdecimal-v3.2.2]: https://github.com/ruby/bigdecimal/releases/tag/v3.2.2 +[bigdecimal-v3.2.3]: https://github.com/ruby/bigdecimal/releases/tag/v3.2.3 +[bigdecimal-v3.3.0]: https://github.com/ruby/bigdecimal/releases/tag/v3.3.0 +[bigdecimal-v3.3.1]: https://github.com/ruby/bigdecimal/releases/tag/v3.3.1 +[bigdecimal-v4.0.0]: https://github.com/ruby/bigdecimal/releases/tag/v4.0.0 +[bigdecimal-v4.0.1]: https://github.com/ruby/bigdecimal/releases/tag/v4.0.1 +[drb-v2.2.3]: https://github.com/ruby/drb/releases/tag/v2.2.3 +[syslog-v0.3.0]: https://github.com/ruby/syslog/releases/tag/v0.3.0 +[csv-v3.3.3]: https://github.com/ruby/csv/releases/tag/v3.3.3 +[csv-v3.3.4]: https://github.com/ruby/csv/releases/tag/v3.3.4 +[csv-v3.3.5]: https://github.com/ruby/csv/releases/tag/v3.3.5 diff --git a/doc/_regexp.rdoc b/doc/_regexp.rdoc new file mode 100644 index 0000000000..aa55a7eebf --- /dev/null +++ b/doc/_regexp.rdoc @@ -0,0 +1,1284 @@ +A {regular expression}[https://en.wikipedia.org/wiki/Regular_expression] +(also called a _regexp_) is a <i>match pattern</i> (also simply called a _pattern_). + +A common notation for a regexp uses enclosing slash characters: + + /foo/ + +A regexp may be applied to a <i>target string</i>; +The part of the string (if any) that matches the pattern is called a _match_, +and may be said <i>to match</i>: + + re = /red/ + re.match?('redirect') # => true # Match at beginning of target. + re.match?('bored') # => true # Match at end of target. + re.match?('credit') # => true # Match within target. + re.match?('foo') # => false # No match. + +== \Regexp Uses + +A regexp may be used: + +- To extract substrings based on a given pattern: + + re = /foo/ # => /foo/ + re.match('food') # => #<MatchData "foo"> + re.match('good') # => nil + + See sections {Method match}[rdoc-ref:Regexp@Method+match] + and {Operator =~}[rdoc-ref:Regexp@Operator+-3D~]. + +- To determine whether a string matches a given pattern: + + re.match?('food') # => true + re.match?('good') # => false + + See section {Method match?}[rdoc-ref:Regexp@Method+match-3F]. + +- As an argument for calls to certain methods in other classes and modules; + most such methods accept an argument that may be either a string + or the (much more powerful) regexp. + + See {Regexp Methods}[rdoc-ref:language/regexp/methods.rdoc]. + +== \Regexp Objects + +A regexp object has: + +- A source; see {Sources}[rdoc-ref:Regexp@Sources]. + +- Several modes; see {Modes}[rdoc-ref:Regexp@Modes]. + +- A timeout; see {Timeouts}[rdoc-ref:Regexp@Timeouts]. + +- An encoding; see {Encodings}[rdoc-ref:Regexp@Encodings]. + +== Creating a \Regexp + +A regular expression may be created with: + +- A regexp literal using slash characters + (see {Regexp Literals}[rdoc-ref:syntax/literals.rdoc@Regexp+Literals]): + + # This is a very common usage. + /foo/ # => /foo/ + +- A <tt>%r</tt> regexp literal + (see {%r: Regexp Literals}[rdoc-ref:syntax/literals.rdoc@25r-3A+Regexp+Literals]): + + # Same delimiter character at beginning and end; + # useful for avoiding escaping characters + %r/name\/value pair/ # => /name\/value pair/ + %r:name/value pair: # => /name\/value pair/ + %r|name/value pair| # => /name\/value pair/ + + # Certain "paired" characters can be delimiters. + %r[foo] # => /foo/ + %r{foo} # => /foo/ + %r(foo) # => /foo/ + %r<foo> # => /foo/ + +- Method Regexp.new. + +== Method <tt>match</tt> + +Each of the methods Regexp#match, String#match, and Symbol#match +returns a MatchData object if a match was found, +nil+ otherwise; +each also sets {global variables}[rdoc-ref:Regexp@Global+Variables]: + + 'food'.match(/foo/) # => #<MatchData "foo"> + 'food'.match(/bar/) # => nil + +== Operator <tt>=~</tt> + +Each of the operators Regexp#=~, String#=~, and Symbol#=~ +returns an integer offset if a match was found, +nil+ otherwise; +each also sets {global variables}[rdoc-ref:Regexp@Global+Variables]: + + /bar/ =~ 'foo bar' # => 4 + 'foo bar' =~ /bar/ # => 4 + /baz/ =~ 'foo bar' # => nil + +== Method <tt>match?</tt> + +Each of the methods Regexp#match?, String#match?, and Symbol#match? +returns +true+ if a match was found, +false+ otherwise; +none sets {global variables}[rdoc-ref:Regexp@Global+Variables]: + + 'food'.match?(/foo/) # => true + 'food'.match?(/bar/) # => false + +== Global Variables + +Certain regexp-oriented methods assign values to global variables: + +- <tt>#match</tt>: see {Method match}[rdoc-ref:Regexp@Method+match]. +- <tt>#=~</tt>: see {Operator =~}[rdoc-ref:Regexp@Operator+-3D~]. + +The affected global variables are: + +- <tt>$~</tt>: Returns a MatchData object, or +nil+. +- <tt>$&</tt>: Returns the matched part of the string, or +nil+. +- <tt>$`</tt>: Returns the part of the string to the left of the match, or +nil+. +- <tt>$'</tt>: Returns the part of the string to the right of the match, or +nil+. +- <tt>$+</tt>: Returns the last group matched, or +nil+. +- <tt>$1</tt>, <tt>$2</tt>, etc.: Returns the first, second, etc., + matched group, or +nil+. + Note that <tt>$0</tt> is quite different; + it returns the name of the currently executing program. + +These variables, except for <tt>$~</tt>, are shorthands for methods of +<tt>$~</tt>. See MatchData@Global+variables+equivalence. + +Examples: + + # Matched string, but no matched groups. + 'foo bar bar baz'.match('bar') + $~ # => #<MatchData "bar"> + $& # => "bar" + $` # => "foo " + $' # => " bar baz" + $+ # => nil + $1 # => nil + + # Matched groups. + /s(\w{2}).*(c)/.match('haystack') + $~ # => #<MatchData "stac" 1:"ta" 2:"c"> + $& # => "stac" + $` # => "hay" + $' # => "k" + $+ # => "c" + $1 # => "ta" + $2 # => "c" + $3 # => nil + + # No match. + 'foo'.match('bar') + $~ # => nil + $& # => nil + $` # => nil + $' # => nil + $+ # => nil + $1 # => nil + +Note that Regexp#match?, String#match?, and Symbol#match? +do not set global variables. + +== Sources + +As seen above, the simplest regexp uses a literal expression as its source: + + re = /foo/ # => /foo/ + re.match('food') # => #<MatchData "foo"> + re.match('good') # => nil + +A rich collection of available _subexpressions_ +gives the regexp great power and flexibility: + +- {Special characters}[rdoc-ref:Regexp@Special+Characters] +- {Source literals}[rdoc-ref:Regexp@Source+Literals] +- {Character classes}[rdoc-ref:Regexp@Character+Classes] +- {Shorthand character classes}[rdoc-ref:Regexp@Shorthand+Character+Classes] +- {Anchors}[rdoc-ref:Regexp@Anchors] +- {Alternation}[rdoc-ref:Regexp@Alternation] +- {Quantifiers}[rdoc-ref:Regexp@Quantifiers] +- {Groups and captures}[rdoc-ref:Regexp@Groups+and+Captures] +- {Unicode}[rdoc-ref:Regexp@Unicode] +- {POSIX Bracket Expressions}[rdoc-ref:Regexp@POSIX+Bracket+Expressions] +- {Comments}[rdoc-ref:Regexp@Comments] + +=== Special Characters + +\Regexp special characters, called _metacharacters_, +have special meanings in certain contexts; +depending on the context, these are sometimes metacharacters: + + . ? - + * ^ \ | $ ( ) [ ] { } + +To match a metacharacter literally, backslash-escape it: + + # Matches one or more 'o' characters. + /o+/.match('foo') # => #<MatchData "oo"> + # Would match 'o+'. + /o\+/.match('foo') # => nil + +To match a backslash literally, backslash-escape it: + + /\./.match('\.') # => #<MatchData "."> + /\\./.match('\.') # => #<MatchData "\\."> + +Method Regexp.escape returns an escaped string: + + Regexp.escape('.?-+*^\|$()[]{}') + # => "\\.\\?\\-\\+\\*\\^\\\\\\|\\$\\(\\)\\[\\]\\{\\}" + +=== Source Literals + +The source literal largely behaves like a double-quoted string; +see {Double-Quoted String Literals}[rdoc-ref:syntax/literals.rdoc@Double-Quoted+String+Literals]. + +In particular, a source literal may contain interpolated expressions: + + s = 'foo' # => "foo" + /#{s}/ # => /foo/ + /#{s.capitalize}/ # => /Foo/ + /#{2 + 2}/ # => /4/ + +There are differences between an ordinary string literal and a source literal; +see {Shorthand Character Classes}[rdoc-ref:Regexp@Shorthand+Character+Classes]. + +- <tt>\s</tt> in an ordinary string literal is equivalent to a space character; + in a source literal, it's shorthand for matching a whitespace character. +- In an ordinary string literal, these are (needlessly) escaped characters; + in a source literal, they are shorthands for various matching characters: + + \w \W \d \D \h \H \S \R + +=== Character Classes + +A <i>character class</i> is delimited by square brackets; +it specifies that certain characters match at a given point in the target string: + + # This character class will match any vowel. + re = /B[aeiou]rd/ + re.match('Bird') # => #<MatchData "Bird"> + re.match('Bard') # => #<MatchData "Bard"> + re.match('Byrd') # => nil + +A character class may contain hyphen characters to specify ranges of characters: + + # These regexps have the same effect. + /[abcdef]/.match('foo') # => #<MatchData "f"> + /[a-f]/.match('foo') # => #<MatchData "f"> + /[a-cd-f]/.match('foo') # => #<MatchData "f"> + +When the first character of a character class is a caret (<tt>^</tt>), +the sense of the class is inverted: it matches any character _except_ those specified. + + /[^a-eg-z]/.match('f') # => #<MatchData "f"> + +A character class may contain another character class. +By itself this isn't useful because <tt>[a-z[0-9]]</tt> +describes the same set as <tt>[a-z0-9]</tt>. + +However, character classes also support the <tt>&&</tt> operator, +which performs set intersection on its arguments. +The two can be combined as follows: + + /[a-w&&[^c-g]z]/ # ([a-w] AND ([^c-g] OR z)) + +This is equivalent to: + + /[abh-w]/ + +=== Shorthand Character Classes + +Each of the following metacharacters serves as a shorthand +for a character class: + +- <tt>/./</tt>: Matches any character except a newline: + + /./.match('foo') # => #<MatchData "f"> + /./.match("\n") # => nil + +- <tt>/./m</tt>: Matches any character, including a newline; + see {Multiline Mode}[rdoc-ref:Regexp@Multiline+Mode]: + + /./m.match("\n") # => #<MatchData "\n"> + +- <tt>/\w/</tt>: Matches a word character: equivalent to <tt>[a-zA-Z0-9_]</tt>: + + /\w/.match(' foo') # => #<MatchData "f"> + /\w/.match(' _') # => #<MatchData "_"> + /\w/.match(' ') # => nil + +- <tt>/\W/</tt>: Matches a non-word character: equivalent to <tt>[^a-zA-Z0-9_]</tt>: + + /\W/.match(' ') # => #<MatchData " "> + /\W/.match('_') # => nil + +- <tt>/\d/</tt>: Matches a digit character: equivalent to <tt>[0-9]</tt>: + + /\d/.match('THX1138') # => #<MatchData "1"> + /\d/.match('foo') # => nil + +- <tt>/\D/</tt>: Matches a non-digit character: equivalent to <tt>[^0-9]</tt>: + + /\D/.match('123Jump!') # => #<MatchData "J"> + /\D/.match('123') # => nil + +- <tt>/\h/</tt>: Matches a hexdigit character: equivalent to <tt>[0-9a-fA-F]</tt>: + + /\h/.match('xyz fedcba9876543210') # => #<MatchData "f"> + /\h/.match('xyz') # => nil + +- <tt>/\H/</tt>: Matches a non-hexdigit character: equivalent to <tt>[^0-9a-fA-F]</tt>: + + /\H/.match('fedcba9876543210xyz') # => #<MatchData "x"> + /\H/.match('fedcba9876543210') # => nil + +- <tt>/\s/</tt>: Matches a whitespace character: equivalent to <tt>/[ \t\r\n\f\v]/</tt>: + + /\s/.match('foo bar') # => #<MatchData " "> + /\s/.match('foo') # => nil + +- <tt>/\S/</tt>: Matches a non-whitespace character: equivalent to <tt>/[^ \t\r\n\f\v]/</tt>: + + /\S/.match(" \t\r\n\f\v foo") # => #<MatchData "f"> + /\S/.match(" \t\r\n\f\v") # => nil + +- <tt>/\R/</tt>: Matches a linebreak, platform-independently: + + /\R/.match("\r") # => #<MatchData "\r"> # Carriage return (CR) + /\R/.match("\n") # => #<MatchData "\n"> # Newline (LF) + /\R/.match("\f") # => #<MatchData "\f"> # Formfeed (FF) + /\R/.match("\v") # => #<MatchData "\v"> # Vertical tab (VT) + /\R/.match("\r\n") # => #<MatchData "\r\n"> # CRLF + /\R/.match("\u0085") # => #<MatchData "\u0085"> # Next line (NEL) + /\R/.match("\u2028") # => #<MatchData "\u2028"> # Line separator (LSEP) + /\R/.match("\u2029") # => #<MatchData "\u2029"> # Paragraph separator (PSEP) + +=== Anchors + +An anchor is a metasequence that matches a zero-width position between +characters in the target string. + +For a subexpression with no anchor, +matching may begin anywhere in the target string: + + /real/.match('surrealist') # => #<MatchData "real"> + +For a subexpression with an anchor, +matching must begin at the matched anchor. + +==== Boundary Anchors + +Each of these anchors matches a boundary: + +- <tt>^</tt>: Matches the beginning of a line: + + /^bar/.match("foo\nbar") # => #<MatchData "bar"> + /^ar/.match("foo\nbar") # => nil + +- <tt>$</tt>: Matches the end of a line: + + /bar$/.match("foo\nbar") # => #<MatchData "bar"> + /ba$/.match("foo\nbar") # => nil + +- <tt>\A</tt>: Matches the beginning of the string: + + /\Afoo/.match('foo bar') # => #<MatchData "foo"> + /\Afoo/.match(' foo bar') # => nil + +- <tt>\Z</tt>: Matches the end of the string; + if string ends with a single newline, + it matches just before the ending newline: + + /foo\Z/.match('bar foo') # => #<MatchData "foo"> + /foo\Z/.match('foo bar') # => nil + /foo\Z/.match("bar foo\n") # => #<MatchData "foo"> + /foo\Z/.match("bar foo\n\n") # => nil + +- <tt>\z</tt>: Matches the end of the string: + + /foo\z/.match('bar foo') # => #<MatchData "foo"> + /foo\z/.match('foo bar') # => nil + /foo\z/.match("bar foo\n") # => nil + +- <tt>\b</tt>: Matches word boundary when not inside brackets; + matches backspace (<tt>"0x08"</tt>) when inside brackets: + + /foo\b/.match('foo bar') # => #<MatchData "foo"> + /foo\b/.match('foobar') # => nil + +- <tt>\B</tt>: Matches non-word boundary: + + /foo\B/.match('foobar') # => #<MatchData "foo"> + /foo\B/.match('foo bar') # => nil + +- <tt>\G</tt>: Matches first matching position: + + In methods like String#gsub and String#scan, it changes on each iteration. + It initially matches the beginning of subject, and in each following iteration it matches where the last match finished. + + " a b c".gsub(/ /, '_') # => "____a_b_c" + " a b c".gsub(/\G /, '_') # => "____a b c" + + In methods like Regexp#match and String#match + that take an optional offset, it matches where the search begins. + + "hello, world".match(/,/, 3) # => #<MatchData ","> + "hello, world".match(/\G,/, 3) # => nil + +==== Lookaround Anchors + +Lookahead anchors: + +- <tt>(?=pat)</tt>: Positive lookahead assertion: + ensures that the following characters match _pat_, + but doesn't include those characters in the matched substring. + +- <tt>(?!pat)</tt>: Negative lookahead assertion: + ensures that the following characters <i>do not</i> match _pat_, + but doesn't include those characters in the matched substring. + +Lookbehind anchors: + +- <tt>(?<=pat)</tt>: Positive lookbehind assertion: + ensures that the preceding characters match _pat_, but + doesn't include those characters in the matched substring. + +- <tt>(?<!pat)</tt>: Negative lookbehind assertion: + ensures that the preceding characters do not match + _pat_, but doesn't include those characters in the matched substring. + +The pattern below uses positive lookahead and positive lookbehind to match +text appearing in <tt><b></tt>...<tt></b></tt> tags +without including the tags in the match: + + /(?<=<b>)\w+(?=<\/b>)/.match("Fortune favors the <b>bold</b>.") + # => #<MatchData "bold"> + +The pattern in lookbehind must be fixed-width. +But top-level alternatives can be of various lengths. +ex. (?<=a|bc) is OK. (?<=aaa(?:b|cd)) is not allowed. + +==== Match-Reset Anchor + +- <tt>\K</tt>: Match reset: + the matched content preceding <tt>\K</tt> in the regexp is excluded from the result. + For example, the following two regexps are almost equivalent: + + /ab\Kc/.match('abc') # => #<MatchData "c"> + /(?<=ab)c/.match('abc') # => #<MatchData "c"> + + These match same string and <tt>$&</tt> equals <tt>'c'</tt>, + while the matched position is different. + + As are the following two regexps: + + /(a)\K(b)\Kc/ + /(?<=(?<=(a))(b))c/ + +=== Alternation + +The vertical bar metacharacter (<tt>|</tt>) may be used within parentheses +to express alternation: +two or more subexpressions any of which may match the target string. + +Two alternatives: + + re = /(a|b)/ + re.match('foo') # => nil + re.match('bar') # => #<MatchData "b" 1:"b"> + +Four alternatives: + + re = /(a|b|c|d)/ + re.match('shazam') # => #<MatchData "a" 1:"a"> + re.match('cold') # => #<MatchData "c" 1:"c"> + +Each alternative is a subexpression, and may be composed of other subexpressions: + + re = /([a-c]|[x-z])/ + re.match('bar') # => #<MatchData "b" 1:"b"> + re.match('ooz') # => #<MatchData "z" 1:"z"> + +Method Regexp.union provides a convenient way to construct +a regexp with alternatives. + +=== Quantifiers + +A simple regexp matches one character: + + /\w/.match('Hello') # => #<MatchData "H"> + +An added _quantifier_ specifies how many matches are required or allowed: + +- <tt>*</tt> - Matches zero or more times: + + /\w*/.match('') + # => #<MatchData ""> + /\w*/.match('x') + # => #<MatchData "x"> + /\w*/.match('xyz') + # => #<MatchData "xyz"> + +- <tt>+</tt> - Matches one or more times: + + /\w+/.match('') # => nil + /\w+/.match('x') # => #<MatchData "x"> + /\w+/.match('xyz') # => #<MatchData "xyz"> + +- <tt>?</tt> - Matches zero or one times: + + /\w?/.match('') # => #<MatchData ""> + /\w?/.match('x') # => #<MatchData "x"> + /\w?/.match('xyz') # => #<MatchData "x"> + +- <tt>{</tt>_n_<tt>}</tt> - Matches exactly _n_ times: + + /\w{2}/.match('') # => nil + /\w{2}/.match('x') # => nil + /\w{2}/.match('xyz') # => #<MatchData "xy"> + +- <tt>{</tt>_min_<tt>,}</tt> - Matches _min_ or more times: + + /\w{2,}/.match('') # => nil + /\w{2,}/.match('x') # => nil + /\w{2,}/.match('xy') # => #<MatchData "xy"> + /\w{2,}/.match('xyz') # => #<MatchData "xyz"> + +- <tt>{,</tt>_max_<tt>}</tt> - Matches _max_ or fewer times: + + /\w{,2}/.match('') # => #<MatchData ""> + /\w{,2}/.match('x') # => #<MatchData "x"> + /\w{,2}/.match('xyz') # => #<MatchData "xy"> + +- <tt>{</tt>_min_<tt>,</tt>_max_<tt>}</tt> - + Matches at least _min_ times and at most _max_ times: + + /\w{1,2}/.match('') # => nil + /\w{1,2}/.match('x') # => #<MatchData "x"> + /\w{1,2}/.match('xyz') # => #<MatchData "xy"> + +==== Greedy, Lazy, or Possessive Matching + +Quantifier matching may be greedy, lazy, or possessive: + +- In _greedy_ matching, as many occurrences as possible are matched + while still allowing the overall match to succeed. + Greedy quantifiers: <tt>*</tt>, <tt>+</tt>, <tt>?</tt>, + <tt>{min, max}</tt> and its variants. +- In _lazy_ matching, the minimum number of occurrences are matched. + Lazy quantifiers: <tt>*?</tt>, <tt>+?</tt>, <tt>??</tt>, + <tt>{min, max}?</tt> and its variants. +- In _possessive_ matching, once a match is found, there is no backtracking; + that match is retained, even if it jeopardises the overall match. + Possessive quantifiers: <tt>*+</tt>, <tt>++</tt>, <tt>?+</tt>. + Note that <tt>{min, max}</tt> and its variants do _not_ support possessive matching. + +More: + +- About greedy and lazy matching, see + {Choosing Minimal or Maximal Repetition}[https://doc.lagout.org/programmation/Regular%20Expressions/Regular%20Expressions%20Cookbook_%20Detailed%20Solutions%20in%20Eight%20Programming%20Languages%20%282nd%20ed.%29%20%5BGoyvaerts%20%26%20Levithan%202012-09-06%5D.pdf#tutorial-backtrack]. +- About possessive matching, see + {Eliminate Needless Backtracking}[https://doc.lagout.org/programmation/Regular%20Expressions/Regular%20Expressions%20Cookbook_%20Detailed%20Solutions%20in%20Eight%20Programming%20Languages%20%282nd%20ed.%29%20%5BGoyvaerts%20%26%20Levithan%202012-09-06%5D.pdf#tutorial-backtrack]. + +=== Groups and Captures + +A simple regexp has (at most) one match: + + re = /\d\d\d\d-\d\d-\d\d/ + re.match('1943-02-04') # => #<MatchData "1943-02-04"> + re.match('1943-02-04').size # => 1 + re.match('foo') # => nil + +Adding one or more pairs of parentheses, <tt>(subexpression)</tt>, +defines _groups_, which may result in multiple matched substrings, +called _captures_: + + re = /(\d\d\d\d)-(\d\d)-(\d\d)/ + re.match('1943-02-04') # => #<MatchData "1943-02-04" 1:"1943" 2:"02" 3:"04"> + re.match('1943-02-04').size # => 4 + +The first capture is the entire matched string; +the other captures are the matched substrings from the groups. + +A group may have a {quantifier}[rdoc-ref:Regexp@Quantifiers]: + + re = /July 4(th)?/ + re.match('July 4') # => #<MatchData "July 4" 1:nil> + re.match('July 4th') # => #<MatchData "July 4th" 1:"th"> + + re = /(foo)*/ + re.match('') # => #<MatchData "" 1:nil> + re.match('foo') # => #<MatchData "foo" 1:"foo"> + re.match('foofoo') # => #<MatchData "foofoo" 1:"foo"> + + re = /(foo)+/ + re.match('') # => nil + re.match('foo') # => #<MatchData "foo" 1:"foo"> + re.match('foofoo') # => #<MatchData "foofoo" 1:"foo"> + +The returned \MatchData object gives access to the matched substrings: + + re = /(\d\d\d\d)-(\d\d)-(\d\d)/ + md = re.match('1943-02-04') + # => #<MatchData "1943-02-04" 1:"1943" 2:"02" 3:"04"> + md[0] # => "1943-02-04" + md[1] # => "1943" + md[2] # => "02" + md[3] # => "04" + +==== Non-Capturing Groups + +A group may be made non-capturing; +it is still a group (and, for example, can have a quantifier), +but its matching substring is not included among the captures. + +A non-capturing group begins with <tt>?:</tt> (inside the parentheses): + + # Don't capture the year. + re = /(?:\d\d\d\d)-(\d\d)-(\d\d)/ + md = re.match('1943-02-04') # => #<MatchData "1943-02-04" 1:"02" 2:"04"> + +==== Backreferences + +A group match may also be referenced within the regexp itself; +such a reference is called a +backreference+: + + /[csh](..) [csh]\1 in/.match('The cat sat in the hat') + # => #<MatchData "cat sat in" 1:"at"> + +This table shows how each subexpression in the regexp above +matches a substring in the target string: + + | Subexpression in Regexp | Matching Substring in Target String | + |---------------------------|-------------------------------------| + | First '[csh]' | Character 'c' | + | '(..)' | First substring 'at' | + | First space ' ' | First space character ' ' | + | Second '[csh]' | Character 's' | + | '\1' (backreference 'at') | Second substring 'at' | + | ' in' | Substring ' in' | + +A regexp may contain any number of groups: + +- For a large number of groups: + + - The ordinary <tt>\\n</tt> notation applies only for _n_ in range (1..9). + - The <tt>MatchData[n]</tt> notation applies for any non-negative _n_. + +- <tt>\0</tt> is a special backreference, referring to the entire matched string; + it may not be used within the regexp itself, + but may be used outside it (for example, in a substitution method call): + + 'The cat sat in the hat'.gsub(/[csh]at/, '\0s') + # => "The cats sats in the hats" + +==== Named Captures + +As seen above, a capture can be referred to by its number. +A capture can also have a name, +prefixed as <tt>?<name></tt> or <tt>?'name'</tt>, +and the name (symbolized) may be used as an index in <tt>MatchData[]</tt>: + + md = /\$(?<dollars>\d+)\.(?'cents'\d+)/.match("$3.67") + # => #<MatchData "$3.67" dollars:"3" cents:"67"> + md[:dollars] # => "3" + md[:cents] # => "67" + # The capture numbers are still valid. + md[2] # => "67" + +When a regexp contains a named capture, there are no unnamed captures: + + /\$(?<dollars>\d+)\.(\d+)/.match("$3.67") + # => #<MatchData "$3.67" dollars:"3"> + +A named group may be backreferenced as <tt>\k<name></tt>: + + /(?<vowel>[aeiou]).\k<vowel>.\k<vowel>/.match('ototomy') + # => #<MatchData "ototo" vowel:"o"> + +When (and only when) a regexp contains named capture groups +and appears before the <tt>=~</tt> operator, +the captured substrings are assigned to local variables with corresponding names: + + /\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ '$3.67' + dollars # => "3" + cents # => "67" + +Method Regexp#named_captures returns a hash of the capture names and substrings; +method Regexp#names returns an array of the capture names. + +==== Atomic Grouping + +A group may be made _atomic_ with <tt>(?></tt>_subexpression_<tt>)</tt>. + +This causes the subexpression to be matched +independently of the rest of the expression, +so that the matched substring becomes fixed for the remainder of the match, +unless the entire subexpression must be abandoned and subsequently revisited. + +In this way _subexpression_ is treated as a non-divisible whole. +Atomic grouping is typically used to optimise patterns +to prevent needless backtracking . + +Example (without atomic grouping): + + /".*"/.match('"Quote"') # => #<MatchData "\"Quote\""> + +Analysis: + +1. The leading subexpression <tt>"</tt> in the pattern matches the first character + <tt>"</tt> in the target string. +2. The next subexpression <tt>.*</tt> matches the next substring <tt>Quote"</tt> + (including the trailing double-quote). +3. Now there is nothing left in the target string to match + the trailing subexpression <tt>"</tt> in the pattern; + this would cause the overall match to fail. +4. The matched substring is backtracked by one position: <tt>Quote</tt>. +5. The final subexpression <tt>"</tt> now matches the final substring <tt>"</tt>, + and the overall match succeeds. + +If subexpression <tt>.*</tt> is grouped atomically, +the backtracking is disabled, and the overall match fails: + + /"(?>.*)"/.match('"Quote"') # => nil + +Atomic grouping can affect performance; +see {Atomic Group}[https://www.regular-expressions.info/atomic.html]. + +==== Subexpression Calls + +As seen above, a backreference number (<tt>\\n</tt>) or name (<tt>\k<name></tt>) +gives access to a captured _substring_; +the corresponding regexp _subexpression_ may also be accessed, +via the number n (<tt>\\gn</tt>) or name (<tt>\g<name></tt>): + + /\A(?<paren>\(\g<paren>*\))*\z/.match('(())') + # ^1 + # ^2 + # ^3 + # ^4 + # ^5 + # ^6 + # ^7 + # ^8 + # ^9 + # ^10 + +The pattern: + +1. Matches at the beginning of the string, i.e. before the first character. +2. Enters a named group +paren+. +3. Matches the first character in the string, <tt>'('</tt>. +4. Calls the +paren+ group again, i.e. recurses back to the second step. +5. Re-enters the +paren+ group. +6. Matches the second character in the string, <tt>'('</tt>. +7. Attempts to call +paren+ a third time, + but fails because doing so would prevent an overall successful match. +8. Matches the third character in the string, <tt>')'</tt>; + marks the end of the second recursive call +9. Matches the fourth character in the string, <tt>')'</tt>. +10. Matches the end of the string. + +See {Subexpression calls}[https://learnbyexample.github.io/Ruby_Regexp/groupings-and-backreferences.html?highlight=subexpression#subexpression-calls]. + +==== Conditionals + +The conditional construct takes the form <tt>(?(cond)yes|no)</tt>, where: + +- _cond_ may be a capture number or name. +- The match to be applied is _yes_ if _cond_ is captured; + otherwise the match to be applied is _no_. +- If not needed, <tt>|no</tt> may be omitted. + +Examples: + + re = /\A(foo)?(?(1)(T)|(F))\z/ + re.match('fooT') # => #<MatchData "fooT" 1:"foo" 2:"T" 3:nil> + re.match('F') # => #<MatchData "F" 1:nil 2:nil 3:"F"> + re.match('fooF') # => nil + re.match('T') # => nil + + re = /\A(?<xyzzy>foo)?(?(<xyzzy>)(T)|(F))\z/ + re.match('fooT') # => #<MatchData "fooT" xyzzy:"foo"> + re.match('F') # => #<MatchData "F" xyzzy:nil> + re.match('fooF') # => nil + re.match('T') # => nil + + +==== Absence Operator + +The absence operator is a special group that matches anything which does _not_ match the contained subexpressions. + + /(?~real)/.match('surrealist') # => #<MatchData "surrea"> + /(?~real)ist/.match('surrealist') # => #<MatchData "ealist"> + /sur(?~real)ist/.match('surrealist') # => nil + +=== Unicode + +==== Unicode Properties + +The <tt>/\p{property_name}/</tt> construct (with lowercase +p+) +matches characters using a Unicode property name, +much like a character class; +property +Alpha+ specifies alphabetic characters: + + /\p{Alpha}/.match('a') # => #<MatchData "a"> + /\p{Alpha}/.match('1') # => nil + +A property can be inverted +by prefixing the name with a caret character (<tt>^</tt>): + + /\p{^Alpha}/.match('1') # => #<MatchData "1"> + /\p{^Alpha}/.match('a') # => nil + +Or by using <tt>\P</tt> (uppercase +P+): + + /\P{Alpha}/.match('1') # => #<MatchData "1"> + /\P{Alpha}/.match('a') # => nil + +See {Unicode Properties}[rdoc-ref:language/regexp/unicode_properties.rdoc] +for regexps based on the numerous properties. + +Some commonly-used properties correspond to POSIX bracket expressions: + +- <tt>/\p{Alnum}/</tt>: Alphabetic and numeric character +- <tt>/\p{Alpha}/</tt>: Alphabetic character +- <tt>/\p{Blank}/</tt>: Space or tab +- <tt>/\p{Cntrl}/</tt>: Control character +- <tt>/\p{Digit}/</tt>: Digit + characters, and similar) +- <tt>/\p{Lower}/</tt>: Lowercase alphabetical character +- <tt>/\p{Print}/</tt>: Like <tt>\p{Graph}</tt>, but includes the space character +- <tt>/\p{Punct}/</tt>: Punctuation character +- <tt>/\p{Space}/</tt>: Whitespace character (<tt>[:blank:]</tt>, newline, + carriage return, etc.) +- <tt>/\p{Upper}/</tt>: Uppercase alphabetical +- <tt>/\p{XDigit}/</tt>: Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F) + +These are also commonly used: + +- <tt>/\p{Emoji}/</tt>: Unicode emoji. +- <tt>/\p{Graph}/</tt>: Characters excluding <tt>/\p{Cntrl}/</tt> and <tt>/\p{Space}/</tt>. + Note that invisible characters under the Unicode + {"Format"}[https://www.compart.com/en/unicode/category/Cf] category are included. +- <tt>/\p{Word}/</tt>: A member in one of these Unicode character + categories (see below) or having one of these Unicode properties: + + - Unicode categories: + - +Mark+ (+M+). + - <tt>Decimal Number</tt> (+Nd+) + - <tt>Connector Punctuation</tt> (+Pc+). + + - Unicode properties: + - +Alpha+ + - <tt>Join_Control</tt> + +- <tt>/\p{ASCII}/</tt>: A character in the ASCII character set. +- <tt>/\p{Any}/</tt>: Any Unicode character (including unassigned characters). +- <tt>/\p{Assigned}/</tt>: An assigned character. + +==== Unicode Character Categories + +A Unicode character category name: + +- May be either its full name or its abbreviated name. +- Is case-insensitive. +- Treats a space, a hyphen, and an underscore as equivalent. + +Examples: + + /\p{lu}/ # => /\p{lu}/ + /\p{LU}/ # => /\p{LU}/ + /\p{Uppercase Letter}/ # => /\p{Uppercase Letter}/ + /\p{Uppercase_Letter}/ # => /\p{Uppercase_Letter}/ + /\p{UPPERCASE-LETTER}/ # => /\p{UPPERCASE-LETTER}/ + +Below are the Unicode character category abbreviations and names. +Enumerations of characters in each category are at the links. + +Letters: + +- +L+, +Letter+: +LC+, +Lm+, or +Lo+. +- +LC+, +Cased_Letter+: +Ll+, +Lt+, or +Lu+. +- {Lu, Lowercase_Letter}[https://www.compart.com/en/unicode/category/Ll]. +- {Lu, Modifier_Letter}[https://www.compart.com/en/unicode/category/Lm]. +- {Lu, Other_Letter}[https://www.compart.com/en/unicode/category/Lo]. +- {Lu, Titlecase_Letter}[https://www.compart.com/en/unicode/category/Lt]. +- {Lu, Uppercase_Letter}[https://www.compart.com/en/unicode/category/Lu]. + +Marks: + +- +M+, +Mark+: +Mc+, +Me+, or +Mn+. +- {Mc, Spacing_Mark}[https://www.compart.com/en/unicode/category/Mc]. +- {Me, Enclosing_Mark}[https://www.compart.com/en/unicode/category/Me]. +- {Mn, Nonapacing_Mark}[https://www.compart.com/en/unicode/category/Mn]. + +Numbers: + +- +N+, +Number+: +Nd+, +Nl+, or +No+. +- {Nd, Decimal_Number}[https://www.compart.com/en/unicode/category/Nd]. +- {Nl, Letter_Number}[https://www.compart.com/en/unicode/category/Nl]. +- {No, Other_Number}[https://www.compart.com/en/unicode/category/No]. + +Punctuation: + +- +P+, +Punctuation+: +Pc+, +Pd+, +Pe+, +Pf+, +Pi+, +Po+, or +Ps+. +- {Pc, Connector_Punctuation}[https://www.compart.com/en/unicode/category/Pc]. +- {Pd, Dash_Punctuation}[https://www.compart.com/en/unicode/category/Pd]. +- {Pe, Close_Punctuation}[https://www.compart.com/en/unicode/category/Pe]. +- {Pf, Final_Punctuation}[https://www.compart.com/en/unicode/category/Pf]. +- {Pi, Initial_Punctuation}[https://www.compart.com/en/unicode/category/Pi]. +- {Po, Other_Punctuation}[https://www.compart.com/en/unicode/category/Po]. +- {Ps, Open_Punctuation}[https://www.compart.com/en/unicode/category/Ps]. + +- +S+, +Symbol+: +Sc+, +Sk+, +Sm+, or +So+. +- {Sc, Currency_Symbol}[https://www.compart.com/en/unicode/category/Sc]. +- {Sk, Modifier_Symbol}[https://www.compart.com/en/unicode/category/Sk]. +- {Sm, Math_Symbol}[https://www.compart.com/en/unicode/category/Sm]. +- {So, Other_Symbol}[https://www.compart.com/en/unicode/category/So]. + +- +Z+, +Separator+: +Zl+, +Zp+, or +Zs+. +- {Zl, Line_Separator}[https://www.compart.com/en/unicode/category/Zl]. +- {Zp, Paragraph_Separator}[https://www.compart.com/en/unicode/category/Zp]. +- {Zs, Space_Separator}[https://www.compart.com/en/unicode/category/Zs]. + +- +C+, +Other+: +Cc+, +Cf+, +Cn+, +Co+, or +Cs+. +- {Cc, Control}[https://www.compart.com/en/unicode/category/Cc]. +- {Cf, Format}[https://www.compart.com/en/unicode/category/Cf]. +- {Cn, Unassigned}[https://www.compart.com/en/unicode/category/Cn]. +- {Co, Private_Use}[https://www.compart.com/en/unicode/category/Co]. +- {Cs, Surrogate}[https://www.compart.com/en/unicode/category/Cs]. + +==== Unicode Scripts and Blocks + +Among the Unicode properties are: + +- {Unicode scripts}[https://en.wikipedia.org/wiki/Script_(Unicode)]; + see {supported scripts}[https://www.unicode.org/standard/supported.html]. +- {Unicode blocks}[https://en.wikipedia.org/wiki/Unicode_block]; + see {supported blocks}[http://www.unicode.org/Public/UNIDATA/Blocks.txt]. + +=== POSIX Bracket Expressions + +A POSIX <i>bracket expression</i> is also similar to a character class. +These expressions provide a portable alternative to the above, +with the added benefit of encompassing non-ASCII characters: + +- <tt>/\d/</tt> matches only ASCII decimal digits +0+ through +9+. +- <tt>/[[:digit:]]/</tt> matches any character in the Unicode + <tt>Decimal Number</tt> (+Nd+) category; + see below. + +The POSIX bracket expressions: + +- <tt>/[[:digit:]]/</tt>: Matches a {Unicode digit}[https://www.compart.com/en/unicode/category/Nd]: + + /[[:digit:]]/.match('9') # => #<MatchData "9"> + /[[:digit:]]/.match("\u1fbf9") # => #<MatchData "9"> + +- <tt>/[[:xdigit:]]/</tt>: Matches a digit allowed in a hexadecimal number; + equivalent to <tt>[0-9a-fA-F]</tt>. + +- <tt>/[[:upper:]]/</tt>: Matches a {Unicode uppercase letter}[https://www.compart.com/en/unicode/category/Lu]: + + /[[:upper:]]/.match('A') # => #<MatchData "A"> + /[[:upper:]]/.match("\u00c6") # => #<MatchData "Æ"> + +- <tt>/[[:lower:]]/</tt>: Matches a {Unicode lowercase letter}[https://www.compart.com/en/unicode/category/Ll]: + + /[[:lower:]]/.match('a') # => #<MatchData "a"> + /[[:lower:]]/.match("\u01fd") # => #<MatchData "ǽ"> + +- <tt>/[[:alpha:]]/</tt>: Matches <tt>/[[:upper:]]/</tt> or <tt>/[[:lower:]]/</tt>. + +- <tt>/[[:alnum:]]/</tt>: Matches <tt>/[[:alpha:]]/</tt> or <tt>/[[:digit:]]/</tt>. + +- <tt>/[[:space:]]/</tt>: Matches {Unicode space character}[https://www.compart.com/en/unicode/category/Zs]: + + /[[:space:]]/.match(' ') # => #<MatchData " "> + /[[:space:]]/.match("\u2005") # => #<MatchData " "> + +- <tt>/[[:blank:]]/</tt>: Matches <tt>/[[:space:]]/</tt> or tab character: + + /[[:blank:]]/.match(' ') # => #<MatchData " "> + /[[:blank:]]/.match("\u2005") # => #<MatchData " "> + /[[:blank:]]/.match("\t") # => #<MatchData "\t"> + +- <tt>/[[:cntrl:]]/</tt>: Matches {Unicode control character}[https://www.compart.com/en/unicode/category/Cc]: + + /[[:cntrl:]]/.match("\u0000") # => #<MatchData "\u0000"> + /[[:cntrl:]]/.match("\u009f") # => #<MatchData "\u009F"> + +- <tt>/[[:graph:]]/</tt>: Matches any character + except <tt>/[[:space:]]/</tt> or <tt>/[[:cntrl:]]/</tt>. + +- <tt>/[[:print:]]/</tt>: Matches <tt>/[[:graph:]]/</tt> or space character. + +- <tt>/[[:punct:]]/</tt>: Matches any (Unicode punctuation character}[https://www.compart.com/en/unicode/category/Po]: + +Ruby also supports these (non-POSIX) bracket expressions: + +- <tt>/[[:ascii:]]/</tt>: Matches a character in the ASCII character set. +- <tt>/[[:word:]]/</tt>: Matches a character in one of these Unicode character + categories or having one of these Unicode properties: + + - Unicode categories: + - +Mark+ (+M+). + - <tt>Decimal Number</tt> (+Nd+) + - <tt>Connector Punctuation</tt> (+Pc+). + + - Unicode properties: + - +Alpha+ + - <tt>Join_Control</tt> + +=== Comments + +A comment may be included in a regexp pattern +using the <tt>(?#</tt>_comment_<tt>)</tt> construct, +where _comment_ is a substring that is to be ignored. +arbitrary text ignored by the regexp engine: + + /foo(?#Ignore me)bar/.match('foobar') # => #<MatchData "foobar"> + +The comment may not include an unescaped terminator character. + +See also {Extended Mode}[rdoc-ref:Regexp@Extended+Mode]. + +== Modes + +Each of these modifiers sets a mode for the regexp: + +- +i+: <tt>/pattern/i</tt> sets + {Case-Insensitive Mode}[rdoc-ref:Regexp@Case-Insensitive+Mode]. +- +m+: <tt>/pattern/m</tt> sets + {Multiline Mode}[rdoc-ref:Regexp@Multiline+Mode]. +- +x+: <tt>/pattern/x</tt> sets + {Extended Mode}[rdoc-ref:Regexp@Extended+Mode]. +- +o+: <tt>/pattern/o</tt> sets + {Interpolation Mode}[rdoc-ref:Regexp@Interpolation+Mode]. + +Any, all, or none of these may be applied. + +Modifiers +i+, +m+, and +x+ may be applied to subexpressions: + +- <tt>(?modifier)</tt> turns the mode "on" for ensuing subexpressions +- <tt>(?-modifier)</tt> turns the mode "off" for ensuing subexpressions +- <tt>(?modifier:subexp)</tt> turns the mode "on" for _subexp_ within the group +- <tt>(?-modifier:subexp)</tt> turns the mode "off" for _subexp_ within the group + +Example: + + re = /(?i)te(?-i)st/ + re.match('test') # => #<MatchData "test"> + re.match('TEst') # => #<MatchData "TEst"> + re.match('TEST') # => nil + re.match('teST') # => nil + + re = /t(?i:e)st/ + re.match('test') # => #<MatchData "test"> + re.match('tEst') # => #<MatchData "tEst"> + re.match('tEST') # => nil + +Method Regexp#options returns an integer whose value showing +the settings for case-insensitivity mode, multiline mode, and extended mode. + +=== Case-Insensitive Mode + +By default, a regexp is case-sensitive: + + /foo/.match('FOO') # => nil + +Modifier +i+ enables case-insensitive mode: + + /foo/i.match('FOO') + # => #<MatchData "FOO"> + +Method Regexp#casefold? returns whether the mode is case-insensitive. + +=== Multiline Mode + +The multiline-mode in Ruby is what is commonly called a "dot-all mode": + +- Without the +m+ modifier, the subexpression <tt>.</tt> does not match newlines: + + /a.c/.match("a\nc") # => nil + +- With the modifier, it does match: + + /a.c/m.match("a\nc") # => #<MatchData "a\nc"> + +Unlike other languages, the modifier +m+ does not affect the anchors <tt>^</tt> and <tt>$</tt>. +These anchors always match at line-boundaries in Ruby. + +=== Extended Mode + +Modifier +x+ enables extended mode, which means that: + +- Literal white space in the pattern is to be ignored. +- Character <tt>#</tt> marks the remainder of its containing line as a comment, + which is also to be ignored for matching purposes. + +In extended mode, whitespace and comments may be used +to form a self-documented regexp. + +Regexp not in extended mode (matches some Roman numerals): + + pattern = '^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$' + re = /#{pattern}/ + re.match('MCMXLIII') # => #<MatchData "MCMXLIII" 1:"CM" 2:"XL" 3:"III"> + +Regexp in extended mode: + + pattern = <<-EOT + ^ # beginning of string + M{0,3} # thousands - 0 to 3 Ms + (CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 Cs), + # or 500-800 (D, followed by 0 to 3 Cs) + (XC|XL|L?X{0,3}) # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 Xs), + # or 50-80 (L, followed by 0 to 3 Xs) + (IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 Is), + # or 5-8 (V, followed by 0 to 3 Is) + $ # end of string + EOT + re = /#{pattern}/x + re.match('MCMXLIII') # => #<MatchData "MCMXLIII" 1:"CM" 2:"XL" 3:"III"> + +=== Interpolation Mode + +Modifier +o+ means that the first time a literal regexp with interpolations +is encountered, +the generated Regexp object is saved and used for all future evaluations +of that literal regexp. +Without modifier +o+, the generated Regexp is not saved, +so each evaluation of the literal regexp generates a new Regexp object. + +Without modifier +o+: + + def letters; sleep 5; /[A-Z][a-z]/; end + words = %w[abc def xyz] + start = Time.now + words.each {|word| word.match(/\A[#{letters}]+\z/) } + Time.now - start # => 15.0174892 + +With modifier +o+: + + start = Time.now + words.each {|word| word.match(/\A[#{letters}]+\z/o) } + Time.now - start # => 5.0010866 + +Note that if the literal regexp does not have interpolations, +the +o+ behavior is the default. + +== Encodings + +By default, a regexp with only US-ASCII characters has US-ASCII encoding: + + re = /foo/ + re.source.encoding # => #<Encoding:US-ASCII> + re.encoding # => #<Encoding:US-ASCII> + +A regular expression containing non-US-ASCII characters +is assumed to use the source encoding. +This can be overridden with one of the following modifiers. + +- <tt>/pat/n</tt>: US-ASCII if only containing US-ASCII characters, + otherwise ASCII-8BIT: + + /foo/n.encoding # => #<Encoding:US-ASCII> + /foo\xff/n.encoding # => #<Encoding:ASCII-8BIT> + /foo\x7f/n.encoding # => #<Encoding:US-ASCII> + +- <tt>/pat/u</tt>: UTF-8 + + /foo/u.encoding # => #<Encoding:UTF-8> + +- <tt>/pat/e</tt>: EUC-JP + + /foo/e.encoding # => #<Encoding:EUC-JP> + +- <tt>/pat/s</tt>: Windows-31J + + /foo/s.encoding # => #<Encoding:Windows-31J> + +A regexp can be matched against a target string when either: + +- They have the same encoding. +- The regexp's encoding is a fixed encoding and the string + contains only ASCII characters. + Method Regexp#fixed_encoding? returns whether the regexp + has a <i>fixed</i> encoding. + +If a match between incompatible encodings is attempted an +<tt>Encoding::CompatibilityError</tt> exception is raised. + +Example: + + re = eval("# encoding: ISO-8859-1\n/foo\\xff?/") + re.encoding # => #<Encoding:ISO-8859-1> + re =~ "foo".encode("UTF-8") # => 0 + re =~ "foo\u0100" # Raises Encoding::CompatibilityError + +The encoding may be explicitly fixed by including Regexp::FIXEDENCODING +in the second argument for Regexp.new: + + # Regexp with encoding ISO-8859-1. + re = Regexp.new("a".force_encoding('iso-8859-1'), Regexp::FIXEDENCODING) + re.encoding # => #<Encoding:ISO-8859-1> + # Target string with encoding UTF-8. + s = "a\u3042" + s.encoding # => #<Encoding:UTF-8> + re.match(s) # Raises Encoding::CompatibilityError. + +== Timeouts + +When either a regexp source or a target string comes from untrusted input, +malicious values could become a denial-of-service attack; +to prevent such an attack, it is wise to set a timeout. + +\Regexp has two timeout values: + +- A class default timeout, used for a regexp whose instance timeout is +nil+; + this default is initially +nil+, and may be set by method Regexp.timeout=: + + Regexp.timeout # => nil + Regexp.timeout = 3.0 + Regexp.timeout # => 3.0 + +- An instance timeout, which defaults to +nil+ and may be set in Regexp.new: + + re = Regexp.new('foo', timeout: 5.0) + re.timeout # => 5.0 + +When regexp.timeout is +nil+, the timeout "falls through" to Regexp.timeout; +when regexp.timeout is non-+nil+, that value controls timing out: + + | regexp.timeout Value | Regexp.timeout Value | Result | + |----------------------|----------------------|-----------------------------| + | nil | nil | Never times out. | + | nil | Float | Times out in Float seconds. | + | Float | Any | Times out in Float seconds. | + +== Optimization + +For certain values of the pattern and target string, +matching time can grow polynomially or exponentially in relation to the input size; +the potential vulnerability arising from this is the {regular expression denial-of-service}[https://en.wikipedia.org/wiki/ReDoS] (ReDoS) attack. + +\Regexp matching can apply an optimization to prevent ReDoS attacks. +When the optimization is applied, matching time increases linearly (not polynomially or exponentially) +in relation to the input size, and a ReDoS attack is not possible. + +This optimization is applied if the pattern meets these criteria: + +- No backreferences. +- No subexpression calls. +- No nested lookaround anchors or atomic groups. +- No nested quantifiers with counting (i.e. no nested <tt>{n}</tt>, + <tt>{min,}</tt>, <tt>{,max}</tt>, or <tt>{min,max}</tt> style quantifiers) + +You can use method Regexp.linear_time? to determine whether a pattern meets these criteria: + + Regexp.linear_time?(/a*/) # => true + Regexp.linear_time?('a*') # => true + Regexp.linear_time?(/(a*)\1/) # => false + +However, an untrusted source may not be safe even if the method returns +true+, +because the optimization uses memoization (which may invoke large memory consumption). + +== References + +Read: + +- <i>Mastering Regular Expressions</i> + by Jeffrey E.F. Friedl. +- <i>Regular Expressions Cookbook</i> + by Jan Goyvaerts & Steven Levithan. + +Explore, test: + +- {Rubular}[https://rubular.com/]: interactive online editor. diff --git a/doc/_timezones.rdoc b/doc/_timezones.rdoc new file mode 100644 index 0000000000..a2ac46584f --- /dev/null +++ b/doc/_timezones.rdoc @@ -0,0 +1,163 @@ +== Timezone Specifiers + +Certain +Time+ methods accept arguments that specify timezones: + +- Time.at: keyword argument +in:+. +- Time.new: positional argument +zone+ or keyword argument +in:+. +- Time.now: keyword argument +in:+. +- Time#getlocal: positional argument +zone+. +- Time#localtime: positional argument +zone+. + +The value given with any of these must be one of the following +(each detailed below): + +- {Hours/minutes offset}[rdoc-ref:Time@Hours-2FMinutes+Offsets]. +- {Single-letter offset}[rdoc-ref:Time@Single-Letter+Offsets]. +- {Integer offset}[rdoc-ref:Time@Integer+Offsets]. +- {Timezone object}[rdoc-ref:Time@Timezone+Objects]. +- {Timezone name}[rdoc-ref:Time@Timezone+Names]. + +=== Hours/Minutes Offsets + +The zone value may be a string offset from UTC +in the form <tt>'+HH:MM'</tt> or <tt>'-HH:MM'</tt>, +where: + +- +HH+ is the 2-digit hour in the range <tt>0..23</tt>. +- +MM+ is the 2-digit minute in the range <tt>0..59</tt>. + +Examples: + + t = Time.utc(2000, 1, 1, 20, 15, 1) # => 2000-01-01 20:15:01 UTC + Time.at(t, in: '-23:59') # => 1999-12-31 20:16:01 -2359 + Time.at(t, in: '+23:59') # => 2000-01-02 20:14:01 +2359 + +=== Single-Letter Offsets + +The zone value may be a letter in the range <tt>'A'..'I'</tt> +or <tt>'K'..'Z'</tt>; +see {List of military time zones}[https://en.wikipedia.org/wiki/List_of_military_time_zones]: + + t = Time.utc(2000, 1, 1, 20, 15, 1) # => 2000-01-01 20:15:01 UTC + Time.at(t, in: 'A') # => 2000-01-01 21:15:01 +0100 + Time.at(t, in: 'I') # => 2000-01-02 05:15:01 +0900 + Time.at(t, in: 'K') # => 2000-01-02 06:15:01 +1000 + Time.at(t, in: 'Y') # => 2000-01-01 08:15:01 -1200 + Time.at(t, in: 'Z') # => 2000-01-01 20:15:01 UTC + +=== \Integer Offsets + +The zone value may be an integer number of seconds +in the range <tt>-86399..86399</tt>: + + t = Time.utc(2000, 1, 1, 20, 15, 1) # => 2000-01-01 20:15:01 UTC + Time.at(t, in: -86399) # => 1999-12-31 20:15:02 -235959 + Time.at(t, in: 86399) # => 2000-01-02 20:15:00 +235959 + +=== Timezone Objects + +The zone value may be an object responding to certain timezone methods, an +instance of {Timezone}[https://github.com/panthomakos/timezone] and +{TZInfo}[https://tzinfo.github.io] for example. + +The timezone methods are: + +- +local_to_utc+: + + Called when Time.new is invoked with +tz+ as the value of positional + argument +zone+ or keyword argument +in:+. + + Argument:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects]. + Returns:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects] in the UTC timezone. + +- +utc_to_local+: + + Called when Time.at or Time.now is invoked with +tz+ as the value for + keyword argument +in:+, and when Time#getlocal or Time#localtime is called + with +tz+ as the value for positional argument +zone+. + + The UTC offset will be calculated as the difference between the + original time and the returned object as an +Integer+. + If the object is in fixed offset, its +utc_offset+ is also counted. + + Argument:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects]. + Returns:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects] in the local timezone. + +A custom timezone class may have these instance methods, +which will be called if defined: + +- +abbr+: + + Called when Time#strftime is invoked with a format involving <tt>%Z</tt>. + + Argument:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects]. + Returns:: a string abbreviation for the timezone name. + +- +dst?+: + + Called when Time.at or Time.now is invoked with +tz+ as the value for + keyword argument +in:+, and when Time#getlocal or Time#localtime is + called with +tz+ as the value for positional argument +zone+. + + Argument:: a {Time-like object}[rdoc-ref:Time@Time-Like+Objects]. + Returns:: whether the time is daylight saving time. + +- +name+: + + Called when <tt>Marshal.dump(t)</tt> is invoked + + Argument:: none. + Returns:: the string name of the timezone. + +==== +Time+-Like Objects + +A +Time+-like object is a container object capable of interfacing with +timezone libraries for timezone conversion. + +The argument to the timezone conversion methods above will have attributes +similar to Time, except that timezone related attributes are meaningless. + +The objects returned by +local_to_utc+ and +utc_to_local+ methods of the +timezone object may be of the same class as their arguments, of arbitrary +object classes, or of class Integer. + +For a returned class other than +Integer+, the class must have the +following methods: + +- +year+ +- +mon+ +- +mday+ +- +hour+ +- +min+ +- +sec+ +- +isdst+ +- +to_i+ + +For a returned +Integer+, its components, decomposed in UTC, are +interpreted as times in the specified timezone. + +=== Timezone Names + +If the class (the receiver of class methods, or the class of the receiver +of instance methods) has +find_timezone+ singleton method, this method is +called to achieve the corresponding timezone object from a timezone name. + +For example, using {Timezone}[https://github.com/panthomakos/timezone]: + class TimeWithTimezone < Time + require 'timezone' + def self.find_timezone(z) = Timezone[z] + end + + TimeWithTimezone.now(in: "America/New_York") #=> 2023-12-25 00:00:00 -0500 + TimeWithTimezone.new("2023-12-25 America/New_York") #=> 2023-12-25 00:00:00 -0500 + +Or, using {TZInfo}[https://tzinfo.github.io]: + class TimeWithTZInfo < Time + require 'tzinfo' + def self.find_timezone(z) = TZInfo::Timezone.get(z) + end + + TimeWithTZInfo.now(in: "America/New_York") #=> 2023-12-25 00:00:00 -0500 + TimeWithTZInfo.new("2023-12-25 America/New_York") #=> 2023-12-25 00:00:00 -0500 + +You can define this method per subclasses, or on the toplevel Time class. diff --git a/doc/contributing.rdoc b/doc/contributing.rdoc deleted file mode 100644 index 7d39c12fda..0000000000 --- a/doc/contributing.rdoc +++ /dev/null @@ -1,447 +0,0 @@ -= Contributing to Ruby - -Ruby has a vast and friendly community with hundreds of people contributing to -a thriving open-source ecosystem. This guide is designed to cover ways for -participating in the development of CRuby. - -There are plenty of ways for you to help even if you're not ready to write -code or documentation. You can help by reporting issues, testing patches, and -trying out beta releases with your applications. - -== How To Report - -If you've encountered a bug in Ruby please report it to the redmine issue -tracker available at {bugs.ruby-lang.org}[https://bugs.ruby-lang.org/]. Do not -report security vulnerabilities here, there is a {separate -channel}[rdoc-label:label-Reporting+Security+Issues] for them. - -There are a few simple steps you should follow in order to receive feedback -on your ticket. - -* If you haven't already, - {sign up for an account}[https://bugs.ruby-lang.org/account/register] on the - bug tracker. -* Try the latest version. - - If you aren't already using the latest version, try installing a newer - stable release. See - {Downloading Ruby}[https://www.ruby-lang.org/en/downloads/]. -* Look to see if anyone already reported your issue, try - {searching on redmine}[https://bugs.ruby-lang.org/projects/ruby-trunk/issues] - for your problem. -* If you can't find a ticket addressing your issue, - {create a new one}[https://bugs.ruby-lang.org/projects/ruby-trunk/issues/new]. -* Choose the target version, usually current. Bugs will be first fixed in the - current release and then {backported}[rdoc-label:label-Backport+Requests]. -* Fill in the Ruby version you're using when experiencing this issue - (<code>ruby -v</code>). -* Attach any logs or reproducible programs to provide additional information. - Reproducible scripts should be as small as possible. -* Briefly describe your problem. A 2-3 sentence description will help give a - quick response. -* Pick a category, such as core for common problems, or lib for a standard - library. -* Check the {Maintainers - list}[https://bugs.ruby-lang.org/projects/ruby/wiki/Maintainers] and assign - the ticket if there is an active maintainer for the library or feature. -* If the ticket doesn't have any replies after 10 days, you can send a - reminder. -* Please reply to feedback requests. If a bug report doesn't get any feedback, - it'll eventually get rejected. - -=== Reporting to downstream distributions - -You can report downstream issues for the following distributions via their bug tracker: - -* {debian}[https://bugs.debian.org/cgi-bin/pkgreport.cgi?src=ruby-defaults] -* {freebsd}[http://www.freebsd.org/cgi/query-pr-summary.cgi?text=ruby] -* {redhat}[https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=MODIFIED] -* {macports}[https://trac.macports.org/query?status=assigned&status=new&status=reopened&port=~ruby] -* etc (add your distribution bug tracker here) - -=== Platform Maintainers - -For platform specific bugs in Ruby, you can assign your ticket to the current -maintainer for a specific platform. - -The current active platform maintainers are as follows: - -[mswin64 (Microsoft Windows)] - NAKAMURA Usaku (usa) -[mingw32 (Minimalist GNU for Windows)] - Nobuyoshi Nakada (nobu) -[AIX] - Yutaka Kanemoto (kanemoto) -[FreeBSD] - Akinori MUSHA (knu) -[Solaris] - Naohisa Goto (ngoto) -[RHEL, CentOS] - KOSAKI Motohiro (kosaki) -[macOS] - Kenta Murata (mrkn) -[OpenBSD] - Jeremy Evans (jeremyevans0) -[cygwin, bcc32, djgpp, wince, ...] - none. (Maintainer WANTED) - -== Reporting Security Issues - -Security vulnerabilities receive special treatment since they may negatively -affect many users. There is a private mailing list that all security issues -should be reported to and will be handled discretely. Email the -mailto:security@ruby-lang.org list and the problem will be published after -fixes have been released. You can also encrypt the issue using {the PGP public -key}[https://www.ruby-lang.org/security.asc] for the list. - -== Reporting Other Issues - -If you're having an issue with the website, or maybe the mailing list, you can -contact the webmaster to help resolve the problem. - -The current webmaster is: - -* Hiroshi SHIBATA (hsbt) - -You can also report issues with the ruby-lang.org website on the issue tracker: - -* {issue tracker}[https://github.com/ruby/www.ruby-lang.org/issues] - -== Resolve Existing Issues - -As a next step beyond reporting issues you can help the core team resolve -existing issues. If you check the Everyone's Issues list in GitHub Issues, -you will find a lot of issues already requiring attention. What can you do for -these? Quite a bit, actually: - -When a bug report goes for a while without any feedback, it goes to the bug -graveyard which is unfortunate. If you check the {issues -list}[https://bugs.ruby-lang.org/projects/ruby-trunk/issues] you will find lots -of delinquent bugs that require attention. - -You can help by verifying the existing tickets, try to reproduce the reported -issue on your own and comment if you still experience the bug. Some issues -lack attention because of too much ambiguity, to help you can narrow down the -problem and provide more specific details or instructions to reproduce the -bug. You might also try contributing a failing test in the form of a patch, -which we will cover later in this guide. - -It may also help to try out patches other contributors have submitted to -redmine, if gone without notice. In this case the +patch+ command is your -friend, see <code>man patch</code> for more information. Basically this would -go something like this: - - cd path/to/ruby - patch -p0 < path/to/patch - -You will then be prompted to apply the patch with the associated files. After -building ruby again, you should try to run the tests and verify if the change -actually worked or fixed the bug. It's important to provide valuable feedback -on the patch that can help reach the overall goal, try to answer some of these -questions: - -* What do you like about this change? -* What would you do differently? -* Are there any other edge cases not tested? -* Is there any documentation that would be affected by this change? - -If you can answer some or all of these questions, you're on the right track. -If your comment simply says "+1", then odds are that other reviewers aren't -going to take it too seriously. Show that you took the time to review the -patch. - -== How To Request Features - -If there's a new feature that you want to see added to Ruby, you will need to -write a convincing proposal and patch to implement the feature. - -For new features in CRuby, use the {'Feature' -tracker}[https://bugs.ruby-lang.org/projects/ruby-trunk/issues?set_filter=1&tracker_id=2] -on ruby-master. For non-CRuby dependent features, features that would apply to -alternate Ruby implementations such as JRuby and Rubinius, use the {CommonRuby -tracker}[https://bugs.ruby-lang.org/projects/common-ruby]. - -When writing a proposal be sure to check for previous discussions on the -topic and have a solid use case. You will need to be persuasive and convince -Matz on your new feature. You should also consider the potential compatibility -issues that this new feature might raise. - -Consider making your feature into a gem, and if there are enough people who -benefit from your feature it could help persuade ruby-core. Although feature -requests can seem like an alluring way to contribute to Ruby, often these -discussions can lead nowhere and exhaust time and energy that could be better -spent fixing bugs. Choose your battles. - -A good template for a feature proposal should look something like this: - -[Abstract] - Summary of your feature -[Background] - Describe current behavior and why it is problem. Related work, such as - solutions in other language helps us to understand the problem. -[Proposal] - Describe your proposal in details -[Details] - If it has complicated feature, describe it -[Usecase] - How would your feature be used? Who will benefit from it? -[Discussion] - Discuss about this proposal. A list of pros and cons will help start - discussion. -[Limitation] - Limitation of your proposal -[Another alternative proposal] - If there are alternative proposals, show them. -[See also] - Links to the other related resources - -=== Slideshow - -At the Ruby Developer Meeting in Japan, committers discuss Feature Proposals together in Tokyo. We will judge proposals and then accept, reject, or give feedback for them. -If you have a stalled proposal, making a slide to submit is good way to get feedback. - -Slides should be: - -* One-page slide -* Include a corresponding ticket number -* MUST include a figure and/or short example code -* SHOULD have less sentence in natural language (try to write less than 140 characters) -* It is RECOMMENDED to itemize: motivation/use case, proposal, pros/cons, corner case -* PDF or Image (Web browsers can show it) - -Please note: - -* Even if the proposal is generally acceptable, it won't be accepted without writing corner cases in the ticket -* Slide's example: DevelopersMeeting20130727Japan - -== Backport Requests - -When a new version of Ruby is released, it starts at patch level 0 (p0), and -bugs will be fixed first on the master branch. If it's determined that a bug -exists in a previous version of Ruby that is still in the bug fix stage of -maintenance, then a patch will be backported. After the maintenance stage of a -particular Ruby version ends, it goes into "security fix only" mode which -means only security related vulnerabilities will be backported. Versions in -End-of-life (EOL) will not receive any updates and it is recommended you -upgrade as soon as possible. - -If a major security issue is found or after a certain amount of time since the -last patch level release, a new patch-level release will be made. - -When submitting a backport request please confirm the bug has been fixed in -newer versions and exists in maintenance mode versions. There is a backport -tracker for each major version still in maintenance where you can request a -particular revision merged in the affected version of Ruby. - -Each major version of Ruby has a release manager that should be assigned to -handle backport requests. You can find the list of release managers on the -{wiki}[https://bugs.ruby-lang.org/projects/ruby/wiki/ReleaseEngineering]. - -=== Branches - -Status and maintainers of branches are listed on the -{wiki}[https://bugs.ruby-lang.org/projects/ruby/wiki/ReleaseEngineering]. - -== Running tests - -In order to help resolve existing issues and contributing patches to Ruby you -need to be able to run the test suite. - -CRuby uses git for source control, the {git homepage}[https://git-scm.com/] -has installation instructions with links to documentation for learning more -about git. There is a mirror of the repository on {github}[https://github.com/ruby/ruby]. -. For other resources see the {ruby-core documentation on -ruby-lang.org}[https://www.ruby-lang.org/en/community/ruby-core/]. - -Install the prerequisite dependencies for building the CRuby interpreter to -run tests. - -* C compiler -* autoconf -* bison -* gperf -* ruby - Ruby itself is prerequisite in order to build Ruby from source. It - can be 1.8. - -You should also have access to development headers for the following -libraries, but these are not required: - -* NDBM/QDBM -* GDBM -* OpenSSL/LibreSSL -* readline/editline(libedit) -* zlib -* libffi -* libyaml -* libexecinfo (FreeBSD) - -Now let's build CRuby: - -* Checkout the CRuby source code: - - git clone https://github.com/ruby/ruby.git ruby-master - -* Generate the configuration files and build: - - cd ruby-master - autoconf - mkdir build && cd build # its good practice to build outside of source dir - mkdir ~/.rubies # we will install to .rubies/ruby-master in our home dir - ../configure --prefix="${HOME}/.rubies/ruby-master" - make up && make install - -After adding Ruby to your PATH, you should be ready to run the test suite: - - make test - -You can also use +test-all+ to run all of the tests with the RUNRUBY -interpreter just built. Use TESTS or RUNRUBYOPT to pass parameters, such as: - - make test-all TESTS=-v - -This is also how you can run a specific test from our build dir: - - make test-all TESTS=drb/test_drb.rb - -You can run +test+ and +test-all+ at once by +check+ . - - make check - -For older versions of Ruby you will need to run the build setup again after -checking out the associated branch in git, for example if you wanted to -checkout 1.9.3: - - git clone https://github.com/ruby/ruby.git --branch ruby_1_9_3 - -Once you checked out the source code, you can update the local copy by: - - make up - -Or, update, build, install and check, by just: - - make love - -== Contributing Documentation - -If you're interested in contributing documentation directly to CRuby there is -some information available at -{Contributing}[https://github.com/ruby/ruby#contributing]. - -There is also the {Ruby Reference -Manual}[https://github.com/rurema/doctree/wiki] in Japanese. - -== Contributing A Patch - -=== Deciding what to patch - -Before you submit a patch, there are a few things you should know: - -* Pay attention to the maintenance policy for stable and maintained versions of Ruby. -* Released versions in security mode will not merge feature changes. -* Search for previous discussions on ruby-core to verify the maintenance policy -* Patches must be distributed under Ruby's license. -* This license may change in the future, you must join the discussion if you don't agree to the change - -To improve the chance your patch will be accepted please follow these simple rules: - -* Bug fixes should be committed on master first -* Format of the patch file must be a unified diff (ie: diff -pu, svn diff, or git diff) -* Don't introduce cosmetic changes -* Follow the original coding style of the code -* Don't mix different changes in one commit - -First thing you should do is check out the code if you haven't already: - - git clone https://github.com/ruby/ruby.git ruby-master - -Now create a dedicated branch: - - cd ruby-master - git checkout -b my_new_branch - -The name of your branch doesn't really matter because it will only exist on -your local computer and won't be part of the official Ruby repository. It will -be used to create patches based on the differences between your branch and -master, or edge Ruby. - -=== Coding style - -Here are some general rules to follow when writing Ruby and C code for CRuby: - -* Indent 4 spaces for C without tabs (old codes might use tabs for eight-space indentation, - but newer codes recommend to use spaces only) -* Indent 2 space tabs for Ruby -* Do not use TABs in ruby codes -* ANSI C style for 1.9+ for function declarations -* Follow C90 (not C99) Standard -* PascalStyle for class/module names. -* UNDERSCORE_SEPARATED_UPPER_CASE for other constants. -* Capitalize words. -* ABBRs should be all upper case. -* Do as others do - -=== Commit messages - -When you're ready to commit: - - git commit path/to/files - -This will open your editor in which you write your commit message. -Use the following style for commit messages: - -* Use a succint subject line. -* Include reasoning behind the change in the commit message, focusing on why - the change is being made. -* Refer to redmine issue (such as Fixes [Bug #1234] or Implements - [Feature #3456]), or discussion on the mailing list - (such as [ruby-core:12345]). -* For GitHub issues, use [GH-#] (such as [Fixes GH-234]). -* Follow the style used by other committers. - -=== Contributing your code - -Now that you've got some code you want to contribute, let's get set up to -generate a patch. Start by forking the github mirror, check the {github docs on -forking}[https://help.github.com/articles/fork-a-repo] if you get stuck here. -You will only need a github account if you intend to host your repository -on github. - -Next copy the writable url for your fork and add it as a git remote, replace -"my_username" with your github account name: - - git remote add my_fork git@github.com:my_username/ruby.git - # Now we can push our branch to our fork - git push my_fork my_new_branch - -In order to generate a patch that you can upload to the bug tracker, we can use -the github interface to review our changes just visit -https://github.com/my_username/ruby/compare/master...my_new_branch - -Next, you can simply add '.patch' to the end of this URL and it will generate -the patch for you, save the file to your computer and upload it to the bug -tracker. Alternatively you can submit a pull request, but for the best chances -to receive feedback add it is recommended you add it to redmine. - -Since git is a distributed system, you are welcome to host your git repository -on any {publicly accessible hosting -site}[https://git.wiki.kernel.org/index.php/GitHosting], including {hosting your -own}[https://www.kernel.org/pub/software/scm/git/docs/user-manual.html#public-repositories] -You may use the {'git format-patch'}[https://git-scm.com/docs/git-format-patch] -command to generate patch files to upload to redmine. You may also use -the {'git request-pull'}[https://git-scm.com/docs/git-request-pull] command for -formatting pull request messages to redmine. - -=== Updating the official repository - -If you are a committer, you can push changes directly into the official -repository: - - git push origin your-branch-name:master - -However, it is likely will have become outdated, and you will have to -update it. In that case, run: - - git fetch origin - git rebase remotes/origin/master - -and then try pushing your changes again. diff --git a/doc/bug_triaging.rdoc b/doc/contributing/bug_triaging.rdoc index 310eff1aeb..83fe88cabe 100644 --- a/doc/bug_triaging.rdoc +++ b/doc/contributing/bug_triaging.rdoc @@ -27,12 +27,12 @@ If you cannot reproduce the example with the master branch, but can reproduce the issue on the latest version for the branch, then it is likely the bug has already been fixed, but it has not been backported yet. Try to determine which commit fixed it, and update the issue noting that the issue has been -fixed but not yet backported. If the ruby version is in the security +fixed but not yet backported. If the Ruby version is in the security maintenance phase or no longer supported, change the status to Closed. This -change be made without adding a note to avoid spamming the mailing list. +change can be made without adding a note to avoid spamming the mailing list. For issues that may require backwards incompatible changes or may benefit from -general committer attention or discussion, considering adding them as agenda +general committer attention or discussion, consider adding them as agenda items for the next committer meeting (https://bugs.ruby-lang.org/issues/14770). == Crash Bugs Without Reproducers @@ -41,27 +41,26 @@ Many bugs reported have little more than a crash report, often with no way to reproduce the issue. These bugs are difficult to triage as they often do not contain enough information. - -For these bugs, if the ruby version is the master branch or is the latest +For these bugs, if the Ruby version is the master branch or is the latest release for the branch and the branch is in normal maintenance phase, look at the backtrace and see if you can determine what could be causing the issue. -If you can guess would could be causing the issue, see if you can put together +If you can guess what could be causing the issue, see if you can put together a reproducible example (this is in general quite difficult). If you cannot guess what could be causing the issue, or cannot put together a reproducible example yourself, please ask the reporter to provide a reproducible example, and change the status to Feedback. -if the ruby version is no longer current (e.g. 2.5.0 when the latest version -on the ruby 2.5 branch is 2.5.5), add a note to the issue asking the reporter -to try the latest ruby version for the branch and report back, and change the -status to Feedback. If the ruby version is in the security maintenance phase +If the Ruby version is no longer current (e.g. 2.5.0 when the latest version +on the Ruby 2.5 branch is 2.5.5), add a note to the issue asking the reporter +to try the latest Ruby version for the branch and report back, and change the +status to Feedback. If the Ruby version is in the security maintenance phase or no longer supported, change the status to Closed. This change can be made without adding a note. == Crash Bugs With 3rd Party C Extensions -If the crash happens inside a 3rd party C extension, try to figure out which -C extension it happens inside, and add a note to the issue to report the +If the crash happens inside a 3rd party C extension, try to figure out inside +which C extension it happens, and add a note to the issue to report the issue to that C extension, and set the status to Third Party's Issue. == Non-Bug reports @@ -75,6 +74,6 @@ improvements) or Misc. This change can be made without adding a note. There are many issues that are stale, with no updates in months or even years. For stale issues in Feedback state, where the feedback has not been received, you can change the status to Closed without adding a note. For stale issues -in Assigned state, you can reach out the assignee and see if they can update +in Assigned state, you can reach out to the assignee and see if they can update the issue. If the assignee is no longer an active committer, remove them as the assignee and change the status to Open. diff --git a/doc/contributing/building_ruby.md b/doc/contributing/building_ruby.md new file mode 100644 index 0000000000..286bd1f116 --- /dev/null +++ b/doc/contributing/building_ruby.md @@ -0,0 +1,355 @@ +# Building Ruby + +## Dependencies + +1. Install the prerequisite dependencies for building the CRuby interpreter: + + * C compiler + + For RubyGems, you will also need: + + * [OpenSSL] 1.1.x or 3.0.x / [LibreSSL] + * [libyaml] 0.1.7 or later + * [zlib] + + If you want to build from the git repository, you will also need: + + * [autoconf] - 2.67 or later + * [gperf] - 3.1 or later + * Usually unneeded; only if you edit some source files using gperf + * ruby - 3.1 or later + * We can upgrade this version to system ruby version of the latest + Ubuntu LTS. + * git - 2.32 or later + * Anterior versions may work; 2.32 or later will prevent build + errors in case your system `.gitconfig` uses `$HOME` paths. + +2. Install optional, recommended dependencies: + + * [libffi] (to build fiddle) + * [gmp] (if you wish to accelerate Bignum operations) + * [rustc] - 1.58.0 or later, if you wish to build + [YJIT](rdoc-ref:RubyVM::YJIT). + + If you want to link the libraries (e.g., gmp) installed into other than + the OS default place, typically using Homebrew on macOS, pass the + `--with-opt-dir` (or `--with-gmp-dir` for gmp) option to `configure`. + + ```sh + configure --with-opt-dir=$(brew --prefix gmp):$(brew --prefix jemalloc) + ``` + + As for the libraries needed for particular extensions only and not for + Ruby (openssl, readline, libyaml, zlib), you can add `--with-EXTLIB-dir` + options to the command line or to `CONFIGURE_ARGS` environment variable. + The command line options will be embedded in `rbconfig.rb`, while the + latter environment variable is not embedded and is only used when + building the extension libraries. + + ```sh + export CONFIGURE_ARGS="" + for ext in openssl readline libyaml zlib; do + CONFIGURE_ARGS="${CONFIGURE_ARGS} --with-$ext-dir=$(brew --prefix $ext)" + done + ``` + +[OpenSSL]: https://www.openssl.org +[LibreSSL]: https://www.libressl.org +[libyaml]: https://github.com/yaml/libyaml/ +[zlib]: https://www.zlib.net +[autoconf]: https://www.gnu.org/software/autoconf/ +[gperf]: https://www.gnu.org/software/gperf/ +[libffi]: https://sourceware.org/libffi/ +[gmp]: https://gmplib.org +[rustc]: https://www.rust-lang.org + +## Quick start guide + +1. Download ruby source code: + + Select one of the below. + + 1. Build from the tarball: + + Download the latest tarball from [Download Ruby] page and extract + it. Example for Ruby 3.0.2: + + ```sh + tar -xzf ruby-3.0.2.tar.gz + cd ruby-3.0.2 + ``` + + 2. Build from the git repository: + + Checkout the CRuby source code: + + ```sh + git clone https://github.com/ruby/ruby.git + cd ruby + ``` + + Run the GNU Autoconf script (which generates the `configure` script): + + ```sh + ./autogen.sh + ``` + +2. Create a `build` directory inside the repository directory: + + ```sh + mkdir build && cd build + ``` + + While it's not necessary to build in a dedicated directory like this, it's good + practice to do so. + +3. We'll eventually install our new Ruby in `~/.rubies/ruby-master`, so we'll create that directory: + + ```sh + mkdir ~/.rubies + ``` + +4. Run the `configure` script (which generates the `Makefile`): + + ```sh + ../configure --prefix="${HOME}/.rubies/ruby-master" + ``` + + - Also `-C` (or `--config-cache`) would reduce time to configure from the + next time. + +5. Build Ruby: + + ```sh + make + ``` + +6. [Run tests](testing_ruby.md) to confirm your build succeeded. + +7. Install our newly-compiled Ruby into `~/.rubies/ruby-master`: + + ```sh + make install + ``` + + - If you need to run `make install` with `sudo` and want to avoid document + generation with different permissions, you can use `make SUDO=sudo + install`. + +8. You can then try your new Ruby out, for example: + + ```sh + ~/.rubies/ruby-master/bin/ruby -e "puts 'Hello, World!'" + ``` + +By the end, your repo will look like this: + +```text +ruby +├── autogen.sh # Pre-existing Autoconf script, used in step 1 +├── configure # Generated in step 1, which generates the `Makefile` in step 4 +├── build # Created in step 2 and populated in step 4 +│ ├── GNUmakefile # Generated by `../configure` +│ ├── Makefile # Generated by `../configure` +│ ├── object.o # Compiled object file, built my `make` +│ └── ... other compiled `.o` object files +│ +│ # Other interesting files: +├── include +│ └── ruby.h # The main public header +├── internal +│ ├── object.h +│ └── ... other header files used by the `.c` files in the repo root. +├── lib +│ └── # Default gems, like `bundler`, `erb`, `set`, `yaml`, etc. +├── spec +│ └── # A mirror of the Ruby specification from github.com/ruby/spec +├── test +│ ├── ruby +│ └── ... +├── object.c +└── ... other `.c` files +``` + +[Download Ruby]: https://www.ruby-lang.org/en/downloads/ + +### Unexplainable Build Errors + +If you are having unexplainable build errors, after saving all your work, try +running `git clean -xfd` in the source root to remove all git ignored local +files. If you are working from a source directory that's been updated several +times, you may have temporary build artifacts from previous releases which can +cause build failures. + +## Building on Windows + +The documentation for building on Windows can be found in [the separated +file](../distribution/windows.md). + +## More details + +If you're interested in continuing development on Ruby, here are more details +about Ruby's build to help out. + +### Running make scripts in parallel + +In GNU make[^caution-gmake-3] and BSD make implementations, to run a specific make script in +parallel, pass the flag `-j<number of processes>`. For instance, to run tests +on 8 processes, use: + +```sh +make test-all -j8 +``` + +We can also set `MAKEFLAGS` to run _all_ `make` commands in parallel. + +Having the right `--jobs` flag will ensure all processors are utilized when +building software projects. To do this effectively, you can set `MAKEFLAGS` in +your shell configuration/profile: + +```sh +# On macOS with Fish shell: +export MAKEFLAGS="--jobs "(sysctl -n hw.ncpu) + +# On macOS with Bash/ZSH shell: +export MAKEFLAGS="--jobs $(sysctl -n hw.ncpu)" + +# On Linux with Fish shell: +export MAKEFLAGS="--jobs "(nproc) + +# On Linux with Bash/ZSH shell: +export MAKEFLAGS="--jobs $(nproc)" +``` + +[^caution-gmake-3]: **CAUTION**: GNU make 3 is missing some features for parallel execution, we +recommend to upgrade to GNU make 4 or later. + +### Miniruby vs Ruby + +Miniruby is a version of Ruby which has no external dependencies and lacks +certain features. It can be useful in Ruby development because it allows for +faster build times. Miniruby is built before Ruby. A functional Miniruby is +required to build Ruby. To build Miniruby: + +```sh +make miniruby +``` + +## Debugging + +You can use either lldb or gdb for debugging. Before debugging, you need to +create a `test.rb` with the Ruby script you'd like to run. You can use the +following make targets: + +* `make run`: Runs `test.rb` using Miniruby +* `make lldb`: Runs `test.rb` using Miniruby in lldb +* `make gdb`: Runs `test.rb` using Miniruby in gdb +* `make runruby`: Runs `test.rb` using Ruby +* `make lldb-ruby`: Runs `test.rb` using Ruby in lldb +* `make gdb-ruby`: Runs `test.rb` using Ruby in gdb + +For VS Code users, you can set up editor-based debugging experience by running: + +```shell +cp -r misc/.vscode .vscode +``` + +This will add launch configurations for debugging Ruby itself by running `test.rb` with `lldb`. + +**Note**: if you build Ruby under the `./build` folder, you'll need to update `.vscode/launch.json`'s program entry accordingly to: `"${workspaceFolder}/build/ruby"` + +### Compiling for Debugging + +You can compile Ruby with the `RUBY_DEBUG` macro to enable debugging on some +features. One example is debugging object shapes in Ruby with +`RubyVM::Shape.of(object)`. + +Additionally Ruby can be compiled to support the `RUBY_DEBUG` environment +variable to enable debugging on some features. An example is using +`RUBY_DEBUG=gc_stress` to debug GC-related issues. + +There is also support for the `RUBY_DEBUG_LOG` environment variable to log a +lot of information about what the VM is doing, via the `USE_RUBY_DEBUG_LOG` +macro. + +You should also configure Ruby without optimization and other flags that may +interfere with debugging by changing the optimization flags. + +Bringing it all together: + +```sh +./configure cppflags="-DRUBY_DEBUG=1 -DUSE_RUBY_DEBUG_LOG=1" --enable-debug-env optflags="-O0 -fno-omit-frame-pointer" +``` + +### Building with Address Sanitizer + +Using the address sanitizer (ASAN) is a great way to detect memory issues. It +can detect memory safety issues in Ruby itself, and also in any C extensions +compiled with and loaded into a Ruby compiled with ASAN. + +```sh +./autogen.sh +mkdir build && cd build +../configure CC=clang-18 cflags="-fsanitize=address -fno-omit-frame-pointer -DUSE_MN_THREADS=0" # and any other options you might like +make +``` + +The compiled Ruby will now automatically crash with a report and a backtrace +if ASAN detects a memory safety issue. To run Ruby's test suite under ASAN, +issue the following command. Note that this will take quite a long time (over +two hours on my laptop); the `RUBY_TEST_TIMEOUT_SCALE` and +`SYNTAX_SUGEST_TIMEOUT` variables are required to make sure tests don't +spuriously fail with timeouts when in fact they're just slow. + +```sh +RUBY_TEST_TIMEOUT_SCALE=5 SYNTAX_SUGGEST_TIMEOUT=600 make check +``` + +Please note, however, the following caveats! + +* Due to [Bug #20243], Clang generates code for threadlocal variables which + doesn't work with M:N threading. Thus, it's necessary to disable M:N + threading support at build time for now (with the `-DUSE_MN_THREADS=0` + configure argument). +* ASAN will only work when using Clang version 18 or later - it requires + [llvm/llvm-project#75290] related to multithreaded `fork`. +* ASAN has only been tested so far with Clang on Linux. It may or may not work + with other compilers or on other platforms - please file an issue on + [Ruby Issue Tracking System] if you run into problems with such configurations + (or, to report that they actually work properly!) +* In particular, although I have not yet tried it, I have reason to believe + ASAN will _not_ work properly on macOS yet - the fix for the multithreaded + fork issue was actually reverted for macOS (see [llvm/llvm-project#75659]). + Please open an issue on [Ruby Issue Tracking System] if this is a problem for + you. + +[Revision 9d0a5148]: https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/9d0a5148ae062a0481a4a18fbeb9cfd01dc10428 +[Bug #20243]: https://bugs.ruby-lang.org/issues/20243 +[llvm/llvm-project#75290]: https://github.com/llvm/llvm-project/pull/75290 +[llvm/llvm-project#75659]: https://github.com/llvm/llvm-project/pull/75659#issuecomment-1861584777 +[Ruby Issue Tracking System]: https://bugs.ruby-lang.org + +## How to measure coverage of C and Ruby code + +You need to be able to use gcc (gcov) and lcov visualizer. + +```sh +./autogen.sh +./configure --enable-gcov +make +make update-coverage +rm -f test-coverage.dat +make test-all COVERAGE=true +make lcov +open lcov-out/index.html +``` + +If you need only C code coverage, you can remove `COVERAGE=true` from the +above process. You can also use `gcov` command directly to get per-file +coverage. + +If you need only Ruby code coverage, you can remove `--enable-gcov`. Note +that `test-coverage.dat` accumulates all runs of `make test-all`. Make sure +that you remove the file if you want to measure one test run. + +You can see the coverage result of CI: https://rubyci.org/coverage diff --git a/doc/contributing/concurrency_guide.md b/doc/contributing/concurrency_guide.md new file mode 100644 index 0000000000..1fb58f7203 --- /dev/null +++ b/doc/contributing/concurrency_guide.md @@ -0,0 +1,154 @@ +# Concurrency Guide + +This is a guide to thinking about concurrency in the cruby source code, whether that's contributing to Ruby +by writing C or by contributing to one of the JITs. This does not touch on native extensions, only the core +language. It will go over: + +* What needs synchronizing? +* How to use the VM lock, and what you can and can't do when you've acquired this lock. +* What you can and can't do when you've acquired other native locks. +* The difference between the VM lock and the GVL. +* What a VM barrier is and when to use it. +* The lock ordering of some important locks. +* How ruby interrupt handling works. +* The timer thread and what it's responsible for. + +## What needs synchronizing? + +Before ractors, only one ruby thread could run at once. That didn't mean you could forget about concurrency issues, though. The timer thread +is a native thread that interacts with other ruby threads and changes some VM internals, so if these changes can be done in parallel by both the timer +thread and a ruby thread, they need to be synchronized. + +When you add ractors to the mix, it gets more complicated. However, ractors allow you to forget about synchronization for non-shareable objects because +they aren't used across ractors. Only one ruby thread can touch the object at once. For shareable objects, they are deeply frozen so there isn't any +mutation on the objects themselves. However, something like reading/writing constants across ractors does need to be synchronized. In this case, ruby threads need to +see a consistent view of the VM. If publishing the update takes 2 steps or even two separate instructions, like in this case, synchronization is required. + +Most synchronization is to protect VM internals. These internals include structures for the thread scheduler on each ractor, the global ractor scheduler, the +coordination between ruby threads and ractors, global tables (for `fstrings`, encodings, symbols and global vars), etc. Anything that can be mutated by a ractor +that can also be read or mutated by another ractor at the same time requires proper synchronization. + +## The VM Lock + +There's only one VM lock and it is for critical sections that can only be entered by one ractor at a time. +Without ractors, the VM lock is useless. It does not stop all ractors from running, as ractors can run +without trying to acquire this lock. If you're updating global (shared) data between ractors and aren't using +atomics, you need to use a lock and this is a convenient one to use. Unlike other locks, you can allocate ruby-managed +memory with it held. When you take the VM lock, there are things you can and can't do during your critical section: + +You can (as long as no other locks are also held before the VM lock): + +* Create ruby objects, call `ruby_xmalloc`, etc. + +You can't: + +* Context switch to another ruby thread or ractor. This is important, as many things can cause ruby-level context switches including: + + * Calling any ruby method through, for example, `rb_funcall`. If you execute ruby code, a context switch could happen. + This also applies to ruby methods defined in C, as they can be redefined in Ruby. Things that call ruby methods such as + `rb_obj_respond_to` are also disallowed. + + * Calling `rb_raise`. This will call `initialize` on the new exception object. With the VM lock + held, nothing you call should be able to raise an exception. `NoMemoryError` is allowed, however. + + * Calling `rb_nogvl` or a ruby-level mechanism that can context switch like `rb_mutex_lock`. + + * Enter any blocking operation managed by ruby. This will context switch to another ruby thread using `rb_nogvl` or + something equivalent. A blocking operation is one that blocks the thread's progress, such as `sleep` or `IO#read`. + +Internally, the VM lock is the `vm->ractor.sync.lock`. + +You need to be on a ruby thread to take the VM lock. You also can't take it inside any functions that could be called during sweeping, as MMTK sweeps +on another thread and you need a valid `ec` to grab the lock. For this same reason (among others), you can't take it from the timer thread either. + +## Other Locks + +All native locks that aren't the VM lock share a more strict set of rules for what's allowed during the critical section. By native locks, we mean +anything that uses `rb_native_mutex_lock`. Some important locks include the `interrupt_lock`, the ractor scheduling lock (protects global scheduling data structures), +the thread scheduling lock (local to each ractor, protects per-ractor scheduling data structures) and the ractor lock (local to each ractor, protects ractor data structures). + +When you acquire one of these locks, + +You can: + +* Allocate memory though non-ruby allocation such as raw `malloc` or the standard library. But be careful, some functions like `strdup` use +ruby allocation through the use of macros! + +* Use `ccan` lists, as they don't allocate. + +* Do the usual things like set variables or struct fields, manipulate linked lists, signal condition variables etc. + +You can't: + +* Allocate ruby-managed memory. This includes creating ruby objects or using `ruby_xmalloc` or `st_insert`. The reason this +is disallowed is if that allocation causes a GC, then all other ruby threads must join a VM barrier as soon as possible +(when they next check interrupts or acquire the VM lock). This is so that no other ractors are running during GC. If a ruby thread +is waiting (blocked) on this same native lock, it can't join the barrier and a deadlock occurs because the barrier will never finish. + +* Raise exceptions. You also can't use `EC_JUMP_TAG` if it jumps out of the critical section. + +* Context switch. See the `VM Lock` section for more info. + +## Difference Between VM Lock and GVL + +The VM Lock is a particular lock in the source code. There is only one VM Lock. The GVL, on the other hand, is more of a combination of locks. +It is "acquired" when a ruby thread is about to run or is running. Since many ruby threads can run at the same time if they're in different ractors, +there are many GVLs (1 per `SNT` + 1 for the main ractor). It can no longer be thought of as a "Global VM Lock" like it once was before ractors. + +## VM Barriers + +Sometimes, taking the VM Lock isn't enough and you need a guarantee that all ractors have stopped. This happens when running `GC`, for instance. +To get a barrier, you take the VM Lock and call `rb_vm_barrier()`. For the duration that the VM lock is held, no other ractors will be running. It's not used +often as taking a barrier slows ractor performance down considerably, but it's useful to know about and is sometimes the only solution. + +## Lock Orderings + +It's a good idea to not hold more than 2 locks at once on the same thread. Locking multiple locks can introduce deadlocks, so do it with care. When locking +multiple locks at once, follow an ordering that is consistent across the program, otherwise you can introduce deadlocks. Here are the orderings of some important locks: + +* VM lock before ractor_sched_lock +* thread_sched_lock before ractor_sched_lock +* interrupt_lock before timer_th.waiting_lock +* timer_th.waiting_lock before ractor_sched_lock + +These orderings are subject to change, so check the source if you're not sure. On top of this: + +* During each `ubf` (unblock) function, the VM lock can be taken around it in some circumstances. This happens during VM shutdown, for example. +See the "Interrupt Handling" section for more details. + +## Ruby Interrupt Handling + +When the VM runs ruby code, ruby's threads intermittently check ruby-level interrupts. These software interrupts +are for various things in ruby and they can be set by other ruby threads or the timer thread. + +* Ruby threads check when they should give up their timeslice. The native thread switches to another ruby thread when their time is up. +* The timer thread sends a "trap" interrupt to the main thread if any ruby-level signal handlers are pending. +* Ruby threads can have other ruby threads run tasks for them by sending them an interrupt. For instance, ractors send +the main thread an interrupt when they need to `require` a file so that it's done on the main thread. They wait for the +main thread's result. +* During VM shutdown, a "terminate" interrupt is sent to all ractor main threads top stop them asap. +* When calling `Thread#raise`, the caller sends an interrupt to that thread telling it which exception to raise. +* Unlocking a mutex sends the next waiter (if any) an interrupt telling it to grab the lock. +* Signalling or broadcasting on a condition variable tells the waiter(s) to wake up. + +This isn't a complete list. + +When sending an interrupt to a ruby thread, the ruby thread can be blocked. For example, it could be in the middle of a `TCPSocket#read` call. If so, +the receiving thread's `ubf` (unblock function) gets called from the thread (ruby thread or timer thread) that sent the interrupt. +Each ruby thread has a `ubf` that is set when it enters a blocking operation and is unset after returning from it. By default, this `ubf` function sends a +`SIGVTALRM` to the receiving thread to try to unblock it from the kernel so it can check its interrupts. There are other `ubfs` that +aren't associated with a syscall, such as when calling `Ractor#join` or `sleep`. All `ubfs` are called with the `interrupt_lock` held, +so take that into account when using locks inside `ubfs`. + +Remember, `ubfs` can be called from the timer thread so you cannot assume an `ec` inside them. The `ec` (execution context) is only set on ruby threads. + +## The Timer Thread + +The timer thread has a few functions. They are: + +* Send interrupts to ruby threads that have run for their whole timeslice. +* Wake up M:N ruby threads (threads in non-main ractors) blocked on IO or after a specified timeout. This +uses `kqueue` or `epoll`, depending on the OS, to receive IO events on behalf of the threads. +* Continue calling the `SIGVTARLM` signal if a thread is still blocked on a syscall after the first `ubf` call. +* Signal native threads (`SNT`) waiting on a ractor if there are ractors waiting in the global run queue. +* Create more `SNT`s if some are blocked, like on IO or on `Ractor#join`. diff --git a/doc/contributing/contributing.md b/doc/contributing/contributing.md new file mode 100644 index 0000000000..a2ed00ab90 --- /dev/null +++ b/doc/contributing/contributing.md @@ -0,0 +1,35 @@ +# Contributing to Ruby + +## Ruby Issues + +To report an issue in the Ruby core: + +* [Report issues](reporting_issues.md). + +## Ruby Core + +To contribute to the Ruby core functionality, +you'll need initially to: + +* [Build Ruby](building_ruby.md) on your system. +* [Test Ruby](testing_ruby.md), to make sure the build is correct. + +Then: + +* [Make changes to Ruby](making_changes_to_ruby.md). + +And possibly: + +* [Benchmark Ruby](https://github.com/ruby/ruby/tree/master/benchmark#make-benchmark). + +## Ruby Documentation + +To contribute to the Ruby core documentation, see: + +* [Making changes to the Ruby documentation](documentation_guide.md). + +## Ruby Standard Library + +To contribute to the Ruby Standard Library, see: + +* [Making changes to the Ruby Standard Library](making_changes_to_stdlibs.md). diff --git a/doc/contributing/documentation_guide.md b/doc/contributing/documentation_guide.md new file mode 100644 index 0000000000..4b1e2ac9ad --- /dev/null +++ b/doc/contributing/documentation_guide.md @@ -0,0 +1,623 @@ +# Documentation Guide + +This guide discusses recommendations for documenting +classes, modules, and methods +in the Ruby core and in the Ruby standard library. + +## Generating documentation + +Most Ruby documentation lives in the source files and is written in +[RDoc format](https://ruby.github.io/rdoc/RDoc/MarkupReference.html). + +Some pages live under the `doc` folder and can be written in either +`.rdoc` or `.md` format, determined by the file extension. + +To generate the output of documentation changes in HTML in the +`{build folder}/.ext/html` directory, run the following inside your +build directory: + +```sh +make html +``` + +If you don't have a build directory, follow the [quick start +guide](building_ruby.md#label-Quick+start+guide) up to step 4. + +Then you can preview your changes by opening +`{build folder}/.ext/html/index.html` file in your browser. + +## Goal + +The goal of Ruby documentation is to impart the most important +and relevant information in the shortest time. +The reader should be able to quickly understand the usefulness +of the subject code and how to use it. + +Providing too little information is bad, but providing unimportant +information or unnecessary examples is not good either. +Use your judgment about what the user needs to know. + +## General Guidelines + +- Keep in mind that the reader may not be fluent in \English. +- Write short declarative or imperative sentences. +- Group sentences into (ideally short) paragraphs, + each covering a single topic. +- Organize material with + [headings]. +- Refer to authoritative and relevant sources using + [links](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Links). +- Use simple verb tenses: simple present, simple past, simple future. +- Use simple sentence structure, not compound or complex structure. +- Avoid: + - Excessive comma-separated phrases; consider a [list]. + - Idioms and culture-specific references. + - Overuse of headings. + - Using US-ASCII-incompatible characters in C source files; + see [Characters](#label-Characters) below. + +### Characters + +Use only US-ASCII-compatible characters in a C source file. +(If you use other characters, the Ruby CI will gently let you know.) + +If you want to put ASCII-incompatible characters into the documentation +for a C-coded class, module, or method, there are workarounds +involving new files `doc/*.rdoc`: + +- For class `Foo` (defined in file `foo.c`), + create file `doc/foo.rdoc`, declare `class Foo; end`, + and place the class documentation above that declaration: + + ```ruby + # Documentation for class Foo goes here. + class Foo; end + ``` + +- Similarly, for module `Bar` (defined in file `bar.c`), + create file `doc/bar.rdoc`, declare `module Bar; end`, + and place the module documentation above that declaration: + + ```ruby + # Documentation for module Bar goes here. + module Bar; end + ``` + +- For a method, things are different. + Documenting a method as above disables the "click to toggle source" feature + in the rendered documentation. + + Therefore it's best to use file inclusion: + + - Retain the `call-seq` in the C code. + - Use file inclusion (`:include:`) to include text from an .rdoc file. + + Example: + + ```c + /* + * call-seq: + * each_byte {|byte| ... } -> self + * each_byte -> enumerator + * + * :include: doc/string/each_byte.rdoc + * + */ + ``` + +### \RDoc + +Ruby is documented using RDoc. +For information on \RDoc syntax and features, see the +[RDoc Markup Reference](https://ruby.github.io/rdoc/RDoc/MarkupReference.html). + +### Output from `irb` + +For code examples, consider using interactive Ruby, +[irb](https://ruby-doc.org/stdlib/libdoc/irb/rdoc/IRB.html). + +For a code example that includes `irb` output, +consider aligning `# => ...` in successive lines. +Alignment may sometimes aid readability: + +```ruby +a = [1, 2, 3] #=> [1, 2, 3] +a.shuffle! #=> [2, 3, 1] +a #=> [2, 3, 1] +``` + +### Headings + +Organize a long discussion for a class or module with [headings]. + +Do not use formal headings in the documentation for a method or constant. + +In the rare case where heading-like structures are needed +within the documentation for a method or constant, use +[bold text](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Bold) +as pseudo-headings. + +### Blank Lines + +A blank line begins a new paragraph. + +A [code block](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Code+Blocks) +or [list] should be preceded by and followed by a blank line. +This is unnecessary for the HTML output, but helps in the `ri` output. + +### \Method Names + +For a method name in text: + +- For a method in the current class or module, + use a double-colon for a singleton method, + or a hash mark for an instance method: + <tt>::bar</tt>, <tt>#baz</tt>. +- Otherwise, include the class or module name + and use a dot for a singleton method, + or a hash mark for an instance method: + <tt>Foo.bar</tt>, <tt>Foo#baz</tt>. + +### Embedded Code and Commands + +Code or commands embedded in running text (i.e., not in a code block) +should marked up as +[monofont]. + +Code that is a simple string should include the quote marks. + +### Auto-Linking + +Most often, the name of a class, module, or method +is auto-linked: + +```rdoc +- Float. +- Enumerable. +- File.new +- File#read. +``` + +renders as: + +> - Float. +> - Enumerable. +> - File.new +> - File#read. + +In general, \RDoc's auto-linking should not be suppressed. +For example, we should write just plain _Float_ (which is auto-linked): + +```rdoc +Returns a Float. +``` + +which renders as: + +> Returns a Float. + +However, _do_ suppress auto-linking when the word in question +does not refer to a Ruby entity (e.g., some uses of _Class_ or _English_): + +```rdoc +Class variables can be tricky. +``` + +renders as: + +> Class variables can be tricky. + +Also, _do_ suppress auto-linking when the word in question +refers to the current document +(e.g., _Float_ in the documentation for class Float). + +In this case you may consider forcing the name to +[monofont], +which suppresses auto-linking, and also emphasizes that the word is a class name: + +```rdoc +A +Float+ object represents .... +``` + +renders as: + +> A `Float` object represents .... + +For a _very_ few, _very_ often-discussed classes, +you might consider avoiding the capitalized class name altogether. +For example, for some mentions of arrays, +you might write simply the lowercase _array_. + +Instead of: + +```rdoc +For an empty Array, .... +``` + +which renders as: + +> For an empty Array, .... + +you might write: + +```rdoc +For an empty array, .... +``` + +which renders as: + +> For an empty array, .... + +This more casual usage avoids both auto-linking and distracting font changes, +and is unlikely to cause confusion. + +This principle may be usefully applied, in particular, for: + +- An array. +- An integer. +- A hash. +- A string. + +However, it should be applied _only_ when referring to an _instance_ of the class, +and _never_ when referring to the class itself. + +### Explicit Links + +When writing an explicit link, follow these guidelines. + +#### `rdoc-ref` Scheme + +Use the `rdoc-ref` scheme for: + +- A link in core documentation to other core documentation. +- A link in core documentation to documentation in a standard library package. +- A link in a standard library package to other documentation in that same + standard library package. + +See section "`rdoc-ref` Scheme" in [links]. + +#### URL-Based Link + +Use a full URL-based link for: + +- A link in standard library documentation to documentation in the core. +- A link in standard library documentation to documentation in a different + standard library package. + +Doing so ensures that the link will be valid even when the package documentation +is built independently (separately from the core documentation). + +The link should lead to a target in https://docs.ruby-lang.org/en/master/. + +Also use a full URL-based link for a link to an off-site document. + +### Variable Names + +The name of a variable (as specified in its call-seq) should be marked up as +[monofont]. + +Also, use monofont text for the name of a transient variable +(i.e., one defined and used only in the discussion, such as `n`). + +### HTML Tags + +In general, avoid using HTML tags (even in formats where it's allowed) +because `ri` (the Ruby Interactive reference tool) +may not render them properly. + +### Tables + +In particular, avoid building tables with HTML tags +(<tt><table></tt>, etc.). + +Alternatives: + +- A {verbatim text block}[https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Verbatim+Text+Blocks], + using spaces and punctuation to format the text; + note that {text markup}[https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Text+Markup] + will not be honored: + + - Example {source}[https://github.com/ruby/ruby/blob/34d802f32f00df1ac0220b62f72605827c16bad8/file.c#L6570-L6596]. + - Corresponding {output}[rdoc-ref:File@Read-2FWrite+Mode]. + +- (Markdown format only): A {Github Flavored Markdown (GFM) table}[https://github.github.com/gfm/#tables-extension-], + using special formatting for the text: + + - Example {source}[https://github.com/ruby/ruby/blob/34d802f32f00df1ac0220b62f72605827c16bad8/doc/contributing/glossary.md?plain=1]. + - Corresponding {output}[https://docs.ruby-lang.org/en/master/contributing/glossary_md.html]. + +### Languages in Examples + +For symbols and strings in documentation examples: + +- Prefer \English in \English documentation: <tt>'Hello'</tt>. +- Prefer Japanese in Japanese documentation: <tt>'ã“ã‚“ã«ã¡ã¯'</tt>. +- If a second language is needed (as, for example, characters with different byte-sizes), + prefer Japanese in \English documentation and \English in Japanese documentation. +- Use other languages examples only as necessary: see String#capitalize. + +## Documenting Classes and Modules + +The general structure of the class or module documentation should be: + +- Synopsis +- Common uses, with examples +- "What's Here" summary (optional) + +### Synopsis + +The synopsis is a short description of what the class or module does +and why the reader might want to use it. +Avoid details in the synopsis. + +### Common Uses + +Show common uses of the class or module. +Depending on the class or module, this section may vary greatly +in both length and complexity. + +### What's Here Summary + +The documentation for a class or module may include a "What's Here" section. + +Guidelines: + +- The section title is `What's Here`. +- Consider listing the parent class and any included modules; consider + [links] to their "What's Here" sections if those exist. +- All methods mentioned in the left-pane table of contents + should be listed (including any methods extended from another class). +- Attributes (which are not included in the TOC) may also be listed. +- Display methods as items in one or more bullet lists: + + - Begin each item with the method name, followed by a colon + and a short description. + - If the method has aliases, mention them in parentheses before the colon + (and do not list the aliases separately). + - Check the rendered documentation to determine whether \RDoc has recognized + the method and linked to it; if not, manually insert a + [link](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Links). + +- If there are numerous entries, consider grouping them into subsections with headings. +- If there are more than a few such subsections, + consider adding a table of contents just below the main section title. + +## Documenting Methods + +### General Structure + +The general structure of the method documentation should be: + +- Calling sequence (for methods written in C). +- Synopsis (short description). +- In-brief examples (optional) +- Details and examples. +- Argument description (if necessary). +- Corner cases and exceptions. +- Related methods (optional). + +### Calling Sequence (for methods written in C) + +For methods written in Ruby, \RDoc documents the calling sequence automatically. + +For methods written in C, \RDoc cannot determine what arguments +the method accepts, so those need to be documented using \RDoc directive +[`call-seq:`](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Directives+for+Method+Documentation). + +For a singleton method, use the form: + +```rdoc +class_name.method_name(method_args) {|block_args| ... } -> return_type +``` + +Example: + +```rdoc +* call-seq: +* Hash.new(default_value = nil) -> new_hash +* Hash.new {|hash, key| ... } -> new_hash +``` + +For an instance method, use the form +(omitting any prefix, just as RDoc does for a Ruby-coded method): + +```rdoc +method_name(method_args) {|block_args| ... } -> return_type +``` + +For example, in Array, use: + +```rdoc +* call-seq: +* count -> integer +* count(obj) -> integer +* count {|element| ... } -> integer +``` + +```rdoc +* call-seq: +* <=> other -> -1, 0, 1, or nil +``` + +For a binary-operator style method (e.g., Array#&), +cite `self` in the call-seq (not, e.g., `array` or `receiver`): + +```rdoc +* call-seq: +* self & other_array -> new_array +``` + +Arguments: + +- If the method does not accept arguments, omit the parentheses. +- If the method accepts optional arguments: + + - Separate each argument name and its default value with ` = ` + (equal-sign with surrounding spaces). + - If the method has the same behavior with either an omitted + or an explicit argument, use a `call-seq` with optional arguments. + For example, use: + + ```rdoc + * call-seq: + * respond_to?(symbol, include_all = false) -> true or false + ``` + + - If the behavior is different with an omitted or an explicit argument, + use a `call-seq` with separate lines. + For example, in Enumerable, use: + + ```rdoc + * call-seq: + * max -> element + * max(n) -> array + ``` + +Block: + +- If the method does not accept a block, omit the block. +- If the method accepts a block, the `call-seq` should have `{|args| ... }`, + not `{|args| block }` or `{|args| code }`. +- If the method accepts a block, but returns an Enumerator when the block is omitted, + the `call-seq` should show both forms: + + ```rdoc + * call-seq: + * array.select {|element| ... } -> new_array + * array.select -> new_enumerator + ``` + +Return types: + +- If the method can return multiple different types, + separate the types with "or" and, if necessary, commas. +- If the method can return multiple types, use `object`. +- If the method returns the receiver, use `self`. +- If the method returns an object of the same class, + prefix `new_` if and only if the object is not `self`; + example: `new_array`. + +Aliases: + +- Omit aliases from the `call-seq`, unless the alias is an + operator method. If listing both a regular method and an + operator method in the `call-seq`, explain in the details and + examples section when it is recommended to use the regular method + and when it is recommended to use the operator method. + +### Synopsis + +The synopsis comes next, and is a short description of what the +method does and why you would want to use it. Ideally, this +is a single sentence, but for more complex methods it may require +an entire paragraph. + +For `Array#count`, the synopsis is: + +> Returns a count of specified elements. + +This is great as it is short and descriptive. Avoid documenting +too much in the synopsis, stick to the most important information +for the benefit of the reader. + +### In-Brief Examples + +For a method whose documentation is lengthy, +consider adding an "in-brief" passage, +showing examples that summarize the method's uses. + +The passage may answer some users' questions +(without their having to read long documentation); +see Array#[] and Array#[]=. + +### Details and Examples + +Most non-trivial methods benefit from examples, as well as details +beyond what is given in the synopsis. In the details and examples +section, you can document how the method handles different types +of arguments, and provides examples on proper usage. In this +section, focus on how to use the method properly, not on how the +method handles improper arguments or corner cases. + +Not every behavior of a method requires an example. If the method +is documented to return `self`, you don't need to provide an example +showing the return value is the same as the receiver. If the method +is documented to return `nil`, you don't need to provide an example +showing that it returns `nil`. If the details mention that for a +certain argument type, an empty array is returned, you don't need +to provide an example for that. + +Only add an example if it provides the user additional information, +do not add an example if it provides the same information given +in the synopsis or details. The purpose of examples is not to prove +what the details are stating. + +Many methods that can take an optional block call the block if it is given, +but return a new Enumerator if the block is not given; +in that case, do not provide an example, +but do state the fact (with the auto-linking uppercase Enumerator): + +```rdoc +* With no block given, returns a new Enumerator. +``` + +### Argument Description (if necessary) + +For methods that require arguments, if not obvious and not explicitly +mentioned in the details or implicitly shown in the examples, you can +provide details about the types of arguments supported. When discussing +the types of arguments, use simple language even if less-precise, such +as "level must be an integer", not "level must be an Integer-convertible +object". The vast majority of use will be with the expected type, not an +argument that is explicitly convertible to the expected type, and +documenting the difference is not important. + +For methods that take blocks, it can be useful to document the type of +argument passed if it is not obvious, not explicitly mentioned in the +details, and not implicitly shown in the examples. + +If there is more than one argument or block argument, use a +[labeled list](https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Labeled+Lists). + +### Corner Cases and Exceptions + +For corner cases of methods, such as atypical usage, briefly mention +the behavior, but do not provide any examples. + +Only document exceptions raised if they are not obvious. For example, +if you have stated earlier than an argument type must be an integer, +you do not need to document that a `TypeError` is raised if a non-integer +is passed. Do not provide examples of exceptions being raised unless +that is a common case, such as `Hash#fetch` raising a `KeyError`. + +### Related Methods (optional) + +In some cases, it is useful to document which methods are related to +the current method. For example, documentation for `Hash#[]` might +mention `Hash#fetch` as a related method, and `Hash#merge` might mention +`Hash#merge!` as a related method. + +- Consider which methods may be related + to the current method, and if you think the reader would benefit from it, + at the end of the method documentation, add a line starting with + "Related: " (e.g. "Related: #fetch."). +- Don't list more than three related methods. + If you think more than three methods are related, + list the three you think are most important. +- Consider adding: + + - A phrase suggesting how the related method is similar to, + or different from, the current method. + See an example at Time#getutc. + - Example code that illustrates the similarities and differences. + See examples at Time#ctime, Time#inspect, Time#to_s. + +### Methods Accepting Multiple Argument Types + +For methods that accept multiple argument types, in some cases it can +be useful to document the different argument types separately. It's +best to use a separate paragraph for each case you are discussing. + +[headings]: https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Headings +[list]: https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Lists +[monofont]: https://ruby.github.io/rdoc/RDoc/MarkupReference.html#class-RDoc::MarkupReference-label-Monofont diff --git a/doc/dtrace_probes.rdoc b/doc/contributing/dtrace_probes.rdoc index d2cdd56902..1b20597ab4 100644 --- a/doc/dtrace_probes.rdoc +++ b/doc/contributing/dtrace_probes.rdoc @@ -52,15 +52,21 @@ with when they are fired and the arguments they take: [ruby:::method-entry(classname, methodname, filename, lineno);] This probe is fired just before a method is entered. - classname name of the class (a string) - methodname name of the method about to be executed (a string) - filename the file name where the method is _being called_ (a string) - lineno the line number where the method is _being called_ (an int) + classname:: name of the class (a string) + methodname:: name of the method about to be executed (a string) + filename:: the file name where the method is _being called_ (a string) + lineno:: the line number where the method is _being called_ (an int) + + *NOTE*: will only be fired if tracing is enabled, e.g. with: <code>TracePoint.new{}.enable</code>. + See Feature#14104[https://bugs.ruby-lang.org/issues/14104] for more details. [ruby:::method-return(classname, methodname, filename, lineno);] This probe is fired just after a method has returned. The arguments are the same as "ruby:::method-entry". + *NOTE*: will only be fired if tracing is enabled, e.g. with: <code>TracePoint.new{}.enable</code>. + See Feature#14104[https://bugs.ruby-lang.org/issues/14104] for more details. + [ruby:::cmethod-entry(classname, methodname, filename, lineno);] This probe is fired just before a C method is entered. The arguments are the same as "ruby:::method-entry". @@ -72,9 +78,9 @@ with when they are fired and the arguments they take: [ruby:::require-entry(requiredfile, filename, lineno);] This probe is fired on calls to rb_require_safe (when a file is required). - requiredfile is the name of the file to be required (string). - filename is the file that called "require" (string). - lineno is the line number where the call to require was made (int). + requiredfile:: the name of the file to be required (string). + filename:: the file that called "+require+" (string). + lineno:: the line number where the call to require was made (int). [ruby:::require-return(requiredfile, filename, lineno);] This probe is fired just before rb_require_safe (when a file is required) @@ -84,11 +90,11 @@ with when they are fired and the arguments they take: [ruby:::find-require-entry(requiredfile, filename, lineno);] This probe is fired right before search_required is called. search_required determines whether the file has already been required by searching loaded - features ($"), and if not, figures out which file must be loaded. + features (<code>$"</code>), and if not, figures out which file must be loaded. - requiredfile is the file to be required (string). - filename is the file that called "require" (string). - lineno is the line number where the call to require was made (int). + requiredfile:: the file to be required (string). + filename:: the file that called "require" (string). + lineno:: the line number where the call to require was made (int). [ruby:::find-require-return(requiredfile, filename, lineno);] This probe is fired right after search_required returns. See the @@ -106,56 +112,56 @@ with when they are fired and the arguments they take: [ruby:::raise(classname, filename, lineno);] This probe is fired when an exception is raised. - classname is the class name of the raised exception (string) - filename the name of the file where the exception was raised (string) - lineno the line number in the file where the exception was raised (int) + classname:: the class name of the raised exception (string) + filename:: the name of the file where the exception was raised (string) + lineno:: the line number in the file where the exception was raised (int) [ruby:::object-create(classname, filename, lineno);] This probe is fired when an object is about to be allocated. - classname the class of the allocated object (string) - filename the name of the file where the object is allocated (string) - lineno the line number in the file where the object is allocated (int) + classname:: the class of the allocated object (string) + filename:: the name of the file where the object is allocated (string) + lineno:: the line number in the file where the object is allocated (int) [ruby:::array-create(length, filename, lineno);] This probe is fired when an Array is about to be allocated. - length the size of the array (long) - filename the name of the file where the array is allocated (string) - lineno the line number in the file where the array is allocated (int) + length:: the size of the array (long) + filename:: the name of the file where the array is allocated (string) + lineno:: the line number in the file where the array is allocated (int) [ruby:::hash-create(length, filename, lineno);] This probe is fired when a Hash is about to be allocated. - length the size of the hash (long) - filename the name of the file where the hash is allocated (string) - lineno the line number in the file where the hash is allocated (int) + length:: the size of the hash (long) + filename:: the name of the file where the hash is allocated (string) + lineno:: the line number in the file where the hash is allocated (int) [ruby:::string-create(length, filename, lineno);] This probe is fired when a String is about to be allocated. - length the size of the string (long) - filename the name of the file where the string is allocated (string) - lineno the line number in the file where the string is allocated (int) + length:: the size of the string (long) + filename:: the name of the file where the string is allocated (string) + lineno:: the line number in the file where the string is allocated (int) [ruby:::symbol-create(str, filename, lineno);] This probe is fired when a Symbol is about to be allocated. - str the contents of the symbol (string) - filename the name of the file where the string is allocated (string) - lineno the line number in the file where the string is allocated (int) + str:: the contents of the symbol (string) + filename:: the name of the file where the string is allocated (string) + lineno:: the line number in the file where the string is allocated (int) [ruby:::parse-begin(sourcefile, lineno);] Fired just before parsing and compiling a source file. - sourcefile the file being parsed (string) - lineno the line number where the source starts (int) + sourcefile:: the file being parsed (string) + lineno:: the line number where the source starts (int) [ruby:::parse-end(sourcefile, lineno);] Fired just after parsing and compiling a source file. - sourcefile the file being parsed (string) - lineno the line number where the source ended (int) + sourcefile:: the file being parsed (string) + lineno:: the line number where the source ended (int) [ruby:::gc-mark-begin();] Fired at the beginning of a mark phase. @@ -172,7 +178,7 @@ with when they are fired and the arguments they take: [ruby:::method-cache-clear(class, sourcefile, lineno);] Fired when the method cache is cleared. - class is the classname being cleared, or "global" (string) - sourcefile the file being parsed (string) - lineno the line number where the source ended (int) + class:: the classname being cleared, or "global" (string) + sourcefile:: the file being parsed (string) + lineno:: the line number where the source ended (int) diff --git a/doc/contributing/glossary.md b/doc/contributing/glossary.md new file mode 100644 index 0000000000..3ec9796147 --- /dev/null +++ b/doc/contributing/glossary.md @@ -0,0 +1,48 @@ +# Ruby Internals Glossary + +Just a list of acronyms I've run across in the Ruby source code and their meanings. + +| Term | Definition | +| --- | -----------| +| `bmethod` | Method defined by `define_method() {}` (a Block that runs as a Method). | +| `BIN` | Basic Instruction Name. Used as a macro to reference the YARV instruction. Converts pop into YARVINSN_pop. | +| `bop` | Basic Operator. Relates to methods like `Integer` plus and minus which can be optimized as long as they haven't been redefined. | +| `cc` | Call Cache. An inline cache structure for the call site. Stored in the `cd` | +| `cd` | Call Data. A data structure that points at the `ci` and the `cc`. `iseq` objects points at the `cd`, and access call information and call caches via this structure | +| CFG | Control Flow Graph. Representation of the program where all control-flow and data dependencies have been made explicit by unrolling the stack and local variables. | +| `cfp`| Control Frame Pointer. Represents a Ruby stack frame. Calling a method pushes a new frame (cfp), returning pops a frame. Points at the `pc`, `sp`, `ep`, and the corresponding `iseq`| +| `ci` | Call Information. Refers to an `rb_callinfo` struct. Contains call information about the call site, including number of parameters to be passed, whether they are keyword arguments or not, etc. Used in conjunction with the `cc` and `cd`. | +| `cme` | Callable Method Entry. Refers to the `rb_callable_method_entry_t` struct, the internal representation of a Ruby method that has `defined_class` and `owner` set and is ready for dispatch. | +| `cref` | Class reference. A structure pointing to the class reference where `klass_or_self`, visibility scope, and refinements are stored. It also stores a pointer to the next class in the hierarchy referenced by `rb_cref_struct * next`. The Class reference is lexically scoped. | +| CRuby | Reference implementation of Ruby written in C | +| `cvar` | Class Variable. Refers to a Ruby class variable like `@@foo` | +| `dvar` | Dynamic Variable. Used by the parser to refer to local variables that are defined outside of the current lexical scope. For example `def foo; bar = 1; -> { p bar }; end` the "bar" inside the block is a `dvar` | +| `ec` | Execution Context. The top level VM context, points at the current `cfp` | +| `ep` | Environment Pointer. Local variables, including method parameters are stored in the `ep` array. The `ep` is pointed to by the `cfp` | +| GC | Garbage Collector | +| `gvar` | Global Variable. Refers to a Ruby global variable like `$$`, etc | +| `ICLASS` | Internal Class. When a module is included, the target class gets a new superclass which is an instance of an `ICLASS`. The `ICLASS` represents the module in the inheritance chain. | +| `ifunc` | Internal FUNCtion. A block implemented in C. | +| `iseq` | Instruction Sequence. Usually "iseq" in the C code will refer to an `rb_iseq_t` object that holds a reference to the actual instruction sequences which are executed by the VM. The object also holds information about the code, like the method name associated with the code. | +| `insn` | Instruction. Refers to a YARV instruction. | +| `insns` | Instructions. Usually an array of YARV instructions. | +| `ivar` | Instance Variable. Refers to a Ruby instance variable like `@foo` | +| `imemo` | Internal Memo. A tagged struct whose memory is managed by Ruby's GC, but contains internal information and isn't meant to be exposed to Ruby programs. Contains various information depending on the type. See the `imemo_type` enum for different types. | +| `IVC` | Instance Variable Cache. Cache specifically for instance variable access | +| JIT | Just In Time compiler | +| `lep` | Local Environment Pointer. An `ep` which is tagged `VM_ENV_FLAG_LOCAL`. Usually this is the `ep` of a method (rather than a block, whose `ep` isn't "local") | +| `local` | Local. Refers to a local variable. | +| `me` | Method Entry. Refers to an `rb_method_entry_t` struct, the internal representation of a Ruby method. | +| MRI | Matz's Ruby Implementation | +| `pc` | Program Counter. Usually the instruction that will be executed _next_ by the VM. Pointed to by the `cfp` and incremented by the VM | +| `snt` | Shared Native Thread. OS thread on which many ruby threads can run. Ruby threads from different ractors can even run on the same SNT. Ruby threads can switch SNTs when they context switch. SNTs are used in the M:N threading model. By default, non-main ractors use this model. +| `dnt` | Dedicated Native Thread. OS thread on which only one ruby thread can run. The ruby thread always runs on that same OS thread. DNTs are used in the 1:1 threading model. By default, the main ractor uses this model. +| `sp` | Stack Pointer. The top of the stack. The VM executes instructions in the `iseq` and instructions will push and pop values on the stack. The VM updates the `sp` on the `cfp` to point at the top of the stack| +| ST table | ST table is the main C implementation of a hash (smaller Ruby hashes may be backed by AR tables). | +| `svar` | Special Variable. Refers to special local variables like `$~` and `$_`. See the `getspecial` instruction in `insns.def` | +| `VALUE` | VALUE is a pointer to a ruby object from the Ruby C code. | +| VM | Virtual Machine. In MRI's case YARV (Yet Another Ruby VM) +| WB | Write Barrier. To do with GC write barriers | +| WC | Wild Card. As seen in instructions like `getlocal_WC_0`. It means this instruction takes a "wild card" for the parameter (in this case an index for a local) | +| YARV | Yet Another Ruby VM. The virtual machine that CRuby uses | +| ZOMBIE | A zombie object. An object that has a finalizer which hasn't been executed yet. The object has been collected, so is "dead", but the finalizer hasn't run yet so it's still somewhat alive. | diff --git a/doc/contributing/making_changes_to_ruby.md b/doc/contributing/making_changes_to_ruby.md new file mode 100644 index 0000000000..260fadb7e3 --- /dev/null +++ b/doc/contributing/making_changes_to_ruby.md @@ -0,0 +1,28 @@ +# Contributing a pull request + +## Code style + +Here are some general rules to follow when writing Ruby and C code for CRuby: + +* Do not change code unrelated to your pull request (including style fixes) +* Indent 4 spaces for C without tabs (tabs are two levels of indentation, equivalent to 8 spaces) +* Indent 2 spaces for Ruby without tabs +* ANSI C style for function declarations +* Follow C99 Standard +* PascalStyle for class/module names +* UNDERSCORE_SEPARATED_UPPER_CASE for other constants +* Abbreviations should be all upper case + +## Commit messages + +Use the following style for commit messages: + +* Use a succinct subject line +* Include reasoning behind the change in the commit message, focusing on why the change is being made +* Refer to issue (such as `Fixes [Bug #1234]` or `Implements [Feature #3456]`), or discussion on the mailing list (such as [ruby-core:12345]) + +## CI + +GitHub actions will run on each pull request. + +There is [a CI that runs on master](https://rubyci.org/). It has broad coverage of different systems and architectures, such as Solaris SPARC and macOS. diff --git a/doc/contributing/making_changes_to_stdlibs.md b/doc/contributing/making_changes_to_stdlibs.md new file mode 100644 index 0000000000..2ceb2e6075 --- /dev/null +++ b/doc/contributing/making_changes_to_stdlibs.md @@ -0,0 +1,49 @@ +# Making Changes To Standard Libraries + +Everything in the [lib](https://github.com/ruby/ruby/tree/master/lib) directory is mirrored from a standalone repository into the Ruby repository. +If you'd like to make contributions to standard libraries, do so in the standalone repositories, and the +changes will be automatically mirrored into the Ruby repository. + +For example, ERB lives in [a separate repository](https://github.com/ruby/erb) and is mirrored into [Ruby](https://github.com/ruby/ruby/tree/master/lib/erb). + +## Maintainers + +You can find the list of maintainers [here](https://docs.ruby-lang.org/en/master/maintainers_md.html#label-Maintainers). + +## Build + +First, install its dependencies using: + +```shell +bundle install +``` + +### Libraries with C-extension + +If the library has a `/ext` directory, it has C files that you need to compile with: + +```shell +bundle exec rake compile +``` + +## Running tests + +All standard libraries use [test-unit](https://github.com/test-unit/test-unit) as the test framework. + +To run all tests: + +```shell +bundle exec rake test +``` + +To run a single test file: + +```shell +bundle exec rake test TEST="test/test_foo.rb" +``` + +To run a single test case: + +```shell +bundle exec rake test TEST="test/test_foo.rb" TESTOPTS="--name=/test_mytest/" +``` diff --git a/doc/contributing/memory_view.md b/doc/contributing/memory_view.md new file mode 100644 index 0000000000..0b1369163d --- /dev/null +++ b/doc/contributing/memory_view.md @@ -0,0 +1,167 @@ +# MemoryView + +MemoryView provides the features to share multidimensional homogeneous arrays of +fixed-size element on memory among extension libraries. + +## Disclaimer + +* This feature is still experimental. The specification described here can be changed in the future. + +* This document is under construction. Please refer the master branch of ruby for the latest version of this document. + +## Overview + +We sometimes deal with certain kinds of objects that have arrays of the same typed fixed-size elements on a contiguous memory area as its internal representation. +Numo::NArray in numo-narray and Magick::Image in rmagick are typical examples of such objects. +MemoryView plays the role of the hub to share the internal data of such objects without copy among such libraries. + +Copy-less sharing of data is very important in some field such as data analysis, machine learning, and image processing. In these field, people need to handle large amount of on-memory data with several libraries. If we are forced to copy to exchange large data among libraries, a large amount of the data processing time must be occupied by copying data. You can avoid such wasting time by using MemoryView. + +MemoryView has two categories of APIs: + +1. Producer API + + Classes can register own MemoryView entry which allows objects of that classes to expose their MemoryView + +2. Consumer API + + Consumer API allows us to obtain and manage the MemoryView of an object + +## MemoryView structure + +A MemoryView structure, `rb_memory_view_t`, is used for exporting objects' MemoryView. +This structure contains the reference of the object, which is the owner of the MemoryView, the pointer to the head of exported memory, and the metadata that describes the structure of the memory. The metadata can describe multidimensional arrays with strides. + +### The member of MemoryView structure + +The MemoryView structure consists of the following members. + +- `VALUE obj` + + The reference to the original object that has the memory exported via the MemoryView. + + RubyVM manages the reference count of the MemoryView-exported objects to guard them from the garbage collection. The consumers do not have to struggle to guard this object from GC. + +- `void *data` + + The pointer to the head of the exported memory. + +- `ssize_t byte_size` + + The number of bytes in the memory pointed by `data`. + +- `bool readonly` + + `true` for readonly memory, `false` for writable memory. + +- `const char *format` + + A string to describe the format of an element, or NULL for unsigned byte. + +- `ssize_t item_size` + + The number of bytes in each element. + +- `const rb_memory_view_item_component_t *item_desc.components` + + The array of the metadata of the component in an element. + +- `size_t item_desc.length` + + The number of items in `item_desc.components`. + +- `ssize_t ndim` + + The number of dimensions. + +- `const ssize_t *shape` + + A `ndim` size array indicating the number of elements in each dimension. + This can be `NULL` when `ndim` is 1. + +- `const ssize_t *strides` + + A `ndim` size array indicating the number of bytes to skip to go to the next element in each dimension. + This can be `NULL` when `ndim` is 1. + +- `const ssize_t *sub_offsets` + + A `ndim` size array consisting of the offsets in each dimension when the MemoryView exposes a nested array. + This can be `NULL` when the MemoryView exposes a flat array. + +- `void *private_data` + + The private data that MemoryView provider uses internally. + This can be `NULL` when any private data is unnecessary. + +## MemoryView APIs + +### For consumers + +- `bool rb_memory_view_available_p(VALUE obj)` + + Return `true` if `obj` supports to export a MemoryView. Return `false` otherwise. + + If this function returns `true`, it doesn't mean the function `rb_memory_view_get` will succeed. + +- `bool rb_memory_view_get(VALUE obj, rb_memory_view_t *view, int flags)` + + If the given `obj` supports to export a MemoryView that conforms the given `flags`, this function fills `view` by the information of the MemoryView and returns `true`. In this case, the reference count of `obj` is increased. + + If the given combination of `obj` and `flags` cannot export a MemoryView, this function returns `false`. The content of `view` is not touched in this case. + + The exported MemoryView must be released by `rb_memory_view_release` when the MemoryView is no longer needed. + +- `bool rb_memory_view_release(rb_memory_view_t *view)` + + Release the given MemoryView `view` and decrement the reference count of `view->obj`. + + Consumers must call this function when the MemoryView is no longer needed. Missing to call this function leads memory leak. + +- `ssize_t rb_memory_view_item_size_from_format(const char *format, const char **err)` + + Calculate the number of bytes occupied by an element. + + When the calculation fails, the failed location in `format` is stored into `err`, and returns `-1`. + +- `void *rb_memory_view_get_item_pointer(rb_memory_view_t *view, const ssize_t *indices)` + + Calculate the location of the item indicated by the given `indices`. + The length of `indices` must equal to `view->ndim`. + This function initializes `view->item_desc` if needed. + +- `VALUE rb_memory_view_get_item(rb_memory_view_t *view, const ssize_t *indices)` + + Return the Ruby object representation of the item indicated by the given `indices`. + The length of `indices` must equal to `view->ndim`. + This function uses `rb_memory_view_get_item_pointer`. + +- `rb_memory_view_init_as_byte_array(rb_memory_view_t *view, VALUE obj, void *data, const ssize_t len, const bool readonly)` + + Fill the members of `view` as an 1-dimensional byte array. + +- `void rb_memory_view_fill_contiguous_strides(const ssize_t ndim, const ssize_t item_size, const ssize_t *const shape, const bool row_major_p, ssize_t *const strides)` + + Fill the `strides` array with byte-Strides of a contiguous array of the given shape with the given element size. + +- `void rb_memory_view_prepare_item_desc(rb_memory_view_t *view)` + + Fill the `item_desc` member of `view`. + +- `bool rb_memory_view_is_contiguous(const rb_memory_view_t *view)` + + Return `true` if the data in the MemoryView `view` is row-major or column-major contiguous. + + Return `false` otherwise. + +- `bool rb_memory_view_is_row_major_contiguous(const rb_memory_view_t *view)` + + Return `true` if the data in the MemoryView `view` is row-major contiguous. + + Return `false` otherwise. + +- `bool rb_memory_view_is_column_major_contiguous(const rb_memory_view_t *view)` + + Return `true` if the data in the MemoryView `view` is column-major contiguous. + + Return `false` otherwise. diff --git a/doc/contributing/reporting_issues.md b/doc/contributing/reporting_issues.md new file mode 100644 index 0000000000..a1a2295712 --- /dev/null +++ b/doc/contributing/reporting_issues.md @@ -0,0 +1,102 @@ +# Reporting Issues +## Reporting security issues + +If you've found a security vulnerability, please follow +[these instructions](https://www.ruby-lang.org/en/security/). + +## Reporting bugs + +If you've encountered a bug in Ruby, please report it to the Redmine issue +tracker available at [bugs.ruby-lang.org](https://bugs.ruby-lang.org/), by +following these steps: + +* Check if anyone has already reported your issue by + searching [the Redmine issue tracker](https://bugs.ruby-lang.org/projects/ruby-master/issues). +* If you haven't already, + [sign up for an account](https://bugs.ruby-lang.org/account/register) on the + Redmine issue tracker. +* If you can't find a ticket addressing your issue, please [create a new issue](https://bugs.ruby-lang.org/projects/ruby-master/issues/new). You will need to fill in the subject, description and Ruby version. + + * Ensure the issue exists on Ruby master by trying to replicate your bug on + the head of master (see ["making changes to Ruby"](making_changes_to_ruby.md)). + * Write a concise subject and briefly describe your problem in the description section. If + your issue affects [a released version of Ruby](#label-Backport+requests), please say so. + * Fill in the Ruby version you're using when experiencing this issue + (the output of running `ruby -v`). + * Attach any logs or reproducible programs to provide additional information. + Any scripts should be as small as possible. +* If the ticket doesn't have any replies after 10 days, you can send a + reminder. +* Please reply to feedback requests. If a bug report doesn't get any feedback, + it'll eventually get rejected. + +### Reporting website issues + +If you're having an issue with the bug tracker or the mailing list, you can +contact the webmaster, Hiroshi SHIBATA (hsbt@ruby-lang.org). + +You can report issues with ruby-lang.org on the +[repo's issue tracker](https://github.com/ruby/www.ruby-lang.org/issues). + +## Requesting features + +If there's a new feature that you want to see added to Ruby, you will need to +write a proposal on [the Redmine issue tracker](https://bugs.ruby-lang.org/projects/ruby-master/issues/new). +When you open the issue, select `Feature` in the Tracker dropdown. + +When writing a proposal, be sure to check for previous discussions on the +topic and have a solid use case. You should also consider the potential +compatibility issues that this new feature might raise. Consider making +your feature into a gem, and if there are enough people who benefit from +your feature it could help persuade Ruby core. + +Here is a template you can use for a feature proposal: + +```markdown +# Abstract + +Briefly summarize your feature + +# Background + +Describe current behavior + +# Proposal + +Describe your feature in detail + +# Use cases + +Give specific example uses of your feature + +# Discussion + +Describe why this feature is necessary and better than using existing features + +# See also + +Link to other related resources (such as implementations in other languages) +``` + +## Backport requests + +If a bug exists in a released version of Ruby, please report this in the issue. +Once this bug is fixed, the fix can be backported if deemed necessary. Only Ruby +committers can request backporting, and backporting is done by the backport manager. +New patch versions are released at the discretion of the backport manager. + +[Ruby versions](https://www.ruby-lang.org/en/downloads/) can be in one of three maintenance states: + +* Stable releases: backport any bug fixes +* Security maintenance: only backport security fixes +* End of life: no backports, please upgrade your Ruby version + +## Add context to existing issues + +There are several ways you can help with a bug that aren't directly +resolving it. These include: + +* Verifying or reproducing the existing issue and reporting it +* Adding more specific reproduction instructions +* Contributing a failing test as a patch (see ["making changes to Ruby"](making_changes_to_ruby.md)) +* Testing patches that others have submitted (see ["making changes to Ruby"](making_changes_to_ruby.md)) diff --git a/doc/contributing/testing_ruby.md b/doc/contributing/testing_ruby.md new file mode 100644 index 0000000000..4c7ce7f6a8 --- /dev/null +++ b/doc/contributing/testing_ruby.md @@ -0,0 +1,155 @@ +# Testing Ruby + +All the commands below assume that you're running them from the `build/` directory made during [Building Ruby](building_ruby.md). + +Most commands below should work with [GNU make](https://www.gnu.org/software/make/) (the default on Linux and macOS), [BSD make](https://man.freebsd.org/cgi/man.cgi?make(1)) and [NMAKE](https://learn.microsoft.com/en-us/cpp/build/reference/nmake-reference), except where indicated otherwise. + +## Test suites + +There are several test suites in the Ruby codebase: + +We can run any of the make scripts [in parallel](building_ruby.md#label-Running+make+scripts+in+parallel) to speed them up. + +1. [bootstraptest/](https://github.com/ruby/ruby/tree/master/bootstraptest) + + This is a small test suite that runs on [Miniruby](building_ruby.md#label-Miniruby+vs+Ruby). We can run it with: + + ```sh + make btest + ``` + + To run individual bootstrap tests, we can either specify a list of filenames or use the `--sets` flag in the variable `BTESTS`: + + ```sh + make btest BTESTS="../bootstraptest/test_string.rb ../bootstraptest/test_class.rb" + make btest BTESTS="--sets=string,class" + ``` + + To run these tests with verbose logging, we can add `-v` to the `OPTS`: + + ```sh + make btest OPTS="--sets=string,class -v" + ``` + + If we want to run the bootstrap test suite on Ruby (not Miniruby), we can use: + + ```sh + make test + ``` + + To run these tests with verbose logging, we can add `-v` to the `OPTS`: + + ```sh + make test OPTS=-v + ``` + + (GNU make only) To run a specific file, we can use: + + ```sh + make ../test/ruby/test_string.rb + ``` + + You can use the `-n` test option to run a specific test with a regex: + + ```sh + make ../test/ruby/test_string.rb TESTOPTS="-n /test_.*_to_s/" + ``` + +2. [test/](https://github.com/ruby/ruby/tree/master/test) + + This is a more comprehensive test suite that runs on Ruby. We can run it with: + + ```sh + make test-all + ``` + + We can run a specific test file or directory in this suite using the `TESTS` option, for example: + + ```sh + make test-all TESTS="../test/ruby/" + make test-all TESTS="../test/ruby/test_string.rb" + ``` + + We can run a specific test in this suite using the `TESTS` option, specifying + first the file name, and then the test name, prefixed with `--name`. For example: + + ```sh + make test-all TESTS="../test/ruby/test_string.rb --name=TestString#test_to_s" + ``` + + To run these tests with verbose logging, we can add `-v` to `TESTS`: + + ```sh + make test-all TESTS=-v + ``` + + We can display the help of the `TESTS` option: + + ```sh + make test-all TESTS=--help + ``` + + We can run all the tests in `test/`, `bootstraptest/` and `spec/` (the `spec/` is explained in a later section) all together with: + + ```sh + make check + ``` + +3. [spec/ruby](https://github.com/ruby/ruby/tree/master/spec/ruby) + + This is a test suite defined in [the Ruby spec repository](https://github.com/ruby/spec), and is periodically mirrored into the `spec/ruby` directory of this repository. It tests the behavior of the Ruby programming language. We can run this using: + + ```sh + make test-spec + ``` + + We can run a specific test file or directory in this suite using the `SPECOPTS` option, for example: + + ```sh + make test-spec SPECOPTS="../spec/ruby/core/string/" + make test-spec SPECOPTS="../spec/ruby/core/string/to_s_spec.rb" + ``` + + To run a specific test, we can use the `--example` flag to match against the test name: + + ```sh + make test-spec SPECOPTS="../spec/ruby/core/string/to_s_spec.rb --example='returns self when self.class == String'" + ``` + + To run these specs with verbose logging, we can add `-v` to the `SPECOPTS`: + + ```sh + make test-spec SPECOPTS="../spec/ruby/core/string/to_s_spec.rb -Vfs" + ``` + + (GNU make only) To run a ruby-spec file or directory, we can use + + ```sh + make ../spec/ruby/core/string/to_s_spec.rb + ``` + +4. [spec/bundler](https://github.com/ruby/ruby/tree/master/spec/bundler) + + The bundler test suite is defined in [the RubyGems repository](https://github.com/rubygems/rubygems/tree/master/bundler/spec), and is periodically mirrored into the `spec/ruby` directory of this repository. We can run this using: + + ```sh + make test-bundler + ``` + + To run a specific bundler spec file, we can use `BUNDLER_SPECS` as follows: + + ```sh + make test-bundler BUNDLER_SPECS=commands/exec_spec.rb + ``` + +## Troubleshooting + +### Running test suites on s390x CPU Architecture + +If we see failing tests related to the zlib library on s390x CPU architecture, we can run the test suites with `DFLTCC=0` to pass: + +```sh +DFLTCC=0 make check +``` + +The failures can happen with the zlib library applying the patch [madler/zlib#410](https://github.com/madler/zlib/pull/410) to enable the deflate algorithm producing a different compressed byte stream. We manage this issue at [[ruby-core:114942][Bug #19909]](https://bugs.ruby-lang.org/issues/19909). diff --git a/doc/contributing/vm_stack_and_frames.md b/doc/contributing/vm_stack_and_frames.md new file mode 100644 index 0000000000..c7dc59db16 --- /dev/null +++ b/doc/contributing/vm_stack_and_frames.md @@ -0,0 +1,163 @@ +# Ruby VM Stack and Frame Layout + +This document explains the Ruby VM stack architecture, including how the value +stack (SP) and control frames (CFP) share a single contiguous memory region, +and how individual frames are structured. + +## VM Stack Architecture + +The Ruby VM uses a single contiguous stack (`ec->vm_stack`) with two different +regions growing toward each other. Understanding this requires distinguishing +the overall architecture (how CFPs and values share one stack) from individual +frame internals (how values are organized for one single frame). + +```text +High addresses (ec->vm_stack + ec->vm_stack_size) + ↓ + [CFP region starts here] ↠RUBY_VM_END_CONTROL_FRAME(ec) + [CFP - 1] New frame pushed here (grows downward) + [CFP - 2] Another frame + ... + + (Unused space - stack overflow when they meet) + + ... Value stack grows UP toward higher addresses + [SP + n] Values pushed here + [ec->cfp->sp] Current executing frame's stack pointer + ↑ +Low addresses (ec->vm_stack) +``` + +The "unused space" represents free space available for new frames and values. When this gap closes (CFP meets SP), stack overflow occurs. + +### Stack Growth Directions + +**Control Frames (CFP):** + +- Start at `ec->vm_stack + ec->vm_stack_size` (high addresses) +- Grow **downward** toward lower addresses as frames are pushed +- Each new frame is allocated at `cfp - 1` (lower address) +- The `rb_control_frame_t` structure itself moves downward + +**Value Stack (SP):** + +- Starts at `ec->vm_stack` (low addresses) +- Grows **upward** toward higher addresses as values are pushed +- Each frame's `cfp->sp` points to the top of its value stack + +### Stack Overflow + +When recursive calls push too many frames, CFP grows downward until it collides +with SP growing upward. The VM detects this with `CHECK_VM_STACK_OVERFLOW0`, +which computes `const rb_control_frame_struct *bound = (void *)&sp[margin];` +and raises if `cfp <= &bound[1]`. + +## Understanding Individual Frame Value Stacks + +Each frame has its own portion of the overall VM stack, called its "VM value stack" +or simply "value stack". This space is pre-allocated when the frame is created, +with size determined by: + +- `local_size` - space for local variables +- `stack_max` - maximum depth for temporary values during execution + +The frame's value stack grows upward from its base (where self/arguments/locals +live) toward `cfp->sp` (the current top of temporary values). + +## Visualizing How Frames Fit in the VM Stack + +The left side shows the overall VM stack with CFP metadata separated from frame +values. The right side zooms into one frame's value region, revealing its internal +structure. + +```text +Overall VM Stack (ec->vm_stack): Zooming into Frame 2's value stack: + +High addr (vm_stack + vm_stack_size) High addr (cfp->sp) + ↓ ┌ + [CFP 1 metadata] │ [Temporaries] + [CFP 2 metadata] ─────────┠│ [Env: Flags/Block/CME] ↠cfp->ep + [CFP 3 metadata] │ │ [Locals] + ──────────────── │ ┌─┤ [Arguments] + (unused space) │ │ │ [self] + ──────────────── │ │ â”” + [Frame 3 values] │ │ Low addr (frame base) + [Frame 2 values] <────────┴───────┘ + [Frame 1 values] + ↑ +Low addr (vm_stack) +``` + +## Examining a Single Frame's Value Stack + +Now let's walk through a concrete Ruby program to see how a single frame's +value stack is structured internally: + +```ruby +def foo(x, y) + z = x.casecmp(y) +end + +foo(:one, :two) +``` + +First, after arguments are evaluated and right before the `send` to `foo`: + +```text + ┌────────────┠+ putself │ :two │ + putobject :one 0x2 ├────────────┤ + putobject :two │ :one │ +â–º send <:foo, argc:2> 0x1 ├────────────┤ + leave │ self │ + 0x0 └────────────┘ +``` + +The `put*` instructions have pushed 3 items onto the stack. It's now time to +add a new control frame for `foo`. The following is the shape of the stack +after one instruction in `foo`: + +```text + cfp->sp=0x8 at this point. + 0x8 ┌────────────â”◄──Stack space for temporaries + │ :one │ live above the environment. + 0x7 ├────────────┤ + getlocal x@0 │ < flags > │ foo's rb_control_frame_t +â–º getlocal y@1 0x6 ├────────────┤◄──has cfp->ep=0x6 + send <:casecmp, argc:1> │ <no block> │ + dup 0x5 ├────────────┤ The flags, block, and CME triple + setlocal z@2 │ <CME: foo> │ (VM_ENV_DATA_SIZE) form an + leave 0x4 ├────────────┤ environment. They can be used to + │ z (nil) │ figure out what local variables + 0x3 ├────────────┤ are below them. + │ :two │ + 0x2 ├────────────┤ Notice how the arguments, now + │ :one │ locals, never moved. This layout + 0x1 ├────────────┤ allows for argument transfer + │ self │ without copying. + 0x0 └────────────┘ +``` + +Given that locals have lower address than `cfp->ep`, it makes sense then that +`getlocal` in `insns.def` has `val = *(vm_get_ep(GET_EP(), level) - idx);`. +When accessing variables in the immediate scope, where `level=0`, it's +essentially `val = cfp->ep[-idx];`. + +Note that this EP-relative index has a different basis than the index that comes +after "@" in disassembly listings. The "@" index is relative to the 0th local +(`x` in this case). + +### Q&A + +Q: It seems that the receiver is always at an offset relative to EP, + like locals. Couldn't we use EP to access it instead of using `cfp->self`? + +A: Not all calls put the `self` in the callee on the stack. Two + examples are `Proc#call`, where the receiver is the Proc object, but `self` + inside the callee is `Proc#receiver`, and `yield`, where the receiver isn't + pushed onto the stack before the arguments. + +Q: Why have `cfp->ep` when it seems that everything is below `cfp->sp`? + +A: In the example, `cfp->ep` points to the stack, but it can also point to the + GC heap. Blocks can capture and evacuate their environment to the heap. diff --git a/doc/contributors.rdoc b/doc/contributors.rdoc deleted file mode 100644 index 7c3722032b..0000000000 --- a/doc/contributors.rdoc +++ /dev/null @@ -1,793 +0,0 @@ -= Contributors to Ruby - -The following list might be incomplete. Feel free to add your name if your -patch was accepted into Ruby. - -== A - -Ayumu AIZAWA (ayumin) -* committer - -AKIYOSHI, Masamichi (akiyoshi) -* committer -* He had maintained the VMS support on 2003-2004. - -Muhammad Ali -* wrote rdoc for Fiber - -Minero Aoki (aamine) -* committer -* He is the maintainer of: - * fileutils - * net/http, net/https - * net/pop - * net/smtp - * racc - * ripper - * strscan - -Wakou Aoyama (wakou) -* committer -* He was the maintainer of some standard libraries. - -Koji Arai -* committer - -arton -* He is the distributor of ActiveScriptRuby and experimental 1.9.0-x installers for win32. -* Wrote patches for win32ole, gc.c, tmpdir.rb - -Sergey Avseyev -* Added IO#pread and IO#pwrite. - -== B - -Daniel Berger -* a patch for irb -* documentation -* He wrote forwardable.rb - -David Black (dblack) -* committer -* He is the maintainer of scanf - -Ken Bloom -* a patch for REXML. - -Oliver M. Bolzer -* a patch for soap - -Alexey Borzenkov -* a patch for mkmf.rb - -Evan Brodie -* a patch for documentation of Float#round - -Richard Brown -* a patch for configure.in - -Dirkjan Bussink -* a patch for date.rb - -Daniel Bovensiepen -* documentation -* a patch for irb - -== C - -Brian Candler -* a patch for configure.in, net/telnet - -keith cascio -* a patch for optparse.rb - -Frederick Cheung -* a patch for test/ruby/test_symbol.rb - -Christoph -* patches for set.rb - -Sean Chittenden -* patches for net/http, cgi - -William D. Clinger -* ruby_strtod is based on his paper. - -== D - -Ryan Davis (ryan) -* committer -* He wrote and is the maintainer of miniunit - -Guy Decoux (ts) -* committer - -Zach Dennis - -Martin Duerst (duerst) -* committer -* M17N - -Paul Duncan -* patches for rdoc - -Alexander Dymo -* a patch for lib/benchmark.rb - -== E - -Yusuke Endoh (mame) -* committer -* He wrote and is the maintainer of base64 library (1.9) -* did much upon YARV compiler. - -erlercw -* wrote Integer::gcd2 - -== F - -Frank S.Fejes -* a patch for net/pop - -Fundakowski Feldman -* a patch for process.c - -Mauricio Fernandez -* patches for parse.y - -David Flanagan (davidflanagan) -* committer -* M17N - -Takeyuki Fujioka (xibbar) -* committer -* He is the maintainer of cgi/* - -FUKUMOTO, Atsushi -* a patch for tracer.rb - -Shota Fukumori (sorah) -* committer -* #4415 parallel unit/test - -Tadayoshi Funaba (tadf) -* committer -* He wrote and is the maintainer of - * date - * parsedate (1.8) -* He ported rational.rb and complex.rb, which 1.8 contains, into rational.c and complex.c of 1.9. - -== G - -David M. Gay -* ruby_strtod - -Florian Gilcher -* documentation - -GOTOU, Kentaro (gotoken) -* committer -* He wrote benchmark.rb -* He is the maintainer of: - * benchmark.rb - * open3 - -GOTOU, Yuuzou (gotoyuzo) -* committer - -James Edward Gray II (jeg2) -* committer -* He wrote the faster implementation of CSV and is the maintainer of csv. -* Wrote documentation for rdoc - -== H - -Phil Hagelberg -* patch for ruby-mode.el's documentation. - -Kirk Haines (wyhaines) -* committer -* the maintainer of ruby_1_8_6 branch - -Shinichiro Hamaji -* fixed memory leaks (marshal.c, string.c) - -Shin-ichiro HARA -* the developer and the sysop of ruby-{dev,list,core,talk} archive. -* a patch for numeric.c - -Chris Heath (traumdeutung) -* a patch for proc.c - -HIROKAWA Hisashi -* fixed socket/socket.c - -Daniel Hob -* He wrote: - * SMTP-TLS support for net/smtp. - * POP3S support - -Eric Hodel (drbrain) -* committer -* He is the maintainer of: - * rdoc - * ri - * rubygems - -Erik Hollensbe -* a patch for delegate.rb - -Johan Holmberg -* a patch for dir.c -* documentation - -Erik Huelsmann - -Dae San Hwang -* built a continuous integration environment on OpenSolaris. - -== I - -Nobuhiro IMAI -* a patch for logger.rb - -"incorporate" -* a patch for sprintf.c - -Keiju Ishitsuka (keiju) -* committer -* He wrote and is the maintainer of: - * cmath.rb (1.9) - * complex.rb (1.8) - * e2mmap.rb - * forwardable.rb - * irb - * mathn - * matrix.rb - * mutex_m.rb - * rational.rb (1.8) - * sync.rb - * shell/* - * thwait.rb - * tracer.rb - -== J - -Curtis Jackson -* missing/dup2.c - -Alan Johnson -* a patch for net/ftp - -Lyle Johnson -* patches for nkf, bigdecimal, numeric.c - -== K - -Yoshihiro Kambayashi -* a patch for enc/trans/single_byte.trans. -* He wrote supports for some encodings. - -Yutaka Kanemoto -* patches for common.mk, AIX AF_INET6 support - -Motoyuki Kasahara -* He wrote getoptlong.rb - -Masahiro Kawato -* a patch for shellwords.rb - -Wataru Kimura -* a patch for configure.in - -Michael Klishin -* patch for make help. - -Noritada Kobayashi -* a patch for optparse.rb - -Shigeo Kobayashi (shigek) -* committer -* He is the maintainer of bigdecimal - -KONISHI, Hiromasa (H_Konishi) -* committer -* He had maintained the bcc32 support in 2004. - -Kornelius "murphy" Kalnbach -* documentation - -K.Kosako (kosako) -* committer -* He wrote Oniguruma. - -Takehiro Kubo -* patches for dl 64bit support. - -== L - -Marc-Andre Lafortune (marcandre) -* committer -* patches for hash.c, array.c, thread.c, enumc, string.c, range.c and rdoc documentation. - -Hongli Lai -* improved pstore.rb -* patch for tool/file2lastrev.rb. - -raspberry lemon -* a patch for webrick/httpproxy.rb. - -Christian Loew -* a patch for fileutils.rb - -== M - -Shugo Maeda (shugo) -* committer -* A system administrator of ruby-lang.org servers. -* He wrote and is the maintainer of: - * monitor.rb - * net/ftp - * net/imap - -Stephan Maka (mathew) -* documentation - -Yukihiro Matsumoto (matz) -* Matz -- the founder, language designer of Ruby. -* committer -* Ruby itself, most of Ruby. -* He is the maintainer of: - * singleton - * timeout - * gdbm - * sdbm - -Konrad Meyer -* documentation - -Mib Software -* missing/vsnprintf.c - -Todd C. Miller -* missing/strlcat.c -* missing/strlcpy.c - -MIYASAKA, Masaru -* a patch for cgi.rb - -Stefan Monnier -* regex.c was fixed with based on his Emacs21 patch. - -Marcel Moolenaar -* patches for eval.c and gc.c. - -moonwolf -* a patch for REXML, xmlrpc - -Hiroshi Moriyama -* a patch for yaml. - -Kyosuke Morohashi -* a patch for gem_prelude.rb - -Kenta Murata -* patches for json, bignum.c - -Akinori MUSHA (knu) -* committer -* He wrote and is the maintainer of: - * abbrev.rb - * generator (1.8) - * enumerator (1.8) - * set - * ipaddr.rb - * digest/* - * syslog -* He is the branch maintainer of ruby_1_8, the release manager of 1.8 series. - -== N - -Hidetoshi NAGAI (nagai) -* committer -* He is the maintainer of tk/* - -Nobuyoshi Nakada (nobu) -* committer -* a.k.a. the "patch monster" -* He wrote and is the maintainer of: - * optparse - * stringio - * io/wait - * iconv - -Satoshi Nakagawa -* patches for util.c - -Narihiro Nakamura (nari) -* committer -* a.k.a. authorNari -* working at GC - -NAKAMURA, Hiroshi (nahi) -* committer -* He is the maintainer of: - * csv.rb (1.8) - * logger.rb - * soap/* (1.8) - * wsdl/* (1.8) - * xsd/* (1.8) - -NAKAMURA, Usaku (usa) -* committer -* a.k.a. unak -* He is the maintainer of mswin32 and mswin64 support. - -NARUSE, Yui (naruse) -* committer -* a.k.a. "nurse" -* Did much upon m17n. -* He is the maintainer of: - * json - * nkf - -Christian Neukirchen -* a patch for webrick/httputils - -Michael Neumann (mneumann) -* committer -* He is the maintainer of - * xmlrpc (1.8) - * gserver (1.8) - -NISHIO Hirokazu -* wrote a patch for CVE-2010-0541 - -Kazuhiro NISHIYAMA (kazu) -* committer -* a.k.a. znz - -Go Noguchi - -Martin Nordholts -* misc/rdebug.el - -nmu -* a patch for socket - -== O - -okkez -* He is a sysop of the Ruby Reference Manual Renewal Project. -* fixed ipaddr.rb, ext/etc - -Haruhiko Okumura -* some of missing/* is based on his book: - * missing/erf.c - * missing/lgamma_r.c - * missing/tgamma.c - -OMAE, jun -* a patch for debug.rb - -Eugene Ossintsev -* documentation - -== P - -Heesob Park -* a patch for win32/win32.c. - -pegacorn -* a patch for instruby.rb - -== Q - -== R - -Gaston Ramos -* documentation - -The Regents of the University of California -* missing/crypt.c -* missing/vsnprintf.c - -Sam Roberts -* patch for socket -* documentation - -Michal Rokos (michal) -* committer -* He was the maintainer of DJGPP support. - -rubikitch -* a patch for io.c - -Marcus Rueckert -* a patch for mkconfig.rb. - -Run Paint Run Run -* patch for enc/unicode.c -* documentation - -Sean Russell (ser) -* committer -* He wrote and is the maintainer of REXML. - -== S - -Kazuo Saito (ksaito) -* committer -* M17N - -Tadashi Saito -* patches for test/ruby/test_math.rb, thread_*.c, bignum.c -* working upon BigDecimal. -* did much upon documentation - -Masahiro Sakai -* a patch for io.c - -Laurent Sansonetti -* a patch for tool/ytab.sed - -Jeff Saracco -* documentation - -Koichi Sasada (ko1) -* committer -* He wrote YARV. - -Hugh Sasse -* a patch for net/http -* documentation - -Charlie Savage -* a patch for win32/Makefile.sub - -Michael Scholz -* a patch for ruby-mode.el - -Arthur Schreiber -* patch for net/http and rdoc. - -Masatoshi SEKI (seki) -* committer -* He wrote and is the maintainer of: - * drb/* - * erb - * rinda - -Roman Shterenzon -* a patch for open-uri. - -Kent Sibilev - -Gavin Sinclair (gsinclair) -* committer - -John W. Small -* He wrote gserver.rb - -Yuki Sonoda (yugui) -* committer -* She is the maintainer of man/* manual pages and is the release manager of 1.9 series. -* She wrote prime.rb. -* A developer and a sysop of redmine.ruby-lang.org. - -SOUMA, Yutaka -* a patch for pack.c. - -Tatsuki Sugiura -* WebDAV support for net/http - -Masaki Suketa (suke) -* committer -* He is the maintainer of win32ole - -sheepman -* patches for ruby.c, thread.c, stringio, enum.c, webrick, net/http - -Siena. (siena) -* committer - -Kirill A. Shutemov -* a patch for parse.y - -Darren Smith -* a patch for golf_prelude.rb - -Richard M. Stallman -* missing/alloca.c - -Robin Stocker -* documentation - -Joshua Stowers -* a patch for array.c - -Marcus Stollsteimer (stomar) -* committer -* a maintainer of www.ruby-lang.org -* patches for cgi (HTML5 tag maker), numeric.c, bigdecimal, ostruct.rb, prime.rb, and others -* documentation - -Adam Strzelecki -* a patch for compile.c - -Masashi Sumi -* improved net/pop.rb - -Eric Sunshine -* NeXT OpenStep, Rhapsody support - -Kouhei Sutou (kou) -* committer -* He wrote and is the maintainer of rss/* - -David Symonds -* documentation - -== T - -TAKANO Mitsuhiro (takano32) -* committer -* He is the maintainer of IA-64 support. -* BigDecimal - -TAKAO, Kouji (kouji) -* committer -* He is the maintainer of readline. - -Nathaniel Talbott (ntalbott) -* committer -* He was the maintainer of test/unit, runit, rubyunit. - -TANAKA, Akira (akr) -* committer -* Did much upon m17n. -* And he is the maintainer of: - * open-uri - * pathname - * pp - * resolv-replace - * resolv - * time - * tsort - -Takaaki Tateishi (ttate) -* committer -* He was the maintainer of dl - -Technorama Ltd. (technoroma) -* committer -* openssl - -Andrew Thompson -* a patch for socket.c IRIX support. - -Dave Thomas (dave) -* committer -* a.k.a. the Pragmatic Programmer. -* He wrote rdoc. - -Tietew -* patches for win32 support - -Masahiro Tomita -* a patch for cgi.rb - -Jakub Travnik -* a patch for eval.c - -Tom Truscott -* missing/crypt.c - -== U - -UEDA, Satoshi -* a patch for uri - -Takaaki Uematsu (uema2) -* committer -* He was the maintainer of WinCE support. - -UENO, Katsuhiro (katsu) -* committer -* He is the maintainer of zlib - -Hajimu UMEMOTO -* He wrote ipaddr.rb - -URABE, Shyouhei (shyouhei) -* committer -* a.k.a. mput. -* He is the branch maintainer of ruby_1_8_6 and ruby_1_8_7 -* and is the release manager of 1.8.x-pXXX. - -== V - -Joel VanderWerf -* a patch for numeric.c - -Peter Vanbroekhoven - -Corinna Vinschen - -== W - -wanabe (wanabe) -* committer -* fixed YARV and Oniguruma. - -Chun Wang -* a patch for time.rb - -WATANABE, Hirofumi (eban) -* committer -* He is the maintainer of - * ftools (1.8) - * tmpdir - * un - * Win32API - -WATANABE, Tetsuya -* a patch for ruby.c - -William Webber (wew) -* committer - -Jim Weirich (jim) -* committer -* He wrote Rake. - -Nathan Weizenbaum -* fixed misc/ruby-mode.el. - -why the lukky stiff (why) -* committer -* He is the maintainer of syck - -Caley Woods -* documentation - -Gary Wright -* documentation - -== X - -== Y - -Akira Yamada (akira) -* committer -* He is the maintainer of ruby related packages at Debian project. - -Keita Yamaguchi -* patches for enum.c, parse.y -* documentation - -Hirokazu Yamamoto (ocean) -* committer - -Hirotaka Yoshioka -* a patch for improving SEGV handling - -== Z - -Aristarkh A Zagorodnikov -* a patch for io.c - -Alexander Zavorine -* committer -* He is the maintainer for Symbian OS. - -Chiyuan Zhang -* a patch for misc/ruby-mode.el. - -Dee Zsombor (zunda) -* a patch for thread_pthread.c - -Dan Zwell -* a patch for net/pop - - diff --git a/doc/csv/arguments/io.rdoc b/doc/csv/arguments/io.rdoc new file mode 100644 index 0000000000..f5fe1d1975 --- /dev/null +++ b/doc/csv/arguments/io.rdoc @@ -0,0 +1,5 @@ +* Argument +io+ should be an IO object that is: + * Open for reading; on return, the IO object will be closed. + * Positioned at the beginning. + To position at the end, for appending, use method CSV.generate. + For any other positioning, pass a preset \StringIO object instead. diff --git a/doc/csv/options/common/col_sep.rdoc b/doc/csv/options/common/col_sep.rdoc new file mode 100644 index 0000000000..3f23c6d2d3 --- /dev/null +++ b/doc/csv/options/common/col_sep.rdoc @@ -0,0 +1,57 @@ +====== Option +col_sep+ + +Specifies the \String field separator to be used +for both parsing and generating. +The \String will be transcoded into the data's \Encoding before use. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma) + +Using the default (comma): + str = CSV.generate do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0\nbar,1\nbaz,2\n" + ary = CSV.parse(str) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using +:+ (colon): + col_sep = ':' + str = CSV.generate(col_sep: col_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo:0\nbar:1\nbaz:2\n" + ary = CSV.parse(str, col_sep: col_sep) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using +::+ (two colons): + col_sep = '::' + str = CSV.generate(col_sep: col_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo::0\nbar::1\nbaz::2\n" + ary = CSV.parse(str, col_sep: col_sep) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using <tt>''</tt> (empty string): + col_sep = '' + str = CSV.generate(col_sep: col_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo0\nbar1\nbaz2\n" + +--- + +Raises an exception if parsing with the empty \String: + col_sep = '' + # Raises ArgumentError (:col_sep must be 1 or more characters: "") + CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep) + diff --git a/doc/csv/options/common/quote_char.rdoc b/doc/csv/options/common/quote_char.rdoc new file mode 100644 index 0000000000..67fd3af68b --- /dev/null +++ b/doc/csv/options/common/quote_char.rdoc @@ -0,0 +1,42 @@ +====== Option +quote_char+ + +Specifies the character (\String of length 1) used used to quote fields +in both parsing and generating. +This String will be transcoded into the data's \Encoding before use. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (double quote) + +This is useful for an application that incorrectly uses <tt>'</tt> (single-quote) +to quote fields, instead of the correct <tt>"</tt> (double-quote). + +Using the default (double quote): + str = CSV.generate do |csv| + csv << ['foo', 0] + csv << ["'bar'", 1] + csv << ['"baz"', 2] + end + str # => "foo,0\n'bar',1\n\"\"\"baz\"\"\",2\n" + ary = CSV.parse(str) + ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]] + +Using <tt>'</tt> (single-quote): + quote_char = "'" + str = CSV.generate(quote_char: quote_char) do |csv| + csv << ['foo', 0] + csv << ["'bar'", 1] + csv << ['"baz"', 2] + end + str # => "foo,0\n'''bar''',1\n\"baz\",2\n" + ary = CSV.parse(str, quote_char: quote_char) + ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]] + +--- + +Raises an exception if the \String length is greater than 1: + # Raises ArgumentError (:quote_char has to be nil or a single character String) + CSV.new('', quote_char: 'xx') + +Raises an exception if the value is not a \String: + # Raises ArgumentError (:quote_char has to be nil or a single character String) + CSV.new('', quote_char: :foo) diff --git a/doc/csv/options/common/row_sep.rdoc b/doc/csv/options/common/row_sep.rdoc new file mode 100644 index 0000000000..eae15b4a84 --- /dev/null +++ b/doc/csv/options/common/row_sep.rdoc @@ -0,0 +1,91 @@ +====== Option +row_sep+ + +Specifies the row separator, a \String or the \Symbol <tt>:auto</tt> (see below), +to be used for both parsing and generating. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto + +--- + +When +row_sep+ is a \String, that \String becomes the row separator. +The String will be transcoded into the data's Encoding before use. + +Using <tt>"\n"</tt>: + row_sep = "\n" + str = CSV.generate(row_sep: row_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0\nbar,1\nbaz,2\n" + ary = CSV.parse(str) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using <tt>|</tt> (pipe): + row_sep = '|' + str = CSV.generate(row_sep: row_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0|bar,1|baz,2|" + ary = CSV.parse(str, row_sep: row_sep) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using <tt>--</tt> (two hyphens): + row_sep = '--' + str = CSV.generate(row_sep: row_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0--bar,1--baz,2--" + ary = CSV.parse(str, row_sep: row_sep) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using <tt>''</tt> (empty string): + row_sep = '' + str = CSV.generate(row_sep: row_sep) do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0bar,1baz,2" + ary = CSV.parse(str, row_sep: row_sep) + ary # => [["foo", "0bar", "1baz", "2"]] + +--- + +When +row_sep+ is the \Symbol +:auto+ (the default), +generating uses <tt>"\n"</tt> as the row separator: + str = CSV.generate do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0\nbar,1\nbaz,2\n" + +Parsing, on the other hand, invokes auto-discovery of the row separator. + +Auto-discovery reads ahead in the data looking for the next <tt>\r\n</tt>, +\n+, or +\r+ sequence. +The sequence will be selected even if it occurs in a quoted field, +assuming that you would have the same line endings there. + +Example: + str = CSV.generate do |csv| + csv << [:foo, 0] + csv << [:bar, 1] + csv << [:baz, 2] + end + str # => "foo,0\nbar,1\nbaz,2\n" + ary = CSV.parse(str) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +The default <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>) is used +if any of the following is true: +* None of those sequences is found. +* Data is +ARGF+, +STDIN+, +STDOUT+, or +STDERR+. +* The stream is only available for output. + +Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead. diff --git a/doc/csv/options/generating/force_quotes.rdoc b/doc/csv/options/generating/force_quotes.rdoc new file mode 100644 index 0000000000..11afd1a16c --- /dev/null +++ b/doc/csv/options/generating/force_quotes.rdoc @@ -0,0 +1,17 @@ +====== Option +force_quotes+ + +Specifies the boolean that determines whether each output field is to be double-quoted. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:force_quotes) # => false + +For examples in this section: + ary = ['foo', 0, nil] + +Using the default, +false+: + str = CSV.generate_line(ary) + str # => "foo,0,\n" + +Using +true+: + str = CSV.generate_line(ary, force_quotes: true) + str # => "\"foo\",\"0\",\"\"\n" diff --git a/doc/csv/options/generating/quote_empty.rdoc b/doc/csv/options/generating/quote_empty.rdoc new file mode 100644 index 0000000000..4c5645c662 --- /dev/null +++ b/doc/csv/options/generating/quote_empty.rdoc @@ -0,0 +1,12 @@ +====== Option +quote_empty+ + +Specifies the boolean that determines whether an empty value is to be double-quoted. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:quote_empty) # => true + +With the default +true+: + CSV.generate_line(['"', ""]) # => "\"\"\"\",\"\"\n" + +With +false+: + CSV.generate_line(['"', ""], quote_empty: false) # => "\"\"\"\",\n" diff --git a/doc/csv/options/generating/write_converters.rdoc b/doc/csv/options/generating/write_converters.rdoc new file mode 100644 index 0000000000..d1a9cc748f --- /dev/null +++ b/doc/csv/options/generating/write_converters.rdoc @@ -0,0 +1,25 @@ +====== Option +write_converters+ + +Specifies converters to be used in generating fields. +See {Write Converters}[#class-CSV-label-Write+Converters] + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil + +With no write converter: + str = CSV.generate_line(["\na\n", "\tb\t", " c "]) + str # => "\"\na\n\",\tb\t, c \n" + +With a write converter: + strip_converter = proc {|field| field.strip } + str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter) + str # => "a,b,c\n" + +With two write converters (called in order): + upcase_converter = proc {|field| field.upcase } + downcase_converter = proc {|field| field.downcase } + write_converters = [upcase_converter, downcase_converter] + str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters) + str # => "a,b,c\n" + +See also {Write Converters}[#class-CSV-label-Write+Converters] diff --git a/doc/csv/options/generating/write_empty_value.rdoc b/doc/csv/options/generating/write_empty_value.rdoc new file mode 100644 index 0000000000..67be5662cb --- /dev/null +++ b/doc/csv/options/generating/write_empty_value.rdoc @@ -0,0 +1,15 @@ +====== Option +write_empty_value+ + +Specifies the object that is to be substituted for each field +that has an empty \String. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:write_empty_value) # => "" + +Without the option: + str = CSV.generate_line(['a', '', 'c', '']) + str # => "a,\"\",c,\"\"\n" + +With the option: + str = CSV.generate_line(['a', '', 'c', ''], write_empty_value: "x") + str # => "a,x,c,x\n" diff --git a/doc/csv/options/generating/write_headers.rdoc b/doc/csv/options/generating/write_headers.rdoc new file mode 100644 index 0000000000..c56aa48adb --- /dev/null +++ b/doc/csv/options/generating/write_headers.rdoc @@ -0,0 +1,29 @@ +====== Option +write_headers+ + +Specifies the boolean that determines whether a header row is included in the output; +ignored if there are no headers. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:write_headers) # => nil + +Without +write_headers+: + file_path = 't.csv' + CSV.open(file_path,'w', + :headers => ['Name','Value'] + ) do |csv| + csv << ['foo', '0'] + end + CSV.open(file_path) do |csv| + csv.shift + end # => ["foo", "0"] + +With +write_headers+": + CSV.open(file_path,'w', + :write_headers => true, + :headers => ['Name','Value'] + ) do |csv| + csv << ['foo', '0'] + end + CSV.open(file_path) do |csv| + csv.shift + end # => ["Name", "Value"] diff --git a/doc/csv/options/generating/write_nil_value.rdoc b/doc/csv/options/generating/write_nil_value.rdoc new file mode 100644 index 0000000000..65d33ff54e --- /dev/null +++ b/doc/csv/options/generating/write_nil_value.rdoc @@ -0,0 +1,14 @@ +====== Option +write_nil_value+ + +Specifies the object that is to be substituted for each +nil+-valued field. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:write_nil_value) # => nil + +Without the option: + str = CSV.generate_line(['a', nil, 'c', nil]) + str # => "a,,c,\n" + +With the option: + str = CSV.generate_line(['a', nil, 'c', nil], write_nil_value: "x") + str # => "a,x,c,x\n" diff --git a/doc/csv/options/parsing/converters.rdoc b/doc/csv/options/parsing/converters.rdoc new file mode 100644 index 0000000000..211fa48de6 --- /dev/null +++ b/doc/csv/options/parsing/converters.rdoc @@ -0,0 +1,46 @@ +====== Option +converters+ + +Specifies converters to be used in parsing fields. +See {Field Converters}[#class-CSV-label-Field+Converters] + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil + +The value may be a field converter name +(see {Stored Converters}[#class-CSV-label-Stored+Converters]): + str = '1,2,3' + # Without a converter + array = CSV.parse_line(str) + array # => ["1", "2", "3"] + # With built-in converter :integer + array = CSV.parse_line(str, converters: :integer) + array # => [1, 2, 3] + +The value may be a converter list +(see {Converter Lists}[#class-CSV-label-Converter+Lists]): + str = '1,3.14159' + # Without converters + array = CSV.parse_line(str) + array # => ["1", "3.14159"] + # With built-in converters + array = CSV.parse_line(str, converters: [:integer, :float]) + array # => [1, 3.14159] + +The value may be a \Proc custom converter: +(see {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters]): + str = ' foo , bar , baz ' + # Without a converter + array = CSV.parse_line(str) + array # => [" foo ", " bar ", " baz "] + # With a custom converter + array = CSV.parse_line(str, converters: proc {|field| field.strip }) + array # => ["foo", "bar", "baz"] + +See also {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters] + +--- + +Raises an exception if the converter is not a converter name or a \Proc: + str = 'foo,0' + # Raises NoMethodError (undefined method `arity' for nil:NilClass) + CSV.parse(str, converters: :foo) diff --git a/doc/csv/options/parsing/empty_value.rdoc b/doc/csv/options/parsing/empty_value.rdoc new file mode 100644 index 0000000000..7d3bcc078c --- /dev/null +++ b/doc/csv/options/parsing/empty_value.rdoc @@ -0,0 +1,13 @@ +====== Option +empty_value+ + +Specifies the object that is to be substituted +for each field that has an empty \String. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:empty_value) # => "" (empty string) + +With the default, <tt>""</tt>: + CSV.parse_line('a,"",b,"",c') # => ["a", "", "b", "", "c"] + +With a different object: + CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"] diff --git a/doc/csv/options/parsing/field_size_limit.rdoc b/doc/csv/options/parsing/field_size_limit.rdoc new file mode 100644 index 0000000000..797c5776fc --- /dev/null +++ b/doc/csv/options/parsing/field_size_limit.rdoc @@ -0,0 +1,39 @@ +====== Option +field_size_limit+ + +Specifies the \Integer field size limit. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:field_size_limit) # => nil + +This is a maximum size CSV will read ahead looking for the closing quote for a field. +(In truth, it reads to the first line ending beyond this size.) +If a quote cannot be found within the limit CSV will raise a MalformedCSVError, +assuming the data is faulty. +You can use this limit to prevent what are effectively DoS attacks on the parser. +However, this limit can cause a legitimate parse to fail; +therefore the default value is +nil+ (no limit). + +For the examples in this section: + str = <<~EOT + "a","b" + " + 2345 + ","" + EOT + str # => "\"a\",\"b\"\n\"\n2345\n\",\"\"\n" + +Using the default +nil+: + ary = CSV.parse(str) + ary # => [["a", "b"], ["\n2345\n", ""]] + +Using <tt>50</tt>: + field_size_limit = 50 + ary = CSV.parse(str, field_size_limit: field_size_limit) + ary # => [["a", "b"], ["\n2345\n", ""]] + +--- + +Raises an exception if a field is too long: + big_str = "123456789\n" * 1024 + # Raises CSV::MalformedCSVError (Field size exceeded in line 1.) + CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048) diff --git a/doc/csv/options/parsing/header_converters.rdoc b/doc/csv/options/parsing/header_converters.rdoc new file mode 100644 index 0000000000..309180805f --- /dev/null +++ b/doc/csv/options/parsing/header_converters.rdoc @@ -0,0 +1,43 @@ +====== Option +header_converters+ + +Specifies converters to be used in parsing headers. +See {Header Converters}[#class-CSV-label-Header+Converters] + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil + +Identical in functionality to option {converters}[#class-CSV-label-Option+converters] +except that: +- The converters apply only to the header row. +- The built-in header converters are +:downcase+ and +:symbol+. + +This section assumes prior execution of: + str = <<-EOT + Name,Value + foo,0 + bar,1 + baz,2 + EOT + # With no header converter + table = CSV.parse(str, headers: true) + table.headers # => ["Name", "Value"] + +The value may be a header converter name +(see {Stored Converters}[#class-CSV-label-Stored+Converters]): + table = CSV.parse(str, headers: true, header_converters: :downcase) + table.headers # => ["name", "value"] + +The value may be a converter list +(see {Converter Lists}[#class-CSV-label-Converter+Lists]): + header_converters = [:downcase, :symbol] + table = CSV.parse(str, headers: true, header_converters: header_converters) + table.headers # => [:name, :value] + +The value may be a \Proc custom converter +(see {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters]): + upcase_converter = proc {|field| field.upcase } + table = CSV.parse(str, headers: true, header_converters: upcase_converter) + table.headers # => ["NAME", "VALUE"] + +See also {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters] + diff --git a/doc/csv/options/parsing/headers.rdoc b/doc/csv/options/parsing/headers.rdoc new file mode 100644 index 0000000000..0ea151f24b --- /dev/null +++ b/doc/csv/options/parsing/headers.rdoc @@ -0,0 +1,63 @@ +====== Option +headers+ + +Specifies a boolean, \Symbol, \Array, or \String to be used +to define column headers. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:headers) # => false + +--- + +Without +headers+: + str = <<-EOT + Name,Count + foo,0 + bar,1 + bax,2 + EOT + csv = CSV.new(str) + csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\""> + csv.headers # => nil + csv.shift # => ["Name", "Count"] + +--- + +If set to +true+ or the \Symbol +:first_row+, +the first row of the data is treated as a row of headers: + str = <<-EOT + Name,Count + foo,0 + bar,1 + bax,2 + EOT + csv = CSV.new(str, headers: true) + csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:2 col_sep:"," row_sep:"\n" quote_char:"\"" headers:["Name", "Count"]> + csv.headers # => ["Name", "Count"] + csv.shift # => #<CSV::Row "Name":"bar" "Count":"1"> + +--- + +If set to an \Array, the \Array elements are treated as headers: + str = <<-EOT + foo,0 + bar,1 + bax,2 + EOT + csv = CSV.new(str, headers: ['Name', 'Count']) + csv + csv.headers # => ["Name", "Count"] + csv.shift # => #<CSV::Row "Name":"bar" "Count":"1"> + +--- + +If set to a \String +str+, method <tt>CSV::parse_line(str, options)</tt> is called +with the current +options+, and the returned \Array is treated as headers: + str = <<-EOT + foo,0 + bar,1 + bax,2 + EOT + csv = CSV.new(str, headers: 'Name,Count') + csv + csv.headers # => ["Name", "Count"] + csv.shift # => #<CSV::Row "Name":"bar" "Count":"1"> diff --git a/doc/csv/options/parsing/liberal_parsing.rdoc b/doc/csv/options/parsing/liberal_parsing.rdoc new file mode 100644 index 0000000000..603de28613 --- /dev/null +++ b/doc/csv/options/parsing/liberal_parsing.rdoc @@ -0,0 +1,38 @@ +====== Option +liberal_parsing+ + +Specifies the boolean or hash value that determines whether +CSV will attempt to parse input not conformant with RFC 4180, +such as double quotes in unquoted fields. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing) # => false + +For the next two examples: + str = 'is,this "three, or four",fields' + +Without +liberal_parsing+: + # Raises CSV::MalformedCSVError (Illegal quoting in str 1.) + CSV.parse_line(str) + +With +liberal_parsing+: + ary = CSV.parse_line(str, liberal_parsing: true) + ary # => ["is", "this \"three", " or four\"", "fields"] + +Use the +backslash_quote+ sub-option to parse values that use +a backslash to escape a double-quote character. This +causes the parser to treat <code>\"</code> as if it were +<code>""</code>. + +For the next two examples: + str = 'Show,"Harry \"Handcuff\" Houdini, the one and only","Tampa Theater"' + +With +liberal_parsing+, but without the +backslash_quote+ sub-option: + # Incorrect interpretation of backslash; incorrectly interprets the quoted comma as a field separator. + ary = CSV.parse_line(str, liberal_parsing: true) + ary # => ["Show", "\"Harry \\\"Handcuff\\\" Houdini", " the one and only\"", "Tampa Theater"] + puts ary[1] # => "Harry \"Handcuff\" Houdini + +With +liberal_parsing+ and its +backslash_quote+ sub-option: + ary = CSV.parse_line(str, liberal_parsing: { backslash_quote: true }) + ary # => ["Show", "Harry \"Handcuff\" Houdini, the one and only", "Tampa Theater"] + puts ary[1] # => Harry "Handcuff" Houdini, the one and only diff --git a/doc/csv/options/parsing/nil_value.rdoc b/doc/csv/options/parsing/nil_value.rdoc new file mode 100644 index 0000000000..412e8795e8 --- /dev/null +++ b/doc/csv/options/parsing/nil_value.rdoc @@ -0,0 +1,12 @@ +====== Option +nil_value+ + +Specifies the object that is to be substituted for each null (no-text) field. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:nil_value) # => nil + +With the default, +nil+: + CSV.parse_line('a,,b,,c') # => ["a", nil, "b", nil, "c"] + +With a different object: + CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"] diff --git a/doc/csv/options/parsing/return_headers.rdoc b/doc/csv/options/parsing/return_headers.rdoc new file mode 100644 index 0000000000..45d2e3f3de --- /dev/null +++ b/doc/csv/options/parsing/return_headers.rdoc @@ -0,0 +1,22 @@ +====== Option +return_headers+ + +Specifies the boolean that determines whether method #shift +returns or ignores the header row. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:return_headers) # => false + +Examples: + str = <<-EOT + Name,Count + foo,0 + bar,1 + bax,2 + EOT + # Without return_headers first row is str. + csv = CSV.new(str, headers: true) + csv.shift # => #<CSV::Row "Name":"foo" "Count":"0"> + # With return_headers first row is headers. + csv = CSV.new(str, headers: true, return_headers: true) + csv.shift # => #<CSV::Row "Name":"Name" "Count":"Count"> + diff --git a/doc/csv/options/parsing/skip_blanks.rdoc b/doc/csv/options/parsing/skip_blanks.rdoc new file mode 100644 index 0000000000..2c8f7b7bb8 --- /dev/null +++ b/doc/csv/options/parsing/skip_blanks.rdoc @@ -0,0 +1,31 @@ +====== Option +skip_blanks+ + +Specifies a boolean that determines whether blank lines in the input will be ignored; +a line that contains a column separator is not considered to be blank. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:skip_blanks) # => false + +See also option {skiplines}[#class-CSV-label-Option+skip_lines]. + +For examples in this section: + str = <<-EOT + foo,0 + + bar,1 + baz,2 + + , + EOT + +Using the default, +false+: + ary = CSV.parse(str) + ary # => [["foo", "0"], [], ["bar", "1"], ["baz", "2"], [], [nil, nil]] + +Using +true+: + ary = CSV.parse(str, skip_blanks: true) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]] + +Using a truthy value: + ary = CSV.parse(str, skip_blanks: :foo) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]] diff --git a/doc/csv/options/parsing/skip_lines.rdoc b/doc/csv/options/parsing/skip_lines.rdoc new file mode 100644 index 0000000000..1481c40a5f --- /dev/null +++ b/doc/csv/options/parsing/skip_lines.rdoc @@ -0,0 +1,37 @@ +====== Option +skip_lines+ + +Specifies an object to use in identifying comment lines in the input that are to be ignored: +* If a \Regexp, ignores lines that match it. +* If a \String, converts it to a \Regexp, ignores lines that match it. +* If +nil+, no lines are considered to be comments. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:skip_lines) # => nil + +For examples in this section: + str = <<-EOT + # Comment + foo,0 + bar,1 + baz,2 + # Another comment + EOT + str # => "# Comment\nfoo,0\nbar,1\nbaz,2\n# Another comment\n" + +Using the default, +nil+: + ary = CSV.parse(str) + ary # => [["# Comment"], ["foo", "0"], ["bar", "1"], ["baz", "2"], ["# Another comment"]] + +Using a \Regexp: + ary = CSV.parse(str, skip_lines: /^#/) + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Using a \String: + ary = CSV.parse(str, skip_lines: '#') + ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +--- + +Raises an exception if given an object that is not a \Regexp, a \String, or +nil+: + # Raises ArgumentError (:skip_lines has to respond to #match: 0) + CSV.parse(str, skip_lines: 0) diff --git a/doc/csv/options/parsing/strip.rdoc b/doc/csv/options/parsing/strip.rdoc new file mode 100644 index 0000000000..56ae4310c3 --- /dev/null +++ b/doc/csv/options/parsing/strip.rdoc @@ -0,0 +1,15 @@ +====== Option +strip+ + +Specifies the boolean value that determines whether +whitespace is stripped from each input field. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:strip) # => false + +With default value +false+: + ary = CSV.parse_line(' a , b ') + ary # => [" a ", " b "] + +With value +true+: + ary = CSV.parse_line(' a , b ', strip: true) + ary # => ["a", "b"] diff --git a/doc/csv/options/parsing/unconverted_fields.rdoc b/doc/csv/options/parsing/unconverted_fields.rdoc new file mode 100644 index 0000000000..3e7f839d49 --- /dev/null +++ b/doc/csv/options/parsing/unconverted_fields.rdoc @@ -0,0 +1,27 @@ +====== Option +unconverted_fields+ + +Specifies the boolean that determines whether unconverted field values are to be available. + +Default value: + CSV::DEFAULT_OPTIONS.fetch(:unconverted_fields) # => nil + +The unconverted field values are those found in the source data, +prior to any conversions performed via option +converters+. + +When option +unconverted_fields+ is +true+, +each returned row (\Array or \CSV::Row) has an added method, ++unconverted_fields+, that returns the unconverted field values: + str = <<-EOT + foo,0 + bar,1 + baz,2 + EOT + # Without unconverted_fields + csv = CSV.parse(str, converters: :integer) + csv # => [["foo", 0], ["bar", 1], ["baz", 2]] + csv.first.respond_to?(:unconverted_fields) # => false + # With unconverted_fields + csv = CSV.parse(str, converters: :integer, unconverted_fields: true) + csv # => [["foo", 0], ["bar", 1], ["baz", 2]] + csv.first.respond_to?(:unconverted_fields) # => true + csv.first.unconverted_fields # => ["foo", "0"] diff --git a/doc/csv/recipes/filtering.rdoc b/doc/csv/recipes/filtering.rdoc new file mode 100644 index 0000000000..1552bf0fb8 --- /dev/null +++ b/doc/csv/recipes/filtering.rdoc @@ -0,0 +1,158 @@ +== Recipes for Filtering \CSV + +These recipes are specific code examples for specific \CSV filtering tasks. + +For other recipes, see {Recipes for CSV}[./recipes_rdoc.html]. + +All code snippets on this page assume that the following has been executed: + require 'csv' + +=== Contents + +- {Source and Output Formats}[#label-Source+and+Output+Formats] + - {Filtering String to String}[#label-Filtering+String+to+String] + - {Recipe: Filter String to String with Headers}[#label-Recipe-3A+Filter+String+to+String+with+Headers] + - {Recipe: Filter String to String Without Headers}[#label-Recipe-3A+Filter+String+to+String+Without+Headers] + - {Filtering String to IO Stream}[#label-Filtering+String+to+IO+Stream] + - {Recipe: Filter String to IO Stream with Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+with+Headers] + - {Recipe: Filter String to IO Stream Without Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+Without+Headers] + - {Filtering IO Stream to String}[#label-Filtering+IO+Stream+to+String] + - {Recipe: Filter IO Stream to String with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+with+Headers] + - {Recipe: Filter IO Stream to String Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+Without+Headers] + - {Filtering IO Stream to IO Stream}[#label-Filtering+IO+Stream+to+IO+Stream] + - {Recipe: Filter IO Stream to IO Stream with Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+with+Headers] + - {Recipe: Filter IO Stream to IO Stream Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+Without+Headers] + +=== Source and Output Formats + +You can use a Unix-style "filter" for \CSV data. +The filter reads source \CSV data and writes output \CSV data as modified by the filter. +The input and output \CSV data may be any mixture of \Strings and \IO streams. + +==== Filtering \String to \String + +You can filter one \String to another, with or without headers. + +===== Recipe: Filter \String to \String with Headers + +Use class method CSV.filter with option +headers+ to filter a \String to another \String: + in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + out_string = '' + CSV.filter(in_string, out_string, headers: true) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n" + +===== Recipe: Filter \String to \String Without Headers + +Use class method CSV.filter without option +headers+ to filter a \String to another \String: + in_string = "foo,0\nbar,1\nbaz,2\n" + out_string = '' + CSV.filter(in_string, out_string) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n" + +==== Filtering \String to \IO Stream + +You can filter a \String to an \IO stream, with or without headers. + +===== Recipe: Filter \String to \IO Stream with Headers + +Use class method CSV.filter with option +headers+ to filter a \String to an \IO stream: + in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.open(path, 'w') do |out_io| + CSV.filter(in_string, out_io, headers: true) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + p File.read(path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n" + +===== Recipe: Filter \String to \IO Stream Without Headers + +Use class method CSV.filter without option +headers+ to filter a \String to an \IO stream: + in_string = "foo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.open(path, 'w') do |out_io| + CSV.filter(in_string, out_io) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + p File.read(path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n" + +==== Filtering \IO Stream to \String + +You can filter an \IO stream to a \String, with or without headers. + +===== Recipe: Filter \IO Stream to \String with Headers + +Use class method CSV.filter with option +headers+ to filter an \IO stream to a \String: + in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, in_string) + out_string = '' + File.open(path, headers: true) do |in_io| + CSV.filter(in_io, out_string, headers: true) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n" + +===== Recipe: Filter \IO Stream to \String Without Headers + +Use class method CSV.filter without option +headers+ to filter an \IO stream to a \String: + in_string = "foo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, in_string) + out_string = '' + File.open(path) do |in_io| + CSV.filter(in_io, out_string) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n" + +==== Filtering \IO Stream to \IO Stream + +You can filter an \IO stream to another \IO stream, with or without headers. + +===== Recipe: Filter \IO Stream to \IO Stream with Headers + +Use class method CSV.filter with option +headers+ to filter an \IO stream to another \IO stream: + in_path = 't.csv' + in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + File.write(in_path, in_string) + out_path = 'u.csv' + File.open(in_path) do |in_io| + File.open(out_path, 'w') do |out_io| + CSV.filter(in_io, out_io, headers: true) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + end + p File.read(out_path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n" + +===== Recipe: Filter \IO Stream to \IO Stream Without Headers + +Use class method CSV.filter without option +headers+ to filter an \IO stream to another \IO stream: + in_path = 't.csv' + in_string = "foo,0\nbar,1\nbaz,2\n" + File.write(in_path, in_string) + out_path = 'u.csv' + File.open(in_path) do |in_io| + File.open(out_path, 'w') do |out_io| + CSV.filter(in_io, out_io) do |row| + row[0] = row[0].upcase + row[1] *= 4 + end + end + end + p File.read(out_path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n" diff --git a/doc/csv/recipes/generating.rdoc b/doc/csv/recipes/generating.rdoc new file mode 100644 index 0000000000..e61838d31a --- /dev/null +++ b/doc/csv/recipes/generating.rdoc @@ -0,0 +1,246 @@ +== Recipes for Generating \CSV + +These recipes are specific code examples for specific \CSV generating tasks. + +For other recipes, see {Recipes for CSV}[./recipes_rdoc.html]. + +All code snippets on this page assume that the following has been executed: + require 'csv' + +=== Contents + +- {Output Formats}[#label-Output+Formats] + - {Generating to a String}[#label-Generating+to+a+String] + - {Recipe: Generate to String with Headers}[#label-Recipe-3A+Generate+to+String+with+Headers] + - {Recipe: Generate to String Without Headers}[#label-Recipe-3A+Generate+to+String+Without+Headers] + - {Generating to a File}[#label-Generating+to+a+File] + - {Recipe: Generate to File with Headers}[#label-Recipe-3A+Generate+to+File+with+Headers] + - {Recipe: Generate to File Without Headers}[#label-Recipe-3A+Generate+to+File+Without+Headers] + - {Generating to IO an Stream}[#label-Generating+to+an+IO+Stream] + - {Recipe: Generate to IO Stream with Headers}[#label-Recipe-3A+Generate+to+IO+Stream+with+Headers] + - {Recipe: Generate to IO Stream Without Headers}[#label-Recipe-3A+Generate+to+IO+Stream+Without+Headers] +- {Converting Fields}[#label-Converting+Fields] + - {Recipe: Filter Generated Field Strings}[#label-Recipe-3A+Filter+Generated+Field+Strings] + - {Recipe: Specify Multiple Write Converters}[#label-Recipe-3A+Specify+Multiple+Write+Converters] +- {RFC 4180 Compliance}[#label-RFC+4180+Compliance] + - {Row Separator}[#label-Row+Separator] + - {Recipe: Generate Compliant Row Separator}[#label-Recipe-3A+Generate+Compliant+Row+Separator] + - {Recipe: Generate Non-Compliant Row Separator}[#label-Recipe-3A+Generate+Non-Compliant+Row+Separator] + - {Column Separator}[#label-Column+Separator] + - {Recipe: Generate Compliant Column Separator}[#label-Recipe-3A+Generate+Compliant+Column+Separator] + - {Recipe: Generate Non-Compliant Column Separator}[#label-Recipe-3A+Generate+Non-Compliant+Column+Separator] + - {Quote Character}[#label-Quote+Character] + - {Recipe: Generate Compliant Quote Character}[#label-Recipe-3A+Generate+Compliant+Quote+Character] + - {Recipe: Generate Non-Compliant Quote Character}[#label-Recipe-3A+Generate+Non-Compliant+Quote+Character] + +=== Output Formats + +You can generate \CSV output to a \String, to a \File (via its path), or to an \IO stream. + +==== Generating to a \String + +You can generate \CSV output to a \String, with or without headers. + +===== Recipe: Generate to \String with Headers + +Use class method CSV.generate with option +headers+ to generate to a \String. + +This example uses method CSV#<< to append the rows +that are to be generated: + output_string = CSV.generate('', headers: ['Name', 'Value'], write_headers: true) do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n" + +===== Recipe: Generate to \String Without Headers + +Use class method CSV.generate without option +headers+ to generate to a \String. + +This example uses method CSV#<< to append the rows +that are to be generated: + output_string = CSV.generate do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Foo,0\nBar,1\nBaz,2\n" + +==== Generating to a \File + +You can generate /CSV data to a \File, with or without headers. + +===== Recipe: Generate to \File with Headers + +Use class method CSV.open with option +headers+ generate to a \File. + +This example uses method CSV#<< to append the rows +that are to be generated: + path = 't.csv' + CSV.open(path, 'w', headers: ['Name', 'Value'], write_headers: true) do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + p File.read(path) # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n" + +===== Recipe: Generate to \File Without Headers + +Use class method CSV.open without option +headers+ to generate to a \File. + +This example uses method CSV#<< to append the rows +that are to be generated: + path = 't.csv' + CSV.open(path, 'w') do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + p File.read(path) # => "Foo,0\nBar,1\nBaz,2\n" + +==== Generating to an \IO Stream + +You can generate \CSV data to an \IO stream, with or without headers. + +==== Recipe: Generate to \IO Stream with Headers + +Use class method CSV.new with option +headers+ to generate \CSV data to an \IO stream: + path = 't.csv' + File.open(path, 'w') do |file| + csv = CSV.new(file, headers: ['Name', 'Value'], write_headers: true) + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + p File.read(path) # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n" + +===== Recipe: Generate to \IO Stream Without Headers + +Use class method CSV.new without option +headers+ to generate \CSV data to an \IO stream: + path = 't.csv' + File.open(path, 'w') do |file| + csv = CSV.new(file) + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + p File.read(path) # => "Foo,0\nBar,1\nBaz,2\n" + +=== Converting Fields + +You can use _write_ _converters_ to convert fields when generating \CSV. + +==== Recipe: Filter Generated Field Strings + +Use option <tt>:write_converters</tt> and a custom converter to convert field values when generating \CSV. + +This example defines and uses a custom write converter to strip whitespace from generated fields: + strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field } + output_string = CSV.generate(write_converters: strip_converter) do |csv| + csv << [' foo ', 0] + csv << [' bar ', 1] + csv << [' baz ', 2] + end + output_string # => "foo,0\nbar,1\nbaz,2\n" + +==== Recipe: Specify Multiple Write Converters + +Use option <tt>:write_converters</tt> and multiple custom converters +to convert field values when generating \CSV. + +This example defines and uses two custom write converters to strip and upcase generated fields: + strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field } + upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field } + converters = [strip_converter, upcase_converter] + output_string = CSV.generate(write_converters: converters) do |csv| + csv << [' foo ', 0] + csv << [' bar ', 1] + csv << [' baz ', 2] + end + output_string # => "FOO,0\nBAR,1\nBAZ,2\n" + +=== RFC 4180 Compliance + +By default, \CSV generates data that is compliant with +{RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180] +with respect to: +- Column separator. +- Quote character. + +==== Row Separator + +RFC 4180 specifies the row separator CRLF (Ruby <tt>"\r\n"</tt>). + +===== Recipe: Generate Compliant Row Separator + +For strict compliance, use option +:row_sep+ to specify row separator <tt>"\r\n"</tt>: + output_string = CSV.generate('', row_sep: "\r\n") do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Foo,0\r\nBar,1\r\nBaz,2\r\n" + +===== Recipe: Generate Non-Compliant Row Separator + +For data with non-compliant row separators, use option +:row_sep+ with a different value: +This example source uses semicolon (<tt>";'</tt>) as its row separator: + output_string = CSV.generate('', row_sep: ";") do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Foo,0;Bar,1;Baz,2;" + +==== Column Separator + +RFC 4180 specifies column separator COMMA (Ruby <tt>","</tt>). + +===== Recipe: Generate Compliant Column Separator + +Because the \CSV default comma separator is <tt>","</tt>, +you need not specify option +:col_sep+ for compliant data: + output_string = CSV.generate('') do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Foo,0\nBar,1\nBaz,2\n" + +===== Recipe: Generate Non-Compliant Column Separator + +For data with non-compliant column separators, use option +:col_sep+. +This example source uses TAB (<tt>"\t"</tt>) as its column separator: + output_string = CSV.generate('', col_sep: "\t") do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "Foo\t0\nBar\t1\nBaz\t2\n" + +==== Quote Character + +RFC 4180 specifies quote character DQUOTE (Ruby <tt>"\""</tt>). + +===== Recipe: Generate Compliant Quote Character + +Because the \CSV default quote character is <tt>"\""</tt>, +you need not specify option +:quote_char+ for compliant data: + output_string = CSV.generate('', force_quotes: true) do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "\"Foo\",\"0\"\n\"Bar\",\"1\"\n\"Baz\",\"2\"\n" + +===== Recipe: Generate Non-Compliant Quote Character + +For data with non-compliant quote characters, use option +:quote_char+. +This example source uses SQUOTE (<tt>"'"</tt>) as its quote character: + output_string = CSV.generate('', quote_char: "'", force_quotes: true) do |csv| + csv << ['Foo', 0] + csv << ['Bar', 1] + csv << ['Baz', 2] + end + output_string # => "'Foo','0'\n'Bar','1'\n'Baz','2'\n" diff --git a/doc/csv/recipes/parsing.rdoc b/doc/csv/recipes/parsing.rdoc new file mode 100644 index 0000000000..1b7071e33f --- /dev/null +++ b/doc/csv/recipes/parsing.rdoc @@ -0,0 +1,545 @@ +== Recipes for Parsing \CSV + +These recipes are specific code examples for specific \CSV parsing tasks. + +For other recipes, see {Recipes for CSV}[./recipes_rdoc.html]. + +All code snippets on this page assume that the following has been executed: + require 'csv' + +=== Contents + +- {Source Formats}[#label-Source+Formats] + - {Parsing from a String}[#label-Parsing+from+a+String] + - {Recipe: Parse from String with Headers}[#label-Recipe-3A+Parse+from+String+with+Headers] + - {Recipe: Parse from String Without Headers}[#label-Recipe-3A+Parse+from+String+Without+Headers] + - {Parsing from a File}[#label-Parsing+from+a+File] + - {Recipe: Parse from File with Headers}[#label-Recipe-3A+Parse+from+File+with+Headers] + - {Recipe: Parse from File Without Headers}[#label-Recipe-3A+Parse+from+File+Without+Headers] + - {Parsing from an IO Stream}[#label-Parsing+from+an+IO+Stream] + - {Recipe: Parse from IO Stream with Headers}[#label-Recipe-3A+Parse+from+IO+Stream+with+Headers] + - {Recipe: Parse from IO Stream Without Headers}[#label-Recipe-3A+Parse+from+IO+Stream+Without+Headers] +- {RFC 4180 Compliance}[#label-RFC+4180+Compliance] + - {Row Separator}[#label-Row+Separator] + - {Recipe: Handle Compliant Row Separator}[#label-Recipe-3A+Handle+Compliant+Row+Separator] + - {Recipe: Handle Non-Compliant Row Separator}[#label-Recipe-3A+Handle+Non-Compliant+Row+Separator] + - {Column Separator}[#label-Column+Separator] + - {Recipe: Handle Compliant Column Separator}[#label-Recipe-3A+Handle+Compliant+Column+Separator] + - {Recipe: Handle Non-Compliant Column Separator}[#label-Recipe-3A+Handle+Non-Compliant+Column+Separator] + - {Quote Character}[#label-Quote+Character] + - {Recipe: Handle Compliant Quote Character}[#label-Recipe-3A+Handle+Compliant+Quote+Character] + - {Recipe: Handle Non-Compliant Quote Character}[#label-Recipe-3A+Handle+Non-Compliant+Quote+Character] + - {Recipe: Allow Liberal Parsing}[#label-Recipe-3A+Allow+Liberal+Parsing] +- {Special Handling}[#label-Special+Handling] + - {Special Line Handling}[#label-Special+Line+Handling] + - {Recipe: Ignore Blank Lines}[#label-Recipe-3A+Ignore+Blank+Lines] + - {Recipe: Ignore Selected Lines}[#label-Recipe-3A+Ignore+Selected+Lines] + - {Special Field Handling}[#label-Special+Field+Handling] + - {Recipe: Strip Fields}[#label-Recipe-3A+Strip+Fields] + - {Recipe: Handle Null Fields}[#label-Recipe-3A+Handle+Null+Fields] + - {Recipe: Handle Empty Fields}[#label-Recipe-3A+Handle+Empty+Fields] +- {Converting Fields}[#label-Converting+Fields] + - {Converting Fields to Objects}[#label-Converting+Fields+to+Objects] + - {Recipe: Convert Fields to Integers}[#label-Recipe-3A+Convert+Fields+to+Integers] + - {Recipe: Convert Fields to Floats}[#label-Recipe-3A+Convert+Fields+to+Floats] + - {Recipe: Convert Fields to Numerics}[#label-Recipe-3A+Convert+Fields+to+Numerics] + - {Recipe: Convert Fields to Dates}[#label-Recipe-3A+Convert+Fields+to+Dates] + - {Recipe: Convert Fields to DateTimes}[#label-Recipe-3A+Convert+Fields+to+DateTimes] + - {Recipe: Convert Assorted Fields to Objects}[#label-Recipe-3A+Convert+Assorted+Fields+to+Objects] + - {Recipe: Convert Fields to Other Objects}[#label-Recipe-3A+Convert+Fields+to+Other+Objects] + - {Recipe: Filter Field Strings}[#label-Recipe-3A+Filter+Field+Strings] + - {Recipe: Register Field Converters}[#label-Recipe-3A+Register+Field+Converters] + - {Using Multiple Field Converters}[#label-Using+Multiple+Field+Converters] + - {Recipe: Specify Multiple Field Converters in Option :converters}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+Option+-3Aconverters] + - {Recipe: Specify Multiple Field Converters in a Custom Converter List}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+a+Custom+Converter+List] +- {Converting Headers}[#label-Converting+Headers] + - {Recipe: Convert Headers to Lowercase}[#label-Recipe-3A+Convert+Headers+to+Lowercase] + - {Recipe: Convert Headers to Symbols}[#label-Recipe-3A+Convert+Headers+to+Symbols] + - {Recipe: Filter Header Strings}[#label-Recipe-3A+Filter+Header+Strings] + - {Recipe: Register Header Converters}[#label-Recipe-3A+Register+Header+Converters] + - {Using Multiple Header Converters}[#label-Using+Multiple+Header+Converters] + - {Recipe: Specify Multiple Header Converters in Option :header_converters}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+Option+-3Aheader_converters] + - {Recipe: Specify Multiple Header Converters in a Custom Header Converter List}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+a+Custom+Header+Converter+List] +- {Diagnostics}[#label-Diagnostics] + - {Recipe: Capture Unconverted Fields}[#label-Recipe-3A+Capture+Unconverted+Fields] + - {Recipe: Capture Field Info}[#label-Recipe-3A+Capture+Field+Info] + +=== Source Formats + +You can parse \CSV data from a \String, from a \File (via its path), or from an \IO stream. + +==== Parsing from a \String + +You can parse \CSV data from a \String, with or without headers. + +===== Recipe: Parse from \String with Headers + +Use class method CSV.parse with option +headers+ to read a source \String all at once +(may have memory resource implications): + string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + CSV.parse(string, headers: true) # => #<CSV::Table mode:col_or_row row_count:4> + +Use instance method CSV#each with option +headers+ to read a source \String one row at a time: + CSV.new(string, headers: true).each do |row| + p row + end +Output: + #<CSV::Row "Name":"foo" "Value":"0"> + #<CSV::Row "Name":"bar" "Value":"1"> + #<CSV::Row "Name":"baz" "Value":"2"> + +===== Recipe: Parse from \String Without Headers + +Use class method CSV.parse without option +headers+ to read a source \String all at once +(may have memory resource implications): + string = "foo,0\nbar,1\nbaz,2\n" + CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Use instance method CSV#each without option +headers+ to read a source \String one row at a time: + CSV.new(string).each do |row| + p row + end +Output: + ["foo", "0"] + ["bar", "1"] + ["baz", "2"] + +==== Parsing from a \File + +You can parse \CSV data from a \File, with or without headers. + +===== Recipe: Parse from \File with Headers + +Use instance method CSV#read with option +headers+ to read a file all at once: + string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, string) + CSV.read(path, headers: true) # => #<CSV::Table mode:col_or_row row_count:4> + +Use class method CSV.foreach with option +headers+ to read one row at a time: + CSV.foreach(path, headers: true) do |row| + p row + end +Output: + #<CSV::Row "Name":"foo" "Value":"0"> + #<CSV::Row "Name":"bar" "Value":"1"> + #<CSV::Row "Name":"baz" "Value":"2"> + +===== Recipe: Parse from \File Without Headers + +Use class method CSV.read without option +headers+ to read a file all at once: + string = "foo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, string) + CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Use class method CSV.foreach without option +headers+ to read one row at a time: + CSV.foreach(path) do |row| + p row + end +Output: + ["foo", "0"] + ["bar", "1"] + ["baz", "2"] + +==== Parsing from an \IO Stream + +You can parse \CSV data from an \IO stream, with or without headers. + +===== Recipe: Parse from \IO Stream with Headers + +Use class method CSV.parse with option +headers+ to read an \IO stream all at once: + string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, string) + File.open(path) do |file| + CSV.parse(file, headers: true) + end # => #<CSV::Table mode:col_or_row row_count:4> + +Use class method CSV.foreach with option +headers+ to read one row at a time: + File.open(path) do |file| + CSV.foreach(file, headers: true) do |row| + p row + end + end +Output: + #<CSV::Row "Name":"foo" "Value":"0"> + #<CSV::Row "Name":"bar" "Value":"1"> + #<CSV::Row "Name":"baz" "Value":"2"> + +===== Recipe: Parse from \IO Stream Without Headers + +Use class method CSV.parse without option +headers+ to read an \IO stream all at once: + string = "foo,0\nbar,1\nbaz,2\n" + path = 't.csv' + File.write(path, string) + File.open(path) do |file| + CSV.parse(file) + end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +Use class method CSV.foreach without option +headers+ to read one row at a time: + File.open(path) do |file| + CSV.foreach(file) do |row| + p row + end + end +Output: + ["foo", "0"] + ["bar", "1"] + ["baz", "2"] + +=== RFC 4180 Compliance + +By default, \CSV parses data that is compliant with +{RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180] +with respect to: +- Row separator. +- Column separator. +- Quote character. + +==== Row Separator + +RFC 4180 specifies the row separator CRLF (Ruby <tt>"\r\n"</tt>). + +Although the \CSV default row separator is <tt>"\n"</tt>, +the parser also by default handles row separator <tt>"\r"</tt> and the RFC-compliant <tt>"\r\n"</tt>. + +===== Recipe: Handle Compliant Row Separator + +For strict compliance, use option +:row_sep+ to specify row separator <tt>"\r\n"</tt>, +which allows the compliant row separator: + source = "foo,1\r\nbar,1\r\nbaz,2\r\n" + CSV.parse(source, row_sep: "\r\n") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] +But rejects other row separators: + source = "foo,1\nbar,1\nbaz,2\n" + CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError + source = "foo,1\rbar,1\rbaz,2\r" + CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError + source = "foo,1\n\rbar,1\n\rbaz,2\n\r" + CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError + +===== Recipe: Handle Non-Compliant Row Separator + +For data with non-compliant row separators, use option +:row_sep+. +This example source uses semicolon (<tt>";"</tt>) as its row separator: + source = "foo,1;bar,1;baz,2;" + CSV.parse(source, row_sep: ';') # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] + +==== Column Separator + +RFC 4180 specifies column separator COMMA (Ruby <tt>","</tt>). + +===== Recipe: Handle Compliant Column Separator + +Because the \CSV default comma separator is ',', +you need not specify option +:col_sep+ for compliant data: + source = "foo,1\nbar,1\nbaz,2\n" + CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] + +===== Recipe: Handle Non-Compliant Column Separator + +For data with non-compliant column separators, use option +:col_sep+. +This example source uses TAB (<tt>"\t"</tt>) as its column separator: + source = "foo,1\tbar,1\tbaz,2" + CSV.parse(source, col_sep: "\t") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] + +==== Quote Character + +RFC 4180 specifies quote character DQUOTE (Ruby <tt>"\""</tt>). + +===== Recipe: Handle Compliant Quote Character + +Because the \CSV default quote character is <tt>"\""</tt>, +you need not specify option +:quote_char+ for compliant data: + source = "\"foo\",\"1\"\n\"bar\",\"1\"\n\"baz\",\"2\"\n" + CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] + +===== Recipe: Handle Non-Compliant Quote Character + +For data with non-compliant quote characters, use option +:quote_char+. +This example source uses SQUOTE (<tt>"'"</tt>) as its quote character: + source = "'foo','1'\n'bar','1'\n'baz','2'\n" + CSV.parse(source, quote_char: "'") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]] + +==== Recipe: Allow Liberal Parsing + +Use option +:liberal_parsing+ to specify that \CSV should +attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields: + source = 'is,this "three, or four",fields' + CSV.parse(source) # Raises MalformedCSVError + CSV.parse(source, liberal_parsing: true) # => [["is", "this \"three", " or four\"", "fields"]] + +=== Special Handling + +You can use parsing options to specify special handling for certain lines and fields. + +==== Special Line Handling + +Use parsing options to specify special handling for blank lines, or for other selected lines. + +===== Recipe: Ignore Blank Lines + +Use option +:skip_blanks+ to ignore blank lines: + source = <<-EOT + foo,0 + + bar,1 + baz,2 + + , + EOT + parsed = CSV.parse(source, skip_blanks: true) + parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]] + +===== Recipe: Ignore Selected Lines + +Use option +:skip_lines+ to ignore selected lines. + source = <<-EOT + # Comment + foo,0 + bar,1 + baz,2 + # Another comment + EOT + parsed = CSV.parse(source, skip_lines: /^#/) + parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] + +==== Special Field Handling + +Use parsing options to specify special handling for certain field values. + +===== Recipe: Strip Fields + +Use option +:strip+ to strip parsed field values: + CSV.parse_line(' a , b ', strip: true) # => ["a", "b"] + +===== Recipe: Handle Null Fields + +Use option +:nil_value+ to specify a value that will replace each field +that is null (no text): + CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"] + +===== Recipe: Handle Empty Fields + +Use option +:empty_value+ to specify a value that will replace each field +that is empty (\String of length 0); + CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"] + +=== Converting Fields + +You can use field converters to change parsed \String fields into other objects, +or to otherwise modify the \String fields. + +==== Converting Fields to Objects + +Use field converters to change parsed \String objects into other, more specific, objects. + +There are built-in field converters for converting to objects of certain classes: +- \Float +- \Integer +- \Date +- \DateTime + +Other built-in field converters include: +- +:numeric+: converts to \Integer and \Float. +- +:all+: converts to \DateTime, \Integer, \Float. + +You can also define field converters to convert to objects of other classes. + +===== Recipe: Convert Fields to Integers + +Convert fields to \Integer objects using built-in converter +:integer+: + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, converters: :integer) + parsed.map {|row| row['Value'].class} # => [Integer, Integer, Integer] + +===== Recipe: Convert Fields to Floats + +Convert fields to \Float objects using built-in converter +:float+: + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, converters: :float) + parsed.map {|row| row['Value'].class} # => [Float, Float, Float] + +===== Recipe: Convert Fields to Numerics + +Convert fields to \Integer and \Float objects using built-in converter +:numeric+: + source = "Name,Value\nfoo,0\nbar,1.1\nbaz,2.2\n" + parsed = CSV.parse(source, headers: true, converters: :numeric) + parsed.map {|row| row['Value'].class} # => [Integer, Float, Float] + +===== Recipe: Convert Fields to Dates + +Convert fields to \Date objects using built-in converter +:date+: + source = "Name,Date\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2001-02-03\n" + parsed = CSV.parse(source, headers: true, converters: :date) + parsed.map {|row| row['Date'].class} # => [Date, Date, Date] + +===== Recipe: Convert Fields to DateTimes + +Convert fields to \DateTime objects using built-in converter +:date_time+: + source = "Name,DateTime\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n" + parsed = CSV.parse(source, headers: true, converters: :date_time) + parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime] + +===== Recipe: Convert Assorted Fields to Objects + +Convert assorted fields to objects using built-in converter +:all+: + source = "Type,Value\nInteger,0\nFloat,1.0\nDateTime,2001-02-04\n" + parsed = CSV.parse(source, headers: true, converters: :all) + parsed.map {|row| row['Value'].class} # => [Integer, Float, DateTime] + +===== Recipe: Convert Fields to Other Objects + +Define a custom field converter to convert \String fields into other objects. +This example defines and uses a custom field converter +that converts each column-1 value to a \Rational object: + rational_converter = proc do |field, field_context| + field_context.index == 1 ? field.to_r : field + end + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, converters: rational_converter) + parsed.map {|row| row['Value'].class} # => [Rational, Rational, Rational] + +==== Recipe: Filter Field Strings + +Define a custom field converter to modify \String fields. +This example defines and uses a custom field converter +that strips whitespace from each field value: + strip_converter = proc {|field| field.strip } + source = "Name,Value\n foo , 0 \n bar , 1 \n baz , 2 \n" + parsed = CSV.parse(source, headers: true, converters: strip_converter) + parsed['Name'] # => ["foo", "bar", "baz"] + parsed['Value'] # => ["0", "1", "2"] + +==== Recipe: Register Field Converters + +Register a custom field converter, assigning it a name; +then refer to the converter by its name: + rational_converter = proc do |field, field_context| + field_context.index == 1 ? field.to_r : field + end + CSV::Converters[:rational] = rational_converter + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, converters: :rational) + parsed['Value'] # => [(0/1), (1/1), (2/1)] + +==== Using Multiple Field Converters + +You can use multiple field converters in either of these ways: +- Specify converters in option +:converters+. +- Specify converters in a custom converter list. + +===== Recipe: Specify Multiple Field Converters in Option +:converters+ + +Apply multiple field converters by specifying them in option +:converters+: + source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n" + parsed = CSV.parse(source, headers: true, converters: [:integer, :float]) + parsed['Value'] # => [0, 1.0, 2.0] + +===== Recipe: Specify Multiple Field Converters in a Custom Converter List + +Apply multiple field converters by defining and registering a custom converter list: + strip_converter = proc {|field| field.strip } + CSV::Converters[:strip] = strip_converter + CSV::Converters[:my_converters] = [:integer, :float, :strip] + source = "Name,Value\n foo , 0 \n bar , 1.0 \n baz , 2.0 \n" + parsed = CSV.parse(source, headers: true, converters: :my_converters) + parsed['Name'] # => ["foo", "bar", "baz"] + parsed['Value'] # => [0, 1.0, 2.0] + +=== Converting Headers + +You can use header converters to modify parsed \String headers. + +Built-in header converters include: +- +:symbol+: converts \String header to \Symbol. +- +:downcase+: converts \String header to lowercase. + +You can also define header converters to otherwise modify header \Strings. + +==== Recipe: Convert Headers to Lowercase + +Convert headers to lowercase using built-in converter +:downcase+: + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, header_converters: :downcase) + parsed.headers # => ["name", "value"] + +==== Recipe: Convert Headers to Symbols + +Convert headers to downcased Symbols using built-in converter +:symbol+: + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, header_converters: :symbol) + parsed.headers # => [:name, :value] + parsed.headers.map {|header| header.class} # => [Symbol, Symbol] + +==== Recipe: Filter Header Strings + +Define a custom header converter to modify \String fields. +This example defines and uses a custom header converter +that capitalizes each header \String: + capitalize_converter = proc {|header| header.capitalize } + source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, header_converters: capitalize_converter) + parsed.headers # => ["Name", "Value"] + +==== Recipe: Register Header Converters + +Register a custom header converter, assigning it a name; +then refer to the converter by its name: + capitalize_converter = proc {|header| header.capitalize } + CSV::HeaderConverters[:capitalize] = capitalize_converter + source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, headers: true, header_converters: :capitalize) + parsed.headers # => ["Name", "Value"] + +==== Using Multiple Header Converters + +You can use multiple header converters in either of these ways: +- Specify header converters in option +:header_converters+. +- Specify header converters in a custom header converter list. + +===== Recipe: Specify Multiple Header Converters in Option :header_converters + +Apply multiple header converters by specifying them in option +:header_converters+: + source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n" + parsed = CSV.parse(source, headers: true, header_converters: [:downcase, :symbol]) + parsed.headers # => [:name, :value] + +===== Recipe: Specify Multiple Header Converters in a Custom Header Converter List + +Apply multiple header converters by defining and registering a custom header converter list: + CSV::HeaderConverters[:my_header_converters] = [:symbol, :downcase] + source = "NAME,VALUE\nfoo,0\nbar,1.0\nbaz,2.0\n" + parsed = CSV.parse(source, headers: true, header_converters: :my_header_converters) + parsed.headers # => [:name, :value] + +=== Diagnostics + +==== Recipe: Capture Unconverted Fields + +To capture unconverted field values, use option +:unconverted_fields+: + source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" + parsed = CSV.parse(source, converters: :integer, unconverted_fields: true) + parsed # => [["Name", "Value"], ["foo", 0], ["bar", 1], ["baz", 2]] + parsed.each {|row| p row.unconverted_fields } +Output: + ["Name", "Value"] + ["foo", "0"] + ["bar", "1"] + ["baz", "2"] + +==== Recipe: Capture Field Info + +To capture field info in a custom converter, accept two block arguments. +The first is the field value; the second is a +CSV::FieldInfo+ object: + strip_converter = proc {|field, field_info| p field_info; field.strip } + source = " foo , 0 \n bar , 1 \n baz , 2 \n" + parsed = CSV.parse(source, converters: strip_converter) + parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]] +Output: + #<struct CSV::FieldInfo index=0, line=1, header=nil> + #<struct CSV::FieldInfo index=1, line=1, header=nil> + #<struct CSV::FieldInfo index=0, line=2, header=nil> + #<struct CSV::FieldInfo index=1, line=2, header=nil> + #<struct CSV::FieldInfo index=0, line=3, header=nil> + #<struct CSV::FieldInfo index=1, line=3, header=nil> diff --git a/doc/csv/recipes/recipes.rdoc b/doc/csv/recipes/recipes.rdoc new file mode 100644 index 0000000000..9bf7885b1e --- /dev/null +++ b/doc/csv/recipes/recipes.rdoc @@ -0,0 +1,6 @@ +== Recipes for \CSV + +The recipes are specific code examples for specific tasks. See: +- {Recipes for Parsing CSV}[./parsing_rdoc.html] +- {Recipes for Generating CSV}[./generating_rdoc.html] +- {Recipes for Filtering CSV}[./filtering_rdoc.html] diff --git a/doc/distribution/distribution.md b/doc/distribution/distribution.md new file mode 100644 index 0000000000..164e1b7109 --- /dev/null +++ b/doc/distribution/distribution.md @@ -0,0 +1,48 @@ +# Distribution + +This document outlines the expected way to distribute Ruby, with a specific focus on building Ruby packages. + +## Getting the Ruby Tarball + +### Official Releases + +The tarball for official releases is created by the release manager. The release manager uploads the tarball to the [Ruby website](https://www.ruby-lang.org/en/downloads/). + +Downstream distributors should use the official release tarballs as part of their build process. This ensures that the tarball is created in a consistent way, and that the tarball is cryptographically verified. + +### Using the nightly tarball for testing + +See the Snapshots section of the [Ruby website](https://www.ruby-lang.org/en/downloads/). + +### Building a manual snapshot tarball for testing + +This can be useful if the nightly tarball does not have all changes yet. + +At Ruby source tree cloned using git: + +```console +$ ./autogen.sh +$ ./configure -C +$ make +$ make dist +``` + +This will create several tarball in the `tmp` directory. The tarball will be named e.g. `ruby-<version>.tar.gz` (several different compression formats will be generated). + +## Building the Tarball + +See [Building Ruby](contributing/building_ruby.md). + +## Updating the Ruby Standard Library + +The Ruby standard library is a collection of Ruby files that are included with Ruby. These files are used to provide the basic functionality of Ruby. The standard library is located in the `lib` directory and is distributed as part of the Ruby tarball. + +Occasionally, the standard library needs to be updated, for example a security issue might be found in a default gem or standard gem. There are two main ways that Ruby would update this code. + +### Releasing an Updated Ruby Gem + +Normally, the Ruby gem maintainer will release an updated gem. This gem can be installed alongside the default gem. This allows the user to update the gem without having to update Ruby. + +### Releasing a New Ruby Version + +If the update is critical, then the Ruby maintainers may decide to release a new version of Ruby. This new version will include the updated standard library. diff --git a/doc/distribution/windows.md b/doc/distribution/windows.md new file mode 100644 index 0000000000..26a727d7ad --- /dev/null +++ b/doc/distribution/windows.md @@ -0,0 +1,304 @@ +# Windows + +Ruby supports a few native build platforms for Windows. + +* mswin: Build using Microsoft Visual C++ compiler with vcruntimeXXX.dll +* mingw-msvcrt: Build using compiler for Mingw with msvcrtXX.dll +* mingw-ucrt: Build using compiler for Mingw with Windows Universal CRT + +## Building Ruby using Mingw with UCRT + +The easiest build environment is just a standard [RubyInstaller-Devkit] +installation and [git-for-windows]. You might like to use [VSCode] as an +editor. + +### Build examples + +Ruby core development can be done either in Windows `cmd` like: + +```batch +ridk install +ridk enable ucrt64 + +pacman -S --needed %MINGW_PACKAGE_PREFIX%-openssl %MINGW_PACKAGE_PREFIX%-libyaml %MINGW_PACKAGE_PREFIX%-libffi + +mkdir c:\work\ruby +cd /d c:\work\ruby + +git clone https://github.com/ruby/ruby src + +sh ./src/autogen.sh + +mkdir build +cd build +sh ../src/configure -C --disable-install-doc +make +``` + +or in MSYS2 `bash` like: + +```bash +ridk install +ridk enable ucrt64 +bash + +pacman -S --needed $MINGW_PACKAGE_PREFIX-openssl $MINGW_PACKAGE_PREFIX-libyaml $MINGW_PACKAGE_PREFIX-libffi + +mkdir /c/work/ruby +cd /c/work/ruby + +git clone https://github.com/ruby/ruby src + +./src/autogen.sh +cd build +../src/configure -C --disable-install-doc +make +``` + +If you have other MSYS2 environment via other package manager like `scoop`, you need to specify `$MINGW_PACKAGE_PREFIX` is `mingw-w64-ucrt-x86_64`. +And you need to add `--with-opt-dir` option to `configure` command like: + +```batch +sh ../../ruby/configure -C --disable-install-doc --with-opt-dir=C:\Users\username\scoop\apps\msys2\current\ucrt64 +``` + +[RubyInstaller-Devkit]: https://rubyinstaller.org/ +[git-for-windows]: https://gitforwindows.org/ +[VSCode]: https://code.visualstudio.com/ + +## Building Ruby using Visual C++ + +### Requirement + +1. Windows 10/Windows Server 2016 or later. + +2. Visual C++ 14.0 (2015) or later. + + **Note** if you want to build x64 version, use native compiler for + x64. + + The minimum requirement is here: + * VC++/MSVC on VS 2017/2019/2022 version build tools. + * Windows 10/11 SDK + + You can install Visual Studio Build Tools with `winget`. + `win32\install-buildtools.cmd` is a batch file to install the + minimum requirements excluding the IDE etc. + +3. Please set environment variable `INCLUDE`, `LIB`, `PATH` to run + required commands properly from the command line. These are set + properly by `vsdevcmd.bat` or `vcvarall*.bat` usually. You can run + the following command to set them in your command line. + + To native build: + + ``` + cmd /k win32\vssetup.cmd + ``` + + To cross build arm64 binary: + + ``` + cmd /k win32\vssetup.cmd -arch=arm64 + ``` + + To cross build x64 binary: + + ``` + cmd /k win32\vssetup.cmd -arch=x64 + ``` + + This batch file is a wrapper of `vsdevcmd.bat` and options are + passed to it as-is. `win32\vssetup.cmd -help` for other command + line options. + + **Note** building ruby requires following commands. + + * `nmake` + * `cl` + * `ml` + * `lib` + * `dumpbin` + +4. If you want to build from GIT source, following commands are required. + * `git` + * `ruby` 3.1 or later + + You can use [scoop](https://scoop.sh/) to install them like: + + ```batch + scoop install git ruby + ``` + + The windows version of `git` configured with `autocrlf` is `true`. The Ruby + test suite may fail with `autocrlf` set to `true`. You can set it to `false` + like: + + ```batch + git config --global core.autocrlf false + ``` + +5. You need to install required libraries using [vcpkg](https://vcpkg.io/) on + directory of ruby repository like: + + ```batch + vcpkg --triplet x64-windows install + ``` + +6. Enable Command Extension of your command line. It's the default behavior + of `cmd.exe`. If you want to enable it explicitly, run `cmd.exe` with + `/E:ON` option. + +### How to compile and install + +1. Execute `win32\configure.bat` on your build directory. + You can specify the target platform as an argument. + For example, run `configure --target=i686-mswin32`. + You can also specify the install directory. + For example, run `configure --prefix=<install_directory>`. + Default of the install directory is `/usr` . + +2. If you want to append to the executable and DLL file names, + specify `--program-prefix` and `--program-suffix`, like + `win32\configure.bat --program-suffix=-$(MAJOR)$(MINOR)`. + + Also, the `--install-name` and `--so-name` options specify the + exact base names of the executable and DLL files, respectively, + like `win32\configure.bat --install-name=$(RUBY_BASE_NAME)-$(MAJOR)$(MINOR)`. + + By default, the name for the executable without a console window + is generated from the _RUBY_INSTALL_NAME_ specified as above by + replacing `ruby` with `rubyw`. If you want to make it different + more, modify _RUBYW_INSTALL_NAME_ directly in the Makefile. + +3. You need specify vcpkg directory to use `--with-opt-dir` + option like `win32\configure.bat --with-opt-dir=C:/vcpkg_installed/x64-windows` + +4. Run `nmake up` if you are building from GIT source. + +5. Run `nmake` + +6. Run `nmake prepare-vcpkg` with administrator privilege if you need to + copy vcpkg installed libraries like `libssl-3-x64.dll` to the build directory. + +7. Run `nmake check` + +8. Run `nmake install` + +### Build examples + +* Build on the ruby source directory. + + ``` + ruby source directory: C:\ruby + build directory: C:\ruby + install directory: C:\usr\local + ``` + + ```batch + C: + cd \ruby + win32\configure --prefix=/usr/local + nmake + nmake check + nmake install + ``` + +* Build on the relative directory from the ruby source directory. + + ``` + ruby source directory: C:\ruby + build directory: C:\ruby\mswin32 + install directory: C:\usr\local + ``` + + ```batch + C: + cd \ruby + mkdir mswin32 + cd mswin32 + ..\win32\configure --prefix=/usr/local + nmake + nmake check + nmake install + ``` + +* Build on the different drive. + + ``` + ruby source directory: C:\src\ruby + build directory: D:\build\ruby + install directory: C:\usr\local + ``` + + ```batch + D: + cd D:\build\ruby + C:\src\ruby\win32\configure --prefix=/usr/local + nmake + nmake check + nmake install DESTDIR=C: + ``` + +* Build x64 version (requires native x64 VC++ compiler) + + ``` + ruby source directory: C:\ruby + build directory: C:\ruby + install directory: C:\usr\local + ``` + + ```batch + C: + cd \ruby + win32\configure --prefix=/usr/local --target=x64-mswin64 + nmake + nmake check + nmake install + ``` + +### Bugs + +You can **NOT** use a path name that contains any white space characters +as the ruby source directory, this restriction comes from the behavior +of `!INCLUDE` directives of `NMAKE`. + +You can build ruby in any directory including the source directory, +except `win32` directory in the source directory. +This is restriction originating in the path search method of `NMAKE`. + +### Dependency management + +Ruby uses [vcpkg](https://vcpkg.io/) to manage dependencies on mswin platform. + +You can update and install it under the build directory like: + +```batch +nmake update-vcpkg # Update baseline version of vcpkg +nmake install-vcpkg # Install vcpkg from build directory +``` + + +## Icons + +Any icon files(`*.ico`) in the build directory, directories specified with +_icondirs_ make variable and `win32` directory under the ruby +source directory will be included in DLL or executable files, according +to their base names. + + $(RUBY_INSTALL_NAME).ico or ruby.ico --> $(RUBY_INSTALL_NAME).exe + $(RUBYW_INSTALL_NAME).ico or rubyw.ico --> $(RUBYW_INSTALL_NAME).exe + the others --> $(RUBY_SO_NAME).dll + +Although no icons are distributed with the ruby source, you can use +anything you like. You will be able to find many images by search engines. +For example, followings are made from [Ruby logo kit]: + +* Small [favicon] in the official site + +* [vit-ruby.ico] or [icon itself] + +[Ruby logo kit]: https://cache.ruby-lang.org/pub/misc/logo/ruby-logo-kit.zip +[favicon]: https://www.ruby-lang.org/favicon.ico +[vit-ruby.ico]: http://ruby.morphball.net/vit-ruby-ico_en.html +[icon itself]: http://ruby.morphball.net/icon/vit-ruby.ico diff --git a/doc/etc.rd.ja b/doc/etc.rd.ja deleted file mode 100644 index 5d64594fc3..0000000000 --- a/doc/etc.rd.ja +++ /dev/null @@ -1,75 +0,0 @@ -# etc.rd.ja - -*- mode: rd; coding: utf-8; -*- created at: Fri Jul 14 00:47:15 JST 1995 -=begin - -= Etc(モジュール) - -実行ã—ã¦ã„ã‚‹OSã‹ã‚‰ã®æƒ…å ±ã‚’å¾—ã‚‹ãŸã‚ã®ãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ï¼Žã‚¯ãƒ©ã‚¹ã«ã‚¤ãƒ³ã‚¯ãƒ«ãƒ¼ãƒ‰ -ã—ã¦ä½¿ã†ã“ã¨ã‚‚ã§ãる. - -== Module Function - ---- getlogin - - 自分ã®loginåã‚’è¿”ã™ï¼Žã“れãŒå¤±æ•—ã—ãŸå ´åˆã¯getpwuid()を用ã„る㨠- 良ã„. - ---- getpwnam(name) - - /etc/passwdファイル(ã‚ã‚‹ã„ã¯DBMファイルやNISデータベース)を検 - ç´¢ã—,nameã®åå‰ã‚’æŒã¤passwdエントリを返ã™ï¼Žæˆ»ã‚Šå€¤ã¯passwdæ§‹é€ - 体ã§ä»¥ä¸‹ã®ãƒ¡ãƒ³ãƒã‚’æŒã¤ï¼Ž - - struct passwd - name # ユーザå(æ–‡å—列) - passwd # パスワード(æ–‡å—列) - uid # ユーザID(æ•´æ•°) - gid # グループID(æ•´æ•°) - gecos # gecosフィールド(æ–‡å—列) - dir # ホームディレクトリ(æ–‡å—列) - shell # ãƒã‚°ã‚¤ãƒ³ã‚·ã‚§ãƒ«(æ–‡å—列) - # 以é™ã®ãƒ¡ãƒ³ãƒã¯ã‚·ã‚¹ãƒ†ãƒ ã«ã‚ˆã£ã¦ã¯æä¾›ã•れãªã„. - change # パスワード変更時間(æ•´æ•°) - quota # クォータ(æ•´æ•°) - age # エージ(æ•´æ•°) - uclass # ユーザアクセスクラス(æ–‡å—列) - comment # コメント(æ–‡å—列) - expire # アカウント有効期é™(æ•´æ•°) - end - - 詳細ã¯getpwnam(3)ã‚’å‚ç…§ã®ã“ã¨ï¼Ž - ---- getpwuid([uid]) - - uidをユーザIDã¨ã™ã‚‹passwdエントリを返ã™ï¼Žæˆ»ã‚Šå€¤ã¯getpwnam()㨠- åŒæ§˜ã§ã‚る.引数をçœç•¥ã—ãŸå ´åˆã«ã¯getuid()ã®å€¤ã‚’用ã„る.詳細㯠- getpwuid(3)ã‚’å‚ç…§ã®ã“ã¨ï¼Ž - ---- getgrgid(gid) - - /etc/groupファイル(ã‚ã‚‹ã„ã¯â€¦getpwnamå‚ç…§)を検索ã—,gidをグルー - プIDã¨ã™ã‚‹ã‚°ãƒ«ãƒ¼ãƒ—エントリを返ã™ï¼Žæˆ»ã‚Šå€¤ã¯groupæ§‹é€ ä½“ã§ä»¥ä¸‹ã® - メンãƒã‚’æŒã¤ï¼Ž - - struct group - name # グループå(æ–‡å—列) - passwd # グループã®ãƒ‘スワード(æ–‡å—列) - gid # グループID(æ•´æ•°) - mem # グループメンãƒåã®é…列 - end - - 詳細ã¯getgrgid(3)ã‚’å‚ç…§ã®ã“ã¨ï¼Ž - ---- getgrnam(name) - - nameã¨ã„ã†åå‰ã®ã‚°ãƒ«ãƒ¼ãƒ—エントリを返ã™ï¼Žæˆ»ã‚Šå€¤ã¯getgrgid()ã¨åŒ - 様ã§ã‚る.詳細ã¯getgrnam(3)ã‚’å‚照. - ---- group - - å…¨ã¦ã®ã‚°ãƒ«ãƒ¼ãƒ—ã‚¨ãƒ³ãƒˆãƒªã‚’é †ã«ã‚¢ã‚¯ã‚»ã‚¹ã™ã‚‹ãŸã‚ã®ã‚¤ãƒ†ãƒ¬ãƒ¼ã‚¿ï¼Ž - ---- passwd - - å…¨ã¦ã®passwdã‚¨ãƒ³ãƒˆãƒªã‚’é †ã«ã‚¢ã‚¯ã‚»ã‚¹ã™ã‚‹ãŸã‚ã®ã‚¤ãƒ†ãƒ¬ãƒ¼ã‚¿ï¼Ž - -=end diff --git a/doc/examples/files.rdoc b/doc/examples/files.rdoc new file mode 100644 index 0000000000..cb400c81be --- /dev/null +++ b/doc/examples/files.rdoc @@ -0,0 +1,26 @@ +# English text with newlines. +text = <<~EOT + First line + Second line + + Fourth line + Fifth line +EOT + +# Japanese text. +japanese = 'ã“ã‚“ã«ã¡ã¯' + +# Binary data. +data = "\u9990\u9991\u9992\u9993\u9994" + +# Text file. +File.write('t.txt', text) + +# File with Japanese text. +File.write('t.ja', japanese) + +# File with binary data. +f = File.new('t.dat', 'wb:UTF-16') +f.write(data) +f.close + diff --git a/doc/extension.ja.rdoc b/doc/extension.ja.rdoc index d83be10729..381b94a230 100644 --- a/doc/extension.ja.rdoc +++ b/doc/extension.ja.rdoc @@ -1,5 +1,7 @@ # extension.ja.rdoc - -*- RDoc -*- created at: Mon Aug 7 16:45:54 JST 1995 +{English}[rdoc-ref:extension.rdoc] + = Rubyã®æ‹¡å¼µãƒ©ã‚¤ãƒ–ラリã®ä½œã‚Šæ–¹ Rubyã®æ‹¡å¼µãƒ©ã‚¤ãƒ–ラリã®ä½œã‚Šæ–¹ã‚’説明ã—ã¾ã™ï¼Ž @@ -215,17 +217,6 @@ rb_str_new_literal(const char *ptr) :: Cã®ãƒªãƒ†ãƒ©ãƒ«æ–‡å—列ã‹ã‚‰Rubyã®æ–‡å—列を生æˆã™ã‚‹ï¼Ž -rb_tainted_str_new(const char *ptr, long len) :: - - 汚染マークãŒä»˜åŠ ã•ã‚ŒãŸæ–°ã—ã„Rubyã®æ–‡å—列を生æˆã™ã‚‹ï¼Žå¤–部 - ã‹ã‚‰ã®ãƒ‡ãƒ¼ã‚¿ã«åŸºã¥ãæ–‡å—列ã«ã¯æ±šæŸ“マークãŒä»˜åŠ ã•れるã¹ã - ã§ã‚る. - -rb_tainted_str_new2(const char *ptr) :: -rb_tainted_str_new_cstr(const char *ptr) :: - - Cã®æ–‡å—列ã‹ã‚‰æ±šæŸ“マークãŒä»˜åŠ ã•れãŸRubyã®æ–‡å—列を生æˆã™ã‚‹ï¼Ž - rb_str_append(VALUE str1, VALUE str2) :: Rubyã®æ–‡å—列str1ã«Rubyã®æ–‡å—列str2ã‚’è¿½åŠ ã™ã‚‹ï¼Ž @@ -628,12 +619,14 @@ C言語ã¨Rubyã®é–“ã§æƒ…å ±ã‚’å…±æœ‰ã™ã‚‹æ–¹æ³•ã«ã¤ã„ã¦è§£èª¬ã—ã¾ã™ï¼Ž Qtrue :: Qfalse :: - 真å½å€¤ï¼ŽQfalseã¯C言語ã§ã‚‚å½ã¨ã¿ãªã•れã¾ã™(ã¤ã¾ã‚Š0). + 真å½å€¤ï¼ŽC言語ã‹ã‚‰è¦‹ãŸã€Œtrueã€ã¨ã€Œfalseã€ï¼Ž Qnil :: C言語ã‹ã‚‰è¦‹ãŸã€Œnilã€ï¼Ž +RTEST(obj)ã¨ã„ã†ãƒžã‚¯ãƒã¯objãŒQfalseã‹Qnilã®ã¨ã0ã‚’è¿”ã—ã¾ã™ï¼Ž + === Cã¨Rubyã§å…±æœ‰ã•れる大域変数 Cã¨Rubyã§å¤§åŸŸå¤‰æ•°ã‚’使ã£ã¦æƒ…å ±ã‚’å…±æœ‰ã§ãã¾ã™ï¼Žå…±æœ‰ã§ãる大域 @@ -708,30 +701,28 @@ Cã®ä¸–界ã§å®šç¾©ã•れãŸãƒ‡ãƒ¼ã‚¿(æ§‹é€ ä½“)ã‚’Rubyã®ã‚ªãƒ–ジェクトã¨ã ã“ã®ãƒžã‚¯ãƒã®æˆ»ã‚Šå€¤ã¯ç”Ÿæˆã•れãŸã‚ªãƒ–ジェクトを表ã™VALUE値ã§ã™ï¼Ž -klassã¯ã“ã®ã‚ªãƒ–ジェクトã®ã‚¯ãƒ©ã‚¹ã§ã™ï¼Ždata_typeã¯ã“ã®æ§‹é€ 体を -RubyãŒç®¡ç†ã™ã‚‹ãŸã‚ã®æƒ…å ±ã‚’è¨˜è¿°ã—ãŸconst rb_data_type_tåž‹ã¸ã® -ãƒã‚¤ãƒ³ã‚¿ã§ã™ï¼Ž +klassã¯ã“ã®ã‚ªãƒ–ジェクトã®ã‚¯ãƒ©ã‚¹ã§ã™ï¼Žklassã¯, Objectクラス㋠+ら派生ã—, å¿…ãšrb_define_alloc_funcã‹rb_undef_alloc_funcを呼 +ã³å‡ºã—ã¦allocatorã‚’è¨å®šã—ã¦ãã ã•ã„. -ãªãŠ, klassã¯, Objectã‚„ä»–ã®ã‚¯ãƒ©ã‚¹ã§ã¯ãªãData (rb_cData)ã¨ã„ -ã†ç‰¹åˆ¥ãªã‚¯ãƒ©ã‚¹ã‹ã‚‰æ´¾ç”Ÿã™ã‚‹ã“ã¨ãŒæŽ¨å¥¨ã•れã¾ã™ï¼Ž -Dataã‹ã‚‰æ´¾ç”Ÿã—ãªã„å ´åˆã«ã¯, å¿…ãšrb_undef_alloc_func(klass) -を呼ã³å‡ºã—ã¦ãã ã•ã„. +data_typeã¯ã“ã®æ§‹é€ 体をRubyãŒç®¡ç†ã™ã‚‹ãŸã‚ã®æƒ…å ±ã‚’è¨˜è¿°ã—㟠+const rb_data_type_tåž‹ã¸ã®ãƒã‚¤ãƒ³ã‚¿ã§ã™ï¼Ž rb_data_type_tã¯æ¬¡ã®ã‚ˆã†ã«å®šç¾©ã•れã¦ã„ã¾ã™ï¼Ž typedef struct rb_data_type_struct rb_data_type_t; struct rb_data_type_struct { - const char *wrap_struct_name; - struct { - void (*dmark)(void*); - void (*dfree)(void*); - size_t (*dsize)(const void *); - void *reserved[2]; - } function; - const rb_data_type_t *parent; - void *data; - VALUE flags; + const char *wrap_struct_name; + struct { + void (*dmark)(void*); + void (*dfree)(void*); + size_t (*dsize)(const void *); + void *reserved[2]; + } function; + const rb_data_type_t *parent; + void *data; + VALUE flags; }; wrap_struct_nameã¯ã“ã®æ§‹é€ 体をè˜åˆ¥ã™ã‚‹åå‰ã§ã™ï¼Žä¸»ã«çµ±è¨ˆæƒ…å ± @@ -754,8 +745,8 @@ dmarkã¯ã‚¬ãƒ¼ãƒ™ãƒ¼ã‚¸ã‚³ãƒ¬ã‚¯ã‚¿ãŒã‚ªãƒ–ジェクトã¸ã®å‚照をマーク ++ dfreeã¯ã“ã®æ§‹é€ 体ãŒã‚‚ã†ä¸è¦ã«ãªã£ãŸæ™‚ã«å‘¼ã°ã‚Œã‚‹é–¢æ•°ã§ã™ï¼Žã“ -ã®é–¢æ•°ãŒã‚¬ãƒ¼ãƒ™ãƒ¼ã‚¸ã‚³ãƒ¬ã‚¯ã‚¿ã‹ã‚‰å‘¼ã°ã‚Œã¾ã™ï¼Žã“れãŒ-1ã®å ´åˆã¯ï¼Œ -å˜ç´”ã«æ§‹é€ 体ãŒè§£æ”¾ã•れã¾ã™ï¼Ž +ã®é–¢æ•°ãŒã‚¬ãƒ¼ãƒ™ãƒ¼ã‚¸ã‚³ãƒ¬ã‚¯ã‚¿ã‹ã‚‰å‘¼ã°ã‚Œã¾ã™ï¼Žã“れ㌠+RUBY_DEFAULT_FREEã®å ´åˆã¯ï¼Œå˜ç´”ã«æ§‹é€ 体ãŒè§£æ”¾ã•れã¾ã™ï¼Ž dsizeã¯æ§‹é€ ä½“ãŒæ¶ˆè²»ã—ã¦ã„るメモリã®ãƒã‚¤ãƒˆæ•°ã‚’è¿”ã™é–¢æ•°ã§ã™ï¼Ž 引数ã¨ã—ã¦æ§‹é€ 体ã¸ã®ãƒã‚¤ãƒ³ã‚¿ãŒæ¸¡ã•れã¾ã™ï¼Žå®Ÿè£…困難ã§ã‚れã°0 @@ -791,9 +782,14 @@ RUBY_TYPED_WB_PROTECTED :: メソッドã®å®Ÿè£…ã«é©åˆ‡ã«ãƒ©ã‚¤ãƒˆãƒãƒªã‚¢ã‚’挿入ã™ã‚‹è²¬ä»»ãŒã‚りã¾ã™ï¼Ž ã•ã‚‚ãªãã°Rubyã¯å®Ÿè¡Œæ™‚ã«ã‚¯ãƒ©ãƒƒã‚·ãƒ¥ã™ã‚‹å¯èƒ½æ€§ãŒã‚りã¾ã™ï¼Ž - ライトãƒãƒªã‚¢ã«ã¤ã„ã¦ã¯doc/extension.ja.rdocã®Appendix D - "世代別GC"ã‚‚å‚ç…§ã—ã¦ãã ã•ã„. + ライトãƒãƒªã‚¢ã«ã¤ã„ã¦ã¯{世代別 + GC}[rdoc-ref:@Appendix+D.+-E4-B8-96-E4-BB-A3-E5-88-A5GC] + ã‚‚å‚ç…§ã—ã¦ãã ã•ã„. +ã“ã®ãƒžã‚¯ãƒã¯ä¾‹å¤–を発生ã•ã›ã‚‹å¯èƒ½æ€§ãŒã‚ã‚‹ã“ã¨ã«æ³¨æ„ã—ã¦ãã ã• +ã„. ラップã•れる sval ãŒï¼Œè§£æ”¾ã™ã‚‹å¿…è¦ãŒã‚るリソース (割り +当ã¦ã‚‰ã‚ŒãŸãƒ¡ãƒ¢ãƒªï¼Œå¤–部ライブラリã‹ã‚‰ã®ãƒãƒ³ãƒ‰ãƒ«ãªã©) ã‚’ä¿æŒã— +ã¦ã„ã‚‹å ´åˆã¯ï¼Œrb_protect を使用ã™ã‚‹å¿…è¦ãŒã‚りã¾ã™ï¼Ž Cã®æ§‹é€ 体ã®å‰²å½“ã¨å¯¾å¿œã™ã‚‹ã‚ªãƒ–ジェクトã®ç”Ÿæˆã‚’åŒæ™‚ã«è¡Œã†ãƒžã‚¯ ãƒã¨ã—ã¦ä»¥ä¸‹ã®ã‚‚ã®ãŒæä¾›ã•れã¦ã„ã¾ã™ï¼Ž @@ -899,12 +895,12 @@ dbm.cã§ã¯TypedData_Make_Structを以下ã®ã‚ˆã†ã«ä½¿ã£ã¦ã„ã¾ã™ï¼Ž obj = TypedData_Make_Struct(klass, struct dbmdata, &dbm_type, dbmp); -ã“ã“ã§ã¯dbmdataæ§‹é€ ä½“ã¸ã®ãƒã‚¤ãƒ³ã‚¿ã‚’Dataã«ã‚«ãƒ—セル化ã—ã¦ã„ -ã¾ã™ï¼ŽDBM*を直接カプセル化ã—ãªã„ã®ã¯close()ã—ãŸæ™‚ã®å‡¦ç†ã‚’考 -ãˆã¦ã®ã“ã¨ã§ã™ï¼Ž +ã“ã“ã§ã¯dbmdataæ§‹é€ ä½“ã¸ã®ãƒã‚¤ãƒ³ã‚¿ã‚’Rubyオブジェクトã«ã‚«ãƒ—ã‚» +ル化ã—ã¦ã„ã¾ã™ï¼ŽDBM*を直接カプセル化ã—ãªã„ã®ã¯close()ã—ãŸæ™‚ +ã®å‡¦ç†ã‚’考ãˆã¦ã®ã“ã¨ã§ã™ï¼Ž -Dataオブジェクトã‹ã‚‰dbmstructæ§‹é€ ä½“ã®ãƒã‚¤ãƒ³ã‚¿ã‚’å–り出ã™ãŸã‚ -ã«ä»¥ä¸‹ã®ãƒžã‚¯ãƒã‚’使ã£ã¦ã„ã¾ã™ï¼Ž +Rubyオブジェクトã‹ã‚‰dbmdataæ§‹é€ ä½“ã®ãƒã‚¤ãƒ³ã‚¿ã‚’å–り出ã™ãŸã‚ã« +以下ã®ãƒžã‚¯ãƒã‚’使ã£ã¦ã„ã¾ã™ï¼Ž #define GetDBM(obj, dbmp) do {\ TypedData_Get_Struct((obj), struct dbmdata, &dbm_type, (dbmp));\ @@ -1077,6 +1073,20 @@ Rubyã®ã‚½ãƒ¼ã‚¹ã¯ã„ãã¤ã‹ã«åˆ†é¡žã™ã‚‹ã“ã¨ãŒå‡ºæ¥ã¾ã™ï¼Žã“ã®ã†ã ã¦ã„ã¾ã™ï¼Žã“れらã®ã‚½ãƒ¼ã‚¹ã¯ä»Šã¾ã§ã®èª¬æ˜Žã§ã»ã¨ã‚“ã©ç†è§£ã§ãる㨠æ€ã„ã¾ã™ï¼Ž +=== Rubyã®ãƒ˜ãƒƒãƒ€ãƒ•ァイル + +<tt>$repo_root/include/ruby</tt>以下ã¯ã™ã¹ã¦<tt>make +install</tt>ã§ã‚¤ãƒ³ã‚¹ãƒˆãƒ¼ãƒ«ã•れã¾ã™ï¼Žæ‹¡å¼µãƒ©ã‚¤ãƒ–ラリã‹ã‚‰ã¯ï¼Œ +<tt>#include <ruby.h></tt>ã§ã‚¤ãƒ³ã‚¯ãƒ«ãƒ¼ãƒ‰ã™ã‚‹å¿…è¦ãŒã‚りã¾ã™ï¼Ž ++rbimpl_+,+RBIMPL_+ã®ãƒ—レフィックスãŒä»˜ã„ãŸå®Ÿè£…ã®è©³ç´°ã®ãŸã‚ +ã®ã‚·ãƒ³ãƒœãƒ«ã‚’除ã,ã™ã¹ã¦ã®ã‚·ãƒ³ãƒœãƒ«ã¯å…¬é–‹APIã§ã™ï¼Ž + +拡張ライブラリã§ç›´æŽ¥ã‚¤ãƒ³ã‚¯ãƒ«ãƒ¼ãƒ‰ã§ãã‚‹ã®ã¯ï¼Œ +<tt>$repo_root/include/ruby/*.h</tt>ã®ã†ã¡ï¼Œå¯¾å¿œã™ã‚‹ +<tt>HAVE_RUBY_*_H</tt>マクãƒãŒ +<tt>$repo_root/include/ruby.h</tt>ヘッダーã§å®šç¾©ã•れã¦ã„ã‚‹ã‚‚ +ã®ã§ã™ï¼Ž + === Ruby言語ã®ã‚³ã‚¢ class.c :: クラスã¨ãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ« @@ -1119,9 +1129,8 @@ lex.c :: è‡ªå‹•ç”Ÿæˆ -> opt*.inc : è‡ªå‹•ç”Ÿæˆ -> vm.inc : è‡ªå‹•ç”Ÿæˆ -=== æ£è¦è¡¨ç¾ã‚¨ãƒ³ã‚¸ãƒ³ (鬼車) +=== æ£è¦è¡¨ç¾ã‚¨ãƒ³ã‚¸ãƒ³ (鬼雲) - regex.c regcomp.c regenc.c regerror.c @@ -1251,7 +1260,6 @@ Data_Get_Struct(data, type, sval) :: RB_INTEGER_TYPE_P(value) RB_FLOAT_TYPE_P(value) void Check_Type(VALUE value, int type) - SafeStringValue(value) === åž‹å¤‰æ› @@ -1694,6 +1702,9 @@ HAVE_RUBY_*_H :: ã‚’æ„味ã™ã‚‹ï¼ŽãŸã¨ãˆã°ï¼ŒHAVE_RUBY_ST_H ãŒå®šç¾©ã•れã¦ã„ã‚‹å ´åˆã¯ å˜ãªã‚‹ st.h ã§ã¯ãªã ruby/st.h を使用ã™ã‚‹ï¼Ž + ã“れらã®ãƒžã‚¯ãƒã«å¯¾å¿œã™ã‚‹ãƒ˜ãƒƒãƒ€ãƒ¼ãƒ•ァイルã¯ï¼Œæ‹¡å¼µãƒ©ã‚¤ãƒ–ラリ + ã‹ã‚‰ç›´æŽ¥ã‚¤ãƒ³ã‚¯ãƒ«ãƒ¼ãƒ‰ã—ã¦ã‚‚よã„. + RB_EVENT_HOOKS_HAVE_CALLBACK_DATA :: rb_add_event_hook() ãŒãƒ•ãƒƒã‚¯é–¢æ•°ã«æ¸¡ã™ data を第3引数ã¨ã—㦠@@ -1727,7 +1738,7 @@ have_func(func, header) :: ックã™ã‚‹ï¼ŽfuncãŒæ¨™æº–ã§ã¯ãƒªãƒ³ã‚¯ã•れãªã„ライブラリ内ã®ã‚‚ã®ã§ ã‚る時ã«ã¯å…ˆã«have_libraryã§ãã®ãƒ©ã‚¤ãƒ–ラリをãƒã‚§ãƒƒã‚¯ã—ã¦ãŠ ã事.ãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ - `HAVE_{FUNC}` を定義ã—,trueã‚’è¿”ã™ï¼Ž + <tt>HAVE_{FUNC}</tt> を定義ã—,trueã‚’è¿”ã™ï¼Ž have_var(var, header) :: @@ -1735,41 +1746,49 @@ have_var(var, header) :: クã™ã‚‹ï¼ŽvarãŒæ¨™æº–ã§ã¯ãƒªãƒ³ã‚¯ã•れãªã„ライブラリ内ã®ã‚‚ã®ã§ã‚ る時ã«ã¯å…ˆã«have_libraryã§ãã®ãƒ©ã‚¤ãƒ–ラリをãƒã‚§ãƒƒã‚¯ã—ã¦ãŠã 事.ãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ - `HAVE_{VAR}` を定義ã—,trueã‚’è¿”ã™ï¼Ž + <tt>HAVE_{VAR}</tt> を定義ã—,trueã‚’è¿”ã™ï¼Ž have_header(header) :: ヘッダファイルã®å˜åœ¨ã‚’ãƒã‚§ãƒƒã‚¯ã™ã‚‹ï¼Žãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œ - プリプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ `HAVE_{HEADER_H}` を定義ã—,trueã‚’è¿”ã™ï¼Ž + プリプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ <tt>HAVE_{HEADER_H}</tt> を定義ã—,trueã‚’è¿”ã™ï¼Ž (スラッシュやドットã¯ã‚¢ãƒ³ãƒ€ãƒ¼ã‚¹ã‚³ã‚¢ã«ç½®æ›ã•れる) find_header(header, path...) :: ヘッダファイルheaderã®å˜åœ¨ã‚’ -Ipath ã‚’è¿½åŠ ã—ãªãŒã‚‰ãƒã‚§ãƒƒã‚¯ ã™ã‚‹ï¼Žãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ - `HAVE_{HEADER_H}` を定義ã—,trueã‚’è¿”ã™ï¼Ž + <tt>HAVE_{HEADER_H}</tt> を定義ã—,trueã‚’è¿”ã™ï¼Ž (スラッシュやドットã¯ã‚¢ãƒ³ãƒ€ãƒ¼ã‚¹ã‚³ã‚¢ã«ç½®æ›ã•れる) have_struct_member(type, member[, header[, opt]]) :: ヘッダファイルheaderをインクルードã—ã¦åž‹typeãŒå®šç¾©ã•れ, ãªãŠã‹ã¤ãƒ¡ãƒ³ãƒmemberãŒå˜åœ¨ã™ã‚‹ã‹ã‚’ãƒã‚§ãƒƒã‚¯ã™ã‚‹ï¼Žãƒã‚§ãƒƒã‚¯ã« - æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ `HAVE_{TYPE}_{MEMBER}` ã‚’ + æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ <tt>HAVE_{TYPE}_{MEMBER}</tt> ã‚’ 定義ã—,trueã‚’è¿”ã™ï¼Ž have_type(type, header, opt) :: ヘッダファイルheaderをインクルードã—ã¦åž‹typeãŒå˜åœ¨ã™ã‚‹ã‹ã‚’ ãƒã‚§ãƒƒã‚¯ã™ã‚‹ï¼Žãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ - `HAVE_TYPE_{TYPE}` を定義ã—,trueã‚’è¿”ã™ï¼Ž + <tt>HAVE_TYPE_{TYPE}</tt> を定義ã—,trueã‚’è¿”ã™ï¼Ž check_sizeof(type, header) :: ヘッダファイルheaderをインクルードã—ã¦åž‹typeã®charå˜ä½ã‚µã‚¤ ズを調ã¹ã‚‹ï¼Žãƒã‚§ãƒƒã‚¯ã«æˆåŠŸã™ã‚‹ã¨ï¼Œãƒ—リプãƒã‚»ãƒƒã‚µãƒžã‚¯ãƒ - `SIZEOF_{TYPE}` を定義ã—,ãã®ã‚µã‚¤ã‚ºã‚’è¿”ã™ï¼Žå®šç¾©ã•れã¦ã„㪠+ <tt>SIZEOF_{TYPE}</tt> を定義ã—,ãã®ã‚µã‚¤ã‚ºã‚’è¿”ã™ï¼Žå®šç¾©ã•れã¦ã„㪠ã„ã¨ãã¯nilã‚’è¿”ã™ï¼Ž +append_cppflags(array-of-flags[, opt]) :: +append_cflags(array-of-flags[, opt]) :: +append_ldflags(array-of-flags[, opt]) :: + + å„flagãŒä½¿ç”¨å¯èƒ½ã§ã‚れã°ï¼Œãれãžã‚Œ$CPPFLAGS, $CFLAGS, + $LDFLAGSã«è¿½åŠ ã™ã‚‹ï¼Žã‚³ãƒ³ãƒ‘イラã®ãƒ•ラグã«ã¯ç§»æ¤æ€§ãŒãªã„ã®ã§ï¼Œ + 変数ã«ç›´æŽ¥è¿½åŠ ã›ãšã“れらを使ã†ã“ã¨ãŒæœ›ã¾ã—ã„. + create_makefile(target[, target_prefix]) :: 拡張ライブラリ用ã®Makefileを生æˆã™ã‚‹ï¼Žã“ã®é–¢æ•°ã‚’呼ã°ãªã‘れ @@ -1838,9 +1857,23 @@ RGenGCã¯ï¼ŒéŽåŽ»ã®æ‹¡å¼µãƒ©ã‚¤ãƒ–ラリã«ï¼ˆã»ã¼ï¼‰äº’æ›æ€§ã‚’ä¿ã¤ã‚ˆã スã™ã‚‹ã‚ˆã†ãªã‚³ãƒ¼ãƒ‰ã¯æ›¸ã‹ãªã„よã†ã«ã—ã¦ä¸‹ã•ã„.代ã‚りã«ï¼Œrb_ary_aref(), rb_ary_store() ãªã©ã®ï¼Œé©åˆ‡ãª API 関数を利用ã™ã‚‹ã‚ˆã†ã«ã—ã¦ä¸‹ã•ã„. -ãã®ã»ã‹ï¼Œå¯¾å¿œã«ã¤ã„ã¦ã®è©³ç´°ã¯ extension.rdoc ã®ã€ŒAppendix D. Generational -GCã€ã‚’å‚ç…§ã—ã¦ä¸‹ã•ã„. +ãã®ã»ã‹ï¼Œå¯¾å¿œã«ã¤ã„ã¦ã®è©³ç´°ã¯ {Appendix D. Generational +GC}[rdoc-ref:extension.rdoc@Appendix+D.+Generational+GC]ã‚’å‚ +ç…§ã—ã¦ä¸‹ã•ã„. + +== Appendix E. Ractor サãƒãƒ¼ãƒˆ -:enddoc: Local variables: -:enddoc: fill-column: 60 -:enddoc: end: +Ruby 3.0 ã‹ã‚‰ã€Ruby プãƒã‚°ãƒ©ãƒ を並列ã«å®Ÿè¡Œã™ã‚‹ãŸã‚ã®ä»•組ã¿ã§ã‚ã‚‹ Ractor +ãŒå°Žå…¥ã•れã¾ã—ãŸã€‚é©åˆ‡ã«ä¸¦åˆ—ã«å®Ÿè¡Œã™ã‚‹ãŸã‚ã«ã¯ã€Ractor サãƒãƒ¼ãƒˆãŒå¿…è¦ã« +ãªã‚Šã¾ã™ã€‚サãƒãƒ¼ãƒˆã—ã¦ã„ãªã„ライブラリã¯ã€ãƒ¡ã‚¤ãƒ³ Ractor 以外ã§å®Ÿè¡Œã™ã‚‹ã¨ +エラーã«ãªã‚Šã¾ã™ï¼ˆRactor::UnsafeError)。 + +Ractor をサãƒãƒ¼ãƒˆã™ã‚‹ãŸã‚ã®è©³ç´°ã¯ã€{Appendix F. Ractor +support}[rdoc-ref:extension.rdoc@Appendix+F.+Ractor+support] +ã‚’å‚ç…§ã—ã¦ãã ã•ã„。 + +-- +Local variables: +fill-column: 60 +end: +++ diff --git a/doc/extension.rdoc b/doc/extension.rdoc index 1355cdae64..18dc5817d4 100644 --- a/doc/extension.rdoc +++ b/doc/extension.rdoc @@ -1,6 +1,8 @@ # extension.rdoc - -*- RDoc -*- created at: Mon Aug 7 16:45:54 JST 1995 -= Creating Extension Libraries for Ruby +{日本語}[rdoc-ref:extension.ja.rdoc] + += Creating extension libraries for Ruby This document explains how to make extension libraries for Ruby. @@ -10,8 +12,8 @@ In C, variables have types and data do not have types. In contrast, Ruby variables do not have a static type, and data themselves have types, so data will need to be converted between the languages. -Data in Ruby are represented by the C type `VALUE'. Each VALUE data -has its data type. +Objects in Ruby are represented by the C type `VALUE'. Each VALUE +data has its data type. To retrieve C data from a VALUE, you need to: @@ -20,7 +22,7 @@ To retrieve C data from a VALUE, you need to: Converting to the wrong data type may cause serious problems. -=== Data Types +=== Ruby data types The Ruby interpreter has the following data types: @@ -54,7 +56,7 @@ T_ZOMBIE :: object awaiting finalization Most of the types are represented by C structures. -=== Check Data Type of the VALUE +=== Check type of the VALUE data The macro TYPE() defined in ruby.h shows the data type of the VALUE. TYPE() returns the constant number T_XXXX described above. To handle @@ -88,12 +90,14 @@ There are also faster check macros for fixnums and nil. FIXNUM_P(obj) NIL_P(obj) -=== Convert VALUE into C Data +=== Convert VALUE into C data The data for type T_NIL, T_FALSE, T_TRUE are nil, false, true respectively. They are singletons for the data type. The equivalent C constants are: Qnil, Qfalse, Qtrue. -Note that Qfalse is false in C also (i.e. 0), but not Qnil. +RTEST() will return true if a VALUE is neither Qfalse nor Qnil. +If you need to differentiate Qfalse from Qnil, +specifically test against Qfalse. The T_FIXNUM data is a 31bit or 63bit length fixed integer. This size depends on the size of long: if long is 32bit then @@ -141,7 +145,7 @@ Notice: Do not change the value of the structure directly, unless you are responsible for the result. This ends up being the cause of interesting bugs. -=== Convert C Data into VALUE +=== Convert C data into VALUE To convert C data to Ruby values: @@ -167,14 +171,14 @@ INT2NUM() :: for arbitrary sized integers. INT2NUM() converts an integer into a Bignum if it is out of the FIXNUM range, but is a bit slower. -=== Manipulating Ruby Data +=== Manipulating Ruby object As I already mentioned, it is not recommended to modify an object's internal structure. To manipulate objects, use the functions supplied by the Ruby interpreter. Some (not all) of the useful functions are listed below: -==== String Functions +==== String functions rb_str_new(const char *ptr, long len) :: @@ -190,16 +194,6 @@ rb_str_new_literal(const char *ptr) :: Creates a new Ruby string from a C string literal. -rb_tainted_str_new(const char *ptr, long len) :: - - Creates a new tainted Ruby string. Strings from external data - sources should be tainted. - -rb_tainted_str_new2(const char *ptr) :: -rb_tainted_str_new_cstr(const char *ptr) :: - - Creates a new tainted Ruby string from a C string. - rb_sprintf(const char *format, ...) :: rb_vsprintf(const char *format, va_list ap) :: @@ -287,7 +281,7 @@ rb_str_modify(VALUE str) :: you MUST call this function before modifying the contents using RSTRING_PTR and/or rb_str_set_len. -==== Array Functions +==== Array functions rb_ary_new() :: @@ -346,13 +340,13 @@ rb_ary_cat(VALUE ary, const VALUE *ptr, long len) :: == Extending Ruby with C -=== Adding New Features to Ruby +=== Adding new features to Ruby You can add new features (classes, methods, etc.) to the Ruby interpreter. Ruby provides APIs for defining the following things: - Classes, Modules -- Methods, Singleton Methods +- Methods, singleton methods - Constants ==== Class and Module Definition @@ -370,7 +364,7 @@ To define nested classes or modules, use the functions below: VALUE rb_define_class_under(VALUE outer, const char *name, VALUE super) VALUE rb_define_module_under(VALUE outer, const char *name) -==== Method and Singleton Method Definition +==== Method and singleton method definition To define methods or singleton methods, use these functions: @@ -458,12 +452,24 @@ you may rely on: VALUE rb_call_super(int argc, const VALUE *argv) +To specify whether keyword arguments are passed when calling super: + + VALUE rb_call_super_kw(int argc, const VALUE *argv, int kw_splat) + ++kw_splat+ can have these possible values (used by all methods that accept ++kw_splat+ argument): + +RB_NO_KEYWORDS :: Do not pass keywords +RB_PASS_KEYWORDS :: Pass keywords, final argument should be a hash of keywords +RB_PASS_CALLED_KEYWORDS :: Pass keywords if current method was called with + keywords, useful for argument delegation + To achieve the receiver of the current scope (if no other way is available), you can use: VALUE rb_current_receiver(void) -==== Constant Definition +==== Constant definition We have 2 functions to define constants: @@ -473,11 +479,11 @@ We have 2 functions to define constants: The former is to define a constant under specified class/module. The latter is to define a global constant. -=== Use Ruby Features from C +=== Use Ruby features from C There are several ways to invoke Ruby's features from C code. -==== Evaluate Ruby Programs in a String +==== Evaluate Ruby programs in a string The easiest way to use Ruby's functionality from a C program is to evaluate the string as Ruby program. This function will do the job: @@ -546,7 +552,7 @@ and to convert Ruby Symbol object to ID, use ID SYM2ID(VALUE symbol) -==== Invoke Ruby Method from C +==== Invoke Ruby method from C To invoke methods directly, you can use the function below @@ -555,7 +561,7 @@ To invoke methods directly, you can use the function below This function invokes a method on the recv, with the method name specified by the symbol mid. -==== Accessing the Variables and Constants +==== Accessing the variables and constants You can access class variables and instance variables using access functions. Also, global variables can be shared between both @@ -574,9 +580,9 @@ To access the constants of the class/module: See also Constant Definition above. -== Information Sharing Between Ruby and C +== Information sharing between Ruby and C -=== Ruby Constants That Can Be Accessed From C +=== Ruby constants that can be accessed from C As stated in section 1.3, the following Ruby constants can be referred from C. @@ -590,7 +596,7 @@ Qnil :: Ruby nil in C scope. -=== Global Variables Shared Between C and Ruby +=== Global variables shared between C and Ruby Information can be shared between the two environments using shared global variables. To define them, you can use functions listed below: @@ -632,7 +638,7 @@ The prototypes of the getter and setter functions are as follows: VALUE (*getter)(ID id); void (*setter)(VALUE val, ID id); -=== Encapsulate C Data into a Ruby Object +=== Encapsulate C data into a Ruby object Sometimes you need to expose your struct in the C world as a Ruby object. @@ -655,30 +661,30 @@ with the next macro. TypedData_Wrap_Struct() returns a created Ruby object as a VALUE. -The klass argument is the class for the object. +The klass argument is the class for the object. The klass should +derive from rb_cObject, and the allocator must be set by calling +rb_define_alloc_func or rb_undef_alloc_func. + data_type is a pointer to a const rb_data_type_t which describes how Ruby should manage the struct. -It is recommended that klass derives from a special class called -Data (rb_cData) but not from Object or other ordinal classes. -If it doesn't, you have to call rb_undef_alloc_func(klass). - rb_data_type_t is defined like this. Let's take a look at each member of the struct. typedef struct rb_data_type_struct rb_data_type_t; struct rb_data_type_struct { - const char *wrap_struct_name; - struct { - void (*dmark)(void*); - void (*dfree)(void*); - size_t (*dsize)(const void *); - void *reserved[2]; - } function; - const rb_data_type_t *parent; - void *data; - VALUE flags; + const char *wrap_struct_name; + struct { + void (*dmark)(void*); + void (*dfree)(void*); + size_t (*dsize)(const void *); + void (*dcompact)(void*); + void *reserved[1]; + } function; + const rb_data_type_t *parent; + void *data; + VALUE flags; }; wrap_struct_name is an identifier of this instance of the struct. @@ -699,14 +705,22 @@ Note that it is recommended to avoid such a reference. ++ dfree is a function to free the pointer allocation. -If this is -1, the pointer will be just freed. +If this is RUBY_DEFAULT_FREE, the pointer will be just freed. dsize calculates memory consumption in bytes by the struct. Its parameter is a pointer to your struct. You can pass 0 as dsize if it is hard to implement such a function. But it is still recommended to avoid 0. -You have to fill reserved and parent with 0. +dcompact is invoked when memory compaction took place. +Referred Ruby objects that were marked by rb_gc_mark_movable() +can here be updated per rb_gc_location(). + +You have to fill reserved with 0. + +parent can point to another C type definition that the Ruby object +is inherited from. Then TypedData_Get_Struct() does also accept +derived objects. You can fill "data" with an arbitrary value for your use. Ruby does nothing with the member. @@ -722,7 +736,7 @@ RUBY_TYPED_FREE_IMMEDIATELY :: You can specify this flag if the dfree never unlocks Ruby's internal lock (GVL). - If this flag is not set, Ruby defers invokation of dfree() + If this flag is not set, Ruby defers invocation of dfree() and invokes dfree() at the same time as finalizers. RUBY_TYPED_WB_PROTECTED :: @@ -735,25 +749,100 @@ RUBY_TYPED_WB_PROTECTED :: barriers in all implementations of methods of that object as appropriate. Otherwise Ruby might crash while running. - More about write barriers can be found in "Generational GC" in - Appendix D. + More about write barriers can be found in {Generational + GC}[rdoc-ref:@Appendix+D.+Generational+GC]. -You can allocate and wrap the structure in one step. +RUBY_TYPED_FROZEN_SHAREABLE :: + + This flag indicates that the object is shareable object if the object + is frozen. See {Ractor support}[rdoc-ref:@Appendix+F.+Ractor+support] + more details. + + If this flag is not set, the object can not become a shareable + object by Ractor.make_shareable() method. + +Note that this macro can raise an exception. If sval to be wrapped +holds a resource needs to be released (e.g., allocated memory, handle +from an external library, and etc), you will have to use rb_protect. + +You can allocate and wrap the structure in one step, in more +preferable manner. TypedData_Make_Struct(klass, type, data_type, sval) -This macro returns an allocated Data object, wrapping the pointer to +This macro returns an allocated T_DATA object, wrapping the pointer to the structure, which is also allocated. This macro works like: (sval = ZALLOC(type), TypedData_Wrap_Struct(klass, data_type, sval)) +However, you should use this macro instead of "allocation then wrap" +like the above code if it is simply allocated, because the latter can +raise a NoMemoryError and sval will be memory leaked in that case. + Arguments klass and data_type work like their counterparts in TypedData_Wrap_Struct(). A pointer to the allocated structure will be assigned to sval, which should be a pointer of the type specified. +==== Declaratively marking/compacting struct references + +In the case where your struct refers to Ruby objects that are simple values, +not wrapped in conditional logic or complex data structures an alternative +approach to marking and reference updating is provided, by declaring offset +references to the VALUES in your struct. + +Doing this allows the Ruby GC to support marking these references and GC +compaction without the need to define the +dmark+ and +dcompact+ callbacks. + +You must define a static list of VALUE pointers to the offsets within your +struct where the references are located, and set the "data" member to point to +this reference list. The reference list must end with +RUBY_END_REFS+. + +Some Macros have been provided to make edge referencing easier: + +* <code>RUBY_TYPED_DECL_MARKING</code> =A flag that can be set on the +ruby_data_type_t+ to indicate that references are being declared as edges. + +* <code>RUBY_REFERENCES(ref_list_name)</code> - Define _ref_list_name_ as a list of references + +* <code>RUBY_REF_END</code> - The end mark of the references list. + +* <code>RUBY_REF_EDGE(struct, member)</code> - Declare _member_ as a VALUE edge from _struct_. Use this after +RUBY_REFERENCES_START+ + +* +RUBY_REFS_LIST_PTR+ - Coerce the reference list into a format that can be + accepted by the existing +dmark+ interface. + +The example below is from Dir (defined in +dir.c+) + + // The struct being wrapped. Notice this contains 3 members of which the second + // is a VALUE reference to another ruby object. + struct dir_data { + DIR *dir; + const VALUE path; + rb_encoding *enc; + } + + // Define a reference list `dir_refs` containing a single entry to `path`. + // Needs terminating with RUBY_REF_END + RUBY_REFERENCES(dir_refs) = { + RUBY_REF_EDGE(dir_data, path), + RUBY_REF_END + }; + + // Override the "dmark" field with the defined reference list now that we + // no longer need a marking callback and add RUBY_TYPED_DECL_MARKING to the + // flags field + static const rb_data_type_t dir_data_type = { + "dir", + {RUBY_REFS_LIST_PTR(dir_refs), dir_free, dir_memsize,}, + 0, NULL, RUBY_TYPED_WB_PROTECTED | RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_DECL_MARKING + }; + +Declaring simple references declaratively in this manner allows the GC to both +mark, and move the underlying object, and automatically update the reference to +it during compaction. + ==== Ruby object to C struct -To retrieve the C pointer from the Data object, use the macro +To retrieve the C pointer from the T_DATA object, use the macro TypedData_Get_Struct(). TypedData_Get_Struct(obj, type, &data_type, sval) @@ -768,7 +857,7 @@ OK, here's the example of making an extension library. This is the extension to access DBMs. The full source is included in the ext/ directory in the Ruby's source tree. -=== Make the Directory +=== Make the directory % mkdir ext/dbm @@ -795,11 +884,14 @@ the library. Here's the example of an initializing function. + #include <ruby.h> void Init_dbm(void) { /* define DBM class */ VALUE cDBM = rb_define_class("DBM", rb_cObject); + /* Redefine DBM.allocate + rb_define_alloc_func(cDBM, fdbm_alloc); /* DBM includes Enumerable module */ rb_include_module(cDBM, rb_mEnumerable); @@ -809,7 +901,7 @@ Here's the example of an initializing function. /* DBM instance method close(): no args */ rb_define_method(cDBM, "close", fdbm_close, 0); /* DBM instance method []: 1 argument */ - rb_define_method(cDBM, "[]", fdbm_fetch, 1); + rb_define_method(cDBM, "[]", fdbm_aref, 1); /* ... */ @@ -832,10 +924,19 @@ TypedData_Make_Struct. RUBY_TYPED_FREE_IMMEDIATELY, }; - obj = TypedData_Make_Struct(klass, struct dbmdata, &dbm_type, dbmp); + static VALUE + fdbm_alloc(VALUE klass) + { + struct dbmdata *dbmp; + /* Allocate T_DATA object and C struct and fill struct with zero bytes */ + return TypedData_Make_Struct(klass, struct dbmdata, &dbm_type, dbmp); + } This code wraps the dbmdata structure into a Ruby object. We avoid wrapping DBM* directly, because we want to cache size information. +Since Object.allocate allocates an ordinary T_OBJECT type (instead +of T_DATA), it's important to either use rb_define_alloc_func() to +overwrite it or rb_undef_alloc_func() to delete it. To retrieve the dbmdata structure from a Ruby object, we define the following macro: @@ -853,9 +954,13 @@ There are three kinds of way to receive method arguments. First, methods with a fixed number of arguments receive arguments like this: static VALUE - fdbm_delete(VALUE obj, VALUE keystr) + fdbm_aref(VALUE obj, VALUE keystr) { - /* ... */ + struct dbmdata *dbmp; + GetDBM(obj, dbmp); + /* Use dbmp to access the key */ + dbm_fetch(dbmp->di_dbm, StringValueCStr(keystr)); + /* ... */ } The first argument of the C function is the self, the rest are the @@ -920,6 +1025,9 @@ need to put at the top of the file. You can use the functions below to check various conditions. + append_cppflags(array-of-flags[, opt]): append each flag to $CPPFLAGS if usable + append_cflags(array-of-flags[, opt]): append each flag to $CFLAGS if usable + append_ldflags(array-of-flags[, opt]): append each flag to $LDFLAGS if usable have_macro(macro[, headers[, opt]]): check whether macro is defined have_library(lib[, func[, headers[, opt]]]): check whether library containing function exists find_library(lib[, func, *paths]): find library from paths @@ -948,6 +1056,10 @@ The value of the variables below will affect the Makefile. $LDFLAGS: included in LDFLAGS make variable (such as -L) $objs: list of object file names +Compiler/linker flags are not portable usually, you should use ++append_cppflags+, +append_cpflags+ and +append_ldflags+ respectively +instead of appending the above variables directly. + Normally, the object files list is automatically generated by searching source files, but you must define them explicitly if any sources will be generated while building. @@ -956,7 +1068,7 @@ If a compilation condition is not fulfilled, you should not call ``create_makefile''. The Makefile will not be generated, compilation will not be done. -=== Prepare Depend (Optional) +=== Prepare depend (Optional) If the file named depend exists, Makefile will include that file to check dependencies. You can make this file by invoking @@ -995,15 +1107,32 @@ You may need to rb_debug the extension. Extensions can be linked statically by adding the directory name in the ext/Setup file so that you can inspect the extension with the debugger. -=== Done! Now You Have the Extension Library +=== Done! Now you have the extension library You can do anything you want with your library. The author of Ruby will not claim any restrictions on your code depending on the Ruby API. Feel free to use, modify, distribute or sell your program. -== Appendix A. Ruby Source Files Overview +== Appendix A. Ruby header and source files overview + +=== Ruby header files + +Everything under <tt>$repo_root/include/ruby</tt> is installed with +<tt>make install</tt>. +It should be included per <tt>#include <ruby.h></tt> from C extensions. +All symbols are public API with the exception of symbols prefixed with ++rbimpl_+ or +RBIMPL_+. They are implementation details and shouldn't +be used by C extensions. + +Only <tt>$repo_root/include/ruby/*.h</tt> whose corresponding macros +are defined in the <tt>$repo_root/include/ruby.h</tt> header are +allowed to be <tt>#include</tt>-d by C extensions. + +Header files under <tt>$repo_root/internal/</tt> or directly under the +root <tt>$repo_root/*.h</tt> are not make-installed. +They are internal headers with only internal APIs. -=== Ruby Language Core +=== Ruby language core class.c :: classes and modules error.c :: exception classes and exception mechanism @@ -1012,14 +1141,14 @@ load.c :: library loading object.c :: objects variable.c :: variables and constants -=== Ruby Syntax Parser +=== Ruby syntax parser parse.y :: grammar definition parse.c :: automatically generated from parse.y defs/keywords :: reserved keywords lex.c :: automatically generated from keywords -=== Ruby Evaluator (a.k.a. YARV) +=== Ruby evaluator (a.k.a. YARV) compile.c eval.c @@ -1045,9 +1174,8 @@ lex.c :: automatically generated from keywords -> opt*.inc : automatically generated -> vm.inc : automatically generated -=== Regular Expression Engine (Oniguruma) +=== Regular expression engine (Onigumo) - regex.c regcomp.c regenc.c regerror.c @@ -1055,7 +1183,7 @@ lex.c :: automatically generated from keywords regparse.c regsyntax.c -=== Utility Functions +=== Utility functions debug.c :: debug symbols for C debugger dln.c :: dynamic loading @@ -1063,7 +1191,7 @@ st.c :: general purpose hash table strftime.c :: formatting times util.c :: misc utilities -=== Ruby Interpreter Implementation +=== Ruby interpreter implementation dmyext.c dmydln.c @@ -1077,7 +1205,7 @@ util.c :: misc utilities gem_prelude.rb prelude.rb -=== Class Library +=== Class library array.c :: Array bignum.c :: Bignum @@ -1116,13 +1244,13 @@ transcode.c :: Encoding::Converter enc/*.c :: encoding classes enc/trans/* :: codepoint mapping tables -=== goruby Interpreter Implementation +=== goruby interpreter implementation goruby.c golf_prelude.rb : goruby specific libraries. -> golf_prelude.c : automatically generated -== Appendix B. Ruby Extension API Reference +== Appendix B. Ruby extension API reference === Types @@ -1132,7 +1260,7 @@ VALUE :: such as struct RString, etc. To refer the values in structures, use casting macros like RSTRING(obj). -=== Variables and Constants +=== Variables and constants Qnil :: @@ -1146,7 +1274,7 @@ Qfalse :: false object -=== C Pointer Wrapping +=== C pointer wrapping Data_Wrap_Struct(VALUE klass, void (*mark)(), void (*free)(), void *sval) :: @@ -1166,7 +1294,7 @@ Data_Get_Struct(data, type, sval) :: This macro retrieves the pointer value from DATA, and assigns it to the variable sval. -=== Checking Data Types +=== Checking VALUE types RB_TYPE_P(value, type) :: @@ -1196,11 +1324,7 @@ void Check_Type(VALUE value, int type) :: Ensures +value+ is of the given internal +type+ or raises a TypeError -SafeStringValue(value) :: - - Checks that +value+ is a String and is not tainted - -=== Data Type Conversion +=== VALUE type conversion FIX2INT(value), INT2FIX(i) :: @@ -1284,7 +1408,7 @@ rb_str_new2(s) :: char * -> String -=== Defining Classes and Modules +=== Defining classes and modules VALUE rb_define_class(const char *name, VALUE super) :: @@ -1311,7 +1435,7 @@ void rb_extend_object(VALUE object, VALUE module) :: Extend the object with the module's attributes. -=== Defining Global Variables +=== Defining global variables void rb_define_variable(const char *name, VALUE *var) :: @@ -1355,7 +1479,7 @@ void rb_gc_register_mark_object(VALUE object) :: Tells GC to protect the +object+, which may not be referenced anywhere. -=== Constant Definition +=== Constant definition void rb_define_const(VALUE klass, const char *name, VALUE val) :: @@ -1367,7 +1491,7 @@ void rb_define_global_const(const char *name, VALUE val) :: rb_define_const(rb_cObject, name, val) -=== Method Definition +=== Method definition rb_define_method(VALUE klass, const char *name, VALUE (*func)(ANYARGS), int argc) :: @@ -1398,7 +1522,7 @@ rb_scan_args(int argc, VALUE *argv, const char *fmt, ...) :: according to the format string. The format can be described in ABNF as follows: - scan-arg-spec := param-arg-spec [option-hash-arg-spec] [block-arg-spec] + scan-arg-spec := param-arg-spec [keyword-arg-spec] [block-arg-spec] param-arg-spec := pre-arg-spec [post-arg-spec] / post-arg-spec / pre-opt-post-arg-spec @@ -1407,7 +1531,7 @@ rb_scan_args(int argc, VALUE *argv, const char *fmt, ...) :: [num-of-trailing-mandatory-args] pre-opt-post-arg-spec := num-of-leading-mandatory-args num-of-optional-args num-of-trailing-mandatory-args - option-hash-arg-spec := sym-for-option-hash-arg + keyword-arg-spec := sym-for-keyword-arg block-arg-spec := sym-for-block-arg num-of-leading-mandatory-args := DIGIT ; The number of leading @@ -1419,18 +1543,10 @@ rb_scan_args(int argc, VALUE *argv, const char *fmt, ...) :: ; captured as a ruby array num-of-trailing-mandatory-args := DIGIT ; The number of trailing ; mandatory arguments - sym-for-option-hash-arg := ":" ; Indicates that an option - ; hash is captured if the last - ; argument is a hash or can be - ; converted to a hash with - ; #to_hash. When the last - ; argument is nil, it is - ; captured if it is not - ; ambiguous to take it as - ; empty option hash; i.e. '*' - ; is not specified and - ; arguments are given more - ; than sufficient. + sym-for-keyword-arg := ":" ; Indicates that keyword + ; argument captured as a hash. + ; If keyword arguments are not + ; provided, returns nil. sym-for-block-arg := "&" ; Indicates that an iterator ; block should be captured if ; given @@ -1445,6 +1561,19 @@ rb_scan_args(int argc, VALUE *argv, const char *fmt, ...) :: The number of given arguments, excluding an option hash or iterator block, is returned. +rb_scan_args_kw(int kw_splat, int argc, VALUE *argv, const char *fmt, ...) :: + + The same as +rb_scan_args+, except the +kw_splat+ argument specifies whether + keyword arguments are provided (instead of being determined by the call + from Ruby to the C function). +kw_splat+ should be one of the following + values: + + RB_SCAN_ARGS_PASS_CALLED_KEYWORDS :: Same behavior as +rb_scan_args+. + RB_SCAN_ARGS_KEYWORDS :: The final argument should be a hash treated as + keywords. + RB_SCAN_ARGS_LAST_HASH_KEYWORDS :: Treat a final argument as keywords if it + is a hash, and not as keywords otherwise. + int rb_get_kwargs(VALUE keyword_hash, const ID *table, int required, int optional, VALUE *values) :: Retrieves argument VALUEs bound to keywords, which directed by +table+ @@ -1483,11 +1612,41 @@ VALUE rb_funcallv(VALUE recv, ID mid, int argc, VALUE *argv) :: Invokes a method, passing arguments as an array of values. Able to call even private/protected methods. +VALUE rb_funcallv_kw(VALUE recv, ID mid, int argc, VALUE *argv, int kw_splat) :: + + Same as rb_funcallv, using +kw_splat+ to determine whether keyword + arguments are passed. + VALUE rb_funcallv_public(VALUE recv, ID mid, int argc, VALUE *argv) :: Invokes a method, passing arguments as an array of values. Able to call only public methods. +VALUE rb_funcallv_public_kw(VALUE recv, ID mid, int argc, VALUE *argv, int kw_splat) :: + + Same as rb_funcallv_public, using +kw_splat+ to determine whether keyword + arguments are passed. + +VALUE rb_funcall_passing_block(VALUE recv, ID mid, int argc, const VALUE* argv) :: + + Same as rb_funcallv_public, except is passes the currently active block as + the block when calling the method. + +VALUE rb_funcall_passing_block_kw(VALUE recv, ID mid, int argc, const VALUE* argv, int kw_splat) :: + + Same as rb_funcall_passing_block, using +kw_splat+ to determine whether + keyword arguments are passed. + +VALUE rb_funcall_with_block(VALUE recv, ID mid, int argc, const VALUE *argv, VALUE passed_procval) :: + + Same as rb_funcallv_public, except +passed_procval+ specifies the block to + pass to the method. + +VALUE rb_funcall_with_block_kw(VALUE recv, ID mid, int argc, const VALUE *argv, VALUE passed_procval, int kw_splat) :: + + Same as rb_funcall_with_block, using +kw_splat+ to determine whether + keyword arguments are passed. + VALUE rb_eval_string(const char *str) :: Compiles and executes the string as a Ruby program. @@ -1508,7 +1667,7 @@ int rb_respond_to(VALUE obj, ID id) :: Returns true if the object responds to the message specified by id. -=== Instance Variables +=== Instance variables VALUE rb_iv_get(VALUE obj, const char *name) :: @@ -1519,7 +1678,7 @@ VALUE rb_iv_set(VALUE obj, const char *name, VALUE val) :: Sets the value of the instance variable. -=== Control Structure +=== Control structure VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (ANYARGS), VALUE data2) :: @@ -1532,6 +1691,11 @@ VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (A whereas yielded values can be gotten via argc/argv of the third/fourth arguments. +VALUE rb_block_call_kw(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (ANYARGS), VALUE data2, int kw_splat) :: + + Same as rb_funcall_with_block, using +kw_splat+ to determine whether + keyword arguments are passed. + \[OBSOLETE] VALUE rb_iterate(VALUE (*func1)(), VALUE arg1, VALUE (*func2)(), VALUE arg2) :: Calls the function func1, supplying func2 as the block. func1 will be @@ -1543,7 +1707,32 @@ VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (A VALUE rb_yield(VALUE val) :: - Evaluates the block with value val. + Yields val as a single argument to the block. + +VALUE rb_yield_values(int n, ...) :: + + Yields +n+ number of arguments to the block, using one C argument per Ruby + argument. + +VALUE rb_yield_values2(int n, VALUE *argv) :: + + Yields +n+ number of arguments to the block, with all Ruby arguments in the + C argv array. + +VALUE rb_yield_values_kw(int n, VALUE *argv, int kw_splat) :: + + Same as rb_yield_values2, using +kw_splat+ to determine whether + keyword arguments are passed. + +VALUE rb_yield_splat(VALUE args) :: + + Same as rb_yield_values2, except arguments are specified by the Ruby + array +args+. + +VALUE rb_yield_splat_kw(VALUE args, int kw_splat) :: + + Same as rb_yield_splat, using +kw_splat+ to determine whether + keyword arguments are passed. VALUE rb_rescue(VALUE (*func1)(ANYARGS), VALUE arg1, VALUE (*func2)(ANYARGS), VALUE arg2) :: @@ -1585,7 +1774,7 @@ void rb_iter_break_value(VALUE value) :: return the given argument value. This function never return to the caller. -=== Exceptions and Errors +=== Exceptions and errors void rb_warn(const char *fmt, ...) :: @@ -1658,7 +1847,7 @@ int rb_wait_for_single_fd(int fd, int events, struct timeval *timeout) :: Use a NULL +timeout+ to wait indefinitely. -=== I/O Multiplexing +=== I/O multiplexing Ruby supports I/O multiplexing based on the select(2) system call. The Linux select_tut(2) manpage @@ -1710,7 +1899,7 @@ int rb_thread_fd_select(int nfds, rb_fdset_t *readfds, rb_fdset_t *writefds, rb_ rb_io_wait_writable, or rb_wait_for_single_fd functions since they can be optimized for specific platforms (currently, only Linux). -=== Initialize and Start the Interpreter +=== Initialize and start the interpreter The embedding API functions are below (not needed for extension libraries): @@ -1735,7 +1924,7 @@ void ruby_script(char *name) :: Specifies the name of the script ($0). -=== Hooks for the Interpreter Events +=== Hooks for the interpreter events void rb_add_event_hook(rb_event_hook_func_t func, rb_event_flag_t events, VALUE data) :: @@ -1777,7 +1966,7 @@ void rb_gc_adjust_memory_usage(ssize_t diff) :: is decreased; a memory block is freed or a block is reallocated as smaller size. This function may trigger the GC. -=== Macros for Compatibility +=== Macros for compatibility Some macros to check API compatibilities are available by default. @@ -1788,13 +1977,13 @@ NORETURN_STYLE_NEW :: HAVE_RB_DEFINE_ALLOC_FUNC :: Means that function rb_define_alloc_func() is provided, that means the - allocation framework is used. This is same as the result of + allocation framework is used. This is the same as the result of have_func("rb_define_alloc_func", "ruby.h"). HAVE_RB_REG_NEW_STR :: Means that function rb_reg_new_str() is provided, that creates Regexp - object from String object. This is same as the result of + object from String object. This is the same as the result of have_func("rb_reg_new_str", "ruby.h"). HAVE_RB_IO_T :: @@ -1812,11 +2001,57 @@ HAVE_RUBY_*_H :: instance, when HAVE_RUBY_ST_H is defined you should use ruby/st.h not mere st.h. + Header files corresponding to these macros may be <tt>#include</tt> + directly from extension libraries. + RB_EVENT_HOOKS_HAVE_CALLBACK_DATA :: Means that rb_add_event_hook() takes the third argument `data', to be passed to the given event hook function. +=== Defining backward compatible macros for keyword argument functions + +Most ruby C extensions are designed to support multiple Ruby versions. +In order to correctly support Ruby 2.7+ in regards to keyword +argument separation, C extensions need to use <code>*_kw</code> +functions. However, these functions do not exist in Ruby 2.6 and +below, so in those cases macros should be defined to allow you to use +the same code on multiple Ruby versions. Here are example macros +you can use in extensions that support Ruby 2.6 (or below) when using +the <code>*_kw</code> functions introduced in Ruby 2.7. + + #ifndef RB_PASS_KEYWORDS + /* Only define macros on Ruby <2.7 */ + #define rb_funcallv_kw(o, m, c, v, kw) rb_funcallv(o, m, c, v) + #define rb_funcallv_public_kw(o, m, c, v, kw) rb_funcallv_public(o, m, c, v) + #define rb_funcall_passing_block_kw(o, m, c, v, kw) rb_funcall_passing_block(o, m, c, v) + #define rb_funcall_with_block_kw(o, m, c, v, b, kw) rb_funcall_with_block(o, m, c, v, b) + #define rb_scan_args_kw(kw, c, v, s, ...) rb_scan_args(c, v, s, __VA_ARGS__) + #define rb_call_super_kw(c, v, kw) rb_call_super(c, v) + #define rb_yield_values_kw(c, v, kw) rb_yield_values2(c, v) + #define rb_yield_splat_kw(a, kw) rb_yield_splat(a) + #define rb_block_call_kw(o, m, c, v, f, p, kw) rb_block_call(o, m, c, v, f, p) + #define rb_fiber_resume_kw(o, c, v, kw) rb_fiber_resume(o, c, v) + #define rb_fiber_yield_kw(c, v, kw) rb_fiber_yield(c, v) + #define rb_enumeratorize_with_size_kw(o, m, c, v, f, kw) rb_enumeratorize_with_size(o, m, c, v, f) + #define SIZED_ENUMERATOR_KW(obj, argc, argv, size_fn, kw_splat) \ + rb_enumeratorize_with_size((obj), ID2SYM(rb_frame_this_func()), \ + (argc), (argv), (size_fn)) + #define RETURN_SIZED_ENUMERATOR_KW(obj, argc, argv, size_fn, kw_splat) do { \ + if (!rb_block_given_p()) \ + return SIZED_ENUMERATOR(obj, argc, argv, size_fn); \ + } while (0) + #define RETURN_ENUMERATOR_KW(obj, argc, argv, kw_splat) RETURN_SIZED_ENUMERATOR(obj, argc, argv, 0) + #define rb_check_funcall_kw(o, m, c, v, kw) rb_check_funcall(o, m, c, v) + #define rb_obj_call_init_kw(o, c, v, kw) rb_obj_call_init(o, c, v) + #define rb_class_new_instance_kw(c, v, k, kw) rb_class_new_instance(c, v, k) + #define rb_proc_call_kw(p, a, kw) rb_proc_call(p, a) + #define rb_proc_call_with_block_kw(p, c, v, b, kw) rb_proc_call_with_block(p, c, v, b) + #define rb_method_call_kw(c, v, m, kw) rb_method_call(c, v, m) + #define rb_method_call_with_block_kw(c, v, m, b, kw) rb_method_call_with_block(c, v, m, b) + #define rb_eval_cmd_kw(c, a, kw) rb_eval_cmd(c, a, 0) + #endif + == Appendix C. Functions available for use in extconf.rb See documentation for {mkmf}[rdoc-ref:MakeMakefile]. @@ -1918,7 +2153,7 @@ Before inserting write barriers, you need to know about RGenGC algorithm available in include/ruby/ruby.h. An example is available in iseq.c. For a complete guide for RGenGC and write barriers, please refer to -<https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/RGenGC>. +<https://bugs.ruby-lang.org/projects/ruby-master/wiki/RGenGC>. == Appendix E. RB_GC_GUARD to protect from premature GC @@ -1969,6 +2204,89 @@ keyword in C. RB_GC_GUARD has the following advantages: compilers and architectures. RB_GC_GUARD is customizable for broken systems/compilers without negatively affecting other systems. -:enddoc: Local variables: -:enddoc: fill-column: 70 -:enddoc: end: +== Appendix F. Ractor support + +Ractor(s) are the parallel execution mechanism introduced in Ruby 3.0. All +ractors can run in parallel on a different OS thread (using an underlying system +provided thread), so the C extension should be thread-safe. A C extension that +can run in multiple ractors is called "Ractor-safe". + +Ractor safety around C extensions has the following properties: +1. By default, all C extensions are recognized as Ractor-unsafe. +2. Ractor-unsafe C-methods may only be called from the main Ractor. If invoked + by a non-main Ractor, then a Ractor::UnsafeError is raised. +3. If an extension desires to be marked as Ractor-safe the extension should + call rb_ext_ractor_safe(true) at the Init_ function for the extension, and + all defined methods will be marked as Ractor-safe. + +To make a "Ractor-safe" C extension, we need to check the following points: + +1. Do not share unshareable objects between ractors + + For example, C's global variable can lead sharing an unshareable objects + between ractors. + + VALUE g_var; + VALUE set(VALUE self, VALUE v){ return g_var = v; } + VALUE get(VALUE self){ return g_var; } + + set() and get() pair can share an unshareable objects using g_var, and + it is Ractor-unsafe. + + Not only using global variables directly, some indirect data structure + such as global st_table can share the objects, so please take care. + + Note that class and module objects are shareable objects, so you can + keep the code "cFoo = rb_define_class(...)" with C's global variables. + +2. Check the thread-safety of the extension + + An extension should be thread-safe. For example, the following code is + not thread-safe: + + bool g_called = false; + VALUE call(VALUE self) { + if (g_called) rb_raise("recursive call is not allowed."); + g_called = true; + VALUE ret = do_something(); + g_called = false; + return ret; + } + + because g_called global variable should be synchronized by other + ractor's threads. To avoid such data-race, some synchronization should + be used. Check include/ruby/thread_native.h and include/ruby/atomic.h. + + With Ractors, all objects given as method parameters and the receiver (self) + are guaranteed to be from the current Ractor or to be shareable. As a + consequence, it is easier to make code ractor-safe than to make code generally + thread-safe. For example, we don't need to lock an array object to access the + element of it. + +3. Check the thread-safety of any used library + + If the extension relies on an external library, such as a function foo() from + a library libfoo, the function libfoo foo() should be thread safe. + +4. Make an object shareable + + This is not required to make an extension Ractor-safe. + + If an extension provides special objects defined by rb_data_type_t, + consider these objects can become shareable or not. + + RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be + shareable objects if the object is frozen. This means that if the object + is frozen, the mutation of wrapped data is not allowed. + +5. Others + + There are possibly other points or requirements which must be considered in the + making of a Ractor-safe extension. This document will be extended as they are + discovered. + +-- +Local variables: +fill-column: 70 +end: +++ diff --git a/doc/float.rb b/doc/float.rb new file mode 100644 index 0000000000..01668bfc6d --- /dev/null +++ b/doc/float.rb @@ -0,0 +1,128 @@ +# A \Float object stores a real number +# using the native architecture's double-precision floating-point representation. +# +# == \Float Imprecisions +# +# Some real numbers can be represented precisely as \Float objects: +# +# 37.5 # => 37.5 +# 98.75 # => 98.75 +# 12.3125 # => 12.3125 +# +# Others cannot; among these are the transcendental numbers, including: +# +# - Pi, <i>Ï€</i>: in mathematics, a number of infinite precision: +# 3.1415926535897932384626433... (to 25 places); +# in Ruby, it is of limited precision (in this case, to 16 decimal places): +# +# Math::PI # => 3.141592653589793 +# +# - Euler's number, <i>e</i>: in mathematics, a number of infinite precision: +# 2.7182818284590452353602874... (to 25 places); +# in Ruby, it is of limited precision (in this case, to 15 decimal places): +# +# Math::E # => 2.718281828459045 +# +# Some floating-point computations in Ruby give precise results: +# +# 1.0/2 # => 0.5 +# 100.0/8 # => 12.5 +# +# Others do not: +# +# - In mathematics, 2/3 as a decimal number is an infinitely-repeating decimal: +# 0.666... (forever); +# in Ruby, +2.0/3+ is of limited precision (in this case, to 16 decimal places): +# +# 2.0/3 # => 0.6666666666666666 +# +# - In mathematics, the square root of 2 is an irrational number of infinite precision: +# 1.4142135623730950488016887... (to 25 decimal places); +# in Ruby, it is of limited precision (in this case, to 16 decimal places): +# +# Math.sqrt(2.0) # => 1.4142135623730951 +# +# - Even a simple computation can introduce imprecision: +# +# x = 0.1 + 0.2 # => 0.30000000000000004 +# y = 0.3 # => 0.3 +# x == y # => false +# +# See: +# +# - https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html +# - https://github.com/rdp/ruby_tutorials_core/wiki/Ruby-Talk-FAQ#-why-are-rubys-floats-imprecise +# - https://en.wikipedia.org/wiki/Floating_point#Accuracy_problems +# +# Note that precise storage and computation of rational numbers +# is possible using Rational objects. +# +# == Creating a \Float +# +# You can create a \Float object explicitly with: +# +# - A {floating-point literal}[rdoc-ref:syntax/literals.rdoc@Float+Literals]. +# +# You can convert certain objects to Floats with: +# +# - Method #Float. +# +# == What's Here +# +# First, what's elsewhere. Class \Float: +# +# - Inherits from +# {class Numeric}[rdoc-ref:Numeric@What-27s+Here] +# and {class Object}[rdoc-ref:Object@What-27s+Here]. +# - Includes {module Comparable}[rdoc-ref:Comparable@What-27s+Here]. +# +# Here, class \Float provides methods for: +# +# - {Querying}[rdoc-ref:Float@Querying] +# - {Comparing}[rdoc-ref:Float@Comparing] +# - {Converting}[rdoc-ref:Float@Converting] +# +# === Querying +# +# - #finite?: Returns whether +self+ is finite. +# - #hash: Returns the integer hash code for +self+. +# - #infinite?: Returns whether +self+ is infinite. +# - #nan?: Returns whether +self+ is a NaN (not-a-number). +# +# === Comparing +# +# - #<: Returns whether +self+ is less than the given value. +# - #<=: Returns whether +self+ is less than or equal to the given value. +# - #<=>: Returns a number indicating whether +self+ is less than, equal +# to, or greater than the given value. +# - #== (aliased as #=== and #eql?): Returns whether +self+ is equal to +# the given value. +# - #>: Returns whether +self+ is greater than the given value. +# - #>=: Returns whether +self+ is greater than or equal to the given value. +# +# === Converting +# +# - #% (aliased as #modulo): Returns +self+ modulo the given value. +# - #*: Returns the product of +self+ and the given value. +# - #**: Returns the value of +self+ raised to the power of the given value. +# - #+: Returns the sum of +self+ and the given value. +# - #-: Returns the difference of +self+ and the given value. +# - #/: Returns the quotient of +self+ and the given value. +# - #ceil: Returns the smallest number greater than or equal to +self+. +# - #coerce: Returns a 2-element array containing the given value converted to a \Float +# and +self+ +# - #divmod: Returns a 2-element array containing the quotient and remainder +# results of dividing +self+ by the given value. +# - #fdiv: Returns the \Float result of dividing +self+ by the given value. +# - #floor: Returns the greatest number smaller than or equal to +self+. +# - #next_float: Returns the next-larger representable \Float. +# - #prev_float: Returns the next-smaller representable \Float. +# - #quo: Returns the quotient from dividing +self+ by the given value. +# - #round: Returns +self+ rounded to the nearest value, to a given precision. +# - #to_i (aliased as #to_int): Returns +self+ truncated to an Integer. +# - #to_s (aliased as #inspect): Returns a string containing the place-value +# representation of +self+ in the given radix. +# - #truncate: Returns +self+ truncated to a given precision. +# + + class Float; end diff --git a/doc/forwardable.rd.ja b/doc/forwardable.rd.ja index 171724b2e5..53e8202513 100644 --- a/doc/forwardable.rd.ja +++ b/doc/forwardable.rd.ja @@ -1,4 +1,4 @@ - -- forwatable.rb + -- forwardable.rb $Release Version: 1.1 $ $Revision$ diff --git a/doc/globals.rdoc b/doc/globals.rdoc deleted file mode 100644 index 146b7fc34f..0000000000 --- a/doc/globals.rdoc +++ /dev/null @@ -1,72 +0,0 @@ -# -*- mode: rdoc; coding: utf-8; fill-column: 74; -*- - -== Pre-defined global variables - -$!:: The Exception object set by Kernel#raise. -$@:: The same as <code>$!.backtrace</code>. -$~:: The information about the last match in the current scope (thread-local and frame-local). -$&:: The string matched by the last successful match. -$`:: The string to the left of the last successful match. -$':: The string to the right of the last successful match. -$+:: The highest group matched by the last successful match. -$1:: The Nth group of the last successful match. May be > 1. -$=:: This variable is no longer effective. Deprecated. -$/:: The input record separator, newline by default. Aliased to $-0. -$\:: The output record separator for Kernel#print and IO#write. Default is +nil+. -$,:: The output field separator for Kernel#print and Array#join. Non-nil $, will be deprecated. -$;:: The default separator for String#split. Non-nil $; will be deprecated. Aliased to $-F. -$.:: The current input line number of the last file that was read. -$<:: The same as ARGF. -$>:: The default output stream for Kernel#print and Kernel#printf. $stdout by default. -$_:: The last input line of string by gets or readline. -$0:: Contains the name of the script being executed. May be assignable. -$*:: The same as ARGV. -$$:: The process number of the Ruby running this script. Same as Process.pid. -$?:: The status of the last executed child process (thread-local). -$LOAD_PATH:: Load path for searching Ruby scripts and extension libraries used - by Kernel#load and Kernel#require. Aliased to $: and $-I. - Has a singleton method <code>$LOAD_PATH.resolve_feature_path(feature)</code> - that returns [+:rb+ or +:so+, path], which resolves the feature to - the path the original Kernel#require method would load. -$LOADED_FEATURES:: The array contains the module names loaded by require. - Aliased to $". -$DEBUG:: The debug flag, which is set by the -d switch. Enabling debug - output prints each exception raised to $stderr (but not its - backtrace). Setting this to a true value enables debug output as - if -d were given on the command line. Setting this to a false - value disables debug output. Aliased to $-d. -$FILENAME:: Current input filename from ARGF. Same as ARGF.filename. -$stderr:: The current standard error output. -$stdin:: The current standard input. -$stdout:: The current standard output. -$VERBOSE:: The verbose flag, which is set by the -w or -v switch. Setting - this to a true value enables warnings as if -w or -v were given - on the command line. Setting this to +nil+ disables warnings, - including from Kernel#warn. Aliased to $-v and $-w. -$-a:: True if option -a is set. Read-only variable. -$-i:: In in-place-edit mode, this variable holds the extension, otherwise +nil+. -$-l:: True if option -l is set. Read-only variable. -$-p:: True if option -p is set. Read-only variable. - -== Pre-defined global constants - -TRUE:: The typical true value. Deprecated. -FALSE:: The +false+ itself. Deprecated. -NIL:: The +nil+ itself. Deprecated. -STDIN:: The standard input. The default value for $stdin. -STDOUT:: The standard output. The default value for $stdout. -STDERR:: The standard error output. The default value for $stderr. -ENV:: The hash contains current environment variables. -ARGF:: The virtual concatenation of the files given on command line (or from $stdin if no files were given). -ARGV:: An Array of command line arguments given for the script. -DATA:: The file object of the script, pointing just after <code>__END__</code>. -TOPLEVEL_BINDING:: The Binding of the top level scope. -RUBY_VERSION:: The Ruby language version. -RUBY_RELEASE_DATE:: The release date string. -RUBY_PLATFORM:: The platform identifier. -RUBY_PATCHLEVEL:: The patchlevel for this ruby. If this is a development build of ruby the patchlevel will be -1. -RUBY_REVISION:: The GIT commit hash for this ruby. -RUBY_COPYRIGHT:: The copyright string for ruby. -RUBY_ENGINE:: The name of the Ruby implementation. -RUBY_ENGINE_VERSION:: The version of the Ruby implementation. -RUBY_DESCRIPTION:: The same as <tt>ruby --version</tt>, a String describing various aspects of the Ruby implementation. diff --git a/doc/index.md b/doc/index.md new file mode 100644 index 0000000000..596825a19c --- /dev/null +++ b/doc/index.md @@ -0,0 +1,65 @@ +# Ruby Documentation + +Welcome to the official Ruby programming language documentation. + +## Getting Started + +New to Ruby? Start with our [Getting Started Guide](https://www.ruby-lang.org/en/documentation/quickstart/). + +## Core Classes and Modules + +Explore the essential classes and modules: + +- [String](String.html) - Text manipulation and string utilities. +- [Symbol](Symbol.html) - Named identifiers inside the Ruby interpreter. +- [Array](Array.html) - Ordered collections of objects. +- [Hash](Hash.html) - Key-value pairs for efficient data retrieval. +- [Integer](Integer.html) - \Integer number class. +- [Float](Float.html) - Floating-point number class. +- [Enumerable](Enumerable.html) - Collection traversal and searching. +- [File](File.html) - \File operations and handling. +- [IO](IO.html) - Input/output functionality. +- [Time](Time.html) - \Time representation. +- [Regexp](Regexp.html) - Regular expressions for pattern matching. +- [Range](Range.html) - Representing a range of values. +- [Exception](Exception.html) - Base class for all exceptions. +- [Thread](Thread.html) - Multithreading and concurrency. + +## Language Reference + +Deep dive into Ruby's syntax and features: + +- [Ruby Syntax](rdoc-ref:syntax.rdoc) +- [Exceptions](rdoc-ref:exceptions.md) +- [Implicit Conversions](rdoc-ref:implicit_conversion.rdoc) + +## Standard Libraries + +There are some standard libraries included in Ruby that are also commonly used, such as: + +- [Date](Date.html) - \Date representation. +- [JSON](JSON.html) - \JSON encoding and decoding. +- [ERB](ERB.html) - Embedded Ruby for templating. +- [Net::HTTP](Net/HTTP.html) - HTTP client library. + +Use the following links to access the comprehensive set of libraries included with Ruby: + +- [Standard Library Documentation](rdoc-ref:standard_library.md) +- [Maintainers](rdoc-ref:maintainers.md) + +## Contribute to Ruby + +Get involved with the Ruby community: + +- [Contribution Guide](rdoc-ref:contributing/contributing.md) +- [Documentation Guide](rdoc-ref:contributing/documentation_guide.md) +- [Reporting Issues](rdoc-ref:contributing/reporting_issues.md) +- [Building Ruby](rdoc-ref:contributing/building_ruby.md) +- [Testing Ruby](rdoc-ref:contributing/testing_ruby.md) +- [Issue Tracker](https://bugs.ruby-lang.org/projects/ruby-master/issues) + +## Additional Resources + +- [Ruby Homepage](https://www.ruby-lang.org/) +- [RubyGems](https://rubygems.org/) +- [Ruby Community](https://www.ruby-lang.org/en/community/) diff --git a/doc/irb/irb-tools.rd.ja b/doc/irb/irb-tools.rd.ja deleted file mode 100644 index b997f0edea..0000000000 --- a/doc/irb/irb-tools.rd.ja +++ /dev/null @@ -1,184 +0,0 @@ -irb関連ãŠã¾ã‘コマンドã¨ãƒ©ã‚¤ãƒ–ラリ - $Release Version: 0.7.1 $ - $Revision$ - by Keiju ISHITSUKA(Nihon Rational Co.,Ltd.) - -=begin - -:コマンド: -* rtags -- ruby tags command - -:関数ライブラリ: -* xmp -- irb version of gotoken xmp-function - -:クラスライブラリ: -* frame.rb -- frame tracer -* completion.rb -- irb completor - -= rtags - -rtagsã¯emacsåŠã³vi用ã®, TAGファイルをã¤ãるコマンドã§ã™. - -== ä½¿ã„æ–¹ - - rtags [-vi] file.... - -カレントディレクトリã«emacs用ã®TAGSファイルãŒã§ãã¾ã™. -viオプションを -ã¤ã‘ãŸæ™‚ã«ã¯vi用ã®tagsファイルを作æˆã—ã¾ã™. - -emacsã®å ´åˆ, 通常ã®etags.elãŒãã®ã¾ã¾ä½¿ãˆã¾ã™. 検索å¯èƒ½ãªã®ã¯, - -* クラス -* メソッド -* 特異メソッド -* alias -* attrã§å®£è¨€ã•れãŸã‚¢ã‚¯ã‚»ã‚µ(パラメータãŒã‚·ãƒ³ãƒœãƒ«ã‹æ–‡å—列リテラルã«é™ã‚‹) -* attr_XXXã§å®£è¨€ã•れãŸã‚¢ã‚¯ã‚»ã‚µ(パラメータãŒã‚·ãƒ³ãƒœãƒ«ã‹æ–‡å—列リテラルã«é™ã‚‹) - -ã§ã™. - -Cãªã©ã§ä½¿ã£ã¦ã„ã‚‹ã®ã¨é•ã†ã®ã¯, コンプリーションã«é–¢ã™ã‚‹éƒ¨åˆ†ã§, - -関数åã¯, - - 関数å( - -クラスã¯, - - ::クラスå::....::クラスå - -メソッドã¯, - - ::クラスå::....::クラスå#メソッドå - -特異メソッド(クラスメソッド)㯠- - ::クラスå::....::クラスå.メソッドå - -ã§ã‚³ãƒ³ãƒ—リーションを行ãªã†ã¨ã“ã‚ã§ã™. - -= xmp.rb - -ã”ã¨ã‘ã‚“xmpã®ä¸Šä½äº’æ›ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã§ã™. ãŸã , éžå¸¸ã«é‡ã„ã®ã§ã”ã¨ã‘ã‚“xmpã§ -ã¯å¯¾å¿œã§ããªã„時ã«, 使用ã™ã‚‹ã¨è‰¯ã„ã§ã—ょã†. - -== ä½¿ã„æ–¹ - -=== 関数ã¨ã—ã¦ä½¿ã†. - - require "irb/xmp" - xmp <<END - foo = 1 - foo - END - --- - foo = 1 - ==>1 - foo - ==>1 - -=== XMPインスタンスを用ã„ã‚‹. - -ã“ã®å ´åˆã¯, XMPãŒã‚³ãƒ³ãƒ†ã‚ã‚¹ãƒˆæƒ…å ±ã‚’æŒã¤ã®ã§, 変数ã®å€¤ãªã©ã‚’ä¿æŒã—ã¦ã„ -ã¾ã™. - - require "irb/xmp" - xmp = XMP.new - xmp.puts <<END - foo = 1 - foo - END - xmp.puts <<END - foo - END - === - foo = 1 - ==>1 - foo - ==>1 - foo - ==>1 - -== コンテã‚ストã«é–¢ã—㦠- -XMPメソッド群ã®ã‚³ãƒ³ãƒ†ã‚ストã¯, 呼ã³å‡ºã™å‰ã®ã‚³ãƒ³ãƒ†ã‚ストã§è©•価ã•れã¾ã™. -明示的ã«ã‚³ãƒ³ãƒ†ã‚ストを指定ã™ã‚‹ã¨ãã®ã‚³ãƒ³ãƒ†ã‚ストã§è©•価ã—ã¾ã™. - -例: - - xmp "foo", an_binding - -:注: -マルãƒã‚¹ãƒ¬ãƒƒãƒ‰ã«ã¯å¯¾å¿œã—ã¦ã„ã¾ã›ã‚“. - -= frame.rb -ç¾åœ¨å®Ÿè¡Œä¸ã®ãƒ•ãƒ¬ãƒ¼ãƒ æƒ…å ±ã‚’å–り扱ã†ãŸã‚ã®ã‚¯ãƒ©ã‚¹ã§ã™. - -* IRB::Frame.top(n = 0) - 上ã‹ã‚‰n番目ã®ã‚³ãƒ³ãƒ†ã‚ストをå–り出ã—ã¾ã™. nã¯0ãŒæœ€ä¸Šä½ã«ãªã‚Šã¾ã™. -* IRB::Frame.bottom(n = 0) - 下ã‹ã‚‰n番目ã®ã‚³ãƒ³ãƒ†ã‚ストをå–り出ã—ã¾ã™. nã¯0ãŒæœ€ä¸‹ä½ã«ãªã‚Šã¾ã™. -* IRB::Frame.sender - センダã«ãªã£ã¦ã„るオブジェクトをå–り出ã—ã¾ã™. センダã¨ã¯, ãã®ãƒ¡ã‚½ãƒƒ - ドを呼ã³å‡ºã—ãŸå´ã®selfã®ã“ã¨ã§ã™. - -:注: -set_trace_funcを用ã„ã¦Rubyã®å®Ÿè¡Œã‚’トレースã—ã¦ã„ã¾ã™. マルãƒã‚¹ãƒ¬ãƒƒãƒ‰ã« -ã¯å¯¾å¿œã—ã¦ã„ã¾ã›ã‚“. - -= completion.rb -irbã®completion機能をæä¾›ã™ã‚‹ã‚‚ã®ã§ã™. - -== ä½¿ã„æ–¹ - - % irb -r irb/completion - -ã¨ã™ã‚‹ã‹, ~/.irbrc ä¸ã« - - require "irb/completion" - -を入れã¦ãã ã•ã„. irb実行ä¸ã« require "irb/completion" ã—ã¦ã‚‚よã„ã§ã™. - -irb実行ä¸ã« (TAB) を押ã™ã¨ã‚³ãƒ³ãƒ—レーションã—ã¾ã™. - -トップレベルã§(TAB)を押ã™ã¨ã™ã¹ã¦ã®æ§‹æ–‡è¦ç´ , クラス, メソッドã®å€™è£œãŒã§ -ã¾ã™. 候補ãŒå”¯ä¸€ãªã‚‰ã°å®Œå…¨ã«è£œå®Œã—ã¾ã™. - - irb(main):001:0> in - in inspect instance_eval - include install_alias_method instance_of? - initialize install_aliases instance_variables - irb(main):001:0> inspect - "main" - irb(main):002:0> foo = Object.new - #<Object:0x4027146c> - - ((|変数å.|))ã®å¾Œã«(TAB)を押ã™ã¨, ãã®ã‚ªãƒ–ジェクトã®ãƒ¡ã‚½ãƒƒãƒ‰ä¸€è¦§ãŒã§ã¾ - ã™. - - irb(main):003:0> foo. - foo.== foo.frozen? foo.protected_methods - foo.=== foo.hash foo.public_methods - foo.=~ foo.id foo.respond_to? - foo.__id__ foo.inspect foo.send - foo.__send__ foo.instance_eval foo.singleton_methods - foo.class foo.instance_of? foo.taint - foo.clone foo.instance_variables foo.tainted? - foo.display foo.is_a? foo.to_a - foo.dup foo.kind_of? foo.to_s - foo.eql? foo.method foo.type - foo.equal? foo.methods foo.untaint - foo.extend foo.nil? - foo.freeze foo.private_methods - -=end - -% Begin Emacs Environment -% Local Variables: -% mode: text -% comment-column: 0 -% comment-start: "%" -% comment-end: "\n" -% End: -% - diff --git a/doc/irb/irb.rd.ja b/doc/irb/irb.rd.ja deleted file mode 100644 index 31b7cdf3e3..0000000000 --- a/doc/irb/irb.rd.ja +++ /dev/null @@ -1,408 +0,0 @@ -irb -- interactive ruby - $Release Version: 0.9.5 $ - $Revision$ - by Keiju ISHITSUKA(keiju@ruby-lang.org) -=begin -= irbã¨ã¯? - -irbã¯interactive rubyã®ç•¥ã§ã™. rubyã®å¼ã‚’標準入力ã‹ã‚‰ç°¡å˜ã«å…¥åŠ›/実行ã™ã‚‹ -ãŸã‚ã®ãƒ„ールã§ã™. - -= èµ·å‹• - - % irb - -ã§è¡Œãªã„ã¾ã™. - -= ä½¿ã„æ–¹ - -irbã®ä½¿ã„æ–¹ã¯, Rubyã•ãˆçŸ¥ã£ã¦ã„れã°ã„ãŸã£ã¦ç°¡å˜ã§ã™. 基本的ã«ã¯ irb 㨠-ã„ã†ã‚³ãƒžãƒ³ãƒ‰ã‚’実行ã™ã‚‹ã ã‘ã§ã™. irbを実行ã™ã‚‹ã¨, 以下ã®ã‚ˆã†ãªãƒ—ãƒãƒ³ãƒ— -トãŒè¡¨ã‚Œã¦ãã¾ã™. 後ã¯, rubyã®å¼ã‚’入れã¦ä¸‹ã•ã„. å¼ãŒå®Œçµã—ãŸæ™‚点ã§å®Ÿè¡Œ -ã•れã¾ã™. - - dim% irb - irb(main):001:0> 1+2 - 3 - irb(main):002:0> class Foo - irb(main):003:1> def foo - irb(main):004:2> print 1 - irb(main):005:2> end - irb(main):006:1> end - nil - irb(main):007:0> - -ã¾ãŸ, irbã¯Readlineモジュールã«ã‚‚対応ã—ã¦ã„ã¾ã™. Readlineモジュール㌠-インストールã•れã¦ã„る時ã«ã¯, ãれを使ã†ã®ãŒæ¨™æº–ã®å‹•作ã«ãªã‚Šã¾ã™. - -= コマンドオプション - - irb.rb [options] file_name opts - options: - -f ~/.irbrc ã‚’èªã¿è¾¼ã¾ãªã„. - -d $DEBUG ã‚’trueã«ã™ã‚‹(ruby -d ã¨åŒã˜) - -r load-module ruby -r ã¨åŒã˜. - -I path $LOAD_PATH ã« path ã‚’è¿½åŠ ã™ã‚‹. - -U ruby -U ã¨åŒã˜. - -E enc ruby -E ã¨åŒã˜. - -w ruby -w ã¨åŒã˜. - -W[level=2] ruby -W ã¨åŒã˜. - --context-mode n æ–°ã—ã„ワークスペースを作æˆã—ãŸæ™‚ã«é–¢é€£ã™ã‚‹ Binding - オブジェクトã®ä½œæˆæ–¹æ³•ã‚’ 0 ã‹ã‚‰ 3 ã®ã„ãšã‚Œã‹ã«è¨å®šã™ã‚‹. - --echo å®Ÿè¡Œçµæžœã‚’表示ã™ã‚‹(デフォルト). - --noecho å®Ÿè¡Œçµæžœã‚’表示ã—ãªã„. - --inspect çµæžœå‡ºåŠ›ã«inspectを用ã„ã‚‹. - --noinspect çµæžœå‡ºåŠ›ã«inspectを用ã„ãªã„. - --readline readlineライブラリを利用ã™ã‚‹. - --noreadline readlineライブラリを利用ã—ãªã„. - --colorize 色付ã‘を利用ã™ã‚‹. - --nocolorize 色付ã‘を利用ã—ãªã„. - --prompt prompt-mode/--prompt-mode prompt-mode - プãƒãƒ³ãƒ—トモードを切替ãˆã¾ã™. ç¾åœ¨å®šç¾©ã•れã¦ã„るプ - ãƒãƒ³ãƒ—トモードã¯, default, simple, xmp, inf-ruby㌠- 用æ„ã•れã¦ã„ã¾ã™. - --inf-ruby-mode emacsã®inf-ruby-mode用ã®ãƒ—ãƒãƒ³ãƒ—ト表示を行ãªã†. 特 - ã«æŒ‡å®šãŒãªã„é™ã‚Š, readlineライブラリã¯ä½¿ã‚ãªããªã‚‹. - --sample-book-mode/--simple-prompt - éžå¸¸ã«ã‚·ãƒ³ãƒ—ルãªãƒ—ãƒãƒ³ãƒ—トを用ã„るモードã§ã™. - --noprompt プãƒãƒ³ãƒ—ト表示を行ãªã‚ãªã„. - --single-irb irb ä¸ã§ self を実行ã—ã¦å¾—られるオブジェクトをサ - ブ irb ã¨å…±æœ‰ã™ã‚‹. - --tracer コマンド実行時ã«ãƒˆãƒ¬ãƒ¼ã‚¹ã‚’行ãªã†. - --back-trace-limit n - ãƒãƒƒã‚¯ãƒˆãƒ¬ãƒ¼ã‚¹è¡¨ç¤ºã‚’ãƒãƒƒã‚¯ãƒˆãƒ¬ãƒ¼ã‚¹ã®é ã‹ã‚‰ n, 後゠- ã‹ã‚‰nã ã‘行ãªã†. デフォルトã¯16 - - --verbose 詳細ãªãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ã‚’出力ã™ã‚‹. - --noverbose 詳細ãªãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ã‚’出力ã—ãªã„(デフォルト). - -v, --version irbã®ãƒãƒ¼ã‚¸ãƒ§ãƒ³ã‚’表示ã™ã‚‹. - -h, --help irb ã®ãƒ˜ãƒ«ãƒ—を表示ã™ã‚‹. - -- 以é™ã®ã‚³ãƒžãƒ³ãƒ‰ãƒ©ã‚¤ãƒ³å¼•数をオプションã¨ã—ã¦æ‰±ã‚ãªã„. - -= コンフィギュレーション - -irb起動時ã«``~/.irbrc''ã‚’èªã¿è¾¼ã¿ã¾ã™. ã‚‚ã—å˜åœ¨ã—ãªã„å ´åˆã¯, -``.irbrc'', ``irb.rc'', ``_irbrc'', ``$irbrc''ã®é †ã«loadを試ã¿ã¾ã™. - -オプションをè¨å®šã™ã‚‹ä»£ã‚りã«, 以下ã®ã‚³ãƒžãƒ³ãƒ‰ã§ã‚‚デフォルトã®å‹•作をè¨å®š -ã§ãã¾ã™. - - IRB.conf[:IRB_NAME]="irb" - IRB.conf[:USE_TRACER]=false - IRB.conf[:USE_LOADER]=false - IRB.conf[:IGNORE_SIGINT]=true - IRB.conf[:IGNORE_EOF]=false - IRB.conf[:INSPECT_MODE]=nil - IRB.conf[:IRB_RC] = nil - IRB.conf[:BACK_TRACE_LIMIT]=16 - IRB.conf[:USE_LOADER] = false - IRB.conf[:USE_READLINE] = nil - IRB.conf[:USE_TRACER] = false - IRB.conf[:IGNORE_SIGINT] = true - IRB.conf[:IGNORE_EOF] = false - IRB.conf[:PROMPT_MODE] = :DEFAULT - IRB.conf[:PROMPT] = {...} - IRB.conf[:VERBOSE]=true - -== プãƒãƒ³ãƒ—トã®è¨å®š - -プãƒãƒ³ãƒ—トをカスタマイズã—ãŸã„時ã«ã¯, - - IRB.conf[:PROMPT] - -を用ã„ã¾ã™. 例ãˆã°, .irbrcã®ä¸ã§ä¸‹ã®ã‚ˆã†ãªå¼ã‚’記述ã—ã¾ã™: - - IRB.conf[:PROMPT][:MY_PROMPT] = { # プãƒãƒ³ãƒ—トモードã®åå‰ - :PROMPT_I => nil, # 通常ã®ãƒ—ãƒãƒ³ãƒ—ト - :PROMPT_N => nil, # 継続行ã®ãƒ—ãƒãƒ³ãƒ—ト - :PROMPT_S => nil, # æ–‡å—列ãªã©ã®ç¶™ç¶šè¡Œã®ãƒ—ãƒãƒ³ãƒ—ト - :PROMPT_C => nil, # å¼ãŒç¶™ç¶šã—ã¦ã„る時ã®ãƒ—ãƒãƒ³ãƒ—ト - :RETURN => " ==>%s\n" # リターン時ã®ãƒ—ãƒãƒ³ãƒ—ト - } - -プãƒãƒ³ãƒ—トモードを指定ã—ãŸã„時ã«ã¯, - - irb --prompt my-prompt - -ã§ãã®ãƒ—ãƒãƒ³ãƒ—トモードã§èµ·å‹•ã•れã¾ã™. ã¾ãŸã¯, .irbrcã«ä¸‹å¼ã‚’記述ã—ã¦ã‚‚ -OKã§ã™. - - IRB.conf[:PROMPT_MODE] = :MY_PROMPT - -PROMPT_I, PROMPT_N, PROMPT_S, PROMPT_Cã¯, フォーマットを指定ã—ã¾ã™. - - %N èµ·å‹•ã—ã¦ã„るコマンドåãŒå‡ºåŠ›ã•れる. - %m mainオブジェクト(self)ãŒto_sã§å‡ºåŠ›ã•れる. - %M mainオブジェクト(self)ãŒinspectã•れã¦å‡ºåŠ›ã•れる. - %l æ–‡å—列ä¸ã®ã‚¿ã‚¤ãƒ—を表ã™(", ', /, ], `]'ã¯%wã®ä¸ã®æ™‚) - %NNi インデントã®ãƒ¬ãƒ™ãƒ«ã‚’表ã™. NNã¯æ•°å—ãŒå…¥ã‚Šprintfã®%NNdã¨åŒã˜. çœ - ç•¥å¯èƒ½ - %NNn 行番å·ã‚’表ã—ã¾ã™. - %% % - -例ãˆã°, デフォルトã®ãƒ—ãƒãƒ³ãƒ—トモードã¯: - - IRB.conf[:PROMPT_MODE][:DEFAULT] = { - :PROMPT_I => "%N(%m):%03n:%i> ", - :PROMPT_N => "%N(%m):%03n:%i> ", - :PROMPT_S => "%N(%m):%03n:%i%l ", - :PROMPT_C => "%N(%m):%03n:%i* ", - :RETURN => "%s\n" - } - -ã¨ãªã£ã¦ã„ã¾ã™. - -RETURNã¯, ç¾åœ¨ã®ã¨ã“ã‚printfå½¢å¼ã§ã™. å°†æ¥ä»•様ãŒå¤‰ã‚ã‚‹ã‹ã‚‚知れã¾ã›ã‚“. - -== サブirbã®è¨å®š - -コマンドラインオプションãŠã‚ˆã³IRB.confã¯(サブ)irb起動時ã®ãƒ‡ãƒ•ォルト㮠-è¨å®šã‚’決ã‚ã‚‹ã‚‚ã®ã§, `5. コマンド'ã«ã‚ã‚‹confã§å€‹åˆ¥ã®(サブ)irbã®è¨å®šãŒã§ -ãるよã†ã«ãªã£ã¦ã„ã¾ã™. - -IRB.conf[:IRB_RC]ã«procãŒè¨å®šã•れã¦ã„ã‚‹ã¨, サブirbã‚’èµ·å‹•ã™ã‚‹æ™‚ã«ãã® -procã‚’irbã®ã‚³ãƒ³ãƒ†ã‚ストを引数ã¨ã—ã¦å‘¼ã³å‡ºã—ã¾ã™. ã“れã«ã‚ˆã£ã¦å€‹åˆ¥ã®ã‚µ -ブirbã”ã¨ã«è¨å®šã‚’変ãˆã‚‹ã“ã¨ãŒã§ãるよã†ã«ãªã‚Šã¾ã™. - - -= コマンド - -irb拡張コマンドã¯, ç°¡å˜ãªåå‰ã¨é ã«`irb_'ã‚’ã¤ã‘ãŸåå‰ã¨ä¸¡æ–¹å®šç¾©ã•れ㦠-ã„ã¾ã™. ã“れã¯, ç°¡å˜ãªåå‰ãŒoverrideã•ã‚ŒãŸæ™‚ã®ãŸã‚ã§ã™. - ---- exit, quit, irb_exit - 終了ã™ã‚‹. - サブirbã®å ´åˆ, ãã®ã‚µãƒ–irbを終了ã™ã‚‹. - ---- conf, irb_context - irbã®ç¾åœ¨ã®è¨å®šã‚’表示ã™ã‚‹. è¨å®šã®å¤‰æ›´ã¯, confã«ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ã‚’é€ã‚‹ã“ - ã¨ã«ã‚ˆã£ã¦è¡Œãªãˆã‚‹. - ---- conf.eval_history = N - å®Ÿè¡Œçµæžœã®ãƒ’ストリ機能ã®è¨å®š. - nnã¯æ•´æ•°ã‹nilã§ nn>0 ã§ã‚れã°ãã®æ•°ã ã‘ヒストリã«ãŸã‚る。nn==0ã®æ™‚㯠- 無制é™ã«è¨˜æ†¶ã™ã‚‹ã€nilã ã¨ãƒ’ストリ機能ã¯ã‚„ã‚ã‚‹(デフォルト). - ---- Conf.back_trace_limit - ãƒãƒƒã‚¯ãƒˆãƒ¬ãƒ¼ã‚¹è¡¨ç¤ºã‚’ãƒãƒƒã‚¯ãƒˆãƒ¬ãƒ¼ã‚¹ã®é ã‹ã‚‰n, 後ã‚ã‹ã‚‰nã ã‘行ãªã†. - デフォルトã¯16 - ---- conf.ignore_eof = true/false - ^DãŒå…¥åŠ›ã•ã‚ŒãŸæ™‚ã®å‹•作をè¨å®šã™ã‚‹. trueã®æ™‚ã¯^Dを無視ã™ã‚‹, falseã® - 時ã¯irbを終了ã™ã‚‹. - ---- conf.ignore_sigint= true/false - ^CãŒå…¥åŠ›ã•ã‚ŒãŸæ™‚ã®å‹•作をè¨å®šã™ã‚‹. false時ã¯, irbを終了ã™ã‚‹. trueã® - 時ã®å‹•作ã¯ä»¥ä¸‹ã®ã‚ˆã†ã«ãªã‚‹: - 入力ä¸: ã“れã¾ã§å…¥åŠ›ã—ãŸã‚‚ã®ã‚’ã‚ャンセルã—ãƒˆãƒƒãƒ—ãƒ¬ãƒ™ãƒ«ã«æˆ»ã‚‹. - 実行ä¸: å®Ÿè¡Œã‚’ä¸æ¢ã™ã‚‹. - ---- conf.inf_ruby_mode = true/false - inf-ruby-mode用ã®ãƒ—ãƒãƒ³ãƒ—ト表示を行ãªã†. デフォルトã¯false. - ---- conf.inspect_mode = true/false/nil - インスペクトモードをè¨å®šã™ã‚‹. - true: インスペクトã—ã¦è¡¨ç¤ºã™ã‚‹. - false: 通常ã®printã§è¡¨ç¤ºã™ã‚‹. - nil: 通常モードã§ã‚れã°, inspect modeã¨ãªã‚Š, mathãƒ¢ãƒ¼ãƒ‰ã®æ™‚ã¯, non - inspect modeã¨ãªã‚‹. - ---- conf.use_loader = true/false - load/require時ã«irbã®fileèªã¿è¾¼ã¿æ©Ÿèƒ½ã‚’用ã„るモードã®ã‚¹ã‚¤ãƒƒãƒ(デフォ - ルトã¯ç”¨ã„ãªã„). ã“ã®ãƒ¢ãƒ¼ãƒ‰ã¯IRB全体ã«åæ˜ ã•れる. - ---- conf.prompt_c - ifã®ç›´å¾Œãªã©, 行ãŒç¶™ç¶šã—ã¦ã„る時ã®ãƒ—ãƒãƒ³ãƒ—ト. - ---- conf.prompt_i - 通常ã®ãƒ—ãƒãƒ³ãƒ—ト. - ---- conf.prompt_s - æ–‡å—列ä¸ãªã©ã‚’表ã™ãƒ—ãƒãƒ³ãƒ—ト. - ---- conf.rc - ~/.irbrcã‚’èªã¿è¾¼ã‚“ã ã‹ã©ã†ã‹? - ---- conf.use_prompt = true/false - プãƒãƒ³ãƒ—ト表示ã™ã‚‹ã‹ã©ã†ã‹? デフォルトã§ã¯ãƒ—ãƒãƒ³ãƒ—トを表示ã™ã‚‹. - ---- conf.use_readline = true/false/nil - readlineを使ã†ã‹ã©ã†ã‹? - true: readlineを使ã†. - false: readlineを使ã‚ãªã„. - nil: (デフォルト)inf-ruby-mode以外ã§readlineライブラリを利用ã—よ - ã†ã¨ã™ã‚‹. -# -#--- conf.verbose=T/F -# irbã‹ã‚‰ã„ã‚ã„ã‚ãªãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ã‚’出力ã™ã‚‹ã‹? - ---- cws, chws, irb_cws, irb_chws, irb_change_workspace [obj] - objã‚’selfã¨ã™ã‚‹. objãŒçœç•¥ã•れãŸã¨ãã¯, home workspace, ã™ãªã‚ã¡ - irbã‚’èµ·å‹•ã—ãŸã¨ãã®main objectã‚’selfã¨ã™ã‚‹. - ---- pushws, irb_pushws, irb_push_workspace [obj] - UNIXシェルコマンドã®pushdã¨åŒæ§˜. - ---- popws, irb_popws, irb_pop_workspace - UNIXシェルコマンドã®popdã¨åŒæ§˜. - ---- irb [obj] - サブirbã‚’ç«‹ã¡ã‚ã’ã‚‹. objãŒæŒ‡å®šã•ã‚ŒãŸæ™‚ã¯, ãã®objã‚’selfã¨ã™ã‚‹. - ---- jobs, irb_jobs - サブirbã®ãƒªã‚¹ãƒˆ - ---- fg n, irb_fg n - 指定ã—ãŸã‚µãƒ–irbã«ã‚¹ã‚¤ãƒƒãƒã™ã‚‹. nã¯, 次ã®ã‚‚ã®ã‚’指定ã™ã‚‹. - - irbç•ªå· - スレッド - irbオブジェクト - self(irb objã§èµ·å‹•ã—ãŸæ™‚ã®obj) - ---- kill n, irb_kill n - サブirbã‚’killã™ã‚‹. nã¯fgã¨åŒã˜. - ---- source, irb_source path - UNIXシェルコマンドã®sourceã¨ä¼¼ã¦ã„ã‚‹. ç¾åœ¨ã®ç’°å¢ƒä¸Šã§path内ã®ã‚¹ã‚¯ãƒª - プトを評価ã™ã‚‹. - ---- irb_load path, prev - - Rubyã®loadã®irb版. - -= システム変数 - ---- _ - å‰ã®è¨ˆç®—ã®å®Ÿè¡Œçµæžœã‚’覚ãˆã¦ã„ã‚‹(ãƒãƒ¼ã‚«ãƒ«å¤‰æ•°). ---- __ - å®Ÿè¡Œçµæžœã®å±¥æ´ã‚’覚ãˆã¦ã„ã‚‹. - __[line_no]ã§ã€ãã®è¡Œã§å®Ÿè¡Œã—ãŸçµæžœã‚’å¾—ã‚‹ã“ã¨ãŒã§ãã‚‹. line_noãŒè² ã® - 時ã«ã¯ã€æœ€æ–°ã®çµæžœã‹ã‚‰-line_noå‰ã®çµæžœã‚’å¾—ã‚‹ã“ã¨ãŒã§ãã‚‹. - -= 使用例 - -以下ã®ã‚ˆã†ãªæ„Ÿã˜ã§ã™. - - dim% ruby irb.rb - irb(main):001:0> irb # サブirbã®ç«‹ã¡ã‚ã’ - irb#1(main):001:0> jobs # サブirbã®ãƒªã‚¹ãƒˆ - #0->irb on main (#<Thread:0x400fb7e4> : stop) - #1->irb#1 on main (#<Thread:0x40125d64> : running) - nil - irb#1(main):002:0> fg 0 # jobã®ã‚¹ã‚¤ãƒƒãƒ - nil - irb(main):002:0> class Foo;end - nil - irb(main):003:0> irb Foo # Fooをコンテã‚ストã—ã¦irb - # ç«‹ã¡ã‚ã’ - irb#2(Foo):001:0> def foo # Foo#fooã®å®šç¾© - irb#2(Foo):002:1> print 1 - irb#2(Foo):003:1> end - nil - irb#2(Foo):004:0> fg 0 # jobをスイッム- nil - irb(main):004:0> jobs # jobã®ãƒªã‚¹ãƒˆ - #0->irb on main (#<Thread:0x400fb7e4> : running) - #1->irb#1 on main (#<Thread:0x40125d64> : stop) - #2->irb#2 on Foo (#<Thread:0x4011d54c> : stop) - nil - irb(main):005:0> Foo.instance_methods # Foo#fooãŒã¡ã‚ƒã‚“ã¨å®šç¾©ã• - # れã¦ã„ã‚‹ - ["foo"] - irb(main):006:0> fg 2 # jobをスイッム- nil - irb#2(Foo):005:0> def bar # Foo#barを定義 - irb#2(Foo):006:1> print "bar" - irb#2(Foo):007:1> end - nil - irb#2(Foo):010:0> Foo.instance_methods - ["bar", "foo"] - irb#2(Foo):011:0> fg 0 - nil - irb(main):007:0> f = Foo.new - #<Foo:0x4010af3c> - irb(main):008:0> irb f # Fooã®ã‚¤ãƒ³ã‚¹ã‚¿ãƒ³ã‚¹ã§irbã‚’ - # ç«‹ã¡ã‚ã’ã‚‹. - irb#3(#<Foo:0x4010af3c>):001:0> jobs - #0->irb on main (#<Thread:0x400fb7e4> : stop) - #1->irb#1 on main (#<Thread:0x40125d64> : stop) - #2->irb#2 on Foo (#<Thread:0x4011d54c> : stop) - #3->irb#3 on #<Foo:0x4010af3c> (#<Thread:0x4010a1e0> : running) - nil - irb#3(#<Foo:0x4010af3c>):002:0> foo # f.fooã®å®Ÿè¡Œ - nil - irb#3(#<Foo:0x4010af3c>):003:0> bar # f.barã®å®Ÿè¡Œ - barnil - irb#3(#<Foo:0x4010af3c>):004:0> kill 1, 2, 3# jobã®kill - nil - irb(main):009:0> jobs - #0->irb on main (#<Thread:0x400fb7e4> : running) - nil - irb(main):010:0> exit # 終了 - dim% - -= 使用上ã®åˆ¶é™ - -irbã¯, 評価ã§ãる時点(å¼ãŒé–‰ã˜ãŸæ™‚点)ã§ã®é€æ¬¡å®Ÿè¡Œã‚’行ãªã„ã¾ã™. ã—ãŸãŒã£ -ã¦, rubyを直接使ã£ãŸæ™‚ã¨, 若干異ãªã‚‹å‹•作を行ãªã†å ´åˆãŒã‚りã¾ã™. - -ç¾åœ¨æ˜Žã‚‰ã‹ã«ãªã£ã¦ã„ã‚‹å•題点を説明ã—ã¾ã™. - -== ãƒãƒ¼ã‚«ãƒ«å¤‰æ•°ã®å®£è¨€ - -rubyã§ã¯, 以下ã®ãƒ—ãƒã‚°ãƒ©ãƒ ã¯ã‚¨ãƒ©ãƒ¼ã«ãªã‚Šã¾ã™. - - eval "foo = 0" - foo - -- - -:2: undefined local variable or method `foo' for #<Object:0x40283118> (NameError) - --- - NameError - -ã¨ã“ã‚ãŒ, irbを用ã„る㨠- - >> eval "foo = 0" - => 0 - >> foo - => 0 - -ã¨ãªã‚Š, エラーを起ã“ã—ã¾ã›ã‚“. ã“れã¯, rubyãŒæœ€åˆã«ã‚¹ã‚¯ãƒªãƒ—ト全体をコン -パイルã—ã¦ãƒãƒ¼ã‚«ãƒ«å¤‰æ•°ã‚’決定ã™ã‚‹ã‹ã‚‰ã§ã™. ãれã«å¯¾ã—, irbã¯å®Ÿè¡Œå¯èƒ½ã« -ãªã‚‹(å¼ãŒé–‰ã˜ã‚‹)ã¨è‡ªå‹•çš„ã«è©•価ã—ã¦ã„ã‚‹ã‹ã‚‰ã§ã™. 上記ã®ä¾‹ã§ã¯, - - evel "foo = 0" - -を行ãªã£ãŸæ™‚点ã§è©•価を行ãªã„, ãã®æ™‚点ã§å¤‰æ•°ãŒå®šç¾©ã•れるãŸã‚, 次å¼ã§ -変数fooã¯å®šç¾©ã•れã¦ã„ã‚‹ã‹ã‚‰ã§ã™. - -ã“ã®ã‚ˆã†ãªrubyã¨irbã®å‹•作ã®é•ã„を解決ã—ãŸã„å ´åˆã¯, begin...endã§æ‹¬ã£ã¦ -ãƒãƒƒãƒçš„ã«å®Ÿè¡Œã—ã¦ä¸‹ã•ã„: - - >> begin - ?> eval "foo = 0" - >> foo - >> end - NameError: undefined local variable or method `foo' for #<Object:0x4013d0f0> - (irb):3 - (irb_local_binding):1:in `eval' - -== ヒアドã‚ュメント - -ç¾åœ¨ã®ã¨ã“ã‚ヒアドã‚ュメントã®å®Ÿè£…ã¯ä¸å®Œå…¨ã§ã™. - -== シンボル - -シンボルã§ã‚ã‚‹ã‹ã©ã†ã‹ã®åˆ¤æ–ã‚’é–“é•ãˆã‚‹ã“ã¨ãŒã‚りã¾ã™. 具体的ã«ã¯å¼ãŒå®Œäº† -ã—ã¦ã„ã‚‹ã®ã«ç¶™ç¶šè¡Œã¨è¦‹ãªã™ã“ã¨ãŒã‚りã¾ã™. - -=end - -% Begin Emacs Environment -% Local Variables: -% mode: text -% comment-column: 0 -% comment-start: "%" -% comment-end: "\n" -% End: -% diff --git a/doc/jit/yjit.md b/doc/jit/yjit.md new file mode 100644 index 0000000000..24aa163e60 --- /dev/null +++ b/doc/jit/yjit.md @@ -0,0 +1,544 @@ +<p align="center"> + <a href="https://yjit.org/" target="_blank" rel="noopener noreferrer"> + <img src="https://user-images.githubusercontent.com/224488/131155756-aa8fb528-a813-4dfd-99ac-8785c3d5eed7.png" width="400"> + </a> +</p> + +YJIT - Yet Another Ruby JIT +=========================== + +YJIT is a lightweight, minimalistic Ruby JIT built inside CRuby. +It lazily compiles code using a Basic Block Versioning (BBV) architecture. +YJIT is currently supported for macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. +This project is open source and falls under the same license as CRuby. + +<p align="center"><b> + If you're using YJIT in production, please + <a href="mailto:ruby@shopify.com">share your success stories with us!</a> +</b></p> + +If you wish to learn more about the approach taken, here are some conference talks and publications: + +- MPLR 2023 talk: [Evaluating YJIT’s Performance in a Production Context: A Pragmatic Approach](https://www.youtube.com/watch?v=pVRmPZcNUhc) +- RubyKaigi 2023 keynote: [Optimizing YJIT’s Performance, from Inception to Production](https://www.youtube.com/watch?v=X0JRhh8w_4I) +- RubyKaigi 2023 keynote: [Fitting Rust YJIT into CRuby](https://www.youtube.com/watch?v=GI7vvAgP_Qs) +- RubyKaigi 2022 keynote: [Stories from developing YJIT](https://www.youtube.com/watch?v=EMchdR9C8XM) +- RubyKaigi 2022 talk: [Building a Lightweight IR and Backend for YJIT](https://www.youtube.com/watch?v=BbLGqTxTRp0) +- RubyKaigi 2021 talk: [YJIT: Building a New JIT Compiler Inside CRuby](https://www.youtube.com/watch?v=PBVLf3yfMs8) +- Blog post: [YJIT: Building a New JIT Compiler Inside CRuby](https://pointersgonewild.com/2021/06/02/yjit-building-a-new-jit-compiler-inside-cruby/) +- MPLR 2023 paper: [Evaluating YJIT’s Performance in a Production Context: A Pragmatic Approach](https://dl.acm.org/doi/10.1145/3617651.3622982) +- VMIL 2021 paper: [YJIT: A Basic Block Versioning JIT Compiler for CRuby](https://dl.acm.org/doi/10.1145/3486606.3486781) +- MoreVMs 2021 talk: [YJIT: Building a New JIT Compiler Inside CRuby](https://www.youtube.com/watch?v=vucLAqv7qpc) +- ECOOP 2016 talk: [Interprocedural Type Specialization of JavaScript Programs Without Type Analysis](https://www.youtube.com/watch?v=sRNBY7Ss97A) +- ECOOP 2016 paper: [Interprocedural Type Specialization of JavaScript Programs Without Type Analysis](https://drops.dagstuhl.de/opus/volltexte/2016/6101/pdf/LIPIcs-ECOOP-2016-7.pdf) +- ECOOP 2015 talk: [Simple and Effective Type Check Removal through Lazy Basic Block Versioning](https://www.youtube.com/watch?v=S-aHBuoiYE0) +- ECOOP 2015 paper: [Simple and Effective Type Check Removal through Lazy Basic Block Versioning](https://arxiv.org/pdf/1411.0352.pdf) + +To cite YJIT in your publications, please cite the MPLR 2023 paper: + +```BibTeX +@inproceedings{yjit_mplr_2023, +author = {Chevalier-Boisvert, Maxime and Kokubun, Takashi and Gibbs, Noah and Wu, Si Xing (Alan) and Patterson, Aaron and Issroff, Jemma}, +title = {Evaluating YJIT’s Performance in a Production Context: A Pragmatic Approach}, +year = {2023}, +isbn = {9798400703805}, +publisher = {Association for Computing Machinery}, +address = {New York, NY, USA}, +url = {https://doi.org/10.1145/3617651.3622982}, +doi = {10.1145/3617651.3622982}, +booktitle = {Proceedings of the 20th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes}, +pages = {20–33}, +numpages = {14}, +keywords = {dynamically typed, optimization, just-in-time, virtual machine, ruby, compiler, bytecode}, +location = {Cascais, Portugal}, +series = {MPLR 2023} +} +``` + +## Current Limitations + +YJIT may not be suitable for certain applications. It currently only supports macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. YJIT will use more memory than the Ruby interpreter because the JIT compiler needs to generate machine code in memory and maintain additional state information. +You can change how much executable memory is allocated using [YJIT's command-line options](#command-line-options). + +## Installation + +### Requirements + +You will need to install: + + - All the usual build tools for Ruby. See [Building Ruby](../contributing/building_ruby.md) + - The Rust compiler `rustc` + - The Rust version must be [>= 1.58.0](../../yjit/Cargo.toml). + - Optionally, only if you wish to build in dev/debug mode, Rust's `cargo` + +If you don't intend on making code changes to YJIT itself, we recommend +obtaining `rustc` through your OS's package manager since that +likely reuses the same vendor which provides the C toolchain. + +If you will be changing YJIT's Rust code, we suggest using the +[first-party installation method][rust-install] for Rust. Rust also provides +first class [support][editor-tools] for many source code editors. + +[rust-install]: https://www.rust-lang.org/tools/install +[editor-tools]: https://www.rust-lang.org/tools + +### Building YJIT + +Start by cloning the `ruby/ruby` repository: + +```sh +git clone https://github.com/ruby/ruby yjit +cd yjit +``` + +The YJIT `ruby` binary can be built with either GCC or Clang. It can be built either in dev (debug) mode or in release mode. For maximum performance, compile YJIT in release mode with GCC. More detailed build instructions are provided in the [Ruby README](https://github.com/ruby/ruby#how-to-build). + +```sh +# Configure in release mode for maximum performance, build and install +./autogen.sh +./configure --enable-yjit --prefix=$HOME/.rubies/ruby-yjit --disable-install-doc +make -j && make install +``` + +or + +```sh +# Configure in lower-performance dev (debug) mode for development, build and install +./autogen.sh +./configure --enable-yjit=dev --prefix=$HOME/.rubies/ruby-yjit --disable-install-doc +make -j && make install +``` + +Dev mode includes extended YJIT statistics, but can be slow. For only statistics you can configure in stats mode: + +```sh +# Configure in extended-stats mode without slow runtime checks, build and install +./autogen.sh +./configure --enable-yjit=stats --prefix=$HOME/.rubies/ruby-yjit --disable-install-doc +make -j && make install +``` + +On macOS, you may need to specify where to find some libraries: + +```sh +# Install dependencies +brew install openssl libyaml + +# Configure in dev (debug) mode for development, build and install +./autogen.sh +./configure --enable-yjit=dev --prefix=$HOME/.rubies/ruby-yjit --disable-install-doc --with-opt-dir="$(brew --prefix openssl):$(brew --prefix readline):$(brew --prefix libyaml)" +make -j && make install +``` + +Typically configure will choose the default C compiler. To specify the C compiler, use + +```sh +# Choosing a specific c compiler +export CC=/path/to/my/chosen/c/compiler +``` + +before running `./configure`. + +You can test that YJIT works correctly by running: + +```sh +# Quick tests found in /bootstraptest +make btest + +# Complete set of tests +make -j test-all +``` + +## Usage + +### Examples + +Once YJIT is built, you can either use `./miniruby` from within your build directory, or switch to the YJIT version of `ruby` +by using the `chruby` tool: + +```sh +chruby ruby-yjit +ruby myscript.rb +``` + +You can dump statistics about compilation and execution by running YJIT with the `--yjit-stats` command-line option: + +```sh +./miniruby --yjit-stats myscript.rb +``` + +You can see what YJIT has compiled by running YJIT with the `--yjit-log` command-line option: + +```sh +./miniruby --yjit-log myscript.rb +``` + +The machine code generated for a given method can be printed by adding `puts RubyVM::YJIT.disasm(method(:method_name))` to a Ruby script. Note that no code will be generated if the method is not compiled. + +<h3 id="command-line-options">Command-Line Options</h3> + +YJIT supports all command-line options supported by upstream CRuby, but also adds a few YJIT-specific options: + +- `--yjit`: enable YJIT (disabled by default) +- `--yjit-mem-size=N`: soft limit on YJIT memory usage in MiB (default: 128). Tries to limit `code_region_size + yjit_alloc_size` +- `--yjit-exec-mem-size=N`: hard limit on executable memory block in MiB. Limits `code_region_size` +- `--yjit-call-threshold=N`: number of calls after which YJIT begins to compile a function. + It defaults to 30, and it's then increased to 120 when the number of ISEQs in the process reaches 40,000. +- `--yjit-cold-threshold=N`: number of global calls after which an ISEQ is considered cold and not + compiled, lower values mean less code is compiled (default 200K) +- `--yjit-stats`: print statistics after the execution of a program (incurs a run-time cost) +- `--yjit-stats=quiet`: gather statistics while running a program but don't print them. Stats are accessible through `RubyVM::YJIT.runtime_stats`. (incurs a run-time cost) +- `--yjit-log[=file|dir]`: log all compilation events to the specified file or directory. If no name is supplied, the last 1024 log entries will be printed to stderr when the application exits. +- `--yjit-log=quiet`: gather a circular buffer of recent YJIT compilations. The compilation log entries are accessible through `RubyVM::YJIT.log` and old entries will be discarded if the buffer is not drained quickly. (incurs a run-time cost) +- `--yjit-disable`: disable YJIT despite other `--yjit*` flags for lazily enabling it with `RubyVM::YJIT.enable` +- `--yjit-code-gc`: enable code GC (disabled by default as of Ruby 3.3). + It will cause all machine code to be discarded when the executable memory size limit is hit, meaning JIT compilation will then start over. + This can allow you to use a lower executable memory size limit, but may cause a slight drop in performance when the limit is hit. +- `--yjit-perf`: enable frame pointers and profiling with the `perf` tool +- `--yjit-trace-exits`: produce a Marshal dump of backtraces from all exits. Automatically enables `--yjit-stats` +- `--yjit-trace-exits=COUNTER`: produce a Marshal dump of backtraces from a counted exit or a fallback. Automatically enables `--yjit-stats` +- `--yjit-trace-exits-sample-rate=N`: trace exit locations only every Nth occurrence. Automatically enables `--yjit-trace-exits` + +Note that there is also an environment variable `RUBY_YJIT_ENABLE` which can be used to enable YJIT. +This can be useful for some deployment scripts where specifying an extra command-line option to Ruby is not practical. + +You can also enable YJIT at run-time using `RubyVM::YJIT.enable`. This can allow you to enable YJIT after your application is done +booting, which makes it possible to avoid compiling any initialization code. + +You can verify that YJIT is enabled using `RubyVM::YJIT.enabled?` or by checking that `ruby --yjit -v` includes the string `+YJIT`: + +```sh +ruby --yjit -v +ruby 3.3.0dev (2023-01-31T15:11:10Z master 2a0bf269c9) +YJIT dev [x86_64-darwin22] + +ruby --yjit -e "p RubyVM::YJIT.enabled?" +true + +ruby -e "RubyVM::YJIT.enable; p RubyVM::YJIT.enabled?" +true +``` + +### Benchmarking + +We have collected a set of benchmarks and implemented a simple benchmarking harness in the [yjit-bench](https://github.com/Shopify/yjit-bench) repository. This benchmarking harness is designed to disable CPU frequency scaling, set process affinity and disable address space randomization so that the variance between benchmarking runs will be as small as possible. + +## Performance Tips for Production Deployments + +While YJIT options default to what we think would work well for most workloads, +they might not necessarily be the best configuration for your application. +This section covers tips on improving YJIT performance in case YJIT does not +speed up your application in production. + +### Increasing --yjit-mem-size + +The `--yjit-mem-size` value can be used to set the maximum amount of memory that YJIT +is allowed to use. This corresponds to the total of `RubyVM::YJIT.runtime_stats[:code_region_size]` +and `RubyVM::YJIT.runtime_stats[:yjit_alloc_size]` +Increasing the `--yjit-mem-size` value means more code +can be optimized by YJIT, at the cost of more memory usage. + +If you start Ruby with `--yjit-stats`, e.g. using an environment variable `RUBYOPT=--yjit-stats`, +`RubyVM::YJIT.runtime_stats[:ratio_in_yjit]` shows the percentage of total YARV instructions +executed by YJIT as opposed to the CRuby interpreter. +Ideally, `ratio_in_yjit` should be as large as 99%, and increasing `--yjit-mem-size` often +helps improving `ratio_in_yjit`. + +### Running workers as long as possible + +It's helpful to call the same code as many times as possible before a process restarts. +If a process is killed too frequently, the time taken for compiling methods may outweigh +the speedup obtained by compiling them. + +You should monitor the number of requests each process has served. +If you're periodically killing worker processes, e.g. with `unicorn-worker-killer` or `puma_worker_killer`, +you may want to reduce the killing frequency or increase the limit. + +## Reducing YJIT Memory Usage + +YJIT allocates memory for JIT code and metadata. Enabling YJIT generally results in more memory usage. +This section goes over tips on minimizing YJIT memory usage in case it uses more than your capacity. + +### Decreasing --yjit-mem-size + +YJIT uses memory for compiled code and metadata. You can change the maximum amount of memory +that YJIT can use by specifying a different `--yjit-mem-size` command-line option. The default value +is currently `128`. +When changing this value, you may want to monitor `RubyVM::YJIT.runtime_stats[:ratio_in_yjit]` +as explained above. + +### Enabling YJIT lazily + +If you enable YJIT by `--yjit` options or `RUBY_YJIT_ENABLE=1`, YJIT may compile code that is +used only during the application boot. `RubyVM::YJIT.enable` allows you to enable YJIT from Ruby code, +and you can call this after your application is initialized, e.g. on Unicorn's `after_fork` hook. +If you use any YJIT options (`--yjit-*`), YJIT will start at boot by default, but `--yjit-disable` +allows you to start Ruby with the YJIT-disabled mode while passing YJIT tuning options. + +## Code Optimization Tips + +This section contains tips on writing Ruby code that will run as fast as possible on YJIT. Some of this advice is based on current limitations of YJIT, while other advice is broadly applicable. It probably won't be practical to apply these tips everywhere in your codebase. You should ideally start by profiling your application using a tool such as [stackprof](https://github.com/tmm1/stackprof) so that you can determine which methods make up most of the execution time. You can then refactor the specific methods that make up the largest fractions of the execution time. We do not recommend modifying your entire codebase based on the current limitations of YJIT. + +- Avoid using `OpenStruct` +- Avoid redefining basic integer operations (i.e. +, -, <, >, etc.) +- Avoid redefining the meaning of `nil`, equality, etc. +- Avoid allocating objects in the hot parts of your code +- Minimize layers of indirection + - Avoid writing wrapper classes if you can (e.g. a class that only wraps a Ruby hash) + - Avoid methods that just call another method +- Ruby method calls are costly. Avoid things such as methods that only return a value from a hash +- Try to write code so that the same variables and method arguments always have the same type +- Avoid using `TracePoint` as it can cause YJIT to deoptimize code +- Avoid using `binding` as it can cause YJIT to deoptimize code + +You can also use the `--yjit-stats` command-line option to see which bytecodes cause YJIT to exit, and refactor your code to avoid using these instructions in the hottest methods of your code. + +### Other Statistics + +If you run `ruby` with `--yjit-stats`, YJIT will track and return performance statistics in `RubyVM::YJIT.runtime_stats`. + +```rb +$ RUBYOPT="--yjit-stats" irb +irb(main):001:0> RubyVM::YJIT.runtime_stats +=> +{:inline_code_size=>340745, + :outlined_code_size=>297664, + :all_stats=>true, + :yjit_insns_count=>1547816, + :send_callsite_not_simple=>7267, + :send_kw_splat=>7, + :send_ivar_set_method=>72, +... +``` + +Some of the counters include: + +* `:yjit_insns_count` - how many Ruby bytecode instructions have been executed +* `:binding_allocations` - number of bindings allocated +* `:binding_set` - number of variables set via a binding +* `:code_gc_count` - number of garbage collections of compiled code since process start +* `:vm_insns_count` - number of instructions executed by the Ruby interpreter +* `:compiled_iseq_count` - number of bytecode sequences compiled +* `:inline_code_size` - size in bytes of compiled YJIT blocks +* `:outline_code_size` - size in bytes of YJIT error-handling compiled code +* `:side_exit_count` - number of side exits taken at runtime +* `:total_exit_count` - number of exits, including side exits, taken at runtime +* `:avg_len_in_yjit` - avg. number of instructions in compiled blocks before exiting to interpreter + +Counters starting with "exit_" show reasons for YJIT code taking a side exit (return to the interpreter.) + +Performance counter names are not guaranteed to remain the same between Ruby versions. If you're curious what each counter means, +it's usually best to search the source code for it — but it may change in a later Ruby version. + +The printed text after a `--yjit-stats` run includes other information that may be named differently than the information in `RubyVM::YJIT.runtime_stats`. + +## Contributing + +We welcome open source contributions. You should feel free to open new issues to report bugs or just to ask questions. +Suggestions on how to make this readme file more helpful for new contributors are most welcome. + +Bug fixes and bug reports are very valuable to us. If you find a bug in YJIT, it's very possible be that nobody has reported it before, +or that we don't have a good reproduction for it, so please open an issue and provide as much information as you can about your configuration and a description of how you encountered the problem. List the commands you used to run YJIT so that we can easily reproduce the issue on our end and investigate it. If you are able to produce a small program reproducing the error to help us track it down, that is very much appreciated as well. + +If you would like to contribute a large patch to YJIT, we suggest opening an issue or a discussion on the [Shopify/ruby repository](https://github.com/Shopify/ruby/issues) so that +we can have an active discussion. A common problem is that sometimes people submit large pull requests to open source projects +without prior communication, and we have to reject them because the work they implemented does not fit within the design of the +project. We want to save you time and frustration, so please reach out so we can have a productive discussion as to how +you can contribute patches we will want to merge into YJIT. + +### Source Code Organization + +The YJIT source code is divided between: + +- `yjit.c`: code YJIT uses to interface with the rest of CRuby +- `yjit.h`: C definitions YJIT exposes to the rest of the CRuby +- `yjit.rb`: `YJIT` Ruby module that is exposed to Ruby +- `yjit/src/asm/*`: in-memory assembler we use to generate machine code +- `yjit/src/codegen.rs`: logic for translating Ruby bytecode to machine code +- `yjit/src/core.rb`: basic block versioning logic, core structure of YJIT +- `yjit/src/stats.rs`: gathering of run-time statistics +- `yjit/src/options.rs`: handling of command-line options +- `yjit/src/cruby.rs`: C bindings manually exposed to the Rust codebase +- `yjit/bindgen/src/main.rs`: C bindings exposed to the Rust codebase through bindgen + +The core of CRuby's interpreter logic is found in: + +- `insns.def`: defines Ruby's bytecode instructions (gets compiled into `vm.inc`) +- `vm_insnshelper.c`: logic used by Ruby's bytecode instructions +- `vm_exec.c`: Ruby interpreter loop + +### Generating C bindings with bindgen + +In order to expose C functions to the Rust codebase, you will need to generate C bindings: + +```sh +CC=clang ./configure --enable-yjit=dev +make -j yjit-bindgen +``` + +This uses the bindgen tools to generate/update `yjit/src/cruby_bindings.inc.rs` based on the +bindings listed in `yjit/bindgen/src/main.rs`. Avoid manually editing this file +as it could be automatically regenerated at a later time. If you need to manually add C bindings, +add them to `yjit/cruby.rs` instead. + +### Coding & Debugging Protips + +There are multiple test suites: + +- `make btest` (see `/bootstraptest`) +- `make test-all` +- `make test-spec` +- `make check` runs all of the above +- `make yjit-check` runs quick checks to see that YJIT is working correctly + +The tests can be run in parallel like this: + +```sh +make -j test-all RUN_OPTS="--yjit-call-threshold=1" +``` + +Or single-threaded like this, to more easily identify which specific test is failing: + +```sh +make test-all TESTOPTS=--verbose RUN_OPTS="--yjit-call-threshold=1" +``` + +To run a single test file with `test-all`: + +```sh +make test-all TESTS='test/-ext-/marshal/test_usrmarshal.rb' RUNRUBYOPT=--debugger=lldb RUN_OPTS="--yjit-call-threshold=1" +``` + +It's also possible to filter tests by name to run a single test: + +```sh +make test-all TESTS='-n /test_float_plus/' RUN_OPTS="--yjit-call-threshold=1" +``` + +You can also run one specific test in `btest`: + +```sh +make btest BTESTS=bootstraptest/test_ractor.rb RUN_OPTS="--yjit-call-threshold=1" +``` + +There are shortcuts to run/debug your own test/repro in `test.rb`: + +```sh +make run # runs ./miniruby test.rb +make lldb # launches ./miniruby test.rb in lldb +``` + +You can use the Intel syntax for disassembly in LLDB, keeping it consistent with YJIT's disassembly: + +```sh +echo "settings set target.x86-disassembly-flavor intel" >> ~/.lldbinit +``` + +## Running x86 YJIT on Apple's Rosetta + +For development purposes, it is possible to run x86 YJIT on an Apple M1 via Rosetta. You can find basic +instructions below, but there are a few caveats listed further down. + +First, install Rosetta: + +```console +$ softwareupdate --install-rosetta +``` + +Now any command can be run with Rosetta via the `arch` command line tool. + +Then you can start your shell in an x86 environment: + +```console +$ arch -x86_64 zsh +``` + +You can double check your current architecture via the `arch` command: + +```console +$ arch -x86_64 zsh +$ arch +i386 +``` + +You may need to set the default target for `rustc` to x86-64, e.g. + +```console +$ rustup default stable-x86_64-apple-darwin +``` + +While in your i386 shell, install Cargo and Homebrew, then hack away! + +### Rosetta Caveats + +1. You must install a version of Homebrew for each architecture +2. Cargo will install in $HOME/.cargo by default, and I don't know a good way to change architectures after install + +If you use Fish shell you can [read this link](https://tenderlovemaking.com/2022/01/07/homebrew-rosetta-and-ruby.html) for information on making the dev environment easier. + +## Profiling with Linux perf + +`--yjit-perf` allows you to profile JIT-ed methods along with other native functions using Linux perf. +When you run Ruby with `perf record`, perf looks up `/tmp/perf-{pid}.map` to resolve symbols in JIT code, +and this option lets YJIT write method symbols into that file as well as enabling frame pointers. + +### Call graph + +Here's an example way to use this option with [Firefox Profiler](https://profiler.firefox.com) +(See also: [Profiling with Linux perf](https://profiler.firefox.com/docs/#/./guide-perf-profiling)): + +```bash +# Compile the interpreter with frame pointers enabled +./configure --enable-yjit --prefix=$HOME/.rubies/ruby-yjit --disable-install-doc cflags=-fno-omit-frame-pointer +make -j && make install + +# [Optional] Allow running perf without sudo +echo 0 | sudo tee /proc/sys/kernel/kptr_restrict +echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid + +# Profile Ruby with --yjit-perf +cd ../yjit-bench +PERF="record --call-graph fp" ruby --yjit-perf -Iharness-perf benchmarks/liquid-render/benchmark.rb + +# View results on Firefox Profiler https://profiler.firefox.com. +# Create /tmp/test.perf as below and upload it using "Load a profile from file". +perf script --fields +pid > /tmp/test.perf +``` + +### YJIT codegen + +You can also profile the number of cycles consumed by code generated by each YJIT function. + +```bash +# Install perf +apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r` + +# [Optional] Allow running perf without sudo +echo 0 | sudo tee /proc/sys/kernel/kptr_restrict +echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid + +# Profile Ruby with --yjit-perf=codegen +cd ../yjit-bench +PERF=record ruby --yjit-perf=codegen -Iharness-perf benchmarks/lobsters/benchmark.rb + +# Aggregate results +perf script > /tmp/perf.txt +../ruby/misc/yjit_perf.py /tmp/perf.txt +``` + +#### Building perf with Python support + +The above instructions work fine for most people, but you could also use +a handy `perf script -s` interface if you build perf from source. + +```bash +# Build perf from source for Python support +sudo apt-get install libpython3-dev python3-pip flex libtraceevent-dev \ + libelf-dev libunwind-dev libaudit-dev libslang2-dev libdw-dev +git clone --depth=1 https://github.com/torvalds/linux +cd linux/tools/perf +make +make install + +# Aggregate results +perf script -s ../ruby/misc/yjit_perf.py +``` diff --git a/doc/jit/zjit.md b/doc/jit/zjit.md new file mode 100644 index 0000000000..38124cb737 --- /dev/null +++ b/doc/jit/zjit.md @@ -0,0 +1,397 @@ +# ZJIT: ADVANCED RUBY JIT PROTOTYPE + +ZJIT is a method-based just-in-time (JIT) compiler for Ruby. It uses profile +information from the interpreter to guide optimization in the compiler. + +ZJIT is currently supported for macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. +This project is open source and falls under the same license as CRuby. + +## Current Limitations + +ZJIT may not be suitable for certain applications. It currently only supports macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. ZJIT will use more memory than the Ruby interpreter because the JIT compiler needs to generate machine code in memory and maintain additional state information. +You can change how much executable memory is allocated using [ZJIT's command-line options](rdoc-ref:@Command-Line+Options). + +## Contributing + +We welcome open source contributions. Feel free to open new issues to report +bugs or just to ask questions. Suggestions on how to make this document more +helpful for new contributors are most welcome. + +Bug fixes and bug reports are very valuable to us. If you find a bug in ZJIT, +it's very possible that nobody has reported it before, or that we don't have +a good reproduction for it, so please open a ticket on [the official Ruby bug +tracker][rubybugs] (or, if you don't want to make an account, [on +Shopify/ruby][shopifyruby]) and provide as much information as you can about +your configuration and a description of how you encountered the problem. List +the commands you used to run ZJIT so that we can easily reproduce the issue on +our end and investigate it. If you are able to produce a small program +reproducing the error to help us track it down, that is very much appreciated +as well. + +[rubybugs]: https://bugs.ruby-lang.org/projects/ruby-master +[shopifyruby]: https://github.com/Shopify/ruby/issues + +If you would like to contribute a large patch to ZJIT, we suggest [chatting on +Zulip][zulip] for a casual chat and then opening an issue on the [Shopify/ruby +repository][shopifyruby] so that we can have a technical discussion. A common +problem is that sometimes people submit large pull requests to open source +projects without prior communication, and we have to reject them because the +work they implemented does not fit within the design of the project. We want to +save you time and frustration, so please reach out so we can have a productive +discussion as to how you can contribute patches we will want to merge into +ZJIT. + +[zulip]: https://zjit.zulipchat.com/ + +## Build Instructions + +Refer to [Building Ruby](rdoc-ref:contributing/building_ruby.md) for general build prerequisites. +Additionally, ZJIT requires Rust 1.85.0 or later. Release builds need only `rustc`. Development +builds require `cargo` and may download dependencies. GNU Make is required. + +### For normal use + +To build ZJIT on macOS: + +```bash +./autogen.sh + +./configure \ + --enable-zjit \ + --prefix="$HOME"/.rubies/ruby-zjit \ + --disable-install-doc \ + --with-opt-dir="$(brew --prefix openssl):$(brew --prefix readline):$(brew --prefix libyaml)" + +make -j miniruby +``` + +To build ZJIT on Linux: + +```bash +./autogen.sh + +./configure \ + --enable-zjit \ + --prefix="$HOME"/.rubies/ruby-zjit \ + --disable-install-doc + +make -j miniruby +``` + +### For development + +To build ZJIT on macOS: + +```bash +./autogen.sh + +./configure \ + --enable-zjit=dev \ + --prefix="$HOME"/.rubies/ruby-zjit \ + --disable-install-doc \ + --with-opt-dir="$(brew --prefix openssl):$(brew --prefix readline):$(brew --prefix libyaml)" + +make -j miniruby +``` + +To build ZJIT on Linux: + +```bash +./autogen.sh + +./configure \ + --enable-zjit=dev \ + --prefix="$HOME"/.rubies/ruby-zjit \ + --disable-install-doc + +make -j miniruby +``` + +Note that `--enable-zjit=dev` does a lot of IR validation, which will help to catch errors early but mean compilation and warmup are significantly slower. + +The valid values for `--enable-zjit` are, from fastest to slowest: +* `--enable-zjit`: enable ZJIT in release mode for maximum performance +* `--enable-zjit=stats`: enable ZJIT in extended-stats mode +* `--enable-zjit=dev_nodebug`: enable ZJIT in development mode but without slow runtime checks +* `--enable-zjit=dev`: enable ZJIT in debug mode for development, also enables `RUBY_DEBUG` + +### Regenerate bindings + +When modifying `zjit/bindgen/src/main.rs` you need to regenerate bindings in `zjit/src/cruby_bindings.inc.rs` with: + +```bash +make zjit-bindgen +``` + +## Documentation + +### Command-Line Options + +See `ruby --help` for ZJIT-specific command-line options: + +``` +$ ruby --help +... +ZJIT options: + --zjit-mem-size=num + Max amount of memory that ZJIT can use in MiB (default: 128). + --zjit-call-threshold=num + Number of calls to trigger JIT (default: 30). + --zjit-num-profiles=num + Number of profiled calls before JIT (default: 5). + --zjit-stats[=quiet] + Enable collecting ZJIT statistics (=quiet to suppress output). + --zjit-disable Disable ZJIT for lazily enabling it with RubyVM::ZJIT.enable. + --zjit-perf Dump ISEQ symbols into /tmp/perf-{}.map for Linux perf. + --zjit-log-compiled-iseqs=path + Log compiled ISEQs to the file. The file will be truncated. + --zjit-trace-exits[=counter] + Record source on side-exit. `Counter` picks specific counter. + --zjit-trace-exits-sample-rate=num + Frequency at which to record side exits. Must be `usize`. +$ +``` + +### Source level documentation + +You can generate and open the source level documentation in your browser using: + +```bash +cargo doc --document-private-items -p zjit --open +``` + +### Graph of the Type System + +You can generate a graph of the ZJIT type hierarchy using: + +```bash +ruby zjit/src/hir_type/gen_hir_type.rb > zjit/src/hir_type/hir_type.inc.rs +dot -O -Tpdf zjit_types.dot +open zjit_types.dot.pdf +``` + +## Testing + +Note that tests link against CRuby, so directly calling `cargo test`, or `cargo nextest` should not build. All tests are instead accessed through `make`. + +### Setup + +First, ensure you have `cargo` installed. If you do not already have it, you can use [rustup.rs](https://rustup.rs/). + +Also install cargo-binstall with: + +```bash +cargo install cargo-binstall +``` + +Make sure to add `--enable-zjit=dev` when you run `configure`, then install the following tools: + +```bash +cargo binstall --secure cargo-nextest +cargo binstall --secure cargo-insta +``` + +`cargo-insta` is used for updating snapshots. `cargo-nextest` runs each test in its own process, which is valuable since CRuby only supports booting once per process, and most APIs are not thread safe. + +### Running unit tests + +For testing functionality within ZJIT, use: + +```bash +make zjit-test +``` + +You can also run a single test case by specifying the function name: + +```bash +make zjit-test ZJIT_TESTS=test_putobject +``` + +#### Snapshot Testing + +ZJIT uses [insta](https://insta.rs/) for snapshot testing within unit tests. When tests fail due to snapshot mismatches, pending snapshots are created. The test command will notify you if there are pending snapshots: + +``` +Pending snapshots found. Accept with: make zjit-test-update +``` + +To update/accept all the snapshot changes: + +```bash +make zjit-test-update +``` + +You can also review snapshot changes interactively one by one: + +```bash +cd zjit && cargo insta review +``` + +Test changes will be reviewed alongside code changes. + +### Running integration tests + +This command runs Ruby execution tests. + +```bash +make test-all TESTS="test/ruby/test_zjit.rb" +``` + +You can also run a single test case by matching the method name: + +```bash +make test-all TESTS="test/ruby/test_zjit.rb -n TestZJIT#test_putobject" +``` + +### Running all tests + +Runs both `make zjit-test` and `test/ruby/test_zjit.rb`: + +```bash +make zjit-check +``` + +## Statistics Collection + +ZJIT provides detailed statistics about JIT compilation and execution behavior. + +### Basic Stats + +Run with basic statistics printed on exit: + +```bash +./miniruby --zjit-stats script.rb +``` + +Collect stats without printing (access via `RubyVM::ZJIT.stats` in Ruby): + +```bash +./miniruby --zjit-stats=quiet script.rb +``` + +### Accessing Stats in Ruby + +```ruby +# Check if stats are enabled +if RubyVM::ZJIT.stats_enabled? + stats = RubyVM::ZJIT.stats + puts "Compiled ISEQs: #{stats[:compiled_iseq_count]}" + puts "Failed ISEQs: #{stats[:failed_iseq_count]}" + + # You can also reset stats during execution + RubyVM::ZJIT.reset_stats! +end +``` + +### Performance Ratio + +The `ratio_in_zjit` stat shows the percentage of Ruby instructions executed in JIT code vs interpreter. +This metric only appears when ZJIT is built with `--enable-zjit=stats` [or more](#build-instructions) (which enables `rb_vm_insn_count` tracking) and represents a key performance indicator for ZJIT effectiveness. + +### Tracing side exits + +Through [Stackprof](https://github.com/tmm1/stackprof), detailed information about the methods that the JIT side-exits from can be displayed after some execution of a program. Optionally, you can use `--zjit-trace-exits-sample-rate=N` to sample every N-th occurrence. Enabling `--zjit-trace-exits-sample-rate=N` will automatically enable `--zjit-trace-exits`. + +```bash +./miniruby --zjit-trace-exits script.rb +``` + +A file called `zjit_exits_{pid}.dump` will be created in the same directory as `script.rb`. Viewing the side exited methods can be done with Stackprof: + +```bash +stackprof path/to/zjit_exits_{pid}.dump +``` + +### Viewing HIR in Iongraph + +Using `--zjit-dump-hir-iongraph` will dump all compiled functions into a directory named `/tmp/zjit-iongraph-{PROCESS_PID}`. Each file will be named `func_{ZJIT_FUNC_NAME}.json`. In order to use them in the Iongraph viewer, you'll need to use `jq` to collate them to a single file. An example invocation of `jq` is shown below for reference. + +`jq --slurp --null-input '.functions=inputs | .version=1' /tmp/zjit-iongraph-{PROCESS_PID}/func*.json > ~/Downloads/ion.json` + +From there, you can use https://mozilla-spidermonkey.github.io/iongraph/ to view your trace. + +### Printing ZJIT Errors + +`--zjit-debug` prints ZJIT compilation errors and other diagnostics: + +```bash +./miniruby --zjit-debug script.rb +``` + +As you might guess from the name, this option is intended mostly for ZJIT developers. + +## Useful dev commands + +To view YARV output for code snippets: + +```bash +./miniruby --dump=insns -e0 +``` + +To run code snippets with ZJIT: + +```bash +./miniruby --zjit -e0 +``` + +You can also try https://www.rubyexplorer.xyz/ to view Ruby YARV disasm output with syntax highlighting +in a way that can be easily shared with other team members. + +## Understanding Ruby Stacks + +Ruby execution involves three distinct stacks and understanding them will help you understand ZJIT's implementation: + +### 1. Native Stack + +- **Purpose**: Return addresses and saved registers. ZJIT also uses it for some C functions' argument arrays +- **Management**: OS-managed, one per native thread +- **Growth**: Downward from high addresses +- **Constants**: `NATIVE_STACK_PTR`, `NATIVE_BASE_PTR` + +### 2. Ruby VM Stack + +The Ruby VM uses a single contiguous memory region (`ec->vm_stack`) containing two sub-stacks that grow toward each other. When they meet, stack overflow occurs. + +See [doc/contributing/vm_stack_and_frames.md](rdoc-ref:contributing/vm_stack_and_frames.md) for detailed architecture and frame layout. + +**Control Frame Stack:** + +- **Stores**: Frame metadata (`rb_control_frame_t` structures) +- **Growth**: Downward from `vm_stack + size` (high addresses) +- **Constants**: `CFP` + +**Value Stack:** + +- **Stores**: YARV bytecode operands (self, arguments, locals, temporaries) +- **Growth**: Upward from `vm_stack` (low addresses) +- **Constants**: `SP` + +## ZJIT Glossary + +This glossary contains terms that are helpful for understanding ZJIT. + +Please note that some terms may appear in CRuby internals too but with different meanings. + +| Term | Definition | +| ----------------- | ------------------------------------------------------------------------------------------------------------------------------- | +| HIR | High-level Intermediate Representation. High-level (Ruby semantics) graph representation in static single-assignment (SSA) form | +| LIR | Low-level Intermediate Representation. Low-level IR used in the backend for assembly generation | +| SSA | Static Single Assignment. A form where each variable is assigned exactly once | +| `opnd` | Operand. An operand to an IR instruction (can be register, memory, immediate, etc.) | +| `dst` | Destination. The output operand of an instruction where the result is stored | +| VReg | Virtual Register. A virtual register that gets lowered to physical register or memory | +| `insn_id` | Instruction ID. An index of an instruction in a function | +| `block_id` | The index of a basic block, which effectively acts like a pointer | +| `branch` | Control flow edge between basic blocks in the compiled code | +| `cb` | Code Block. Memory region for generated machine code | +| `entry` | The starting address of compiled code for an ISEQ | +| Patch Point | Location in generated code that can be modified later in case assumptions get invalidated | +| Frame State | Captured state of the Ruby stack frame at a specific point for deoptimization | +| Guard | A run-time check that ensures assumptions are still valid | +| `invariant` | An assumption that JIT code relies on, requiring invalidation if broken | +| Deopt | Deoptimization. Process of falling back from JIT code to interpreter | +| Side Exit | Exit from JIT code back to interpreter | +| Type Lattice | Hierarchy of types used for type inference and optimization | +| Constant Folding | Optimization that evaluates constant expressions at compile time | +| RSP | x86-64 stack pointer register used for native stack operations | +| Register Spilling | Process of moving register values to memory when running out of physical registers | diff --git a/doc/language/box.md b/doc/language/box.md new file mode 100644 index 0000000000..8c7fd20b20 --- /dev/null +++ b/doc/language/box.md @@ -0,0 +1,361 @@ +# Ruby Box - Ruby's in-process separation of Classes and Modules + +Ruby Box is designed to provide separated spaces in a Ruby process, to isolate application code, libraries and monkey patches. + +## Known issues + +* Experimental warning is shown when ruby starts with `RUBY_BOX=1` (specify `-W:no-experimental` option to hide it) +* Installing native extensions may fail under `RUBY_BOX=1` because of stack level too deep in extconf.rb +* `require 'active_support/core_ext'` may fail under `RUBY_BOX=1` +* Defined methods in a box may not be referred by built-in methods written in Ruby + +## TODOs + +* Add the loaded box on iseq to check if another box tries running the iseq (add a field only when VM_CHECK_MODE?) +* Assign its own TOPLEVEL_BINDING in boxes +* Fix calling `warn` in boxes to refer `$VERBOSE` and `Warning.warn` in the box +* Make an internal data container class `Ruby::Box::Entry` invisible +* More test cases about `$LOAD_PATH` and `$LOADED_FEATURES` + +## How to use + +### Enabling Ruby Box + +First, an environment variable should be set at the ruby process bootup: `RUBY_BOX=1`. +The only valid value is `1` to enable Ruby Box. Other values (or unset `RUBY_BOX`) means disabling Ruby Box. And setting the value after Ruby program starts doesn't work. + +### Using Ruby Box + +`Ruby::Box` class is the entrypoint of Ruby Box. + +```ruby +box = Ruby::Box.new +box.require('something') # or require_relative, load +``` + +The required file (either .rb or .so/.dll/.bundle) is loaded in the box (`box` here). The required/loaded files from `something` will be loaded in the box recursively. + +```ruby +# something.rb + +X = 1 + +class Something + def self.x = X + def x = ::X +end +``` + +Classes/modules, those methods and constants defined in the box can be accessed via `box` object. + +```ruby +X = 2 +p X # 2 +p ::X # 2 +p box::Something.x # 1 +p box::X # 1 +``` + +Instance methods defined in the box also run with definitions in the box. + +```ruby +s = box::Something.new + +p s.x # 1 +``` + +## Specifications + +### Ruby Box types + +There are two box types: + +* Root box +* User boxes + +There is the root box, just a single box in a Ruby process. Ruby bootstrap runs in the root box, and all builtin classes/modules are defined in the root box. (See "Builtin classes and modules".) + +User boxes are to run user-written programs and libraries loaded from user programs. The user's main program (specified by the `ruby` command line argument) is executed in the "main" box, which is a user box automatically created at the end of Ruby's bootstrap, copied from the root box. + +When `Ruby::Box.new` is called, an "optional" box (a user, non-main box) is created, copied from the root box. All user boxes are flat, copied from the root box. + +### Ruby Box class and instances + +`Ruby::Box` is a class, as a subclass of `Module`. `Ruby::Box` instances are a kind of `Module`. + +### Classes and modules defined in boxes + +The classes and modules, newly defined in a box `box`, are accessible via `box`. For example, if a class `A` is defined in `box`, it is accessible as `box::A` from outside of the box. + +In the box `box`, `A` can be referred to as `A` (and `::A`). + +### Built-in classes and modules reopened in boxes + +In boxes, builtin classes/modules are visible and can be reopened. Those classes/modules can be reopened using `class` or `module` clauses, and class/module definitions can be changed. + +The changed definitions are visible only in the box. In other boxes, builtin classes/modules and those instances work without changed definitions. + +```ruby +# in foo.rb +class String + BLANK_PATTERN = /\A\s*\z/ + def blank? + self.match?(BLANK_PATTERN) + end +end + +module Foo + def self.foo = "foo" + + def self.foo_is_blank? + foo.blank? + end +end + +Foo.foo.blank? #=> false +"foo".blank? #=> false + +# in main.rb +box = Ruby::Box.new +box.require_relative('foo') + +box::Foo.foo_is_blank? #=> false (#blank? called in box) + +"foo".blank? # NoMethodError +String::BLANK_PATTERN # NameError +``` + +The main box and `box` are different boxes, so monkey patches in main are also invisible in `box`. + +### Builtin classes and modules + +In the box context, "builtin" classes and modules are classes and modules: + +* Accessible without any `require` calls in user scripts +* Defined before any user program start running +* Including classes/modules loaded by `prelude.rb` (including RubyGems `Gem`, for example) + +Hereafter, "builtin classes and modules" will be referred to as just "builtin classes". + +### Builtin classes referred via box objects + +Builtin classes in a box `box` can be referred from other boxes. For example, `box::String` is a valid reference, and `String` and `box::String` are identical (`String == box::String`, `String.object_id == box::String.object_id`). + +`box::String`-like reference returns just a `String` in the current box, so its definition is `String` in the box, not in `box`. + +```ruby +# foo.rb +class String + def self.foo = "foo" +end + +# main.rb +box = Ruby::Box.new +box.require_relative('foo') + +box::String.foo # NoMethodError +``` + +### Class instance variables, class variables, constants + +Builtin classes can have different sets of class instance variables, class variables and constants between boxes. + +```ruby +# foo.rb +class Array + @v = "foo" + @@v = "_foo_" + V = "FOO" +end + +Array.instance_variable_get(:@v) #=> "foo" +Array.class_variable_get(:@@v) #=> "_foo_" +Array.const_get(:V) #=> "FOO" + +# main.rb +box = Ruby::Box.new +box.require_relative('foo') + +Array.instance_variable_get(:@v) #=> nil +Array.class_variable_get(:@@v) # NameError +Array.const_get(:V) # NameError +``` + +### Global variables + +In boxes, changes on global variables are also isolated in the boxes. Changes on global variables in a box are visible/applied only in the box. + +```ruby +# foo.rb +$foo = "foo" +$VERBOSE = nil + +puts "This appears: '#{$foo}'" + +# main.rb +p $foo #=> nil +p $VERBOSE #=> false + +box = Ruby::Box.new +box.require_relative('foo') # "This appears: 'foo'" + +p $foo #=> nil +p $VERBOSE #=> false +``` + +### Top level constants + +Usually, top level constants are defined as constants of `Object`. In boxes, top level constants are constants of `Object` in the box. And the box object `box`'s constants are strictly equal to constants of `Object`. + +```ruby +# foo.rb +FOO = 100 + +FOO #=> 100 +Object::FOO #=> 100 + +# main.rb +box = Ruby::Box.new +box.require_relative('foo') + +box::FOO #=> 100 + +FOO # NameError +Object::FOO # NameError +``` + +### Top level methods + +Top level methods are private instance methods of `Object`, in each box. + +```ruby +# foo.rb +def yay = "foo" + +class Foo + def self.say = yay +end + +Foo.say #=> "foo" +yay #=> "foo" + +# main.rb +box = Ruby::Box.new +box.require_relative('foo') + +box::Foo.say #=> "foo" + +yay # NoMethodError +``` + +There is no way to expose top level methods in boxes to others. +(See "Expose top level methods as a method of the box object" in "Discussions" section below) + +### Ruby Box scopes + +Ruby Box works in file scope. One `.rb` file runs in a single box. + +Once a file is loaded in a box `box`, all methods/procs defined/created in the file run in `box`. + +### Utility methods + +Several methods are available for trying/testing Ruby Box. + +* `Ruby::Box.current` returns the current box +* `Ruby::Box.enabled?` returns true/false to represent `RUBY_BOX=1` is specified or not +* `Ruby::Box.root` returns the root box +* `Ruby::Box.main` returns the main box +* `Ruby::Box#eval` evaluates a Ruby code (String) in the receiver box, just like calling `#load` with a file + +## Implementation details + +#### ISeq inline method/constant cache + +As described above in "Ruby Box scopes", an ".rb" file runs in a box. So method/constant resolution will be done in a box consistently. + +That means ISeq inline caches work well even with boxes. Otherwise, it's a bug. + +#### Method call global cache (gccct) + +`rb_funcall()` C function refers to the global cc cache table (gccct), and the cache key is calculated with the current box. + +So, `rb_funcall()` calls have a performance penalty when Ruby Box is enabled. + +#### Current box and loading box + +The current box is the box that the executing code is in. `Ruby::Box.current` returns the current box object. + +The loading box is an internally managed box to determine the box to load newly required/loaded files. For example, `box` is the loading box when `box.require("foo")` is called. + +## Discussions + +#### More builtin methods written in Ruby + +If Ruby Box is enabled by default, builtin methods can be written in Ruby because it can't be overridden by users' monkey patches. Builtin Ruby methods can be JIT-ed, and it could bring performance reward. + +#### Monkey patching methods called by builtin methods + +Builtin methods sometimes call other builtin methods. For example, `Hash#map` calls `Hash#each` to retrieve entries to be mapped. Without Ruby Box, Ruby users can overwrite `Hash#each` and expect the behavior change of `Hash#map` as a result. + +But with boxes, `Hash#map` runs in the root box. Ruby users can define `Hash#each` only in user boxes, so users cannot change `Hash#map`'s behavior in this case. To achieve it, users should override both`Hash#map` and `Hash#each` (or only `Hash#map`). + +It is a breaking change. + +Users can define methods using `Ruby::Box.root.eval(...)`, but it's clearly not ideal API. + +#### Assigning values to global variables used by builtin methods + +Similar to monkey patching methods, global variables assigned in a box is separated from the root box. Methods defined in the root box referring a global variable can't find the re-assigned one. + +#### Context of `$LOAD_PATH` and `$LOADED_FEATURES` + +Global variables `$LOAD_PATH` and `$LOADED_FEATURES` control `require` method behaviors. So those variables are determined by the loading box instead of the current box. + +This could potentially conflict with the user's expectations. We should find the solution. + +#### Expose top level methods as a method of the box object + +Currently, top level methods in boxes are not accessible from outside of the box. But there might be a use case to call other box's top level methods. + +#### Split root and builtin box + +Currently, the single "root" box is the source of classext CoW. And also, the "root" box can load additional files after starting main script evaluation by calling methods which contain lines like `require "openssl"`. + +That means, user boxes can have different sets of definitions according to when it is created. + +``` +[root] + | + |----[main] + | + |(require "openssl" called in root) + | + |----[box1] having OpenSSL + | + |(remove_const called for OpenSSL in root) + | + |----[box2] without OpenSSL +``` + +This could cause unexpected behavior differences between user boxes. It should NOT be a problem because user scripts which refer to `OpenSSL` should call `require "openssl"` by themselves. +But in the worst case, a script (without `require "openssl"`) runs well in `box1`, but doesn't run in `box2`. This situation looks like a "random failure" to users. + +An option possible to prevent this situation is to have "root" and "builtin" boxes. + +* root + * The box for the Ruby process bootstrap, then the source of CoW + * After starting the main box, no code runs in this box +* builtin + * The box copied from the root box at the same time with "main" + * Methods and procs defined in the "root" box run in this box + * Classes and modules required will be loaded in this box + +This design realizes a consistent source of box CoW. + +#### Separate `cc_tbl` and `callable_m_tbl`, `cvc_tbl` for less classext CoW + +The fields of `rb_classext_t` contains several cache(-like) data, `cc_tbl`(callcache table), `callable_m_tbl`(table of resolved complemented methods) and `cvc_tbl`(class variable cache table). + +The classext CoW is triggered when the contents of `rb_classext_t` are changed, including `cc_tbl`, `callable_m_tbl`, and `cvc_tbl`. But those three tables are changed by just calling methods or referring class variables. So, currently, classext CoW is triggered much more times than the original expectation. + +If we can move those three tables outside of `rb_classext_t`, the number of copied `rb_classext_t` will be much less than the current implementation. diff --git a/doc/language/bsearch.rdoc b/doc/language/bsearch.rdoc new file mode 100644 index 0000000000..90705853d7 --- /dev/null +++ b/doc/language/bsearch.rdoc @@ -0,0 +1,120 @@ += Binary Searching + +A few Ruby methods support binary searching in a collection: + +Array#bsearch:: Returns an element selected via a binary search + as determined by a given block. +Array#bsearch_index:: Returns the index of an element selected via a binary search + as determined by a given block. +Range#bsearch:: Returns an element selected via a binary search + as determined by a given block. + +Each of these methods returns an enumerator if no block is given. + +Given a block, each of these methods returns an element (or element index) from +self+ +as determined by a binary search. +The search finds an element of +self+ which meets +the given condition in <tt>O(log n)</tt> operations, where +n+ is the count of elements. ++self+ should be sorted, but this is not checked. + +There are two search modes: + +Find-minimum mode:: method +bsearch+ returns the first element for which + the block returns +true+; + the block must return +true+ or +false+. +Find-any mode:: method +bsearch+ some element, if any, for which + the block returns zero. + the block must return a numeric value. + +The block should not mix the modes by sometimes returning +true+ or +false+ +and other times returning a numeric value, but this is not checked. + +<b>Find-Minimum Mode</b> + +In find-minimum mode, the block must return +true+ or +false+. +The further requirement (though not checked) is that +there are no indexes +i+ and +j+ such that: + +- <tt>0 <= i < j <= self.size</tt>. +- The block returns +true+ for <tt>self[i]</tt> and +false+ for <tt>self[j]</tt>. + +Less formally: the block is such that all +false+-evaluating elements +precede all +true+-evaluating elements. + +In find-minimum mode, method +bsearch+ returns the first element +for which the block returns +true+. + +Examples: + + a = [0, 4, 7, 10, 12] + a.bsearch {|x| x >= 4 } # => 4 + a.bsearch {|x| x >= 6 } # => 7 + a.bsearch {|x| x >= -1 } # => 0 + a.bsearch {|x| x >= 100 } # => nil + + r = (0...a.size) + r.bsearch {|i| a[i] >= 4 } #=> 1 + r.bsearch {|i| a[i] >= 6 } #=> 2 + r.bsearch {|i| a[i] >= 8 } #=> 3 + r.bsearch {|i| a[i] >= 100 } #=> nil + r = (0.0...Float::INFINITY) + r.bsearch {|x| Math.log(x) >= 0 } #=> 1.0 + +These blocks make sense in find-minimum mode: + + a = [0, 4, 7, 10, 12] + a.map {|x| x >= 4 } # => [false, true, true, true, true] + a.map {|x| x >= 6 } # => [false, false, true, true, true] + a.map {|x| x >= -1 } # => [true, true, true, true, true] + a.map {|x| x >= 100 } # => [false, false, false, false, false] + +This would not make sense: + + a.map {|x| x == 7 } # => [false, false, true, false, false] + +<b>Find-Any Mode</b> + +In find-any mode, the block must return a numeric value. +The further requirement (though not checked) is that +there are no indexes +i+ and +j+ such that: + +- <tt>0 <= i < j <= self.size</tt>. +- The block returns a negative value for <tt>self[i]</tt> + and a positive value for <tt>self[j]</tt>. +- The block returns a negative value for <tt>self[i]</tt> and zero <tt>self[j]</tt>. +- The block returns zero for <tt>self[i]</tt> and a positive value for <tt>self[j]</tt>. + +Less formally: the block is such that: + +- All positive-evaluating elements precede all zero-evaluating elements. +- All positive-evaluating elements precede all negative-evaluating elements. +- All zero-evaluating elements precede all negative-evaluating elements. + +In find-any mode, method +bsearch+ returns some element +for which the block returns zero, or +nil+ if no such element is found. + +Examples: + + a = [0, 4, 7, 10, 12] + a.bsearch {|element| 7 <=> element } # => 7 + a.bsearch {|element| -1 <=> element } # => nil + a.bsearch {|element| 5 <=> element } # => nil + a.bsearch {|element| 15 <=> element } # => nil + + a = [0, 100, 100, 100, 200] + r = (0..4) + r.bsearch {|i| 100 - a[i] } #=> 1, 2 or 3 + r.bsearch {|i| 300 - a[i] } #=> nil + r.bsearch {|i| 50 - a[i] } #=> nil + +These blocks make sense in find-any mode: + + a = [0, 4, 7, 10, 12] + a.map {|element| 7 <=> element } # => [1, 1, 0, -1, -1] + a.map {|element| -1 <=> element } # => [-1, -1, -1, -1, -1] + a.map {|element| 5 <=> element } # => [1, 1, -1, -1, -1] + a.map {|element| 15 <=> element } # => [1, 1, 1, 1, 1] + +This would not make sense: + + a.map {|element| element <=> 7 } # => [-1, -1, 0, 1, 1] diff --git a/doc/language/calendars.rdoc b/doc/language/calendars.rdoc new file mode 100644 index 0000000000..a2540f1c43 --- /dev/null +++ b/doc/language/calendars.rdoc @@ -0,0 +1,62 @@ +== Julian and Gregorian Calendars + +The difference between the +{Julian calendar}[https://en.wikipedia.org/wiki/Julian_calendar] +and the +{Gregorian calendar}[https://en.wikipedia.org/wiki/Gregorian_calendar] +may matter to your program if it uses dates before the switchovers. + +- October 15, 1582. +- September 14, 1752. + +A date will be different in the two calendars, in general. + +=== Different switchover dates + +The reasons for the difference are religious/political histories. + +- On October 15, 1582, several countries changed + from the Julian calendar to the Gregorian calendar; + these included Italy, Poland, Portugal, and Spain. + Other countries in the Western world retained the Julian calendar. +- On September 14, 1752, most of the British empire + changed from the Julian calendar to the Gregorian calendar. + +When your code uses a date before these switchover dates, +it will matter whether it considers the switchover date +to be the earlier date or the later date (or neither). + +See also {a concrete example here}[rdoc-ref:DateTime@When+should+you+use+DateTime+and+when+should+you+use+Time-3F]. + +=== Argument +start+ + +Certain methods in class \Date handle differences in the +{Julian and Gregorian calendars}[rdoc-ref:@Julian+and+Gregorian+Calendars] +by accepting an optional argument +start+, whose value may be: + +- Date::ITALY (the default): the created date is Julian + if before October 15, 1582, Gregorian otherwise: + + d = Date.new(1582, 10, 15) + d.prev_day.julian? # => true + d.julian? # => false + d.gregorian? # => true + +- Date::ENGLAND: the created date is Julian if before September 14, 1752, + Gregorian otherwise: + + d = Date.new(1752, 9, 14, Date::ENGLAND) + d.prev_day.julian? # => true + d.julian? # => false + d.gregorian? # => true + +- Date::JULIAN: the created date is Julian regardless of its value: + + d = Date.new(1582, 10, 15, Date::JULIAN) + d.julian? # => true + +- Date::GREGORIAN: the created date is Gregorian regardless of its value: + + d = Date.new(1752, 9, 14, Date::GREGORIAN) + d.prev_day.gregorian? # => true + diff --git a/doc/language/case_mapping.rdoc b/doc/language/case_mapping.rdoc new file mode 100644 index 0000000000..d40155db03 --- /dev/null +++ b/doc/language/case_mapping.rdoc @@ -0,0 +1,106 @@ += Case Mapping + +Some string-oriented methods use case mapping. + +In String: + +- String#capitalize +- String#capitalize! +- String#casecmp +- String#casecmp? +- String#downcase +- String#downcase! +- String#swapcase +- String#swapcase! +- String#upcase +- String#upcase! + +In Symbol: + +- Symbol#capitalize +- Symbol#casecmp +- Symbol#casecmp? +- Symbol#downcase +- Symbol#swapcase +- Symbol#upcase + +== Default Case Mapping + +By default, all of these methods use full Unicode case mapping, +which is suitable for most languages. +See {Section 3.13 (Default Case Algorithms) of the Unicode standard}[https://www.unicode.org/versions/latest/ch03.pdf]. + +Non-ASCII case mapping and folding are supported for UTF-8, +UTF-16BE/LE, UTF-32BE/LE, and ISO-8859-1~16 Strings/Symbols. + +Context-dependent case mapping as described in +{Table 3-17 (Context Specification for Casing) of the Unicode standard}[https://www.unicode.org/versions/latest/ch03.pdf] +is currently not supported. + +In most cases, the case conversion of a string has the same number of characters as before. +There are exceptions (see also +:fold+ below): + + s = "\u00DF" # => "ß" + s.upcase # => "SS" + s = "\u0149" # => "ʼn" + s.upcase # => "ʼN" + +Case mapping may also depend on locale (see also +:turkic+ below): + + s = "\u0049" # => "I" + s.downcase # => "i" # Dot above. + s.downcase(:turkic) # => "ı" # No dot above. + +Case changes may not be reversible: + + s = 'Hello World!' # => "Hello World!" + s.downcase # => "hello world!" + s.downcase.upcase # => "HELLO WORLD!" # Different from original s. + +Case changing methods may not maintain Unicode normalization. +See String#unicode_normalize. + +== Case Mappings + +Except for +casecmp+ and +casecmp?+, +each of the case-mapping methods listed above +accepts an optional argument, <tt>mapping</tt>. + +The argument is one of: + +- +:ascii+: ASCII-only mapping. + Uppercase letters ('A'..'Z') are mapped to lowercase letters ('a'..'z); + other characters are not changed + + s = "Foo \u00D8 \u00F8 Bar" # => "Foo Ø ø Bar" + s.upcase # => "FOO Ø Ø BAR" + s.downcase # => "foo ø ø bar" + s.upcase(:ascii) # => "FOO Ø ø BAR" + s.downcase(:ascii) # => "foo Ø ø bar" + +- +:turkic+: Full Unicode case mapping. + For the Turkic languages + that distinguish dotted and dotless I, for example Turkish and Azeri. + + s = 'Türkiye' # => "Türkiye" + s.upcase # => "TÜRKIYE" + s.upcase(:turkic) # => "TÜRKİYE" # Dot above. + + s = 'TÜRKIYE' # => "TÜRKIYE" + s.downcase # => "türkiye" + s.downcase(:turkic) # => "türkıye" # No dot above. + +- +:fold+ (available only for String#downcase, String#downcase!, + and Symbol#downcase). + Unicode case folding, + which is more far-reaching than Unicode case mapping. + + s = "\u00DF" # => "ß" + s.downcase # => "ß" + s.downcase(:fold) # => "ss" + s.upcase # => "SS" + + s = "\uFB04" # => "ffl" + s.downcase # => "ffl" + s.upcase # => "FFL" + s.downcase(:fold) # => "ffl" diff --git a/doc/language/character_selectors.rdoc b/doc/language/character_selectors.rdoc new file mode 100644 index 0000000000..8bfc9b719b --- /dev/null +++ b/doc/language/character_selectors.rdoc @@ -0,0 +1,100 @@ += Character Selectors + +== Character Selector + +A _character_ _selector_ is a string argument accepted by certain Ruby methods. +Each of these instance methods accepts one or more character selectors: + +- String#tr(selector, replacements): returns a new string. +- String#tr!(selector, replacements): returns +self+ or +nil+. +- String#tr_s(selector, replacements): returns a new string. +- String#tr_s!(selector, replacements): returns +self+ or +nil+. +- String#count(*selectors): returns the count of the specified characters. +- String#delete(*selectors): returns a new string. +- String#delete!(*selectors): returns +self+ or +nil+. +- String#squeeze(*selectors): returns a new string. +- String#squeeze!(*selectors): returns +self+ or +nil+. +- String#strip(*selectors): returns a new string. +- String#strip!(*selectors): returns +self+ or +nil+. + +A character selector identifies zero or more characters in +self+ +that are to be operands for the method. + +In this section, we illustrate using method String#delete(selector), +which deletes the selected characters. + +In the simplest case, the characters selected are exactly those +contained in the selector itself: + + 'abracadabra'.delete('a') # => "brcdbr" + 'abracadabra'.delete('ab') # => "rcdr" + 'abracadabra'.delete('abc') # => "rdr" + '0123456789'.delete('258') # => "0134679" + '!@#$%&*()_+'.delete('+&#') # => "!@$%*()_" + 'ã“ã‚“ã«ã¡ã¯'.delete('ã«') # => "ã“ã‚“ã¡ã¯" + +Note that order and repetitions do not matter: + + 'abracadabra'.delete('dcab') # => "rr" + 'abracadabra'.delete('aaaa') # => "brcdbr" + +In a character selector, these three characters get special treatment: + +- A leading caret (<tt>'^'</tt>) functions as a "not" operator + for the characters to its right: + + 'abracadabra'.delete('^bc') # => "bcb" + '0123456789'.delete('^852') # => "258" + +- A hyphen (<tt>'-'</tt>) between two other characters + defines a range of characters instead of a plain string of characters: + + 'abracadabra'.delete('a-d') # => "rr" + '0123456789'.delete('4-7') # => "012389" + '!@#$%&*()_+'.delete(' -/') # => "@^_" + + # May contain more than one range. + 'abracadabra'.delete('a-cq-t') # => "d" + + # Ranges may be mixed with plain characters. + '0123456789'.delete('67-950-23') # => "4" + + # Ranges may be mixed with negations. + 'abracadabra'.delete('^a-c') # => "abacaaba" + +- A backslash (<tt>'\'</tt>) acts as an escape for a caret, a hyphen, + or another backslash: + + 'abracadabra^'.delete('\^bc') # => "araadara" + 'abracadabra-'.delete('a\-d') # => "brcbr" + "hello\r\nworld".delete("\r") # => "hello\nworld" + "hello\r\nworld".delete("\\r") # => "hello\r\nwold" + "hello\r\nworld".delete("\\\r") # => "hello\nworld" + +== Multiple Character Selectors + +These instance methods accept multiple character selectors: + +- String#count(*selectors): returns the count of the specified characters. +- String#delete(*selectors): returns a new string. +- String#delete!(*selectors): returns +self+ or +nil+. +- String#squeeze(*selectors): returns a new string. +- String#squeeze!(*selectors): returns +self+ or +nil+. +- String#strip(*selectors): returns a new string. +- String#strip!(*selectors): returns +self+ or +nil+. + +In effect, the given selectors are formed into a single selector +consisting of only those characters common to _all_ of the given selectors. + +All forms of selectors may be used, including negations, ranges, and escapes. + +Each of these pairs of method calls is equivalent: + + s.delete('abcde', 'dcbfg') + s.delete('bcd') + + s.delete('^abc', '^def') + s.delete('^abcdef') + + s.delete('a-e', 'c-g') + s.delete('cde') diff --git a/doc/language/dig_methods.rdoc b/doc/language/dig_methods.rdoc new file mode 100644 index 0000000000..366275d451 --- /dev/null +++ b/doc/language/dig_methods.rdoc @@ -0,0 +1,82 @@ += Dig Methods + +Ruby's +dig+ methods are useful for accessing nested data structures. + +Consider this data: + item = { + id: "0001", + type: "donut", + name: "Cake", + ppu: 0.55, + batters: { + batter: [ + {id: "1001", type: "Regular"}, + {id: "1002", type: "Chocolate"}, + {id: "1003", type: "Blueberry"}, + {id: "1004", type: "Devil's Food"} + ] + }, + topping: [ + {id: "5001", type: "None"}, + {id: "5002", type: "Glazed"}, + {id: "5005", type: "Sugar"}, + {id: "5007", type: "Powdered Sugar"}, + {id: "5006", type: "Chocolate with Sprinkles"}, + {id: "5003", type: "Chocolate"}, + {id: "5004", type: "Maple"} + ] + } + +Without a +dig+ method, you can write: + item[:batters][:batter][1][:type] # => "Chocolate" + +With a +dig+ method, you can write: + item.dig(:batters, :batter, 1, :type) # => "Chocolate" + +Without a +dig+ method, you can write, erroneously +(raises <tt>NoMethodError (undefined method `[]' for nil:NilClass)</tt>): + item[:batters][:BATTER][1][:type] + +With a +dig+ method, you can write (still erroneously, but avoiding the exception): + item.dig(:batters, :BATTER, 1, :type) # => nil + +== Why Is +dig+ Better? + +- It has fewer syntactical elements (to get wrong). +- It reads better. +- It does not raise an exception if an item is not found. + +== How Does +dig+ Work? + +The call sequence is: + obj.dig(*identifiers) + +The +identifiers+ define a "path" into the nested data structures: +- For each identifier in +identifiers+, calls method \#dig on a receiver + with that identifier. +- The first receiver is +self+. +- Each successive receiver is the value returned by the previous call to +dig+. +- The value finally returned is the value returned by the last call to +dig+. + +A +dig+ method raises an exception if any receiver does not respond to \#dig: + h = { foo: 1 } + # Raises TypeError (Integer does not have #dig method): + h.dig(:foo, :bar) + +== What Else? + +The structure above has \Hash objects and \Array objects, +both of which have instance method +dig+. + +Altogether there are six built-in Ruby classes that have method +dig+, +three in the core classes and three in the standard library. + +In the core: +- Array#dig: the first argument is an \Integer index. +- Hash#dig: the first argument is a key. +- Struct#dig: the first argument is a key. + +In the standard library: +- OpenStruct#dig: the first argument is a \String name. +- CSV::Table#dig: the first argument is an \Integer index or a \String header. +- CSV::Row#dig: the first argument is an \Integer index or a \String header. diff --git a/doc/language/encodings.rdoc b/doc/language/encodings.rdoc new file mode 100644 index 0000000000..683842d3fb --- /dev/null +++ b/doc/language/encodings.rdoc @@ -0,0 +1,482 @@ += Encodings + +== The Basics + +A {character encoding}[https://en.wikipedia.org/wiki/Character_encoding], +often shortened to _encoding_, is a mapping between: + +- A sequence of 8-bit bytes (each byte in the range <tt>0..255</tt>). +- Characters in a specific character set. + +Some character sets contain only 1-byte characters; +{US-ASCII}[https://en.wikipedia.org/wiki/ASCII], for example, has 256 1-byte characters. +This string, encoded in US-ASCII, has six characters that are stored as six bytes: + + s = 'Hello!'.encode(Encoding::US_ASCII) # => "Hello!" + s.encoding # => #<Encoding:US-ASCII> + s.bytes # => [72, 101, 108, 108, 111, 33] + +Other encodings may involve multi-byte characters. +{UTF-8}[https://en.wikipedia.org/wiki/UTF-8], for example, +encodes more than one million characters, encoding each in one to four bytes. +The lowest-valued of these characters correspond to ASCII characters, +and so are 1-byte characters: + + s = 'Hello!' # => "Hello!" + s.bytes # => [72, 101, 108, 108, 111, 33] + +Other characters, such as the Euro symbol, are multi-byte: + + s = "\u20ac" # => "€" + s.bytes # => [226, 130, 172] + +== The \Encoding Class + +=== \Encoding Objects + +Ruby encodings are defined by constants in class \Encoding. +There can be only one instance of \Encoding for each of these constants. +Method Encoding.list returns an array of \Encoding objects (one for each constant): + + Encoding.list.size # => 103 + Encoding.list.first.class # => Encoding + Encoding.list.take(3) + # => [#<Encoding:ASCII-8BIT>, #<Encoding:UTF-8>, #<Encoding:US-ASCII>] + +=== Names and Aliases + +Method Encoding#name returns the name of an \Encoding: + + Encoding::ASCII_8BIT.name # => "ASCII-8BIT" + Encoding::WINDOWS_31J.name # => "Windows-31J" + +An \Encoding object has zero or more aliases; +method Encoding#names returns an array containing the name and all aliases: + + Encoding::ASCII_8BIT.names + # => ["ASCII-8BIT", "BINARY"] + Encoding::WINDOWS_31J.names + #=> ["Windows-31J", "CP932", "csWindows31J", "SJIS", "PCK"] + +Method Encoding.aliases returns a hash of all alias/name pairs: + + Encoding.aliases.size # => 71 + Encoding.aliases.take(3) + # => [["BINARY", "ASCII-8BIT"], ["CP437", "IBM437"], ["CP720", "IBM720"]] + +Method Encoding.name_list returns an array of all the encoding names and aliases: + + Encoding.name_list.size # => 175 + Encoding.name_list.take(3) + # => ["ASCII-8BIT", "UTF-8", "US-ASCII"] + +Method +name_list+ returns more entries than method +list+ +because it includes both the names and their aliases. + +Method Encoding.find returns the \Encoding for a given name or alias, if it exists: + + Encoding.find("US-ASCII") # => #<Encoding:US-ASCII> + Encoding.find("US-ASCII").class # => Encoding + +=== Default Encodings + +Method Encoding.find, above, also returns a default \Encoding +for each of these special names: + +- +external+: the default external \Encoding: + + Encoding.find("external") # => #<Encoding:UTF-8> + +- +internal+: the default internal \Encoding (may be +nil+): + + Encoding.find("internal") # => nil + +- +locale+: the default \Encoding for a string from the environment: + + Encoding.find("locale") # => #<Encoding:UTF-8> # Linux + Encoding.find("locale") # => #<Encoding:IBM437> # Windows + +- +filesystem+: the default \Encoding for a string from the filesystem: + + Encoding.find("filesystem") # => #<Encoding:UTF-8> + +Method Encoding.default_external returns the default external \Encoding: + + Encoding.default_external # => #<Encoding:UTF-8> + +Method Encoding.default_external= sets that value: + + Encoding.default_external = Encoding::US_ASCII # => #<Encoding:US-ASCII> + Encoding.default_external # => #<Encoding:US-ASCII> + +Method Encoding.default_internal returns the default internal \Encoding: + + Encoding.default_internal # => nil + +Method Encoding.default_internal= sets the default internal \Encoding: + + Encoding.default_internal = Encoding::US_ASCII # => #<Encoding:US-ASCII> + Encoding.default_internal # => #<Encoding:US-ASCII> + +=== Compatible Encodings + +Method Encoding.compatible? returns whether two given objects are encoding-compatible +(that is, whether they can be concatenated); +returns the \Encoding of the concatenated string, or +nil+ if incompatible: + + rus = "\u{442 435 441 442}" + eng = 'text' + Encoding.compatible?(rus, eng) # => #<Encoding:UTF-8> + + s0 = "\xa1\xa1".force_encoding(Encoding::ISO_8859_1) # => "\xA1\xA1" + s1 = "\xa1\xa1".force_encoding(Encoding::EUCJP) # => "\x{A1A1}" + Encoding.compatible?(s0, s1) # => nil + +== \String \Encoding + +A Ruby String object has an encoding that is an instance of class \Encoding. +The encoding may be retrieved by method String#encoding. + +The default encoding for a string literal is the script encoding; +see {Script Encoding}[rdoc-ref:@Script+Encoding]. + + 's'.encoding # => #<Encoding:UTF-8> + +The default encoding for a string created with method String.new is: + +- For no argument, ASCII-8BIT. +- For a \String object argument, the encoding of that string. +- For a string literal, the script encoding; + see {Script Encoding}[rdoc-ref:@Script+Encoding]. + +In either case, any encoding may be specified: + + s = String.new(encoding: Encoding::UTF_8) # => "" + s.encoding # => #<Encoding:UTF-8> + s = String.new('foo', encoding: Encoding::BINARY) # => "foo" + s.encoding # => #<Encoding:BINARY (ASCII-8BIT)> + +The encoding for a string may be changed: + + s = "R\xC3\xA9sum\xC3\xA9" # => "Résumé" + s.encoding # => #<Encoding:UTF-8> + s.force_encoding(Encoding::ISO_8859_1) # => "R\xC3\xA9sum\xC3\xA9" + s.encoding # => #<Encoding:ISO-8859-1> + +Changing the assigned encoding does not alter the content of the string; +it changes only the way the content is to be interpreted: + + s # => "R\xC3\xA9sum\xC3\xA9" + s.force_encoding(Encoding::UTF_8) # => "Résumé" + +The actual content of a string may also be altered; +see {Transcoding a String}[#label-Transcoding+a+String]. + +Here are a couple of useful query methods: + + s = "abc".force_encoding(Encoding::UTF_8) # => "abc" + s.ascii_only? # => true + s = "abc\u{6666}".force_encoding(Encoding::UTF_8) # => "abc晦" + s.ascii_only? # => false + + s = "\xc2\xa1".force_encoding(Encoding::UTF_8) # => "¡" + s.valid_encoding? # => true + s = "\xc2".force_encoding(Encoding::UTF_8) # => "\xC2" + s.valid_encoding? # => false + +== \Symbol and \Regexp Encodings + +The string stored in a Symbol or Regexp object also has an encoding; +the encoding may be retrieved by method Symbol#encoding or Regexp#encoding. + +The default encoding for these, however, is: + +- US-ASCII, if all characters are US-ASCII. +- The script encoding, otherwise; + see (Script Encoding)[rdoc-ref:@Script+Encoding]. + +== Filesystem \Encoding + +The filesystem encoding is the default \Encoding for a string from the filesystem: + + Encoding.find("filesystem") # => #<Encoding:UTF-8> + +== Locale \Encoding + +The locale encoding is the default encoding for a string from the environment, +other than from the filesystem: + + Encoding.find('locale') # => #<Encoding:IBM437> + +== Stream Encodings + +Certain stream objects can have two encodings; these objects include instances of: + +- IO. +- File. +- ARGF. +- StringIO. + +The two encodings are: + +- An _external_ _encoding_, which identifies the encoding of the stream. +- An _internal_ _encoding_, which (if not +nil+) specifies the encoding + to be used for the string constructed from the stream. + +=== External \Encoding + +The external encoding, which is an \Encoding object, specifies how bytes read +from the stream are to be interpreted as characters. + +The default external encoding is: + +- UTF-8 for a text stream. +- ASCII-8BIT for a binary stream. + +The default external encoding is returned by method Encoding.default_external, +and may be set by: + +- Ruby command-line options <tt>--external_encoding</tt> or <tt>-E</tt>. + +You can also set the default external encoding using method Encoding.default_external=, +but doing so may cause problems; strings created before and after the change +may have a different encodings. + +For an \IO or \File object, the external encoding may be set by: + +- Open options +external_encoding+ or +encoding+, when the object is created; + see {Open Options}[rdoc-ref:IO@Open+Options]. + +For an \IO, \File, \ARGF, or \StringIO object, the external encoding may be set by: + +- Methods +set_encoding+ or (except for \ARGF) +set_encoding_by_bom+. + +=== Internal \Encoding + +The internal encoding, which is an \Encoding object or +nil+, +specifies how characters read from the stream +are to be converted to characters in the internal encoding; +those characters become a string whose encoding is set to the internal encoding. + +The default internal encoding is +nil+ (no conversion). +It is returned by method Encoding.default_internal, +and may be set by: + +- Ruby command-line options <tt>--internal_encoding</tt> or <tt>-E</tt>. + +You can also set the default internal encoding using method Encoding.default_internal=, +but doing so may cause problems; strings created before and after the change +may have a different encodings. + +For an \IO or \File object, the internal encoding may be set by: + +- Open options +internal_encoding+ or +encoding+, when the object is created; + see {Open Options}[rdoc-ref:IO@Open+Options]. + +For an \IO, \File, \ARGF, or \StringIO object, the internal encoding may be set by: + +- Method +set_encoding+. + +== Script \Encoding + +A Ruby script has a script encoding, which may be retrieved by: + + __ENCODING__ # => #<Encoding:UTF-8> + +The default script encoding is UTF-8; +a Ruby source file may set its script encoding with a magic comment +on the first line of the file (or second line, if there is a shebang on the first). +The comment must contain the word +coding+ or +encoding+, +followed by a colon, space and the Encoding name or alias: + + # encoding: ISO-8859-1 + __ENCODING__ #=> #<Encoding:ISO-8859-1> + +== Transcoding + +_Transcoding_ is the process of changing a sequence of characters +from one encoding to another. + +As far as possible, the characters remain the same, +but the bytes that represent them may change. + +The handling for characters that cannot be represented in the destination encoding +may be specified by @Encoding+Options. + +=== Transcoding a \String + +Each of these methods transcodes a string: + +- String#encode: Transcodes +self+ into a new string + according to given encodings and options. +- String#encode!: Like String#encode, but transcodes +self+ in place. +- String#scrub: Transcodes +self+ into a new string + by replacing invalid byte sequences with a given or default replacement string. +- String#scrub!: Like String#scrub, but transcodes +self+ in place. +- String#unicode_normalize: Transcodes +self+ into a new string + according to Unicode normalization. +- String#unicode_normalize!: Like String#unicode_normalize, + but transcodes +self+ in place. + +== Transcoding a Stream + +Each of these methods may transcode a stream; +whether it does so depends on the external and internal encodings: + +- IO.foreach: Yields each line of given stream to the block. +- IO.new: Creates and returns a new \IO object for the given integer file descriptor. +- IO.open: Creates a new \IO object. +- IO.pipe: Creates a connected pair of reader and writer \IO objects. +- IO.popen: Creates an \IO object to interact with a subprocess. +- IO.read: Returns a string with all or a subset of bytes from the given stream. +- IO.readlines: Returns an array of strings, which are the lines from the given stream. +- IO.write: Writes a given string to the given stream. + +This example writes a string to a file, encoding it as ISO-8859-1, +then reads the file into a new string, encoding it as UTF-8: + + s = "R\u00E9sum\u00E9" + path = 't.tmp' + ext_enc = Encoding::ISO_8859_1 + int_enc = Encoding::UTF_8 + + File.write(path, s, external_encoding: ext_enc) + raw_text = File.binread(path) + + transcoded_text = File.read(path, external_encoding: ext_enc, internal_encoding: int_enc) + + p raw_text + p transcoded_text + +Output: + + "R\xE9sum\xE9" + "Résumé" + +== \Encoding Options + +A number of methods in the Ruby core accept keyword arguments as encoding options. + +Some of the options specify or utilize a _replacement_ _string_, to be used +in certain transcoding operations. +A replacement string may be in any encoding that can be converted +to the encoding of the destination string. + +These keyword-value pairs specify encoding options: + +- For an invalid byte sequence: + + - <tt>:invalid: nil</tt> (default): Raise exception. + - <tt>:invalid: :replace</tt>: Replace each invalid byte sequence + with the replacement string. + + Examples: + + s = "\x80foo\x80" + s.encode(Encoding::ISO_8859_3) # Raises Encoding::InvalidByteSequenceError. + s.encode(Encoding::ISO_8859_3, invalid: :replace) # => "?foo?" + +- For an undefined character: + + - <tt>:undef: nil</tt> (default): Raise exception. + - <tt>:undef: :replace</tt>: Replace each undefined character + with the replacement string. + + Examples: + + s = "\x80foo\x80" + "\x80".encode(Encoding::UTF_8, Encoding::BINARY) # Raises Encoding::UndefinedConversionError. + s.encode(Encoding::UTF_8, Encoding::BINARY, undef: :replace) # => "�foo�" + + +- Replacement string: + + - <tt>:replace: nil</tt> (default): Set replacement string to default value: + <tt>"\uFFFD"</tt> ("�") for a Unicode encoding, <tt>'?'</tt> otherwise. + - <tt>:replace: some_string</tt>: Set replacement string to the given +some_string+; + overrides +:fallback+. + + Examples: + + s = "\xA5foo\xA5" + options = {:undef => :replace, :replace => 'xyzzy'} + s.encode(Encoding::UTF_8, Encoding::ISO_8859_3, **options) # => "xyzzyfooxyzzy" + +- Replacement fallback: + + One of these may be specified: + + - <tt>:fallback: nil</tt> (default): No replacement fallback. + - <tt>:fallback: hash_like_object</tt>: Set replacement fallback to the given + +hash_like_object+; the replacement string is <tt>hash_like_object[X]</tt>. + - <tt>:fallback: method</tt>: Set replacement fallback to the given + +method+; the replacement string is <tt>method(X)</tt>. + - <tt>:fallback: proc</tt>: Set replacement fallback to the given + +proc+; the replacement string is <tt>proc[X]</tt>. + + Examples: + + s = "\u3042foo\u3043" + + hash = {"\u3042" => 'xyzzy'} + hash.default = 'XYZZY' + s.encode(Encoding::US_ASCII, fallback: hash) # => "xyzzyfooXYZZY" + + def (fallback = "U+%.4X").escape(x) + self % x.unpack("U") + end + "\u{3042}".encode(Encoding::US_ASCII, fallback: fallback.method(:escape)) # => "U+3042" + + proc = Proc.new {|x| x == "\u3042" ? 'xyzzy' : 'XYZZY' } + s.encode('ASCII', fallback: proc) # => "XYZZYfooXYZZY" + +- XML entities: + + One of these may be specified: + + - <tt>:xml: nil</tt> (default): No handling for XML entities. + - <tt>:xml: :text</tt>: Treat source text as XML; + replace each undefined character + with its upper-case hexadecimal numeric character reference, + except that: + + - <tt>&</tt> is replaced with <tt>&</tt>. + - <tt><</tt> is replaced with <tt><</tt>. + - <tt>></tt> is replaced with <tt>></tt>. + + - <tt>:xml: :attr</tt>: Treat source text as XML attribute value; + replace each undefined character + with its upper-case hexadecimal numeric character reference, + except that: + + - The replacement string <tt>r</tt> is double-quoted (<tt>"r"</tt>). + - Each embedded double-quote is replaced with <tt>"</tt>. + - <tt>&</tt> is replaced with <tt>&</tt>. + - <tt><</tt> is replaced with <tt><</tt>. + - <tt>></tt> is replaced with <tt>></tt>. + + Examples: + + s = 'foo"<&>"bar' + "\u3042" + s.encode(Encoding::US_ASCII, xml: :text) # => "foo\"<&>\"barあ" + s.encode(Encoding::US_ASCII, xml: :attr) # => "\"foo"<&>"barあ\"" + + +- Newlines: + + One of these may be specified: + + - <tt>:cr_newline: true</tt>: Replace each line-feed character (<tt>"\n"</tt>) + with a carriage-return character (<tt>"\r"</tt>). + - <tt>:crlf_newline: true</tt>: Replace each line-feed character (<tt>"\n"</tt>) + with a carriage-return/line-feed string (<tt>"\r\n"</tt>). + - <tt>:universal_newline: true</tt>: Replace each carriage-return + character (<tt>"\r"</tt>) and each carriage-return/line-feed string + (<tt>"\r\n"</tt>) with a line-feed character (<tt>"\n"</tt>). + + Examples: + + s = "\n \r \r\n" # => "\n \r \r\n" + s.encode(Encoding::US_ASCII, cr_newline: true) # => "\r \r \r\r" + s.encode(Encoding::US_ASCII, crlf_newline: true) # => "\r\n \r \r\r\n" + s.encode(Encoding::US_ASCII, universal_newline: true) # => "\n \n \n" diff --git a/doc/language/exceptions.md b/doc/language/exceptions.md new file mode 100644 index 0000000000..5f8f0ece69 --- /dev/null +++ b/doc/language/exceptions.md @@ -0,0 +1,521 @@ +# Exceptions + +Ruby code can raise exceptions. + +Most often, a raised exception is meant to alert the running program +that an unusual (i.e., _exceptional_) situation has arisen, +and may need to be handled. + +Code throughout the Ruby core, Ruby standard library, and Ruby gems generates exceptions +in certain circumstances: + +```rb +File.open('nope.txt') # Raises Errno::ENOENT: "No such file or directory" +``` + +## Raised Exceptions + +A raised exception transfers program execution, one way or another. + +### Unrescued Exceptions + +If an exception not _rescued_ +(see [Rescued Exceptions](#label-Rescued+Exceptions) below), +execution transfers to code in the Ruby interpreter +that prints a message and exits the program (or thread): + +```console +$ ruby -e "raise" +-e:1:in '<main>': unhandled exception +``` + +### Rescued Exceptions + +An <i>exception handler</i> may determine what is to happen +when an exception is raised; +the handler may _rescue_ an exception, +and may prevent the program from exiting. + +A simple example: + +```rb +begin + raise 'Boom!' # Raises an exception, transfers control. + puts 'Will not get here.' +rescue + puts 'Rescued an exception.' # Control transferred to here; program does not exit. +end +puts 'Got here.' +``` + +Output: + +``` +Rescued an exception. +Got here. +``` + +An exception handler has several elements: + +| Element | Use | +|-----------------------------|------------------------------------------------------------------------------------------| +| Begin clause. | Begins the handler and contains the code whose raised exception, if any, may be rescued. | +| One or more rescue clauses. | Each contains "rescuing" code, which is to be executed for certain exceptions. | +| Else clause (optional). | Contains code to be executed if no exception is raised. | +| Ensure clause (optional). | Contains code to be executed whether or not an exception is raised, or is rescued. | +| <tt>end</tt> statement. | Ends the handler. ` | + +#### Begin Clause + +The begin clause begins the exception handler: + +- May start with a `begin` statement; + see also [Begin-Less Exception Handlers](#label-Begin-Less+Exception+Handlers). +- Contains code whose raised exception (if any) is covered + by the handler. +- Ends with the first following `rescue` statement. + +#### Rescue Clauses + +A rescue clause: + +- Starts with a `rescue` statement. +- Contains code that is to be executed for certain raised exceptions. +- Ends with the first following `rescue`, + `else`, `ensure`, or `end` statement. + +##### Rescued Exceptions + +A `rescue` statement may include one or more classes +that are to be rescued; +if none is given, StandardError is assumed. + +The rescue clause rescues both the specified class +(or StandardError if none given) or any of its subclasses; +see [Built-In Exception Class Hierarchy](rdoc-ref:Exception@Built-In+Exception+Class+Hierarchy). + +```rb +begin + 1 / 0 # Raises ZeroDivisionError, a subclass of StandardError. +rescue + puts "Rescued #{$!.class}" +end +``` + +Output: + +``` +Rescued ZeroDivisionError +``` + +If the `rescue` statement specifies an exception class, +only that class (or one of its subclasses) is rescued; +this example exits with a ZeroDivisionError, +which was not rescued because it is not ArgumentError or one of its subclasses: + +```rb +begin + 1 / 0 +rescue ArgumentError + puts "Rescued #{$!.class}" +end +``` + +A `rescue` statement may specify multiple classes, +which means that its code rescues an exception +of any of the given classes (or their subclasses): + +```rb +begin + 1 / 0 +rescue FloatDomainError, ZeroDivisionError + puts "Rescued #{$!.class}" +end +``` + +##### Multiple Rescue Clauses + +An exception handler may contain multiple rescue clauses; +in that case, the first clause that rescues the exception does so, +and those before and after are ignored: + +```rb +begin + Dir.open('nosuch') +rescue Errno::ENOTDIR + puts "Rescued #{$!.class}" +rescue Errno::ENOENT + puts "Rescued #{$!.class}" +end +``` + +Output: + +``` +Rescued Errno::ENOENT +``` + +##### Capturing the Rescued \Exception + +A `rescue` statement may specify a variable +whose value becomes the rescued exception +(an instance of Exception or one of its subclasses: + +```rb +begin + 1 / 0 +rescue => x + puts x.class + puts x.message +end +``` + +Output: + +``` +ZeroDivisionError +divided by 0 +``` + +##### Global Variables + +Two read-only global variables always have `nil` value +except in a rescue clause; +they're: + +- `$!`: contains the rescued exception. +- `$@`: contains its backtrace. + +Example: + +```rb +begin + 1 / 0 +rescue + p $! + p $@ +end +``` + +Output: + +``` +#<ZeroDivisionError: divided by 0> +["t.rb:2:in 'Integer#/'", "t.rb:2:in '<main>'"] +``` + +##### Cause + +In a rescue clause, the method Exception#cause returns the previous value of `$!`, +which may be `nil`; +elsewhere, the method returns `nil`. + +Example: + +```rb +begin + raise('Boom 0') +rescue => x0 + puts "Exception: #{x0.inspect}; $!: #{$!.inspect}; cause: #{x0.cause.inspect}." + begin + raise('Boom 1') + rescue => x1 + puts "Exception: #{x1.inspect}; $!: #{$!.inspect}; cause: #{x1.cause.inspect}." + begin + raise('Boom 2') + rescue => x2 + puts "Exception: #{x2.inspect}; $!: #{$!.inspect}; cause: #{x2.cause.inspect}." + end + end +end +``` + +Output: + +``` +Exception: #<RuntimeError: Boom 0>; $!: #<RuntimeError: Boom 0>; cause: nil. +Exception: #<RuntimeError: Boom 1>; $!: #<RuntimeError: Boom 1>; cause: #<RuntimeError: Boom 0>. +Exception: #<RuntimeError: Boom 2>; $!: #<RuntimeError: Boom 2>; cause: #<RuntimeError: Boom 1>. +``` + +#### Else Clause + +The `else` clause: + +- Starts with an `else` statement. +- Contains code that is to be executed if no exception is raised in the begin clause. +- Ends with the first following `ensure` or `end` statement. + +```rb +begin + puts 'Begin.' +rescue + puts 'Rescued an exception!' +else + puts 'No exception raised.' +end +``` + +Output: + +``` +Begin. +No exception raised. +``` + +#### Ensure Clause + +The ensure clause: + +- Starts with an `ensure` statement. +- Contains code that is to be executed + regardless of whether an exception is raised, + and regardless of whether a raised exception is handled. +- Ends with the first following `end` statement. + +```rb +def foo(boom: false) + puts 'Begin.' + raise 'Boom!' if boom +rescue + puts 'Rescued an exception!' +else + puts 'No exception raised.' +ensure + puts 'Always do this.' +end + +foo(boom: true) +foo(boom: false) +``` + +Output: + +``` +Begin. +Rescued an exception! +Always do this. +Begin. +No exception raised. +Always do this. +``` + +#### End Statement + +The `end` statement ends the handler. + +Code following it is reached only if any raised exception is rescued. + +#### Begin-Less \Exception Handlers + +As seen above, an exception handler may be implemented with `begin` and `end`. + +An exception handler may also be implemented as: + +- A method body: + + ```rb + def foo(boom: false) # Serves as beginning of exception handler. + puts 'Begin.' + raise 'Boom!' if boom + rescue + puts 'Rescued an exception!' + else + puts 'No exception raised.' + end # Serves as end of exception handler. + ``` + +- A block: + + ```rb + Dir.chdir('.') do |dir| # Serves as beginning of exception handler. + raise 'Boom!' + rescue + puts 'Rescued an exception!' + end # Serves as end of exception handler. + ``` + +#### Re-Raising an \Exception + +It can be useful to rescue an exception, but allow its eventual effect; +for example, a program can rescue an exception, log data about it, +and then "reinstate" the exception. + +This may be done via the `raise` method, but in a special way; +a rescuing clause: + + - Captures an exception. + - Does whatever is needed concerning the exception (such as logging it). + - Calls method `raise` with no argument, + which raises the rescued exception: + +```rb +begin + 1 / 0 +rescue ZeroDivisionError + # Do needful things (like logging). + raise # Raised exception will be ZeroDivisionError, not RuntimeError. +end +``` + +Output: + +``` +ruby t.rb +t.rb:2:in 'Integer#/': divided by 0 (ZeroDivisionError) + from t.rb:2:in '<main>' +``` + +#### Retrying + +It can be useful to retry a begin clause; +for example, if it must access a possibly-volatile resource +(such as a web page), +it can be useful to try the access more than once +(in the hope that it may become available): + +```rb +retries = 0 +begin + puts "Try ##{retries}." + raise 'Boom' +rescue + puts "Rescued retry ##{retries}." + if (retries += 1) < 3 + puts 'Retrying' + retry + else + puts 'Giving up.' + raise + end +end +``` + +``` +Try #0. +Rescued retry #0. +Retrying +Try #1. +Rescued retry #1. +Retrying +Try #2. +Rescued retry #2. +Giving up. +# RuntimeError ('Boom') raised. +``` + +Note that the retry re-executes the entire begin clause, +not just the part after the point of failure. + +## Raising an \Exception + +Method Kernel#raise raises an exception. + +## Custom Exceptions + +To provide additional or alternate information, +you may create custom exception classes. +Each should be a subclass of one of the built-in exception classes +(commonly StandardError or RuntimeError); +see [Built-In Exception Class Hierarchy](rdoc-ref:Exception@Built-In+Exception+Class+Hierarchy). + +```rb +class MyException < StandardError; end +``` + +## Messages + +Every `Exception` object has a message, +which is a string that is set at the time the object is created; +see Exception.new. + +The message cannot be changed, but you can create a similar object with a different message; +see Exception#exception. + +This method returns the message as defined: + +- Exception#message. + +Two other methods return enhanced versions of the message: + +- Exception#detailed_message: adds exception class name, with optional highlighting. +- Exception#full_message: adds exception class name and backtrace, with optional highlighting. + +Each of the two methods above accepts keyword argument `highlight`; +if the value of keyword `highlight` is `true`, +the returned string includes bolding and underlining ANSI codes (see below) +to enhance the appearance of the message. + +Any exception class (Ruby or custom) may choose to override either of these methods, +and may choose to interpret keyword argument <tt>highlight: true</tt> +to mean that the returned message should contain +[ANSI codes](https://en.wikipedia.org/wiki/ANSI_escape_code) +that specify color, bolding, and underlining. + +Because the enhanced message may be written to a non-terminal device +(e.g., into an HTML page), +it is best to limit the ANSI codes to these widely-supported codes: + +- Begin font color: + + | Color | ANSI Code | + |---------|------------------| + | Red | <tt>\\e[31m</tt> | + | Green | <tt>\\e[32m</tt> | + | Yellow | <tt>\\e[33m</tt> | + | Blue | <tt>\\e[34m</tt> | + | Magenta | <tt>\\e[35m</tt> | + | Cyan | <tt>\\e[36m</tt> | + +<br> + +- Begin font attribute: + + | Attribute | ANSI Code | + |-----------|-----------------| + | Bold | <tt>\\e[1m</tt> | + | Underline | <tt>\\e[4m</tt> | + +<br> + +- End all of the above: + + | Color | ANSI Code | + |-------|-----------------| + | Reset | <tt>\\e[0m</tt> | + +It's also best to craft a message that is conveniently human-readable, +even if the ANSI codes are included "as-is" +(rather than interpreted as font directives). + +## Backtraces + +A _backtrace_ is a record of the methods currently +in the [call stack](https://en.wikipedia.org/wiki/Call_stack); +each such method has been called, but has not yet returned. + +These methods return backtrace information: + +- Exception#backtrace: returns the backtrace as an array of strings or `nil`. +- Exception#backtrace_locations: returns the backtrace as an array + of Thread::Backtrace::Location objects or `nil`. + Each Thread::Backtrace::Location object gives detailed information about a called method. + +By default, Ruby sets the backtrace of the exception to the location where it +was raised. + +The developer might adjust this by either providing `backtrace` argument +to Kernel#raise, or using Exception#set_backtrace. + +Note that: + +- by default, both `backtrace` and `backtrace_locations` represent the same backtrace; +- if the developer sets the backtrace by one of the above methods to an array of + Thread::Backtrace::Location, they still represent the same backtrace; +- if the developer sets the backtrace to a string or an array of strings: + - by Kernel#raise: `backtrace_locations` become `nil`; + - by Exception#set_backtrace: `backtrace_locations` preserve the original + value; +- if the developer sets the backtrace to `nil` by Exception#set_backtrace, + `backtrace_locations` preserve the original value; but if the exception is then + reraised, both `backtrace` and `backtrace_locations` become the location of reraise. diff --git a/doc/language/fiber.md b/doc/language/fiber.md new file mode 100644 index 0000000000..d9011cce2f --- /dev/null +++ b/doc/language/fiber.md @@ -0,0 +1,290 @@ +# Fiber + +Fibers provide a mechanism for cooperative concurrency. + +## Context Switching + +Fibers execute a user-provided block. During the execution, the block may call `Fiber.yield` or `Fiber.transfer` to switch to another fiber. `Fiber#resume` is used to continue execution from the point where `Fiber.yield` was called. + +```rb +#!/usr/bin/env ruby + +puts "1: Start program." + +f = Fiber.new do + puts "3: Entered fiber." + Fiber.yield + puts "5: Resumed fiber." +end + +puts "2: Resume fiber first time." +f.resume + +puts "4: Resume fiber second time." +f.resume + +puts "6: Finished." +``` + +This program demonstrates the flow control of fibers. + +## Scheduler + +The scheduler interface is used to intercept blocking operations. A typical +implementation would be a wrapper for a gem like `EventMachine` or `Async`. This +design provides separation of concerns between the event loop implementation +and application code. It also allows for layered schedulers which can perform +instrumentation. + +To set the scheduler for the current thread: + +```rb +Fiber.set_scheduler(MyScheduler.new) +``` + +When the thread exits, there is an implicit call to `set_scheduler`: + +```rb +Fiber.set_scheduler(nil) +``` + +### Design + +The scheduler interface is designed to be a un-opinionated light-weight layer +between user code and blocking operations. The scheduler hooks should avoid +translating or converting arguments or return values. Ideally, the exact same +arguments from the user code are provided directly to the scheduler hook with +no changes. + +### Interface + +This is the interface you need to implement. + +```rb +class Scheduler + # Wait for the specified process ID to exit. + # This hook is optional. + # @parameter pid [Integer] The process ID to wait for. + # @parameter flags [Integer] A bit-mask of flags suitable for `Process::Status.wait`. + # @returns [Process::Status] A process status instance. + def process_wait(pid, flags) + Thread.new do + Process::Status.wait(pid, flags) + end.value + end + + # Wait for the given io readiness to match the specified events within + # the specified timeout. + # @parameter event [Integer] A bit mask of `IO::READABLE`, + # `IO::WRITABLE` and `IO::PRIORITY`. + # @parameter timeout [Numeric] The amount of time to wait for the event in seconds. + # @returns [Integer] The subset of events that are ready. + def io_wait(io, events, timeout) + end + + # Read from the given io into the specified buffer. + # WARNING: Experimental hook! Do not use in production code! + # @parameter io [IO] The io to read from. + # @parameter buffer [IO::Buffer] The buffer to read into. + # @parameter length [Integer] The minimum amount to read. + def io_read(io, buffer, length) + end + + # Write from the given buffer into the specified IO. + # WARNING: Experimental hook! Do not use in production code! + # @parameter io [IO] The io to write to. + # @parameter buffer [IO::Buffer] The buffer to write from. + # @parameter length [Integer] The minimum amount to write. + def io_write(io, buffer, length) + end + + # Sleep the current task for the specified duration, or forever if not + # specified. + # @parameter duration [Numeric] The amount of time to sleep in seconds. + def kernel_sleep(duration = nil) + end + + # Execute the given block. If the block execution exceeds the given timeout, + # the specified exception `klass` will be raised. Typically, only non-blocking + # methods which enter the scheduler will raise such exceptions. + # @parameter duration [Integer] The amount of time to wait, after which an exception will be raised. + # @parameter klass [Class] The exception class to raise. + # @parameter *arguments [Array] The arguments to send to the constructor of the exception. + # @yields {...} The user code to execute. + def timeout_after(duration, klass, *arguments, &block) + end + + # Resolve hostname to an array of IP addresses. + # This hook is optional. + # @parameter hostname [String] Example: "www.ruby-lang.org". + # @returns [Array] An array of IPv4 and/or IPv6 address strings that the hostname resolves to. + def address_resolve(hostname) + end + + # Block the calling fiber. + # @parameter blocker [Object] What we are waiting on, informational only. + # @parameter timeout [Numeric | Nil] The amount of time to wait for in seconds. + # @returns [Boolean] Whether the blocking operation was successful or not. + def block(blocker, timeout = nil) + end + + # Unblock the specified fiber. + # @parameter blocker [Object] What we are waiting on, informational only. + # @parameter fiber [Fiber] The fiber to unblock. + # @reentrant Thread safe. + def unblock(blocker, fiber) + end + + # Intercept the creation of a non-blocking fiber. + # @returns [Fiber] + def fiber(&block) + Fiber.new(blocking: false, &block) + end + + # Invoked when the thread exits. + def close + self.run + end + + def run + # Implement event loop here. + end +end +``` + +Additional hooks may be introduced in the future, we will use feature detection +in order to enable these hooks. + +### Non-blocking Execution + +The scheduler hooks will only be used in special non-blocking execution +contexts. Non-blocking execution contexts introduce non-determinism because the +execution of scheduler hooks may introduce context switching points into your +program. + +#### Fibers + +Fibers can be used to create non-blocking execution contexts. + +```rb +Fiber.new do + puts Fiber.current.blocking? # false + + # May invoke `Fiber.scheduler&.io_wait`. + io.read(...) + + # May invoke `Fiber.scheduler&.io_wait`. + io.write(...) + + # Will invoke `Fiber.scheduler&.kernel_sleep`. + sleep(n) +end.resume +``` + +We also introduce a new method which simplifies the creation of these +non-blocking fibers: + +```rb +Fiber.schedule do + puts Fiber.current.blocking? # false +end +``` + +The purpose of this method is to allow the scheduler to internally decide the +policy for when to start the fiber, and whether to use symmetric or asymmetric +fibers. + +You can also create blocking execution contexts: + +```rb +Fiber.new(blocking: true) do + # Won't use the scheduler: + sleep(n) +end +``` + +However you should generally avoid this unless you are implementing a scheduler. + +#### IO + +By default, I/O is non-blocking. Not all operating systems support non-blocking +I/O. Windows is a notable example where socket I/O can be non-blocking but pipe +I/O is blocking. Provided that there *is* a scheduler and the current thread *is +non-blocking*, the operation will invoke the scheduler. + +##### `IO#close` + +Closing an IO interrupts all blocking operations on that IO. When a thread calls `IO#close`, it first attempts to interrupt any threads or fibers that are blocked on that IO. The closing thread waits until all blocked threads and fibers have been properly interrupted and removed from the IO's blocking list. Each interrupted thread or fiber receives an `IOError` and is cleanly removed from the blocking operation. Only after all blocking operations have been interrupted and cleaned up will the actual file descriptor be closed, ensuring proper resource cleanup and preventing potential race conditions. + +For fibers managed by a scheduler, the interruption process involves calling `rb_fiber_scheduler_fiber_interrupt` on the scheduler. This allows the scheduler to handle the interruption in a way that's appropriate for its event loop implementation. The scheduler can then notify the fiber, which will receive an `IOError` and be removed from the blocking operation. This mechanism ensures that fiber-based concurrency works correctly with IO operations, even when those operations are interrupted by `IO#close`. + +```mermaid +sequenceDiagram + participant ThreadB + participant ThreadA + participant Scheduler + participant IO + participant Fiber1 + participant Fiber2 + + Note over ThreadA: Thread A has a fiber scheduler + activate Scheduler + ThreadA->>Fiber1: Schedule Fiber 1 + activate Fiber1 + Fiber1->>IO: IO.read + IO->>Scheduler: rb_thread_io_blocking_region + deactivate Fiber1 + + ThreadA->>Fiber2: Schedule Fiber 2 + activate Fiber2 + Fiber2->>IO: IO.read + IO->>Scheduler: rb_thread_io_blocking_region + deactivate Fiber2 + + Note over Fiber1,Fiber2: Both fibers blocked on same IO + + Note over ThreadB: IO.close + activate ThreadB + ThreadB->>IO: thread_io_close_notify_all + Note over ThreadB: rb_mutex_sleep + + IO->>Scheduler: rb_fiber_scheduler_fiber_interrupt(Fiber1) + Scheduler->>Fiber1: fiber_interrupt with IOError + activate Fiber1 + Note over IO: fiber_interrupt causes removal from blocking list + Fiber1->>IO: rb_io_blocking_operation_exit() + IO-->>ThreadB: Wakeup thread + deactivate Fiber1 + + IO->>Scheduler: rb_fiber_scheduler_fiber_interrupt(Fiber2) + Scheduler->>Fiber2: fiber_interrupt with IOError + activate Fiber2 + Note over IO: fiber_interrupt causes removal from blocking list + Fiber2->>IO: rb_io_blocking_operation_exit() + IO-->>ThreadB: Wakeup thread + deactivate Fiber2 + deactivate Scheduler + + Note over ThreadB: Blocking operations list empty + ThreadB->>IO: close(fd) + deactivate ThreadB +``` + +#### Mutex + +The `Mutex` class can be used in a non-blocking context and is fiber specific. + +#### ConditionVariable + +The `ConditionVariable` class can be used in a non-blocking context and is +fiber-specific. + +#### Queue / SizedQueue + +The `Queue` and `SizedQueue` classes can be used in a non-blocking context and +are fiber-specific. + +#### Thread + +The `Thread#join` operation can be used in a non-blocking context and is +fiber-specific. diff --git a/doc/language/format_specifications.rdoc b/doc/language/format_specifications.rdoc new file mode 100644 index 0000000000..a59f210377 --- /dev/null +++ b/doc/language/format_specifications.rdoc @@ -0,0 +1,354 @@ += Format Specifications + +Several Ruby core classes have instance method +printf+ or +sprintf+: + +- ARGF#printf +- IO#printf +- Kernel#printf +- Kernel#sprintf + +Each of these methods takes: + +- Argument +format_string+, which has zero or more + embedded _format_ _specifications_ (see below). +- Arguments <tt>*arguments</tt>, which are zero or more objects to be formatted. + +Each of these methods prints or returns the string +resulting from replacing each +format specification embedded in +format_string+ with a string form +of the corresponding argument among +arguments+. + +A simple example: + + sprintf('Name: %s; value: %d', 'Foo', 0) # => "Name: Foo; value: 0" + +A format specification has the form: + + %[flags][width][.precision]type + +It consists of: + +- A leading percent character. +- Zero or more _flags_ (each is a character). +- An optional _width_ _specifier_ (an integer, or <tt>*</tt>). +- An optional _precision_ _specifier_ (a period followed by a non-negative + integer, or <tt>*</tt>). +- A _type_ _specifier_ (a character). + +Except for the leading percent character, +the only required part is the type specifier, so we begin with that. + +== Type Specifiers + +This section provides a brief explanation of each type specifier. +The links lead to the details and examples. + +=== \Integer Type Specifiers + +- +b+ or +B+: Format +argument+ as a binary integer. + See {Specifiers b and B}[rdoc-ref:@Specifiers+b+and+B]. +- +d+, +i+, or +u+ (all are identical): + Format +argument+ as a decimal integer. + See {Specifier d}[rdoc-ref:@Specifier+d]. +- +o+: Format +argument+ as an octal integer. + See {Specifier o}[rdoc-ref:@Specifier+o]. +- +x+ or +X+: Format +argument+ as a hexadecimal integer. + See {Specifiers x and X}[rdoc-ref:@Specifiers+x+and+X]. + +=== Floating-Point Type Specifiers + +- +a+ or +A+: Format +argument+ as hexadecimal floating-point number. + See {Specifiers a and A}[rdoc-ref:@Specifiers+a+and+A]. +- +e+ or +E+: Format +argument+ in scientific notation. + See {Specifiers e and E}[rdoc-ref:@Specifiers+e+and+E]. +- +f+: Format +argument+ as a decimal floating-point number. + See {Specifier f}[rdoc-ref:@Specifier+f]. +- +g+ or +G+: Format +argument+ in a "general" format. + See {Specifiers g and G}[rdoc-ref:@Specifiers+g+and+G]. + +=== Other Type Specifiers + +- +c+: Format +argument+ as a character. + See {Specifier c}[rdoc-ref:@Specifier+c]. +- +p+: Format +argument+ as a string via <tt>argument.inspect</tt>. + See {Specifier p}[rdoc-ref:@Specifier+p]. +- +s+: Format +argument+ as a string via <tt>argument.to_s</tt>. + See {Specifier s}[rdoc-ref:@Specifier+s]. +- <tt>%</tt>: Format +argument+ (<tt>'%'</tt>) as a single percent character. + See {Specifier %}[rdoc-ref:@Specifier+-25]. + +== Flags + +The effect of a flag may vary greatly among type specifiers. +These remarks are general in nature. +See {type-specific details}[rdoc-ref:@Type+Specifier+Details+and+Examples]. + +Multiple flags may be given with single type specifier; +order does not matter. + +=== <tt>' '</tt> Flag + +Insert a space before a non-negative number: + + sprintf('%d', 10) # => "10" + sprintf('% d', 10) # => " 10" + +Insert a minus sign for negative value: + + sprintf('%d', -10) # => "-10" + sprintf('% d', -10) # => "-10" + +=== <tt>'#'</tt> Flag + +Use an alternate format; varies among types: + + sprintf('%x', 100) # => "64" + sprintf('%#x', 100) # => "0x64" + +=== <tt>'+'</tt> Flag + +Add a leading plus sign for a non-negative number: + + sprintf('%x', 100) # => "64" + sprintf('%+x', 100) # => "+64" + +=== <tt>'-'</tt> Flag + +Left justify the value in its field: + + sprintf('%6d', 100) # => " 100" + sprintf('%-6d', 100) # => "100 " + +=== <tt>'0'</tt> Flag + +Left-pad with zeros instead of spaces: + + sprintf('%6d', 100) # => " 100" + sprintf('%06d', 100) # => "000100" + +=== <tt>'n$'</tt> Flag + +Format the (1-based) <tt>n</tt>th argument into this field: + + sprintf("%s %s", 'world', 'hello') # => "world hello" + sprintf("%2$s %1$s", 'world', 'hello') # => "hello world" + +== Width Specifier + +In general, a width specifier determines the minimum width (in characters) +of the formatted field: + + sprintf('%10d', 100) # => " 100" + + # Left-justify if negative. + sprintf('%-10d', 100) # => "100 " + + # Ignore if too small. + sprintf('%1d', 100) # => "100" + +If the width specifier is <tt>'*'</tt> instead of an integer, the actual minimum +width is taken from the argument list: + + sprintf('%*d', 20, 14) # => " 14" + +== Precision Specifier + +A precision specifier is a decimal point followed by zero or more +decimal digits. + +For integer type specifiers, the precision specifies the minimum number of +digits to be written. If the precision is shorter than the integer, the result is +padded with leading zeros. There is no modification or truncation of the result +if the integer is longer than the precision: + + sprintf('%.3d', 1) # => "001" + sprintf('%.3d', 1000) # => "1000" + + # If the precision is 0 and the value is 0, nothing is written + sprintf('%.d', 0) # => "" + sprintf('%.0d', 0) # => "" + +For the +a+/+A+, +e+/+E+, +f+/+F+ specifiers, the precision specifies +the number of digits after the decimal point to be written: + + sprintf('%.2f', 3.14159) # => "3.14" + sprintf('%.10f', 3.14159) # => "3.1415900000" + + # With no precision specifier, defaults to 6-digit precision. + sprintf('%f', 3.14159) # => "3.141590" + +For the +g+/+G+ specifiers, the precision specifies +the number of significant digits to be written: + + sprintf('%.2g', 123.45) # => "1.2e+02" + sprintf('%.3g', 123.45) # => "123" + sprintf('%.10g', 123.45) # => "123.45" + + # With no precision specifier, defaults to 6 significant digits. + sprintf('%g', 123.456789) # => "123.457" + +For the +s+, +p+ specifiers, the precision specifies +the number of characters to write: + + sprintf('%s', Time.now) # => "2022-05-04 11:59:16 -0400" + sprintf('%.10s', Time.now) # => "2022-05-04" + +If the precision specifier is <tt>'*'</tt> instead of a non-negative integer, +the actual precision is taken from the argument list: + + sprintf('%.*d', 20, 1) # => "00000000000000000001" + +== Type Specifier Details and Examples + +=== Specifiers +a+ and +A+ + +Format +argument+ as hexadecimal floating-point number: + + sprintf('%a', 3.14159) # => "0x1.921f9f01b866ep+1" + sprintf('%a', -3.14159) # => "-0x1.921f9f01b866ep+1" + sprintf('%a', 4096) # => "0x1p+12" + sprintf('%a', -4096) # => "-0x1p+12" + + # Capital 'A' means that alphabetical characters are printed in upper case. + sprintf('%A', 4096) # => "0X1P+12" + sprintf('%A', -4096) # => "-0X1P+12" + +=== Specifiers +b+ and +B+ + +The two specifiers +b+ and +B+ behave identically +except when flag <tt>'#'</tt>+ is used. + +Format +argument+ as a binary integer: + + sprintf('%b', 1) # => "1" + sprintf('%b', 4) # => "100" + + # Prefix '..' for negative value. + sprintf('%b', -4) # => "..100" + + # Alternate format. + sprintf('%#b', 4) # => "0b100" + sprintf('%#B', 4) # => "0B100" + +=== Specifier +c+ + +Format +argument+ as a single character: + + sprintf('%c', 'A') # => "A" + sprintf('%c', 65) # => "A" + +This behaves like String#<<, except for raising ArgumentError instead of RangeError. + +=== Specifier +d+ + +Format +argument+ as a decimal integer: + + sprintf('%d', 100) # => "100" + sprintf('%d', -100) # => "-100" + +Flag <tt>'#'</tt> does not apply. + +=== Specifiers +e+ and +E+ + +Format +argument+ in +{scientific notation}[https://en.wikipedia.org/wiki/Scientific_notation]: + + sprintf('%e', 3.14159) # => "3.141590e+00" + sprintf('%E', -3.14159) # => "-3.141590E+00" + +=== Specifier +f+ + +Format +argument+ as a floating-point number: + + sprintf('%f', 3.14159) # => "3.141590" + sprintf('%f', -3.14159) # => "-3.141590" + +Flag <tt>'#'</tt> does not apply. + +=== Specifiers +g+ and +G+ + +Format +argument+ using exponential form (+e+/+E+ specifier) +if the exponent is less than -4 or greater than or equal to the precision. +Otherwise format +argument+ using floating-point form (+f+ specifier): + + sprintf('%g', 100) # => "100" + sprintf('%g', 100.0) # => "100" + sprintf('%g', 3.14159) # => "3.14159" + sprintf('%g', 100000000000) # => "1e+11" + sprintf('%g', 0.000000000001) # => "1e-12" + + # Capital 'G' means use capital 'E'. + sprintf('%G', 100000000000) # => "1E+11" + sprintf('%G', 0.000000000001) # => "1E-12" + + # Alternate format. + sprintf('%#g', 100000000000) # => "1.00000e+11" + sprintf('%#g', 0.000000000001) # => "1.00000e-12" + sprintf('%#G', 100000000000) # => "1.00000E+11" + sprintf('%#G', 0.000000000001) # => "1.00000E-12" + +=== Specifier +o+ + +Format +argument+ as an octal integer. +If +argument+ is negative, it will be formatted as a two's complement +prefixed with +..7+: + + sprintf('%o', 16) # => "20" + + # Prefix '..7' for negative value. + sprintf('%o', -16) # => "..760" + + # Prefix zero for alternate format if positive. + sprintf('%#o', 16) # => "020" + sprintf('%#o', -16) # => "..760" + +=== Specifier +p+ + +Format +argument+ as a string via <tt>argument.inspect</tt>: + + t = Time.now + sprintf('%p', t) # => "2022-05-01 13:42:07.1645683 -0500" + +=== Specifier +s+ + +Format +argument+ as a string via <tt>argument.to_s</tt>: + + t = Time.now + sprintf('%s', t) # => "2022-05-01 13:42:07 -0500" + +Flag <tt>'#'</tt> does not apply. + +=== Specifiers +x+ and +X+ + +Format +argument+ as a hexadecimal integer. +If +argument+ is negative, it will be formatted as a two's complement +prefixed with +..f+: + + sprintf('%x', 100) # => "64" + + # Prefix '..f' for negative value. + sprintf('%x', -100) # => "..f9c" + + # Use alternate format. + sprintf('%#x', 100) # => "0x64" + + # Alternate format for negative value. + sprintf('%#x', -100) # => "0x..f9c" + +=== Specifier <tt>%</tt> + +Format +argument+ (<tt>'%'</tt>) as a single percent character: + + sprintf('%d %%', 100) # => "100 %" + +Flags do not apply. + +== Reference by Name + +For more complex formatting, Ruby supports a reference by name. +%<name>s style uses format style, but %{name} style doesn't. + +Examples: + + sprintf("%<foo>d : %<bar>f", { :foo => 1, :bar => 2 }) # => 1 : 2.000000 + sprintf("%{foo}f", { :foo => 1 }) # => "1f" diff --git a/doc/language/globals.md b/doc/language/globals.md new file mode 100644 index 0000000000..ece950d3d8 --- /dev/null +++ b/doc/language/globals.md @@ -0,0 +1,610 @@ +# Pre-Defined Global Variables + +Some of the pre-defined global variables have synonyms +that are available via module English. +For each of those, the \English synonym is given. + +To use the module: + +```ruby +require 'English' +``` + +## In Brief + +### Exceptions + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:--------:|:-----------------:|----------------------------------------|:---------:|:---------:|--------------| +| `$!` | `$ERROR_INFO` | \Exception object or `nil` | `nil` | Yes | Kernel#raise | +| `$@` | `$ERROR_POSITION` | \Array of backtrace positions or `nil` | `nil` | Yes | Kernel#raise | + +### Matched \Data + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:---------:|:-------------------:|-----------------------------------|:---------:|:---------:|-----------------| +| `$~` | `$LAST_MATCH_INFO` | \MatchData object or `nil` | `nil` | No | Matcher methods | +| `$&` | `$MATCH` | Matched substring or `nil` | `nil` | No | Matcher methods | +| `` $` `` | `$PRE_MATCH` | Substring left of match or `nil` | `nil` | No | Matcher methods | +| `$'` | `$POST_MATCH` | Substring right of match or `nil` | `nil` | No | Matcher methods | +| `$+` | `$LAST_PAREN_MATCH` | Last group matched or `nil` | `nil` | No | Matcher methods | +| `$1` | | First group matched or `nil` | `nil` | Yes | Matcher methods | +| `$2` | | Second group matched or `nil` | `nil` | Yes | Matcher methods | +| `$n` | | <i>n</i>th group matched or `nil` | `nil` | Yes | Matcher methods | + +### Separators + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:-----------:|:---------------------------:|-------------------------|:---------:|:---------:|----------| +| `$/`, `$-0` | `$INPUT_RECORD_SEPARATOR` | Input record separator | Newline | No | | +| `$\` | `$OUTPUT_RECORD_SEPARATOR` | Output record separator | `nil` | No | | + +### Streams + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:---------:|:----------------------------:|---------------------------------------------|:---------:|:---------:|----------------------| +| `$stdin` | | Standard input stream | `STDIN` | No | | +| `$stdout` | | Standard output stream | `STDOUT` | No | | +| `$stderr` | | Standard error stream | `STDERR` | No | | +| `$<` | `$DEFAULT_INPUT` | Default standard input | `ARGF` | Yes | | +| `$>` | `$DEFAULT_OUTPUT` | Default standard output | `STDOUT` | No | | +| `$.` | `$INPUT_LINE_NUMBER`, `$NR` | Input position of most recently read stream | 0 | No | Certain read methods | +| `$_` | `$LAST_READ_LINE` | String from most recently read stream | `nil` | No | Certain read methods | + +### Processes + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:-------------------------:|:----------------------:|---------------------------------|:-------------:|:---------:|----------| +| `$0`, `$PROGRAM_NAME` | | Program name | Program name | No | | +| `$*` | `$ARGV` | \ARGV array | `ARGV` | Yes | | +| `$$` | `$PROCESS_ID`, `$PID` | Process id | Process PID | Yes | | +| `$?` | `$CHILD_STATUS` | Status of recently exited child | `nil` | Yes | | +| `$LOAD_PATH`, `$:`, `$-I` | | \Array of search paths | Ruby defaults | Yes | | +| `$LOADED_FEATURES`, `$"` | | \Array of load paths | Ruby defaults | Yes | | + +### Debugging + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:-----------:|:--------:|--------------------------------------------|:----------------------------:|:---------:|----------| +| `$FILENAME` | | Value returned by method `ARGF.filename` | Command-line argument or '-' | Yes | | +| `$DEBUG` | | Whether option `-d` or `--debug` was given | Command-line option | No | | +| `$VERBOSE` | | Whether option `-V` or `-W` was given | Command-line option | No | | + +### Other Variables + +| Variable | \English | Contains | Initially | Read-Only | Reset By | +|:-----------:|:--------:|-----------------------------------------------|:---------:|:---------:|----------| +| `$-F`, `$;` | | Separator given with command-line option `-F` | | | | +| `$-a` | | Whether option `-a` was given | | Yes | | +| `$-i` | | Extension given with command-line option `-i` | | No | | +| `$-l` | | Whether option `-l` was given | | Yes | | +| `$-p` | | Whether option `-p` was given | | Yes | | +| `$F` | | \Array of `$_` split by `$-F` | | | | + +## Exceptions + +### `$!` (\Exception) + +Contains the Exception object set by Kernel#raise: + +```ruby +begin + raise RuntimeError.new('Boo!') +rescue RuntimeError + p $! +end +``` + +Output: + +``` +#<RuntimeError: Boo!> +``` + +English - `$ERROR_INFO` + +### `$@` (Backtrace) + +Same as `$!.backtrace`; +returns an array of backtrace positions: + +```ruby +begin + raise RuntimeError.new('Boo!') +rescue RuntimeError + pp $@.take(4) +end +``` + +Output: + +``` +["(irb):338:in `<top (required)>'", + "/snap/ruby/317/lib/ruby/3.2.0/irb/workspace.rb:119:in `eval'", + "/snap/ruby/317/lib/ruby/3.2.0/irb/workspace.rb:119:in `evaluate'", + "/snap/ruby/317/lib/ruby/3.2.0/irb/context.rb:502:in `evaluate'"] +``` + +English - `$ERROR_POSITION`. + +## Matched \Data + +These global variables store information about the most recent +successful match in the current scope. + +For details and examples, +see {Regexp Global Variables}[rdoc-ref:Regexp@Global+Variables]. + +### `$~` (\MatchData) + +MatchData object created from the match; +thread-local and frame-local. + +English - `$LAST_MATCH_INFO`. + +### `$&` (Matched Substring) + +The matched string. + +English - `$MATCH`. + +### `` $` `` (Pre-Match Substring) +The string to the left of the match. + +English - `$PREMATCH`. + +### `$'` (Post-Match Substring) + +The string to the right of the match. + +English - `$POSTMATCH`. + +### `$+` (Last Matched Group) + +The last group matched. + +English - `$LAST_PAREN_MATCH`. + +### `$1`, `$2`, \Etc. (Matched Group) + +For <tt>$n</tt> the <i>n</i>th group of the match. + +No \English. + +## Separators + +### `$/` (Input Record Separator) + +An input record separator, initially newline. +Set by the [command-line option `-0`]. + +Setting to non-nil value by other than the command-line option is +deprecated. + +English - `$INPUT_RECORD_SEPARATOR`, `$RS`. + +Aliased as `$-0`. + +### `$\` (Output Record Separator) + +An output record separator, initially `nil`. + +Copied from `$/` when the [command-line option `-l`] is +given. + +Setting to non-nil value by other than the command-line option is +deprecated. + +English - `$OUTPUT_RECORD_SEPARATOR`, `$ORS`. + +## Streams + +### `$stdin` (Standard Input) + +The current standard input stream; initially: + +```ruby +$stdin # => #<IO:<STDIN>> +``` + +### `$stdout` (Standard Output) + +The current standard output stream; initially: + +```ruby +$stdout # => #<IO:<STDOUT>> +``` + +### `$stderr` (Standard Error) + +The current standard error stream; initially: + +```ruby +$stderr # => #<IO:<STDERR>> +``` + +### `$<` (\ARGF or $stdin) + +Points to stream ARGF if not empty, else to stream $stdin; read-only. + +English - `$DEFAULT_INPUT`. + +### `$>` (Default Standard Output) + +An output stream, initially `$stdout`. + +English - `$DEFAULT_OUTPUT` + +### `$.` (Input Position) + +The input position (line number) in the most recently read stream. + +English - `$INPUT_LINE_NUMBER`, `$NR` + +### `$_` (Last Read Line) + +The line (string) from the most recently read stream. + +English - `$LAST_READ_LINE`. + +## Processes + +### `$0` + +Initially, contains the name of the script being executed; +may be reassigned. + +### `$*` (\ARGV) + +Points to ARGV. + +English - `$ARGV`. + +### `$$` (Process ID) + +The process ID of the current process. Same as Process.pid. + +English - `$PROCESS_ID`, `$PID`. + +### `$?` (Child Status) + +Initially `nil`, otherwise the Process::Status object +created for the most-recently exited child process; +thread-local. + +English - `$CHILD_STATUS`. + +### `$LOAD_PATH` (Load Path) + +Contains the array of paths to be searched +by Kernel#load and Kernel#require. + +Singleton method `$LOAD_PATH.resolve_feature_path(feature)` +returns: + +- <tt>[:rb, path]</tt>, where `path` is the path to the Ruby file to be + loaded for the given `feature`. +- <tt>[:so, path]</tt>, where `path` is the path to the shared object file + to be loaded for the given `feature`. +- `nil` if there is no such `feature` and `path`. + +Examples: + +```ruby +$LOAD_PATH.resolve_feature_path('timeout') +# => [:rb, "/snap/ruby/317/lib/ruby/3.2.0/timeout.rb"] +$LOAD_PATH.resolve_feature_path('date_core') +# => [:so, "/snap/ruby/317/lib/ruby/3.2.0/x86_64-linux/date_core.so"] +$LOAD_PATH.resolve_feature_path('foo') +# => nil +``` + +Aliased as `$:` and `$-I`. + +### `$LOADED_FEATURES` + +Contains an array of the paths to the loaded files: + +```ruby +$LOADED_FEATURES.take(10) +# => +["enumerator.so", + "thread.rb", + "fiber.so", + "rational.so", + "complex.so", + "ruby2_keywords.rb", + "/snap/ruby/317/lib/ruby/3.2.0/x86_64-linux/enc/encdb.so", + "/snap/ruby/317/lib/ruby/3.2.0/x86_64-linux/enc/trans/transdb.so", + "/snap/ruby/317/lib/ruby/3.2.0/x86_64-linux/rbconfig.rb", + "/snap/ruby/317/lib/ruby/3.2.0/rubygems/compatibility.rb"] +``` + +Aliased as `$"`. + +## Debugging + +### `$FILENAME` + +The value returned by method ARGF.filename. + +### `$DEBUG` + +Initially `true` if [command-line option `-d`] or +[`--debug`][command-line option `-d`] is given, otherwise initially `false`; +may be set to either value in the running program. + +When `true`, prints each raised exception to `$stderr`. + +Aliased as `$-d`. + +### `$VERBOSE` + +Initially `true` if [command-line option `-v`] or +[`-w`][command-line option `-w`] is given, otherwise initially `false`; +may be set to either value, or to `nil`, in the running program. + +When `true`, enables Ruby warnings. + +When `nil`, disables warnings, including those from Kernel#warn. + +Aliased as `$-v` and `$-w`. + +## Other Variables + +### `$-F` + +The default field separator in String#split; must be a String or a +Regexp, and can be set with [command-line option `-F`]. + +Setting to non-nil value by other than the command-line option is +deprecated. + +Aliased as `$;`. + +### `$-a` + +Whether [command-line option `-a`] was given; read-only. + +### `$-i` + +Contains the extension given with [command-line option `-i`], +or `nil` if none. + +An alias of ARGF.inplace_mode. + +### `$-l` + +Whether [command-line option `-l`] was set; read-only. + +### `$-p` + +Whether [command-line option `-p`] was given; read-only. + +### `$F` + +If the [command-line option `-a`] is given, the array +obtained by splitting `$_` by `$-F` is assigned at the start of each +`-l`/`-p` loop. + +## Deprecated + +### `$=` + +### `$,` + +# Pre-Defined Global Constants + +## Summary + +### Streams + +| Constant | Contains | +|:--------:|-------------------------| +| `STDIN` | Standard input stream. | +| `STDOUT` | Standard output stream. | +| `STDERR` | Standard error stream. | + +### Environment + +| Constant | Contains | +|-----------------------|-------------------------------------------------------------------------------| +| `ENV` | Hash of current environment variable names and values. | +| `ARGF` | String concatenation of files given on the command line, or `$stdin` if none. | +| `ARGV` | Array of the given command-line arguments. | +| `TOPLEVEL_BINDING` | Binding of the top level scope. | +| `RUBY_VERSION` | String Ruby version. | +| `RUBY_RELEASE_DATE` | String Ruby release date. | +| `RUBY_PLATFORM` | String Ruby platform. | +| `RUBY_PATCH_LEVEL` | String Ruby patch level. | +| `RUBY_REVISION` | String Ruby revision. | +| `RUBY_COPYRIGHT` | String Ruby copyright. | +| `RUBY_ENGINE` | String Ruby engine. | +| `RUBY_ENGINE_VERSION` | String Ruby engine version. | +| `RUBY_DESCRIPTION` | String Ruby description. | + +### Embedded \Data + +| Constant | Contains | +|:---------------------:|-------------------------------------------------------------------------------| +| `DATA` | File containing embedded data (lines following `__END__`, if any). | + +## Streams + +### `STDIN` + +The standard input stream (the default value for `$stdin`): + +```ruby +STDIN # => #<IO:<STDIN>> +``` + +### `STDOUT` + +The standard output stream (the default value for `$stdout`): + +```ruby +STDOUT # => #<IO:<STDOUT>> +``` + +### `STDERR` + +The standard error stream (the default value for `$stderr`): + +```ruby +STDERR # => #<IO:<STDERR>> +``` + +## Environment + +### `ENV` + +A hash of the contains current environment variables names and values: + +```ruby +ENV.take(5) +# => +[["COLORTERM", "truecolor"], + ["DBUS_SESSION_BUS_ADDRESS", "unix:path=/run/user/1000/bus"], + ["DESKTOP_SESSION", "ubuntu"], + ["DISPLAY", ":0"], + ["GDMSESSION", "ubuntu"]] +``` + +### `ARGF` + +The virtual concatenation of the files given on the command line, or from +`$stdin` if no files were given, `"-"` is given, or after +all files have been read. + +### `ARGV` + +An array of the given command-line arguments. + +### `TOPLEVEL_BINDING` + +The Binding of the top level scope: + +```ruby +TOPLEVEL_BINDING # => #<Binding:0x00007f58da0da7c0> +``` + +### `RUBY_VERSION` + +The Ruby version: + +```ruby +RUBY_VERSION # => "3.2.2" +``` + +### `RUBY_RELEASE_DATE` + +The release date string: + +```ruby +RUBY_RELEASE_DATE # => "2023-03-30" +``` + +### `RUBY_PLATFORM` + +The platform identifier: + +```ruby +RUBY_PLATFORM # => "x86_64-linux" +``` + +### `RUBY_PATCHLEVEL` + +The integer patch level for this Ruby: + +```ruby +RUBY_PATCHLEVEL # => 53 +``` + +For a development build the patch level will be -1. + +### `RUBY_REVISION` + +The git commit hash for this Ruby: + +```ruby +RUBY_REVISION # => "e51014f9c05aa65cbf203442d37fef7c12390015" +``` + +### `RUBY_COPYRIGHT` + +The copyright string: + +```ruby +RUBY_COPYRIGHT +# => "ruby - Copyright (C) 1993-2023 Yukihiro Matsumoto" +``` + +### `RUBY_ENGINE` + +The name of the Ruby implementation: + +```ruby +RUBY_ENGINE # => "ruby" +``` + +### `RUBY_ENGINE_VERSION` + +The version of the Ruby implementation: + +```ruby +RUBY_ENGINE_VERSION # => "3.2.2" +``` + +### `RUBY_DESCRIPTION` + +The description of the Ruby implementation: + +```ruby +RUBY_DESCRIPTION +# => "ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]" +``` + +## Embedded \Data + +### `DATA` + +Defined if and only if the program has this line: + +```ruby +__END__ +``` + +When defined, `DATA` is a File object +containing the lines following the `__END__`, +positioned at the first of those lines: + +```ruby +p DATA +DATA.each_line { |line| p line } +__END__ +Foo +Bar +Baz +``` + +Output: + +``` +#<File:t.rb> +"Foo\n" +"Bar\n" +"Baz\n" +``` + + +[command-line option `-0`]: rdoc-ref:language/options.md@0-3A+Set+-24-2F+-28Input+Record+Separator-29 +[command-line option `-F`]: rdoc-ref:language/options.md@F-3A+Set+Input+Field+Separator +[command-line option `-a`]: rdoc-ref:language/options.md@a-3A+Split+Input+Lines+into+Fields +[command-line option `-d`]: rdoc-ref:language/options.md@d-3A+Set+-24DEBUG+to+true +[command-line option `-i`]: rdoc-ref:language/options.md@i-3A+Set+ARGF+In-Place+Mode +[command-line option `-l`]: rdoc-ref:language/options.md@l-3A+Set+Output+Record+Separator-3B+Chop+Lines +[command-line option `-p`]: rdoc-ref:language/options.md@p-3A+-n-2C+with+Printing +[command-line option `-v`]: rdoc-ref:language/options.md@v-3A+Print+Version-3B+Set+-24VERBOSE +[command-line option `-w`]: rdoc-ref:language/options.md@w-3A+Synonym+for+-W1 + diff --git a/doc/language/hash_inclusion.rdoc b/doc/language/hash_inclusion.rdoc new file mode 100644 index 0000000000..05c2b0932a --- /dev/null +++ b/doc/language/hash_inclusion.rdoc @@ -0,0 +1,31 @@ +== \Hash Inclusion + +A hash is set-like in that it cannot have duplicate entries +(or even duplicate keys). +\Hash inclusion can therefore based on the idea of +{subset and superset}[https://en.wikipedia.org/wiki/Subset]. + +Two hashes may be tested for inclusion, +based on comparisons of their entries. + +An entry <tt>h0[k0]</tt> in one hash +is equal to an entry <tt>h1[k1]</tt> in another hash +if and only if the two keys are equal (<tt>k0 == k1</tt>) +and their two values are equal (<tt>h0[k0] == h1[h1]</tt>). + +A hash may be a subset or a superset of another hash: + +- Subset (included in or equal to another): + + - \Hash +h0+ is a _subset_ of hash +h1+ (see Hash#<=) + if each entry in +h0+ is equal to an entry in +h1+. + - Further, +h0+ is a <i>proper subset</i> of +h1+ (see Hash#<) + if +h1+ is larger than +h0+. + +- Superset (including or equal to another): + + - \Hash +h0+ is a _superset_ of hash +h1+ (see Hash#>=) + if each entry in +h1+ is equal to an entry in +h0+. + - Further, +h0+ is a <i>proper superset</i> of +h1+ (see Hash#>) + if +h0+ is larger than +h1+. + diff --git a/doc/language/implicit_conversion.rdoc b/doc/language/implicit_conversion.rdoc new file mode 100644 index 0000000000..e244096125 --- /dev/null +++ b/doc/language/implicit_conversion.rdoc @@ -0,0 +1,221 @@ += Implicit Conversions + +Some Ruby methods accept one or more objects +that can be either: + +* <i>Of a given class</i>, and so accepted as is. +* <i>Implicitly convertible to that class</i>, in which case + the called method converts the object. + +For each of the relevant classes, the conversion is done by calling +a specific conversion method: + +* Array: +to_ary+ +* Hash: +to_hash+ +* Integer: +to_int+ +* String: +to_str+ + +== Array-Convertible Objects + +An <i>Array-convertible object</i> is an object that: + +* Has instance method +to_ary+. +* The method accepts no arguments. +* The method returns an object +obj+ for which <tt>obj.kind_of?(Array)</tt> returns +true+. + +The Ruby core class that satisfies these requirements is: + +* Array + +The examples in this section use method <tt>Array#replace</tt>, +which accepts an Array-convertible argument. + +This class is Array-convertible: + + class ArrayConvertible + def to_ary + [:foo, 'bar', 2] + end + end + a = [] + a.replace(ArrayConvertible.new) # => [:foo, "bar", 2] + +This class is not Array-convertible (no +to_ary+ method): + + class NotArrayConvertible; end + a = [] + # Raises TypeError (no implicit conversion of NotArrayConvertible into Array) + a.replace(NotArrayConvertible.new) + +This class is not Array-convertible (method +to_ary+ takes arguments): + + class NotArrayConvertible + def to_ary(x) + [:foo, 'bar', 2] + end + end + a = [] + # Raises ArgumentError (wrong number of arguments (given 0, expected 1)) + a.replace(NotArrayConvertible.new) + +This class is not Array-convertible (method +to_ary+ returns non-Array): + + class NotArrayConvertible + def to_ary + :foo + end + end + a = [] + # Raises TypeError (can't convert NotArrayConvertible to Array (NotArrayConvertible#to_ary gives Symbol)) + a.replace(NotArrayConvertible.new) + +== Hash-Convertible Objects + +A <i>Hash-convertible object</i> is an object that: + +* Has instance method +to_hash+. +* The method accepts no arguments. +* The method returns an object +obj+ for which <tt>obj.kind_of?(Hash)</tt> returns +true+. + +The Ruby core class that satisfies these requirements is: + +* Hash + +The examples in this section use method <tt>Hash#merge</tt>, +which accepts a Hash-convertible argument. + +This class is Hash-convertible: + + class HashConvertible + def to_hash + {foo: 0, bar: 1, baz: 2} + end + end + h = {} + h.merge(HashConvertible.new) # => {:foo=>0, :bar=>1, :baz=>2} + +This class is not Hash-convertible (no +to_hash+ method): + + class NotHashConvertible; end + h = {} + # Raises TypeError (no implicit conversion of NotHashConvertible into Hash) + h.merge(NotHashConvertible.new) + +This class is not Hash-convertible (method +to_hash+ takes arguments): + + class NotHashConvertible + def to_hash(x) + {foo: 0, bar: 1, baz: 2} + end + end + h = {} + # Raises ArgumentError (wrong number of arguments (given 0, expected 1)) + h.merge(NotHashConvertible.new) + +This class is not Hash-convertible (method +to_hash+ returns non-Hash): + + class NotHashConvertible + def to_hash + :foo + end + end + h = {} + # Raises TypeError (can't convert NotHashConvertible to Hash (ToHashReturnsNonHash#to_hash gives Symbol)) + h.merge(NotHashConvertible.new) + +== Integer-Convertible Objects + +An <i>Integer-convertible object</i> is an object that: + +* Has instance method +to_int+. +* The method accepts no arguments. +* The method returns an object +obj+ for which <tt>obj.kind_of?(Integer)</tt> returns +true+. + +The Ruby core classes that satisfy these requirements are: + +* Integer +* Float +* Complex +* Rational + +The examples in this section use method <tt>Array.new</tt>, +which accepts an Integer-convertible argument. + +This user-defined class is Integer-convertible: + + class IntegerConvertible + def to_int + 3 + end + end + a = Array.new(IntegerConvertible.new).size + a # => 3 + +This class is not Integer-convertible (method +to_int+ takes arguments): + + class NotIntegerConvertible + def to_int(x) + 3 + end + end + # Raises ArgumentError (wrong number of arguments (given 0, expected 1)) + Array.new(NotIntegerConvertible.new) + +This class is not Integer-convertible (method +to_int+ returns non-Integer): + + class NotIntegerConvertible + def to_int + :foo + end + end + # Raises TypeError (can't convert NotIntegerConvertible to Integer (NotIntegerConvertible#to_int gives Symbol)) + Array.new(NotIntegerConvertible.new) + +== String-Convertible Objects + +A <i>String-convertible object</i> is an object that: +* Has instance method +to_str+. +* The method accepts no arguments. +* The method returns an object +obj+ for which <tt>obj.kind_of?(String)</tt> returns +true+. + +The Ruby core class that satisfies these requirements is: + +* String + +The examples in this section use method <tt>String::new</tt>, +which accepts a String-convertible argument. + +This class is String-convertible: + + class StringConvertible + def to_str + 'foo' + end + end + String.new(StringConvertible.new) # => "foo" + +This class is not String-convertible (no +to_str+ method): + + class NotStringConvertible; end + # Raises TypeError (no implicit conversion of NotStringConvertible into String) + String.new(NotStringConvertible.new) + +This class is not String-convertible (method +to_str+ takes arguments): + + class NotStringConvertible + def to_str(x) + 'foo' + end + end + # Raises ArgumentError (wrong number of arguments (given 0, expected 1)) + String.new(NotStringConvertible.new) + +This class is not String-convertible (method +to_str+ returns non-String): + + class NotStringConvertible + def to_str + :foo + end + end + # Raises TypeError (can't convert NotStringConvertible to String (NotStringConvertible#to_str gives Symbol)) + String.new(NotStringConvertible.new) diff --git a/doc/marshal.rdoc b/doc/language/marshal.rdoc index a51f1bf873..740064ade6 100644 --- a/doc/marshal.rdoc +++ b/doc/language/marshal.rdoc @@ -73,7 +73,7 @@ The first byte has the following special values: a positive little-endian integer. "\xfd":: - The total size of the integer is two bytes. The following three bytes are a + The total size of the integer is four bytes. The following three bytes are a negative little-endian integer. "\x04":: @@ -83,7 +83,7 @@ The first byte has the following special values: of stream objects full precision may be used. "\xfc":: - The total size of the integer is two bytes. The following four bytes are a + The total size of the integer is five bytes. The following four bytes are a negative little-endian integer. For compatibility with 32 bit ruby, only Fixnums greater than -10737341824 should be represented this way. For sizes of stream objects full precision may be used. @@ -188,9 +188,9 @@ bytes: result += (byte * 2 ** (exp * 8)) end -=== Class and Module +=== +Class+ and +Module+ -"c" represents a Class object, "m" represents a Module and "M" represents +"c" represents a +Class+ object, "m" represents a +Module+ and "M" represents either a class or module (this is an old-style for compatibility). No class or module content is included, this type is only a reference. Following the type byte is a byte sequence which is used to look up an existing class or @@ -301,6 +301,11 @@ sequence containing the user-defined representation of the object. The class method +_load+ is called on the class with a string created from the byte-sequence. +This type is not recommended for newly created classes, because of some +restrictions: + +- cannot have recursive reference + === User Marshal "U" represents an object with a user-defined serialization format using the diff --git a/doc/language/option_dump.md b/doc/language/option_dump.md new file mode 100644 index 0000000000..a156484bf6 --- /dev/null +++ b/doc/language/option_dump.md @@ -0,0 +1,265 @@ +# Option `--dump` + +For other argument values, +see {Option --dump}[options_md.html#label--dump-3A+Dump+Items]. + +For the examples here, we use this program: + +```console +$ cat t.rb +puts 'Foo' +``` + +The supported dump items: + +- `insns`: Instruction sequences: + + ```sh + $ ruby --dump=insns t.rb + == disasm: #<ISeq:<main>@t.rb:1 (1,0)-(1,10)> (catch: FALSE) + 0000 putself ( 1)[Li] + 0001 putstring "Foo" + 0003 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE> + 0005 leave + ``` + +- `parsetree`: {Abstract syntax tree}[https://en.wikipedia.org/wiki/Abstract_syntax_tree] + (AST): + + ```console + $ ruby --dump=parsetree t.rb + ########################################################### + ## Do NOT use this node dump for any purpose other than ## + ## debug and research. Compatibility is not guaranteed. ## + ########################################################### + + # @ NODE_SCOPE (line: 1, location: (1,0)-(1,10)) + # +- nd_tbl: (empty) + # +- nd_args: + # | (null node) + # +- nd_body: + # @ NODE_FCALL (line: 1, location: (1,0)-(1,10))* + # +- nd_mid: :puts + # +- nd_args: + # @ NODE_LIST (line: 1, location: (1,5)-(1,10)) + # +- nd_alen: 1 + # +- nd_head: + # | @ NODE_STR (line: 1, location: (1,5)-(1,10)) + # | +- nd_lit: "Foo" + # +- nd_next: + # (null node) + ``` + +- `yydebug`: Debugging information from yacc parser generator: + + ``` + $ ruby --dump=yydebug t.rb + Starting parse + Entering state 0 + Reducing stack by rule 1 (line 1295): + lex_state: NONE -> BEG at line 1296 + vtable_alloc:12392: 0x0000558453df1a00 + vtable_alloc:12393: 0x0000558453df1a60 + cmdarg_stack(push): 0 at line 12406 + cond_stack(push): 0 at line 12407 + -> $$ = nterm $@1 (1.0-1.0: ) + Stack now 0 + Entering state 2 + Reading a token: + lex_state: BEG -> CMDARG at line 9049 + Next token is token "local variable or method" (1.0-1.4: puts) + Shifting token "local variable or method" (1.0-1.4: puts) + Entering state 35 + Reading a token: Next token is token "string literal" (1.5-1.6: ) + Reducing stack by rule 742 (line 5567): + $1 = token "local variable or method" (1.0-1.4: puts) + -> $$ = nterm operation (1.0-1.4: ) + Stack now 0 2 + Entering state 126 + Reducing stack by rule 78 (line 1794): + $1 = nterm operation (1.0-1.4: ) + -> $$ = nterm fcall (1.0-1.4: ) + Stack now 0 2 + Entering state 80 + Next token is token "string literal" (1.5-1.6: ) + Reducing stack by rule 292 (line 2723): + cmdarg_stack(push): 1 at line 2737 + -> $$ = nterm $@16 (1.4-1.4: ) + Stack now 0 2 80 + Entering state 235 + Next token is token "string literal" (1.5-1.6: ) + Shifting token "string literal" (1.5-1.6: ) + Entering state 216 + Reducing stack by rule 607 (line 4706): + -> $$ = nterm string_contents (1.6-1.6: ) + Stack now 0 2 80 235 216 + Entering state 437 + Reading a token: Next token is token "literal content" (1.6-1.9: "Foo") + Shifting token "literal content" (1.6-1.9: "Foo") + Entering state 503 + Reducing stack by rule 613 (line 4802): + $1 = token "literal content" (1.6-1.9: "Foo") + -> $$ = nterm string_content (1.6-1.9: ) + Stack now 0 2 80 235 216 437 + Entering state 507 + Reducing stack by rule 608 (line 4716): + $1 = nterm string_contents (1.6-1.6: ) + $2 = nterm string_content (1.6-1.9: ) + -> $$ = nterm string_contents (1.6-1.9: ) + Stack now 0 2 80 235 216 + Entering state 437 + Reading a token: + lex_state: CMDARG -> END at line 7276 + Next token is token "terminator" (1.9-1.10: ) + Shifting token "terminator" (1.9-1.10: ) + Entering state 508 + Reducing stack by rule 590 (line 4569): + $1 = token "string literal" (1.5-1.6: ) + $2 = nterm string_contents (1.6-1.9: ) + $3 = token "terminator" (1.9-1.10: ) + -> $$ = nterm string1 (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 109 + Reducing stack by rule 588 (line 4559): + $1 = nterm string1 (1.5-1.10: ) + -> $$ = nterm string (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 108 + Reading a token: + lex_state: END -> BEG at line 9200 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 586 (line 4541): + $1 = nterm string (1.5-1.10: ) + -> $$ = nterm strings (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 107 + Reducing stack by rule 307 (line 2837): + $1 = nterm strings (1.5-1.10: ) + -> $$ = nterm primary (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 90 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 261 (line 2553): + $1 = nterm primary (1.5-1.10: ) + -> $$ = nterm arg (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 220 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 270 (line 2586): + $1 = nterm arg (1.5-1.10: ) + -> $$ = nterm arg_value (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 221 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 297 (line 2779): + $1 = nterm arg_value (1.5-1.10: ) + -> $$ = nterm args (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 224 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 772 (line 5626): + -> $$ = nterm none (1.10-1.10: ) + Stack now 0 2 80 235 224 + Entering state 442 + Reducing stack by rule 296 (line 2773): + $1 = nterm none (1.10-1.10: ) + + -> $$ = nterm opt_block_arg (1.10-1.10: ) + Stack now 0 2 80 235 224 + Entering state 441 + Reducing stack by rule 288 (line 2696): + $1 = nterm args (1.5-1.10: ) + $2 = nterm opt_block_arg (1.10-1.10: ) + -> $$ = nterm call_args (1.5-1.10: ) + Stack now 0 2 80 235 + Entering state 453 + Reducing stack by rule 293 (line 2723): + $1 = nterm $@16 (1.4-1.4: ) + $2 = nterm call_args (1.5-1.10: ) + cmdarg_stack(pop): 0 at line 2754 + -> $$ = nterm command_args (1.4-1.10: ) + Stack now 0 2 80 + Entering state 333 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 79 (line 1804): + $1 = nterm fcall (1.0-1.4: ) + $2 = nterm command_args (1.4-1.10: ) + -> $$ = nterm command (1.0-1.10: ) + Stack now 0 2 + Entering state 81 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 73 (line 1770): + $1 = nterm command (1.0-1.10: ) + -> $$ = nterm command_call (1.0-1.10: ) + Stack now 0 2 + Entering state 78 + Reducing stack by rule 51 (line 1659): + $1 = nterm command_call (1.0-1.10: ) + -> $$ = nterm expr (1.0-1.10: ) + Stack now 0 2 + Entering state 75 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 39 (line 1578): + $1 = nterm expr (1.0-1.10: ) + -> $$ = nterm stmt (1.0-1.10: ) + Stack now 0 2 + Entering state 73 + Next token is token '\n' (1.10-1.10: ) + Reducing stack by rule 8 (line 1354): + $1 = nterm stmt (1.0-1.10: ) + -> $$ = nterm top_stmt (1.0-1.10: ) + Stack now 0 2 + Entering state 72 + Reducing stack by rule 5 (line 1334): + $1 = nterm top_stmt (1.0-1.10: ) + -> $$ = nterm top_stmts (1.0-1.10: ) + Stack now 0 2 + Entering state 71 + Next token is token '\n' (1.10-1.10: ) + Shifting token '\n' (1.10-1.10: ) + Entering state 311 + Reducing stack by rule 769 (line 5618): + $1 = token '\n' (1.10-1.10: ) + -> $$ = nterm term (1.10-1.10: ) + Stack now 0 2 71 + Entering state 313 + Reducing stack by rule 770 (line 5621): + $1 = nterm term (1.10-1.10: ) + -> $$ = nterm terms (1.10-1.10: ) + Stack now 0 2 71 + Entering state 314 + Reading a token: Now at end of input. + Reducing stack by rule 759 (line 5596): + $1 = nterm terms (1.10-1.10: ) + -> $$ = nterm opt_terms (1.10-1.10: ) + Stack now 0 2 71 + Entering state 312 + Reducing stack by rule 3 (line 1321): + $1 = nterm top_stmts (1.0-1.10: ) + $2 = nterm opt_terms (1.10-1.10: ) + -> $$ = nterm top_compstmt (1.0-1.10: ) + Stack now 0 2 + Entering state 70 + Reducing stack by rule 2 (line 1295): + $1 = nterm $@1 (1.0-1.0: ) + $2 = nterm top_compstmt (1.0-1.10: ) + vtable_free:12426: p->lvtbl->args(0x0000558453df1a00) + vtable_free:12427: p->lvtbl->vars(0x0000558453df1a60) + cmdarg_stack(pop): 0 at line 12428 + cond_stack(pop): 0 at line 12429 + -> $$ = nterm program (1.0-1.10: ) + Stack now 0 + Entering state 1 + Now at end of input. + Shifting token "end-of-input" (1.10-1.10: ) + Entering state 3 + Stack now 0 1 3 + Cleanup: popping token "end-of-input" (1.10-1.10: ) + Cleanup: popping nterm program (1.0-1.10: ) + ``` + +Additional flags can follow dump items. + +- `+comment`: Add comments to AST. +- `+error-tolerant`: Parse in error-tolerant mode. +- `-optimize`: Disable optimizations for instruction sequences. diff --git a/doc/language/options.md b/doc/language/options.md new file mode 100644 index 0000000000..3421d73f55 --- /dev/null +++ b/doc/language/options.md @@ -0,0 +1,688 @@ +# Ruby Command-Line Options + +## About the Examples + +Some examples here use command-line option `-e`, +which passes the Ruby code to be executed on the command line itself: + +```console +$ ruby -e 'puts "Hello, World."' +``` + +Some examples here assume that file `desiderata.txt` exists: + +```console +$ cat desiderata.txt +Go placidly amid the noise and the haste, +and remember what peace there may be in silence. +As far as possible, without surrender, +be on good terms with all persons. +``` + +## Options + +### `-0`: Set `$/` (Input Record Separator) + +Option `-0` defines the input record separator `$/` +for the invoked Ruby program. + +The optional argument to the option must be octal digits, +each in the range `0..7`; +these digits are prefixed with digit `0` to form an octal value. + +If no argument is given, the input record separator is `0x00`. + +If an argument is given, it must immediately follow the option +(no intervening whitespace or equal-sign character `'='`); +argument values: + +- `0`: the input record separator is `''`; + see {Special Line Separator Values}[rdoc-ref:IO@Special+Line+Separator+Values]. +- In range `(1..0377)`: + the input record separator `$/` is set to the character value of the argument. +- Any other octal value: the input record separator is `nil`. + +Examples: + +```console +$ ruby -0 -e 'p $/' +"\x00" +ruby -00 -e 'p $/' +"" +$ ruby -012 -e 'p $/' +"\n" +$ ruby -015 -e 'p $/' +"\r" +$ ruby -0377 -e 'p $/' +"\xFF" +$ ruby -0400 -e 'p $/' +nil +``` + +See also: + +- {Option -a}[rdoc-ref:@a-3A+Split+Input+Lines+into+Fields]: + Split input lines into fields. +- {Option -F}[rdoc-ref:@F-3A+Set+Input+Field+Separator]: + Set input field separator. +- {Option -l}[rdoc-ref:@l-3A+Set+Output+Record+Separator-3B+Chop+Lines]: + Set output record separator; chop lines. +- {Option -n}[rdoc-ref:@n-3A+Run+Program+in+gets+Loop]: + Run program in `gets` loop. +- {Option -p}[rdoc-ref:@p-3A+-n-2C+with+Printing]: + `-n`, with printing. + +### `-a`: Split Input Lines into Fields + +Option `-a`, when given with either of options `-n` or `-p`, +splits the string at `$_` into an array of strings at `$F`: + +```console +$ ruby -an -e 'p $F' desiderata.txt +["Go", "placidly", "amid", "the", "noise", "and", "the", "haste,"] +["and", "remember", "what", "peace", "there", "may", "be", "in", "silence."] +["As", "far", "as", "possible,", "without", "surrender,"] +["be", "on", "good", "terms", "with", "all", "persons."] +``` + +For the splitting, +the default record separator is `$/`, +and the default field separator is `$;`. + +See also: + +- {Option -0}[rdoc-ref:@0-3A+Set+-24-2F+-28Input+Record+Separator-29]: + Set `$/` (input record separator). +- {Option -F}[rdoc-ref:@F-3A+Set+Input+Field+Separator]: + Set input field separator. +- {Option -l}[rdoc-ref:@l-3A+Set+Output+Record+Separator-3B+Chop+Lines]: + Set output record separator; chop lines. +- {Option -n}[rdoc-ref:@n-3A+Run+Program+in+gets+Loop]: + Run program in `gets` loop. +- {Option -p}[rdoc-ref:@p-3A+-n-2C+with+Printing]: + `-n`, with printing. + +### `-c`: Check Syntax + +Option `-c` specifies that the specified Ruby program +should be checked for syntax, but not actually executed: + +```console +$ ruby -e 'puts "Foo"' +Foo +$ ruby -c -e 'puts "Foo"' +Syntax OK +``` + +### `-C`: Set Working Directory + +The argument to option `-C` specifies a working directory +for the invoked Ruby program; +does not change the working directory for the current process: + +```console +$ basename `pwd` +ruby +$ ruby -C lib -e 'puts File.basename(Dir.pwd)' +lib +$ basename `pwd` +ruby +``` + +Whitespace between the option and its argument may be omitted. + +### `-d`: Set `$DEBUG` to `true` + +Some code in (or called by) the Ruby program may include statements or blocks +conditioned by the global variable `$DEBUG` (e.g., `if $DEBUG`); +these commonly write to `$stdout` or `$stderr`. + +The default value for `$DEBUG` is `false`; +option `-d` sets it to `true`: + +```console +$ ruby -e 'p $DEBUG' +false +$ ruby -d -e 'p $DEBUG' +true +``` + +Option `--debug` is an alias for option `-d`. + +### `-e`: Execute Given Ruby Code + +Option `-e` requires an argument, which is Ruby code to be executed; +the option may be given more than once: + +```console +$ ruby -e 'puts "Foo"' -e 'puts "Bar"' +Foo +Bar +``` + +Whitespace between the option and its argument may be omitted. + +The command may include other options, +but should not include arguments (which, if given, are ignored). + +### `-E`: Set Default Encodings + +Option `-E` requires an argument, which specifies either the default external encoding, +or both the default external and internal encodings for the invoked Ruby program: + +```console +# No option -E. +$ ruby -e 'p [Encoding::default_external, Encoding::default_internal]' +[#<Encoding:UTF-8>, nil] +# Option -E with default external encoding. +$ ruby -E cesu-8 -e 'p [Encoding::default_external, Encoding::default_internal]' +[#<Encoding:CESU-8>, nil] +# Option -E with default external and internal encodings. +$ ruby -E utf-8:cesu-8 -e 'p [Encoding::default_external, Encoding::default_internal]' +[#<Encoding:UTF-8>, #<Encoding:CESU-8>] +``` + +Whitespace between the option and its argument may be omitted. + +See also: + +- {Option --external-encoding}[options_md.html#label--external-encoding-3A+Set+Default+External+Encoding]: + Set default external encoding. +- {Option --internal-encoding}[options_md.html#label--internal-encoding-3A+Set+Default+Internal+Encoding]: + Set default internal encoding. + +Option `--encoding` is an alias for option `-E`. + +### `-F`: Set Input Field Separator + +Option `-F`, when given with option `-a`, +specifies that its argument is to be the input field separator to be used for splitting: + +```console +$ ruby -an -Fs -e 'p $F' desiderata.txt +["Go placidly amid the noi", "e and the ha", "te,\n"] +["and remember what peace there may be in ", "ilence.\n"] +["A", " far a", " po", "", "ible, without ", "urrender,\n"] +["be on good term", " with all per", "on", ".\n"] +``` + +The argument may be a regular expression: + +```console +$ ruby -an -F'[.,]\s*' -e 'p $F' desiderata.txt +["Go placidly amid the noise and the haste"] +["and remember what peace there may be in silence"] +["As far as possible", "without surrender"] +["be on good terms with all persons"] +``` + +The argument must immediately follow the option +(no intervening whitespace or equal-sign character `'='`). + +See also: + +- {Option -0}[rdoc-ref:@0-3A+Set+-24-2F+-28Input+Record+Separator-29]: + Set `$/` (input record separator). +- {Option -a}[rdoc-ref:@a-3A+Split+Input+Lines+into+Fields]: + Split input lines into fields. +- {Option -l}[rdoc-ref:@l-3A+Set+Output+Record+Separator-3B+Chop+Lines]: + Set output record separator; chop lines. +- {Option -n}[rdoc-ref:@n-3A+Run+Program+in+gets+Loop]: + Run program in `gets` loop. +- {Option -p}[rdoc-ref:@p-3A+-n-2C+with+Printing]: + `-n`, with printing. + +### `-h`: Print Short Help Message + +Option `-h` prints a short help message +that includes single-hyphen options (e.g. `-I`), +and largely omits double-hyphen options (e.g., `--version`). + +Arguments and additional options are ignored. + +For a longer help message, use option `--help`. + +### `-i`: Set \ARGF In-Place Mode + +Option `-i` sets the \ARGF in-place mode for the invoked Ruby program; +see ARGF#inplace_mode=: + +```console +$ ruby -e 'p ARGF.inplace_mode' +nil +$ ruby -i -e 'p ARGF.inplace_mode' +"" +$ ruby -i.bak -e 'p ARGF.inplace_mode' +".bak" +``` + +### `-I`: Add to `$LOAD_PATH` + +The argument to option `-I` specifies a directory +to be added to the array in global variable `$LOAD_PATH`; +the option may be given more than once: + +```console +$ pushd /tmp +$ ruby -e 'p $LOAD_PATH.size' +8 +$ ruby -I my_lib -I some_lib -e 'p $LOAD_PATH.size' +10 +$ ruby -I my_lib -I some_lib -e 'p $LOAD_PATH.take(2)' +["/tmp/my_lib", "/tmp/some_lib"] +$ popd +``` + +Whitespace between the option and its argument may be omitted. + +### `-l`: Set Output Record Separator; Chop Lines + +Option `-l`, when given with option `-n` or `-p`, +modifies line-ending processing by: + +- Setting global variable output record separator `$\` + to the current value of input record separator `$/`; + this affects line-oriented output (such a the output from Kernel#puts). +- Calling String#chop! on each line read. + +Without option `-l` (unchopped): + +```console +$ ruby -n -e 'p $_' desiderata.txt +"Go placidly amid the noise and the haste,\n" +"and remember what peace there may be in silence.\n" +"As far as possible, without surrender,\n" +"be on good terms with all persons.\n" +``` + +With option `-l` (chopped): + +```console +$ ruby -ln -e 'p $_' desiderata.txt +"Go placidly amid the noise and the haste," +"and remember what peace there may be in silence." +"As far as possible, without surrender," +"be on good terms with all persons." +``` + +See also: + +- {Option -0}[rdoc-ref:@0-3A+Set+-24-2F+-28Input+Record+Separator-29]: + Set `$/` (input record separator). +- {Option -a}[rdoc-ref:@a-3A+Split+Input+Lines+into+Fields]: + Split input lines into fields. +- {Option -F}[rdoc-ref:@F-3A+Set+Input+Field+Separator]: + Set input field separator. +- {Option -n}[rdoc-ref:@n-3A+Run+Program+in+gets+Loop]: + Run program in `gets` loop. +- {Option -p}[rdoc-ref:@p-3A+-n-2C+with+Printing]: + `-n`, with printing. + +### `-n`: Run Program in `gets` Loop + +Option `-n` runs your program in a `Kernel#gets` loop: + +```ruby +while gets + # Your Ruby code. +end +``` + +Note that `gets` reads the next line and sets global variable `$_` +to the last read line: + +```console +$ ruby -n -e 'puts $_' desiderata.txt +Go placidly amid the noise and the haste, +and remember what peace there may be in silence. +As far as possible, without surrender, +be on good terms with all persons. +``` + +See also: + +- {Option -0}[rdoc-ref:@0-3A+Set+-24-2F+-28Input+Record+Separator-29]: + Set `$/` (input record separator). +- {Option -a}[rdoc-ref:@a-3A+Split+Input+Lines+into+Fields]: + Split input lines into fields. +- {Option -F}[rdoc-ref:@F-3A+Set+Input+Field+Separator]: + Set input field separator. +- {Option -l}[rdoc-ref:@l-3A+Set+Output+Record+Separator-3B+Chop+Lines]: + Set output record separator; chop lines. +- {Option -p}[rdoc-ref:@p-3A+-n-2C+with+Printing]: + `-n`, with printing. + +### `-p`: `-n`, with Printing + +Option `-p` is like option `-n`, but also prints each line: + +```console +$ ruby -p -e 'puts $_.size' desiderata.txt +42 +Go placidly amid the noise and the haste, +49 +and remember what peace there may be in silence. +39 +As far as possible, without surrender, +35 +be on good terms with all persons. +``` + +See also: + +- {Option -0}[rdoc-ref:@0-3A+Set+-24-2F+-28Input+Record+Separator-29]: + Set `$/` (input record separator). +- {Option -a}[rdoc-ref:@a-3A+Split+Input+Lines+into+Fields]: + Split input lines into fields. +- {Option -F}[rdoc-ref:@F-3A+Set+Input+Field+Separator]: + Set input field separator. +- {Option -l}[rdoc-ref:@l-3A+Set+Output+Record+Separator-3B+Chop+Lines]: + Set output record separator; chop lines. +- {Option -n}[rdoc-ref:@n-3A+Run+Program+in+gets+Loop]: + Run program in `gets` loop. + +### `-r`: Require Library + +The argument to option `-r` specifies a library to be required +before executing the Ruby program; +the option may be given more than once: + +```console +$ ruby -e 'p defined?(JSON); p defined?(CSV)' +nil +nil +$ ruby -r CSV -r JSON -e 'p defined?(JSON); p defined?(CSV)' +"constant" +"constant" +``` + +Whitespace between the option and its argument may be omitted. + +### `-s`: Define Global Variable + +Option `-s` specifies that a "custom option" is to define a global variable +in the invoked Ruby program: + +- The custom option must appear _after_ the program name. +- The custom option must begin with single hyphen (e.g., `-foo`), + not two hyphens (e.g., `--foo`). +- The name of the global variable is based on the option name: + global variable `$foo` for custom option`-foo`. +- The value of the global variable is the string option argument if given, + `true` otherwise. + +More than one custom option may be given: + +```console +$ cat t.rb +p [$foo, $bar] +$ ruby t.rb +[nil, nil] +$ ruby -s t.rb -foo=baz +["baz", nil] +$ ruby -s t.rb -foo +[true, nil] +$ ruby -s t.rb -foo=baz -bar=bat +["baz", "bat"] +``` + +The option may not be used with +{option -e}[rdoc-ref:@e-3A+Execute+Given+Ruby+Code] + +### `-S`: Search Directories in `ENV['PATH']` + +Option `-S` specifies that the Ruby interpreter +is to search (if necessary) the directories whose paths are in the program's +`PATH` environment variable; +the program is executed in the shell's current working directory +(not necessarily in the directory where the program is found). + +This example uses adds path `'tmp/'` to the `PATH` environment variable: + +```console +$ export PATH=/tmp:$PATH +$ echo "puts File.basename(Dir.pwd)" > /tmp/t.rb +$ ruby -S t.rb +ruby +``` + +### `-v`: Print Version; Set `$VERBOSE` + +Options `-v` prints the Ruby version and sets global variable `$VERBOSE`: + +```console +$ ruby -e 'p $VERBOSE' +false +$ ruby -v -e 'p $VERBOSE' +ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x64-mingw-ucrt] +true +``` + +### `-w`: Synonym for `-W1` + +Option `-w` (lowercase letter) is equivalent to option `-W1` (uppercase letter). + +### `-W`: Set \Warning Policy + +Any Ruby code can create a <i>warning message</i> by calling method Kernel#warn; +methods in the Ruby core and standard libraries can also create warning messages. +Such a message may be printed on `$stderr` +(or not, depending on certain settings). + +Option `-W` helps determine whether a particular warning message +will be written, +by setting the initial value of global variable `$-W`: + +- `-W0`: Sets `$-W` to `0` (silent; no warnings). +- `-W1`: Sets `$-W` to `1` (moderate verbosity). +- `-W2`: Sets `$-W` to `2` (high verbosity). +- `-W`: Same as `-W2` (high verbosity). +- Option not given: Same as `-W1` (moderate verbosity). + +The value of `$-W`, in turn, determines which warning messages (if any) +are to be printed to `$stdout` (see Kernel#warn): + +```console +$ ruby -W1 -e 'p $foo' +nil +$ ruby -W2 -e 'p $foo' +-e:1: warning: global variable '$foo' not initialized +nil +``` + +Ruby code may also define warnings for certain categories; +these are the default settings for the defined categories: + +```rb +Warning[:experimental] # => true +Warning[:deprecated] # => false +Warning[:performance] # => false +``` + +They may also be set: + +```rb +Warning[:experimental] = false +Warning[:deprecated] = true +Warning[:performance] = true +``` + +You can suppress a category by prefixing `no-` to the category name: + +```console +$ ruby -W:no-experimental -e 'p IO::Buffer.new' +#<IO::Buffer> +``` + +### `-x`: Execute Ruby Code Found in Text + +Option `-x` executes a Ruby program whose code is embedded +in other, non-code, text: + +The ruby code: + +- Begins after the first line beginning with `'#!` and containing string `'ruby'`. +- Ends before any one of: + + - End-of-file. + - A line consisting of `'__END__'`, + - Character `Ctrl-D` or `Ctrl-Z`. + +Example: + +```console +$ cat t.txt +Leading garbage. +#!ruby +puts File.basename(Dir.pwd) +__END__ +Trailing garbage. + +$ ruby -x t.txt +ruby +``` + +The optional argument specifies the directory where the text file +is to be found; +the Ruby code is executed in that directory: + +```console +$ cp t.txt /tmp/ +$ ruby -x/tmp t.txt +tmp +$ + +``` + +If an argument is given, it must immediately follow the option +(no intervening whitespace or equal-sign character `'='`). + +### `--backtrace-limit`: Set Backtrace Limit + +Option `--backtrace-limit` sets a limit on the number of entries +to be displayed in a backtrace. + +See Thread::Backtrace.limit. + +### `--copyright`: Print Ruby Copyright + +Option `--copyright` prints a copyright message: + +```console +$ ruby --copyright +ruby - Copyright (C) 1993-2024 Yukihiro Matsumoto +``` + +### `--debug`: Alias for `-d` + +Option `--debug` is an alias for +{option -d}[rdoc-ref:@d-3A+Set+-24DEBUG+to+true]. + +### `--disable`: Disable Features + +Option `--disable` specifies features to be disabled; +the argument is a comma-separated list of the features to be disabled: + +```sh +ruby --disable=gems,rubyopt t.rb +``` + +The supported features: + +- `gems`: Rubygems (default: enabled). +- `did_you_mean`: [`did_you_mean`](https://github.com/ruby/did_you_mean) (default: enabled). +- `rubyopt`: `RUBYOPT` environment variable (default: enabled). +- `frozen-string-literal`: Freeze all string literals (default: disabled). +- `jit`: JIT compiler (default: disabled). + +See also {option --enable}[options_md.html#label--enable-3A+Enable+Features]. + +### `--dump`: Dump Items + +Option `--dump` specifies items to be dumped; +the argument is a comma-separated list of the items. + +Some of the argument values cause the command to behave as if a different +option was given: + +- `--dump=copyright`: + Same as {option \-\-copyright}[options_md.html#label--copyright-3A+Print+Ruby+Copyright]. +- `--dump=help`: + Same as {option \-\-help}[options_md.html#label--help-3A+Print+Help+Message]. +- `--dump=syntax`: + Same as {option -c}[rdoc-ref:@c-3A+Check+Syntax]. +- `--dump=usage`: + Same as {option -h}[rdoc-ref:@h-3A+Print+Short+Help+Message]. +- `--dump=version`: + Same as {option \-\-version}[options_md.html#label--version-3A+Print+Ruby+Version]. + +For other argument values and examples, +see {Option --dump}[option_dump_md.html]. + +### `--enable`: Enable Features + +Option `--enable` specifies features to be enabled; +the argument is a comma-separated list of the features to be enabled. + +```sh +ruby --enable=gems,rubyopt t.rb +``` + +For the features, +see {option --disable}[options_md.html#label--disable-3A+Disable+Features]. + +### `--encoding`: Alias for `-E`. + +Option `--encoding` is an alias for +{option -E}[rdoc-ref:@E-3A+Set+Default+Encodings]. + +### `--external-encoding`: Set Default External \Encoding + +Option `--external-encoding` +sets the default external encoding for the invoked Ruby program; +for values of `encoding`, +see {Encoding: Names and Aliases}[rdoc-ref:encodings.rdoc@Names+and+Aliases]. + +```console +$ ruby -e 'puts Encoding::default_external' +UTF-8 +$ ruby --external-encoding=cesu-8 -e 'puts Encoding::default_external' +CESU-8 +``` + +### `--help`: Print Help Message + +Option `--help` prints a long help message. + +Arguments and additional options are ignored. + +For a shorter help message, use option `-h`. + +### `--internal-encoding`: Set Default Internal \Encoding + +Option `--internal-encoding` +sets the default internal encoding for the invoked Ruby program; +for values of `encoding`, +see {Encoding: Names and Aliases}[rdoc-ref:encodings.rdoc@Names+and+Aliases]. + +```console +$ ruby -e 'puts Encoding::default_internal.nil?' +true +$ ruby --internal-encoding=cesu-8 -e 'puts Encoding::default_internal' +CESU-8 +``` + +### `--jit` + +Option `--jit` is an alias for option `--yjit`, which enables YJIT; +see additional YJIT options in the [YJIT documentation](rdoc-ref:jit/yjit.md). + +### `--verbose`: Set `$VERBOSE` + +Option `--verbose` sets global variable `$VERBOSE` to `true` +and disables input from `$stdin`. + +### `--version`: Print Ruby Version + +Option `--version` prints the version of the Ruby interpreter, then exits. + diff --git a/doc/language/packed_data.rdoc b/doc/language/packed_data.rdoc new file mode 100644 index 0000000000..597db5139f --- /dev/null +++ b/doc/language/packed_data.rdoc @@ -0,0 +1,722 @@ += Packed \Data + +== Quick Reference + +These tables summarize the directives for packing and unpacking. + +=== For Integers + + Directive | Meaning + --------------|--------------------------------------------------------------- + C | 8-bit unsigned (unsigned char) + S | 16-bit unsigned, native endian (uint16_t) + L | 32-bit unsigned, native endian (uint32_t) + Q | 64-bit unsigned, native endian (uint64_t) + J | pointer width unsigned, native endian (uintptr_t) + + c | 8-bit signed (signed char) + s | 16-bit signed, native endian (int16_t) + l | 32-bit signed, native endian (int32_t) + q | 64-bit signed, native endian (int64_t) + j | pointer width signed, native endian (intptr_t) + + S_ S! | unsigned short, native endian + I I_ I! | unsigned int, native endian + L_ L! | unsigned long, native endian + Q_ Q! | unsigned long long, native endian + | (raises ArgumentError if the platform has no long long type) + J! | uintptr_t, native endian (same with J) + + s_ s! | signed short, native endian + i i_ i! | signed int, native endian + l_ l! | signed long, native endian + q_ q! | signed long long, native endian + | (raises ArgumentError if the platform has no long long type) + j! | intptr_t, native endian (same with j) + + S> s> S!> s!> | each the same as the directive without >, but big endian + L> l> L!> l!> | S> is the same as n + I!> i!> | L> is the same as N + Q> q> Q!> q!> | + J> j> J!> j!> | + + S< s< S!< s!< | each the same as the directive without <, but little endian + L< l< L!< l!< | S< is the same as v + I!< i!< | L< is the same as V + Q< q< Q!< q!< | + J< j< J!< j!< | + + n | 16-bit unsigned, network (big-endian) byte order + N | 32-bit unsigned, network (big-endian) byte order + v | 16-bit unsigned, VAX (little-endian) byte order + V | 32-bit unsigned, VAX (little-endian) byte order + + U | UTF-8 character + w | BER-compressed integer + +=== For Floats + + Directive | Meaning + ----------|-------------------------------------------------- + D d | double-precision, native format + F f | single-precision, native format + E | double-precision, little-endian byte order + e | single-precision, little-endian byte order + G | double-precision, network (big-endian) byte order + g | single-precision, network (big-endian) byte order + +=== For Strings + + Directive | Meaning + ----------|----------------------------------------------------------------- + A | arbitrary binary string (remove trailing nulls and ASCII spaces) + a | arbitrary binary string + Z | null-terminated string + B | bit string (MSB first) + b | bit string (LSB first) + H | hex string (high nibble first) + h | hex string (low nibble first) + u | UU-encoded string + M | quoted-printable, MIME encoding (see RFC2045) + m | base64 encoded string (RFC 2045) (default) + | (base64 encoded string (RFC 4648) if followed by 0) + P | pointer to a structure (fixed-length string) + p | pointer to a null-terminated string + +=== Additional Directives for Packing + + Directive | Meaning + ----------|---------------------------------------------------------------- + @ | moves to absolute position + X | back up a byte + x | null byte + +=== Additional Directives for Unpacking + + Directive | Meaning + ----------|---------------------------------------------------------------- + @ | skip to the offset given by the length argument + X | skip backward one byte + x | skip forward one byte + +== Packing and Unpacking + +Certain Ruby core methods deal with packing and unpacking data: + +- Method Array#pack: + Formats each element in array +self+ into a binary string; + returns that string. +- Method String#unpack: + Extracts data from string +self+, + forming objects that become the elements of a new array; + returns that array. +- Method String#unpack1: + Does the same, but unpacks and returns only the first extracted object. + +Each of these methods accepts a string +template+, +consisting of zero or more _directive_ characters, +each followed by zero or more _modifier_ characters. + +Examples (directive <tt>'C'</tt> specifies 'unsigned character'): + + [65].pack('C') # => "A" # One element, one directive. + [65, 66].pack('CC') # => "AB" # Two elements, two directives. + [65, 66].pack('C') # => "A" # Extra element is ignored. + [65].pack('') # => "" # No directives. + [65].pack('CC') # Extra directive raises ArgumentError. + + 'A'.unpack('C') # => [65] # One character, one directive. + 'AB'.unpack('CC') # => [65, 66] # Two characters, two directives. + 'AB'.unpack('C') # => [65] # Extra character is ignored. + 'A'.unpack('CC') # => [65, nil] # Extra directive generates nil. + 'AB'.unpack('') # => [] # No directives. + +The string +template+ may contain any mixture of valid directives +(directive <tt>'c'</tt> specifies 'signed character'): + + [65, -1].pack('cC') # => "A\xFF" + "A\xFF".unpack('cC') # => [65, 255] + +The string +template+ may contain whitespace (which is ignored) +and comments, each of which begins with character <tt>'#'</tt> +and continues up to and including the next following newline: + + [0,1].pack(" C #foo \n C ") # => "\x00\x01" + "\0\1".unpack(" C #foo \n C ") # => [0, 1] + +Any directive may be followed by either of these modifiers: + +- <tt>'*'</tt> - The directive is to be applied as many times as needed: + + [65, 66].pack('C*') # => "AB" + 'AB'.unpack('C*') # => [65, 66] + +- \Integer +count+ - The directive is to be applied +count+ times: + + [65, 66].pack('C2') # => "AB" + [65, 66].pack('C3') # Raises ArgumentError. + 'AB'.unpack('C2') # => [65, 66] + 'AB'.unpack('C3') # => [65, 66, nil] + + Note: Directives in <tt>%w[A a Z m]</tt> use +count+ differently; + see {\String Directives}[rdoc-ref:@String+Directives]. + +If elements don't fit the provided directive, only least significant bits are encoded: + + [257].pack("C").unpack("C") # => [1] + +== Packing Method + +Method Array#pack accepts optional keyword argument ++buffer+ that specifies the target string (instead of a new string): + + [65, 66].pack('C*', buffer: 'foo') # => "fooAB" + +The method can accept a block: + + # Packed string is passed to the block. + [65, 66].pack('C*') {|s| p s } # => "AB" + +== Unpacking Methods + +Methods String#unpack and String#unpack1 each accept +an optional keyword argument +offset+ that specifies an offset +into the string: + + 'ABC'.unpack('C*', offset: 1) # => [66, 67] + 'ABC'.unpack1('C*', offset: 1) # => 66 + +Both methods can accept a block: + + # Each unpacked object is passed to the block. + ret = [] + "ABCD".unpack("C*") {|c| ret << c } + ret # => [65, 66, 67, 68] + + # The single unpacked object is passed to the block. + 'AB'.unpack1('C*') {|ele| p ele } # => 65 + +== \Integer Directives + +Each integer directive specifies the packing or unpacking +for one element in the input or output array. + +=== 8-Bit \Integer Directives + +- <tt>'c'</tt> - 8-bit signed integer + (like C <tt>signed char</tt>): + + [0, 1, 255].pack('c*') # => "\x00\x01\xFF" + s = [0, 1, -1].pack('c*') # => "\x00\x01\xFF" + s.unpack('c*') # => [0, 1, -1] + +- <tt>'C'</tt> - 8-bit unsigned integer + (like C <tt>unsigned char</tt>): + + [0, 1, 255].pack('C*') # => "\x00\x01\xFF" + s = [0, 1, -1].pack('C*') # => "\x00\x01\xFF" + s.unpack('C*') # => [0, 1, 255] + +=== 16-Bit \Integer Directives + +- <tt>'s'</tt> - 16-bit signed integer, native-endian + (like C <tt>int16_t</tt>): + + [513, -514].pack('s*') # => "\x01\x02\xFE\xFD" + s = [513, 65022].pack('s*') # => "\x01\x02\xFE\xFD" + s.unpack('s*') # => [513, -514] + +- <tt>'S'</tt> - 16-bit unsigned integer, native-endian + (like C <tt>uint16_t</tt>): + + [513, -514].pack('S*') # => "\x01\x02\xFE\xFD" + s = [513, 65022].pack('S*') # => "\x01\x02\xFE\xFD" + s.unpack('S*') # => [513, 65022] + +- <tt>'n'</tt> - 16-bit network integer, big-endian: + + s = [0, 1, -1, 32767, -32768, 65535].pack('n*') + # => "\x00\x00\x00\x01\xFF\xFF\x7F\xFF\x80\x00\xFF\xFF" + s.unpack('n*') + # => [0, 1, 65535, 32767, 32768, 65535] + +- <tt>'v'</tt> - 16-bit VAX integer, little-endian: + + s = [0, 1, -1, 32767, -32768, 65535].pack('v*') + # => "\x00\x00\x01\x00\xFF\xFF\xFF\x7F\x00\x80\xFF\xFF" + s.unpack('v*') + # => [0, 1, 65535, 32767, 32768, 65535] + +=== 32-Bit \Integer Directives + +- <tt>'l'</tt> - 32-bit signed integer, native-endian + (like C <tt>int32_t</tt>): + + s = [67305985, -50462977].pack('l*') + # => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC" + s.unpack('l*') + # => [67305985, -50462977] + +- <tt>'L'</tt> - 32-bit unsigned integer, native-endian + (like C <tt>uint32_t</tt>): + + s = [67305985, 4244504319].pack('L*') + # => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC" + s.unpack('L*') + # => [67305985, 4244504319] + +- <tt>'N'</tt> - 32-bit network integer, big-endian: + + s = [0,1,-1].pack('N*') + # => "\x00\x00\x00\x00\x00\x00\x00\x01\xFF\xFF\xFF\xFF" + s.unpack('N*') + # => [0, 1, 4294967295] + +- <tt>'V'</tt> - 32-bit VAX integer, little-endian: + + s = [0,1,-1].pack('V*') + # => "\x00\x00\x00\x00\x01\x00\x00\x00\xFF\xFF\xFF\xFF" + s.unpack('v*') + # => [0, 0, 1, 0, 65535, 65535] + +=== 64-Bit \Integer Directives + +- <tt>'q'</tt> - 64-bit signed integer, native-endian + (like C <tt>int64_t</tt>): + + s = [578437695752307201, -506097522914230529].pack('q*') + # => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8" + s.unpack('q*') + # => [578437695752307201, -506097522914230529] + +- <tt>'Q'</tt> - 64-bit unsigned integer, native-endian + (like C <tt>uint64_t</tt>): + + s = [578437695752307201, 17940646550795321087].pack('Q*') + # => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8" + s.unpack('Q*') + # => [578437695752307201, 17940646550795321087] + +=== Platform-Dependent \Integer Directives + +- <tt>'i'</tt> - Platform-dependent width signed integer, + native-endian (like C <tt>int</tt>): + + s = [67305985, -50462977].pack('i*') + # => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC" + s.unpack('i*') + # => [67305985, -50462977] + +- <tt>'I'</tt> - Platform-dependent width unsigned integer, + native-endian (like C <tt>unsigned int</tt>): + + s = [67305985, -50462977].pack('I*') + # => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC" + s.unpack('I*') + # => [67305985, 4244504319] + +- <tt>'j'</tt> - Pointer-width signed integer, native-endian + (like C <tt>intptr_t</tt>): + + s = [67305985, -50462977].pack('j*') + # => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\xFF\xFF\xFF\xFF" + s.unpack('j*') + # => [67305985, -50462977] + +- <tt>'J'</tt> - Pointer-width unsigned integer, native-endian + (like C <tt>uintptr_t</tt>): + + s = [67305985, 4244504319].pack('J*') + # => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\x00\x00\x00\x00" + s.unpack('J*') + # => [67305985, 4244504319] + +=== Other \Integer Directives + +- <tt>'U'</tt> - UTF-8 character: + + s = [4194304].pack('U*') + # => "\xF8\x90\x80\x80\x80" + s.unpack('U*') + # => [4194304] + +- <tt>'r'</tt> - Signed LEB128-encoded integer + (see {Signed LEB128}[https://en.wikipedia.org/wiki/LEB128#Signed_LEB128]) + + s = [1, 127, -128, 16383, -16384].pack("r*") + # => "\x01\xFF\x00\x80\x7F\xFF\xFF\x00\x80\x80\x7F" + s.unpack('r*') + # => [1, 127, -128, 16383, -16384] + +- <tt>'R'</tt> - Unsigned LEB128-encoded integer + (see {Unsigned LEB128}[https://en.wikipedia.org/wiki/LEB128#Unsigned_LEB128]) + + s = [1, 127, 128, 16383, 16384].pack("R*") + # => "\x01\x7F\x80\x01\xFF\x7F\x80\x80\x01" + s.unpack('R*') + # => [1, 127, 128, 16383, 16384] + +- <tt>'w'</tt> - BER-encoded integer + (see {BER encoding}[https://en.wikipedia.org/wiki/X.690#BER_encoding]): + + s = [1073741823].pack('w*') + # => "\x83\xFF\xFF\xFF\x7F" + s.unpack('w*') + # => [1073741823] + +=== Modifiers for \Integer Directives + +For the following directives, <tt>'!'</tt> or <tt>'_'</tt> modifiers may be +suffixed as underlying platform’s native size. + +- <tt>'i'</tt>, <tt>'I'</tt> - C <tt>int</tt>, always native size. +- <tt>'s'</tt>, <tt>'S'</tt> - C <tt>short</tt>. +- <tt>'l'</tt>, <tt>'L'</tt> - C <tt>long</tt>. +- <tt>'q'</tt>, <tt>'Q'</tt> - C <tt>long long</tt>, if available. +- <tt>'j'</tt>, <tt>'J'</tt> - C <tt>intptr_t</tt>, always native size. + +Native size modifiers are silently ignored for always native size directives. + +The endian modifiers also may be suffixed in the directives above: + +- <tt>'>'</tt> - Big-endian. +- <tt>'<'</tt> - Little-endian. + +== \Float Directives + +Each float directive specifies the packing or unpacking +for one element in the input or output array. + +=== Single-Precision \Float Directives + +- <tt>'F'</tt> or <tt>'f'</tt> - Native format: + + s = [3.0].pack('F') # => "\x00\x00@@" + s.unpack('F') # => [3.0] + +- <tt>'e'</tt> - Little-endian: + + s = [3.0].pack('e') # => "\x00\x00@@" + s.unpack('e') # => [3.0] + +- <tt>'g'</tt> - Big-endian: + + s = [3.0].pack('g') # => "@@\x00\x00" + s.unpack('g') # => [3.0] + +=== Double-Precision \Float Directives + +- <tt>'D'</tt> or <tt>'d'</tt> - Native format: + + s = [3.0].pack('D') # => "\x00\x00\x00\x00\x00\x00\b@" + s.unpack('D') # => [3.0] + +- <tt>'E'</tt> - Little-endian: + + s = [3.0].pack('E') # => "\x00\x00\x00\x00\x00\x00\b@" + s.unpack('E') # => [3.0] + +- <tt>'G'</tt> - Big-endian: + + s = [3.0].pack('G') # => "@\b\x00\x00\x00\x00\x00\x00" + s.unpack('G') # => [3.0] + +A float directive may be infinity or not-a-number: + + inf = 1.0/0.0 # => Infinity + [inf].pack('f') # => "\x00\x00\x80\x7F" + "\x00\x00\x80\x7F".unpack('f') # => [Infinity] + + nan = inf/inf # => NaN + [nan].pack('f') # => "\x00\x00\xC0\x7F" + "\x00\x00\xC0\x7F".unpack('f') # => [NaN] + +== \String Directives + +Each string directive specifies the packing or unpacking +for one byte in the input or output string. + +=== Binary \String Directives + +- <tt>'A'</tt> - Arbitrary binary string (space padded; count is width); + +nil+ is treated as the empty string: + + ['foo'].pack('A') # => "f" + ['foo'].pack('A*') # => "foo" + ['foo'].pack('A2') # => "fo" + ['foo'].pack('A4') # => "foo " + [nil].pack('A') # => " " + [nil].pack('A*') # => "" + [nil].pack('A2') # => " " + [nil].pack('A4') # => " " + + "foo\0".unpack('A') # => ["f"] + "foo\0".unpack('A4') # => ["foo"] + "foo\0bar".unpack('A10') # => ["foo\x00bar"] # Reads past "\0". + "foo ".unpack('A') # => ["f"] + "foo ".unpack('A4') # => ["foo"] + "foo".unpack('A4') # => ["foo"] + + japanese = 'ã“ã‚“ã«ã¡ã¯' + japanese.size # => 5 + japanese.bytesize # => 15 + [japanese].pack('A') # => "\xE3" + [japanese].pack('A*') # => "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF" + japanese.unpack('A') # => ["\xE3"] + japanese.unpack('A2') # => ["\xE3\x81"] + japanese.unpack('A4') # => ["\xE3\x81\x93\xE3"] + japanese.unpack('A*') # => ["\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF"] + +- <tt>'a'</tt> - Arbitrary binary string (null padded; count is width): + + ["foo"].pack('a') # => "f" + ["foo"].pack('a*') # => "foo" + ["foo"].pack('a2') # => "fo" + ["foo\0"].pack('a4') # => "foo\x00" + [nil].pack('a') # => "\x00" + [nil].pack('a*') # => "" + [nil].pack('a2') # => "\x00\x00" + [nil].pack('a4') # => "\x00\x00\x00\x00" + + "foo\0".unpack('a') # => ["f"] + "foo\0".unpack('a4') # => ["foo\x00"] + "foo ".unpack('a4') # => ["foo "] + "foo".unpack('a4') # => ["foo"] + "foo\0bar".unpack('a4') # => ["foo\x00"] # Reads past "\0". + +- <tt>'Z'</tt> - Same as <tt>'a'</tt>, + except that null is added or ignored with <tt>'*'</tt>: + + ["foo"].pack('Z*') # => "foo\x00" + [nil].pack('Z*') # => "\x00" + + "foo\0".unpack('Z*') # => ["foo"] + "foo".unpack('Z*') # => ["foo"] + "foo\0bar".unpack('Z*') # => ["foo"] # Does not read past "\0". + +=== Bit \String Directives + +- <tt>'B'</tt> - Bit string (high byte first): + + ['11111111' + '00000000'].pack('B*') # => "\xFF\x00" + ['10000000' + '01000000'].pack('B*') # => "\x80@" + + ['1'].pack('B0') # => "" + ['1'].pack('B1') # => "\x80" + ['1'].pack('B2') # => "\x80\x00" + ['1'].pack('B3') # => "\x80\x00" + ['1'].pack('B4') # => "\x80\x00\x00" + ['1'].pack('B5') # => "\x80\x00\x00" + ['1'].pack('B6') # => "\x80\x00\x00\x00" + + "\xff\x00".unpack("B*") # => ["1111111100000000"] + "\x01\x02".unpack("B*") # => ["0000000100000010"] + + "".unpack("B0") # => [""] + "\x80".unpack("B1") # => ["1"] + "\x80".unpack("B2") # => ["10"] + "\x80".unpack("B3") # => ["100"] + +- <tt>'b'</tt> - Bit string (low byte first): + + ['11111111' + '00000000'].pack('b*') # => "\xFF\x00" + ['10000000' + '01000000'].pack('b*') # => "\x01\x02" + + ['1'].pack('b0') # => "" + ['1'].pack('b1') # => "\x01" + ['1'].pack('b2') # => "\x01\x00" + ['1'].pack('b3') # => "\x01\x00" + ['1'].pack('b4') # => "\x01\x00\x00" + ['1'].pack('b5') # => "\x01\x00\x00" + ['1'].pack('b6') # => "\x01\x00\x00\x00" + + "\xff\x00".unpack("b*") # => ["1111111100000000"] + "\x01\x02".unpack("b*") # => ["1000000001000000"] + + "".unpack("b0") # => [""] + "\x01".unpack("b1") # => ["1"] + "\x01".unpack("b2") # => ["10"] + "\x01".unpack("b3") # => ["100"] + +=== Hex \String Directives + +- <tt>'H'</tt> - Hex string (high nibble first): + + ['10ef'].pack('H*') # => "\x10\xEF" + ['10ef'].pack('H0') # => "" + ['10ef'].pack('H3') # => "\x10\xE0" + ['10ef'].pack('H5') # => "\x10\xEF\x00" + + ['fff'].pack('H3') # => "\xFF\xF0" + ['fff'].pack('H4') # => "\xFF\xF0" + ['fff'].pack('H5') # => "\xFF\xF0\x00" + ['fff'].pack('H6') # => "\xFF\xF0\x00" + ['fff'].pack('H7') # => "\xFF\xF0\x00\x00" + ['fff'].pack('H8') # => "\xFF\xF0\x00\x00" + + "\x10\xef".unpack('H*') # => ["10ef"] + "\x10\xef".unpack('H0') # => [""] + "\x10\xef".unpack('H1') # => ["1"] + "\x10\xef".unpack('H2') # => ["10"] + "\x10\xef".unpack('H3') # => ["10e"] + "\x10\xef".unpack('H4') # => ["10ef"] + "\x10\xef".unpack('H5') # => ["10ef"] + +- <tt>'h'</tt> - Hex string (low nibble first): + + ['10ef'].pack('h*') # => "\x01\xFE" + ['10ef'].pack('h0') # => "" + ['10ef'].pack('h3') # => "\x01\x0E" + ['10ef'].pack('h5') # => "\x01\xFE\x00" + + ['fff'].pack('h3') # => "\xFF\x0F" + ['fff'].pack('h4') # => "\xFF\x0F" + ['fff'].pack('h5') # => "\xFF\x0F\x00" + ['fff'].pack('h6') # => "\xFF\x0F\x00" + ['fff'].pack('h7') # => "\xFF\x0F\x00\x00" + ['fff'].pack('h8') # => "\xFF\x0F\x00\x00" + + "\x01\xfe".unpack('h*') # => ["10ef"] + "\x01\xfe".unpack('h0') # => [""] + "\x01\xfe".unpack('h1') # => ["1"] + "\x01\xfe".unpack('h2') # => ["10"] + "\x01\xfe".unpack('h3') # => ["10e"] + "\x01\xfe".unpack('h4') # => ["10ef"] + "\x01\xfe".unpack('h5') # => ["10ef"] + +=== Pointer \String Directives + +- <tt>'P'</tt> - Pointer to a structure (fixed-length string): + + s = ['abc'].pack('P') # => "\xE0O\x7F\xE5\xA1\x01\x00\x00" + s.unpack('P*') # => ["abc"] + ".".unpack("P") # => [] + ("\0" * 8).unpack("P") # => [nil] + [nil].pack("P") # => "\x00\x00\x00\x00\x00\x00\x00\x00" + +- <tt>'p'</tt> - Pointer to a null-terminated string: + + s = ['abc'].pack('p') # => "(\xE4u\xE5\xA1\x01\x00\x00" + s.unpack('p*') # => ["abc"] + ".".unpack("p") # => [] + ("\0" * 8).unpack("p") # => [nil] + [nil].pack("p") # => "\x00\x00\x00\x00\x00\x00\x00\x00" + +=== Other \String Directives + +- <tt>'M'</tt> - Quoted printable, MIME encoding; + text mode, but input must use LF and output LF; + (see {RFC 2045}[https://www.ietf.org/rfc/rfc2045.txt]): + + ["a b c\td \ne"].pack('M') # => "a b c\td =\n\ne=\n" + ["\0"].pack('M') # => "=00=\n" + + ["a"*1023].pack('M') == ("a"*73+"=\n")*14+"a=\n" # => true + ("a"*73+"=\na=\n").unpack('M') == ["a"*74] # => true + (("a"*73+"=\n")*14+"a=\n").unpack('M') == ["a"*1023] # => true + + "a b c\td =\n\ne=\n".unpack('M') # => ["a b c\td \ne"] + "=00=\n".unpack('M') # => ["\x00"] + + "pre=31=32=33after".unpack('M') # => ["pre123after"] + "pre=\nafter".unpack('M') # => ["preafter"] + "pre=\r\nafter".unpack('M') # => ["preafter"] + "pre=".unpack('M') # => ["pre="] + "pre=\r".unpack('M') # => ["pre=\r"] + "pre=hoge".unpack('M') # => ["pre=hoge"] + "pre==31after".unpack('M') # => ["pre==31after"] + "pre===31after".unpack('M') # => ["pre===31after"] + +- <tt>'m'</tt> - Base64 encoded string; + count specifies input bytes between each newline, + rounded down to nearest multiple of 3; + if count is zero, no newlines are added; + (see {RFC 4648}[https://www.ietf.org/rfc/rfc4648.txt]): + + [""].pack('m') # => "" + ["\0"].pack('m') # => "AA==\n" + ["\0\0"].pack('m') # => "AAA=\n" + ["\0\0\0"].pack('m') # => "AAAA\n" + ["\377"].pack('m') # => "/w==\n" + ["\377\377"].pack('m') # => "//8=\n" + ["\377\377\377"].pack('m') # => "////\n" + + "".unpack('m') # => [""] + "AA==\n".unpack('m') # => ["\x00"] + "AAA=\n".unpack('m') # => ["\x00\x00"] + "AAAA\n".unpack('m') # => ["\x00\x00\x00"] + "/w==\n".unpack('m') # => ["\xFF"] + "//8=\n".unpack('m') # => ["\xFF\xFF"] + "////\n".unpack('m') # => ["\xFF\xFF\xFF"] + "A\n".unpack('m') # => [""] + "AA\n".unpack('m') # => ["\x00"] + "AA=\n".unpack('m') # => ["\x00"] + "AAA\n".unpack('m') # => ["\x00\x00"] + + [""].pack('m0') # => "" + ["\0"].pack('m0') # => "AA==" + ["\0\0"].pack('m0') # => "AAA=" + ["\0\0\0"].pack('m0') # => "AAAA" + ["\377"].pack('m0') # => "/w==" + ["\377\377"].pack('m0') # => "//8=" + ["\377\377\377"].pack('m0') # => "////" + + "".unpack('m0') # => [""] + "AA==".unpack('m0') # => ["\x00"] + "AAA=".unpack('m0') # => ["\x00\x00"] + "AAAA".unpack('m0') # => ["\x00\x00\x00"] + "/w==".unpack('m0') # => ["\xFF"] + "//8=".unpack('m0') # => ["\xFF\xFF"] + "////".unpack('m0') # => ["\xFF\xFF\xFF"] + +- <tt>'u'</tt> - UU-encoded string: + + [""].pack("u") # => "" + ["a"].pack("u") # => "!80``\n" + ["aaa"].pack("u") # => "#86%A\n" + + "".unpack("u") # => [""] + "#86)C\n".unpack("u") # => ["abc"] + +== Offset Directives + +- <tt>'@'</tt> - Begin packing at the given byte offset; + for packing, null fill or shrink if necessary: + + [1, 2].pack("C@0C") # => "\x02" + [1, 2].pack("C@1C") # => "\x01\x02" + [1, 2].pack("C@5C") # => "\x01\x00\x00\x00\x00\x02" + [*1..5].pack("CCCC@2C") # => "\x01\x02\x05" + + For unpacking, cannot to move to outside the string: + + "\x01\x00\x00\x02".unpack("C@3C") # => [1, 2] + "\x00".unpack("@1C") # => [nil] + "\x00".unpack("@2C") # Raises ArgumentError. + +- <tt>'X'</tt> - For packing, shrink for the given byte offset: + + [0, 1, 2].pack("CCXC") # => "\x00\x02" + [0, 1, 2].pack("CCX2C") # => "\x02" + + For unpacking; rewind unpacking position for the given byte offset: + + "\x00\x02".unpack("CCXC") # => [0, 2, 2] + + Cannot to move to outside the string: + + [0, 1, 2].pack("CCX3C") # Raises ArgumentError. + "\x00\x02".unpack("CX3C") # Raises ArgumentError. + +- <tt>'x'</tt> - Begin packing at after the given byte offset; + for packing, null fill if necessary: + + [].pack("x0") # => "" + [].pack("x") # => "\x00" + [].pack("x8") # => "\x00\x00\x00\x00\x00\x00\x00\x00" + + For unpacking, cannot to move to outside the string: + + "\x00\x00\x02".unpack("CxC") # => [0, 2] + "\x00\x00\x02".unpack("x3C") # => [nil] + "\x00\x00\x02".unpack("x4C") # Raises ArgumentError diff --git a/doc/language/ractor.md b/doc/language/ractor.md new file mode 100644 index 0000000000..72fbde6e5a --- /dev/null +++ b/doc/language/ractor.md @@ -0,0 +1,797 @@ +# Ractor - Ruby's Actor-like concurrency abstraction + +Ractors are designed to provide parallel execution of Ruby code without thread-safety concerns. + +## Summary + +### Multiple Ractors in a ruby process + +You can create multiple Ractors which can run ruby code in parallel with each other. + +* `Ractor.new{ expr }` creates a new Ractor and `expr` can run in parallel with other ractors on a multi-core computer. +* Ruby processes start with one ractor (called the *main ractor*). +* If the main ractor terminates, all other ractors receive termination requests, similar to how threads behave. +* Each Ractor contains one or more `Thread`s. + * Threads within the same ractor share a ractor-wide global lock (GVL in MRI terminology), so they can't run in parallel wich each other (without releasing the GVL explicitly in C extensions). Threads in different ractors can run in parallel. + * The overhead of creating a ractor is slightly above the overhead of creating a thread. + +### Limited sharing between Ractors + +Ractors don't share all objects, unlike threads which can access any object other than objects stored in another thread's thread-locals. + +* Most objects are *unshareable objects*. Unshareable objects can only be used by the ractor that instantiated them, so you don't need to worry about thread-safety issues resulting from using the object concurrently across ractors. +* Some objects are *shareable objects*. Here is an incomplete list to give you an idea: + * `i = 123`: All `Integer`s are shareable. + * `s = "str".freeze`: Frozen strings are shareable if they have no instance variables that refer to unshareable objects. + * `a = [1, [2], 3].freeze`: `a` is not a shareable object because `a` refers to the unshareable object `[2]` (this Array is not frozen). + * `h = {c: Object}.freeze`: `h` is shareable because `Symbol`s and `Class`es are shareable, and the Hash is frozen. + * Class/Module objects are always shareable, even if they refer to unshareable objects. + * Special shareable objects + * Ractor objects themselves are shareable. + * And more... + +### Communication between Ractors with `Ractor::Port` + +Ractors communicate with each other and synchronize their execution by exchanging messages. The `Ractor::Port` class provides this communication mechanism. + +```ruby +port = Ractor::Port.new + +Ractor.new port do |port| + # Other ractors can send to the port + port << 42 +end + +port.receive # get a message from the port. Only the ractor that created the Port can receive from it. +#=> 42 +``` + +All Ractors have a default port, which `Ractor#send`, `Ractor.receive` (etc) will use. + +### Copy & Move semantics when sending objects + +To send unshareable objects to another ractor, objects are either copied or moved. + +* Copy: deep-copies the object to the other ractor. All unshareable objects will be `Kernel#clone`ed. +* Move: moves membership to another ractor. + * The sending ractor can not access the moved object after it moves. + * There is a guarantee that only one ractor can access an unshareable object at once. + +### Thread-safety + +Ractors help to write thread-safe, concurrent programs. They allow sharing of data only through explicit message passing for +unshareable objects. Shareable objects are guaranteed to work correctly across ractors, even if the ractors are running in parallel. +This guarantee, however, only applies across ractors. You still need to use `Mutex`es and other thread-safety tools within a ractor if +you're using multiple ruby `Thread`s. + + * Most objects are unshareable. You can't create data-races across ractors due to the inability to use these objects across ractors. + * Shareable objects are protected by locks (or otherwise don't need to be) so they can be used by more than one ractor at once. + +## Creation and termination + +### `Ractor.new` + +* `Ractor.new { expr }` creates a Ractor. + +```ruby +# Ractor.new with a block creates a new Ractor +r = Ractor.new do + # This block can run in parallel with other ractors +end + +# You can name a Ractor with a `name:` argument. +r = Ractor.new name: 'my-first-ractor' do +end + +r.name #=> 'my-first-ractor' +``` + +### Block isolation + +The Ractor executes `expr` in the given block. +The given block will be isolated from its outer scope. To prevent sharing objects between ractors, outer variables, `self` and other information is isolated from the block. + +This isolation occurs at Ractor creation time (when `Ractor.new` is called). If the given block is not able to be isolated because of outer variables or `self`, an error will be raised. + +```ruby +begin + a = true + r = Ractor.new do + a #=> ArgumentError because this block accesses outer variable `a`. + end + r.join # wait for ractor to finish +rescue ArgumentError +end +``` + +* The `self` of the given block is the `Ractor` object itself. + +```ruby +r = Ractor.new do + p self.class #=> Ractor + self.object_id +end +r.value == self.object_id #=> false +``` + +Arguments passed to `Ractor.new()` become block parameters for the given block. However, Ruby does not pass the objects themselves, but sends them as messages (see below for details). + +```ruby +r = Ractor.new 'ok' do |msg| + msg #=> 'ok' +end +r.value #=> 'ok' +``` + +```ruby +# similar to the last example +r = Ractor.new do + msg = Ractor.receive + msg +end +r.send 'ok' +r.value #=> 'ok' +``` + +### The execution result of the given block + +The return value of the given block becomes an outgoing message (see below for details). + +```ruby +r = Ractor.new do + 'ok' +end +r.value #=> `ok` +``` + +An error in the given block will be propagated to the consumer of the outgoing message. + +```ruby +r = Ractor.new do + raise 'ok' # exception will be transferred to the consumer +end + +begin + r.value +rescue Ractor::RemoteError => e + e.cause.class #=> RuntimeError + e.cause.message #=> 'ok' + e.ractor #=> r +end +``` + +## Communication between Ractors + +Communication between ractors is achieved by sending and receiving messages. There are two ways to communicate: + +* (1) Sending and receiving messages via `Ractor::Port` +* (2) Using shareable container objects. For example, the Ractor::TVar gem ([ko1/ractor-tvar](https://github.com/ko1/ractor-tvar)) + +Users can control program execution timing with (1), but should not control with (2) (only perform critical sections). + +For sending and receiving messages, these are the fundamental APIs: + +* send/receive via `Ractor::Port`. + * `Ractor::Port#send(obj)` (`Ractor::Port#<<(obj)` is an alias) sends a message to the port. Ports are connected to an infinite size incoming queue so sending will never block the caller. + * `Ractor::Port#receive` dequeues a message from its own incoming queue. If the incoming queue is empty, `Ractor::Port#receive` will block the execution of the current Thread until a message is sent. + * `Ractor#send` and `Ractor.receive` use ports (their default port) internally, so are conceptually similar to the above. +* You can close a `Ractor::Port` by `Ractor::Port#close`. A port can only be closed by the ractor that created it. + * If a port is closed, you can't `send` to it. Doing so raises an exception. + * When a ractor is terminated, the ractor's ports are automatically closed. +* You can wait for a ractor's termination and receive its return value with `Ractor#value`. This is similar to `Thread#value`. + +There are 3 ways to send an object as a message: + +1) Send a reference: sending a shareable object sends only a reference to the object (fast). + +2) Copy an object: sending an unshareable object through copying it deeply (can be slow). Note that you can not send an object this way which does not support deep copy. Some `T_DATA` objects (objects whose class is defined in a C extension, such as `StringIO`) are not supported. + +3) Move an object: sending an unshareable object across ractors with a membership change. The sending Ractor can not access the moved object after moving it, otherwise an exception will be raised. Implementation note: `T_DATA` objects are not supported. + +You can choose between "Copy" and "Move" by the `move:` keyword, `Ractor#send(obj, move: true/false)`. The default is `false` ("Copy"). However, if the object is shareable it will automatically use `move`. + +### Wait for multiple Ractors with `Ractor.select` + +You can wait for messages on multiple ports at once. +The return value of `Ractor.select()` is `[port, msg]` where `port` is a ready port and `msg` is the received message. + +To make it convenient, `Ractor.select` can also accept ractors. In this case, it waits for their termination. +The return value of `Ractor.select()` is `[r, msg]` where `r` is a terminated Ractor and `msg` is the value of the ractor's block. + +Wait for a single ractor (same as `Ractor#value`): + +```ruby +r1 = Ractor.new{'r1'} + +r, obj = Ractor.select(r1) +r == r1 and obj == 'r1' #=> true +``` + +Wait for two ractors: + +```ruby +r1 = Ractor.new{'r1'} +r2 = Ractor.new{'r2'} +rs = [r1, r2] +values = [] + +while rs.any? + r, obj = Ractor.select(*rs) + rs.delete(r) + values << obj +end + +values.sort == ['r1', 'r2'] #=> true +``` + +NOTE: Using `Ractor.select()` on a very large number of ractors has the same issue as `select(2)` currently. + +### Closing ports + +* `Ractor::Port#close` closes the port (similar to `Queue#close`). + * `port.send(obj)` will raise an exception when the port is closed. + * When the queue connected to the port is empty and port is closed, `Ractor::Port#receive` raises an exception. If the queue is not empty, it dequeues an object without exceptions. +* When a Ractor terminates, the ports are closed automatically. + +Example (try to get a result from closed ractor): + +```ruby +r = Ractor.new do + 'finish' +end +r.join # success (wait for the termination) +r.value # success (will return 'finish') + +# The ractor's termination value has already been given to another ractor +Ractor.new r do |r| + r.value #=> Ractor::Error +end.join +``` + +Example (try to send to closed port): + +```ruby +r = Ractor.new do +end + +r.join # wait for termination, closes default port + +begin + r.send(1) +rescue Ractor::ClosedError + 'ok' +end +``` + +### Send a message by copying + +`Ractor::Port#send(obj)` copies `obj` deeply if `obj` is an unshareable object. + +```ruby +obj = 'str'.dup +r = Ractor.new obj do |msg| + # return received msg's object_id + msg.object_id +end + +obj.object_id == r.value #=> false +``` + +Some objects do not support copying, and raise an exception. + +```ruby +obj = Thread.new{} +begin + Ractor.new obj do |msg| + msg + end +rescue TypeError => e + e.message #=> #<TypeError: allocator undefined for Thread> +end +``` + +### Send a message by moving + +`Ractor::Port#send(obj, move: true)` moves `obj` to the destination Ractor. +If the source ractor uses the moved object (for example, calls a method like `obj.foo()`), it will raise an error. + +```ruby +r = Ractor.new do + obj = Ractor.receive + obj << ' world' +end + +str = 'hello'.dup +r.send str, move: true +# str is now moved, and accessing str from this ractor is prohibited +modified = r.value #=> 'hello world' + + +begin + # Error because it uses moved str. + str << ' exception' # raise Ractor::MovedError +rescue Ractor::MovedError + modified #=> 'hello world' +end +``` + +Some objects do not support moving, and an exception will be raised. + +```ruby +r = Ractor.new do + Ractor.receive +end + +r.send(Thread.new{}, move: true) #=> allocator undefined for Thread (TypeError) +``` + +Once an object has been moved, the source object's class is changed to `Ractor::MovedObject`. + +### Shareable objects + +The following is an inexhaustive list of shareable objects: + +* `Integer`, `Float`, `Complex`, `Rational` +* `Symbol`, frozen `String` objects that don't refer to unshareables, `true`, `false`, `nil` +* `Regexp` objects, if they have no instance variables or their instance variables refer only to shareables +* `Class` and `Module` objects +* `Ractor` and other special objects which deal with synchronization + +To make objects shareable, `Ractor.make_shareable(obj)` is provided. It tries to make the object shareable by freezing `obj` and recursively traversing its references to freeze them all. This method accepts the `copy:` keyword (default value is false). `Ractor.make_shareable(obj, copy: true)` tries to make a deep copy of `obj` and make the copied object shareable. `Ractor.make_shareable(copy: false)` has no effect on an already shareable object. If the object cannot be made shareable, a `Ractor::Error` exception will be raised. + +## Language changes to limit sharing between Ractors + +To isolate unshareable objects across ractors, we introduced additional language semantics for multi-ractor Ruby programs. + +Note that when not using ractors, these additional semantics are not needed (100% compatible with Ruby 2). + +### Global variables + +Only the main Ractor can access global variables. + +```ruby +$gv = 1 +r = Ractor.new do + $gv +end + +begin + r.join +rescue Ractor::RemoteError => e + e.cause.message #=> 'can not access global variables from non-main Ractors' +end +``` + +Note that some special global variables, such as `$stdin`, `$stdout` and `$stderr` are local to each ractor. See [[Bug #17268]](https://bugs.ruby-lang.org/issues/17268) for more details. + +### Instance variables of shareable objects + +Instance variables of classes/modules can be accessed from non-main ractors only if their values are shareable objects. + +```ruby +class C + @iv = 1 +end + +p Ractor.new do + class C + @iv + end +end.value #=> 1 +``` + +Otherwise, only the main Ractor can access instance variables of shareable objects. + +```ruby +class C + @iv = [] # unshareable object +end + +Ractor.new do + class C + begin + p @iv + rescue Ractor::IsolationError + p $!.message + #=> "can not get unshareable values from instance variables of classes/modules from non-main Ractors" + end + + begin + @iv = 42 + rescue Ractor::IsolationError + p $!.message + #=> "can not set instance variables of classes/modules by non-main Ractors" + end + end +end.join +``` + +```ruby +shared = Ractor.new{} +shared.instance_variable_set(:@iv, 'str') + +r = Ractor.new shared do |shared| + p shared.instance_variable_get(:@iv) +end + +begin + r.join +rescue Ractor::RemoteError => e + e.cause.message #=> can not access instance variables of shareable objects from non-main Ractors (Ractor::IsolationError) +end +``` + +### Class variables + +Only the main Ractor can access class variables. + +```ruby +class C + @@cv = 'str' +end + +r = Ractor.new do + class C + p @@cv + end +end + + +begin + r.join +rescue => e + e.class #=> Ractor::IsolationError +end +``` + +### Constants + +Only the main Ractor can read constants which refer to an unshareable object. + +```ruby +class C + CONST = 'str'.dup +end +r = Ractor.new do + C::CONST +end +begin + r.join +rescue => e + e.class #=> Ractor::IsolationError +end +``` + +Only the main Ractor can define constants which refer to an unshareable object. + +```ruby +class C +end +r = Ractor.new do + C::CONST = 'str'.dup +end +begin + r.join +rescue => e + e.class #=> Ractor::IsolationError +end +``` + +When creating/updating a library to support ractors, constants should only refer to shareable objects if they are to be used by non-main ractors. + +```ruby +TABLE = {a: 'ko1', b: 'ko2', c: 'ko3'} +``` + +In this case, `TABLE` refers to an unshareable Hash object. In order for other ractors to use `TABLE`, we need to make it shareable. We can use `Ractor.make_shareable()` like so: + +```ruby +TABLE = Ractor.make_shareable( {a: 'ko1', b: 'ko2', c: 'ko3'} ) +``` + +To make it easy, Ruby 3.0 introduced a new `shareable_constant_value` file directive. + +```ruby +# shareable_constant_value: literal + +TABLE = {a: 'ko1', b: 'ko2', c: 'ko3'} +#=> Same as: TABLE = Ractor.make_shareable( {a: 'ko1', b: 'ko2', c: 'ko3'} ) +``` + +The `shareable_constant_value` directive accepts the following modes (descriptions use the example: `CONST = expr`): + +* none: Do nothing. Same as: `CONST = expr` +* literal: + * if `expr` consists of literals, replaced to `CONST = Ractor.make_shareable(expr)`. + * otherwise: replaced to `CONST = expr.tap{|o| raise unless Ractor.shareable?(o)}`. +* experimental_everything: replaced to `CONST = Ractor.make_shareable(expr)`. +* experimental_copy: replaced to `CONST = Ractor.make_shareable(expr, copy: true)`. + +Except for the `none` mode (default), it is guaranteed that these constants refer only to shareable objects. + +See [syntax/comments.rdoc](../syntax/comments.rdoc) for more details. + +### Shareable procs + +Procs and lambdas are unshareable objects, even when they are frozen. To create an unshareable Proc, you must use `Ractor.shareable_proc { expr }`. Much like during Ractor creation, the proc's block is isolated from its outer environment, so it cannot access variables from the outside scope. `self` is also changed within the Proc to be `nil` by default, although a `self:` keyword can be provided if you want to customize the value to a different shareable object. + +```ruby +p = Ractor.shareable_proc { p self } +p.call #=> nil +``` + +```ruby +begin + a = 1 + pr = Ractor.shareable_proc { p a } + pr.call # never gets here +rescue Ractor::IsolationError +end +``` + +In order to dynamically define a method with `Module#define_method` that can be used from different ractors, you must define it with a shareable proc. Alternatively, you can use `Module#class_eval` or `Module#module_eval` with a String. Even though the shareable proc's `self` is initially bound to `nil`, `define_method` will bind `self` to the correct value in the method. + +```ruby +class A + define_method :testing, &Ractor.shareable_proc do + p self + end +end +Ractor.new do + a = A.new + a.testing #=> #<A:0x0000000101acfe10> +end.join +``` + +This isolation must be done to prevent the method from accessing and assigning captured outer variables across ractors. + +### Ractor-local storage + +You can store any object (even unshareables) in ractor-local storage. + +```ruby +r = Ractor.new do + values = [] + Ractor[:threads] = [] + 3.times do |i| + Ractor[:threads] << Thread.new do + values << [Ractor.receive, i+1] # Ractor.receive blocks the current thread in the current ractor until it receives a message + end + end + Ractor[:threads].each(&:join) + values +end + +r << 1 +r << 2 +r << 3 +r.value #=> [[1,1],[2,2],[3,3]] (the order can change with each run) +``` + +## Examples + +### Traditional Ring example in Actor-model + +```ruby +RN = 1_000 +CR = Ractor.current + +r = Ractor.new do + p Ractor.receive + CR << :fin +end + +RN.times{ + r = Ractor.new r do |next_r| + next_r << Ractor.receive + end +} + +p :setup_ok +r << 1 +p Ractor.receive +``` + +### Fork-join + +```ruby +def fib n + if n < 2 + 1 + else + fib(n-2) + fib(n-1) + end +end + +RN = 10 +rs = (1..RN).map do |i| + Ractor.new i do |i| + [i, fib(i)] + end +end + +until rs.empty? + r, v = Ractor.select(*rs) + rs.delete r + p answer: v +end +``` + +### Worker pool + +(1) One ractor has a pool + +```ruby +require 'prime' + +N = 1000 +RN = 10 + +# make RN workers +workers = (1..RN).map do + Ractor.new do |; result_port| + loop do + n, result_port = Ractor.receive + result_port << [n, n.prime?, Ractor.current] + end + end +end + +result_port = Ractor::Port.new +results = [] + +(1..N).each do |i| + if workers.empty? + # receive a result + n, result, w = result_port.receive + results << [n, result] + else + w = workers.pop + end + + # send a task to the idle worker ractor + w << [i, result_port] +end + +# receive a result +while results.size != N + n, result, _w = result_port.receive + results << [n, result] +end + +pp results.sort_by{|n, result| n} +``` + +### Pipeline + +```ruby +# pipeline with send/receive + +r3 = Ractor.new Ractor.current do |cr| + cr.send Ractor.receive + 'r3' +end + +r2 = Ractor.new r3 do |r3| + r3.send Ractor.receive + 'r2' +end + +r1 = Ractor.new r2 do |r2| + r2.send Ractor.receive + 'r1' +end + +r1 << 'r0' +p Ractor.receive #=> "r0r1r2r3" +``` + +### Supervise + +```ruby +# ring example again + +r = Ractor.current +(1..10).map{|i| + r = Ractor.new r, i do |r, i| + r.send Ractor.receive + "r#{i}" + end +} + +r.send "r0" +p Ractor.receive #=> "r0r10r9r8r7r6r5r4r3r2r1" +``` + +```ruby +# ring example with an error + +r = Ractor.current +rs = (1..10).map{|i| + r = Ractor.new r, i do |r, i| + loop do + msg = Ractor.receive + raise if /e/ =~ msg + r.send msg + "r#{i}" + end + end +} + +r.send "r0" +p Ractor.receive #=> "r0r10r9r8r7r6r5r4r3r2r1" +r.send "r0" +p Ractor.select(*rs, Ractor.current) #=> [:receive, "r0r10r9r8r7r6r5r4r3r2r1"] +r.send "e0" +p Ractor.select(*rs, Ractor.current) +#=> +# <Thread:0x000056262de28bd8 run> terminated with exception (report_on_exception is true): +# Traceback (most recent call last): +# 2: from /home/ko1/src/ruby/trunk/test.rb:7:in `block (2 levels) in <main>' +# 1: from /home/ko1/src/ruby/trunk/test.rb:7:in `loop' +# /home/ko1/src/ruby/trunk/test.rb:9:in `block (3 levels) in <main>': unhandled exception +# Traceback (most recent call last): +# 2: from /home/ko1/src/ruby/trunk/test.rb:7:in `block (2 levels) in <main>' +# 1: from /home/ko1/src/ruby/trunk/test.rb:7:in `loop' +# /home/ko1/src/ruby/trunk/test.rb:9:in `block (3 levels) in <main>': unhandled exception +# 1: from /home/ko1/src/ruby/trunk/test.rb:21:in `<main>' +# <internal:ractor>:69:in `select': thrown by remote Ractor. (Ractor::RemoteError) +``` + +```ruby +# resend non-error message + +r = Ractor.current +rs = (1..10).map{|i| + r = Ractor.new r, i do |r, i| + loop do + msg = Ractor.receive + raise if /e/ =~ msg + r.send msg + "r#{i}" + end + end +} + +r.send "r0" +p Ractor.receive #=> "r0r10r9r8r7r6r5r4r3r2r1" +r.send "r0" +p Ractor.select(*rs, Ractor.current) +[:receive, "r0r10r9r8r7r6r5r4r3r2r1"] +msg = 'e0' +begin + r.send msg + p Ractor.select(*rs, Ractor.current) +rescue Ractor::RemoteError + msg = 'r0' + retry +end + +#=> <internal:ractor>:100:in `send': The incoming-port is already closed (Ractor::ClosedError) +# because r == r[-1] is terminated. +``` + +```ruby +# ring example with supervisor and re-start + +def make_ractor r, i + Ractor.new r, i do |r, i| + loop do + msg = Ractor.receive + raise if /e/ =~ msg + r.send msg + "r#{i}" + end + end +end + +r = Ractor.current +rs = (1..10).map{|i| + r = make_ractor(r, i) +} + +msg = 'e0' # error causing message +begin + r.send msg + p Ractor.select(*rs, Ractor.current) +rescue Ractor::RemoteError + r = rs[-1] = make_ractor(rs[-2], rs.size-1) + msg = 'x0' + retry +end + +#=> [:receive, "x0r9r9r8r7r6r5r4r3r2r1"] +``` diff --git a/doc/language/regexp/methods.rdoc b/doc/language/regexp/methods.rdoc new file mode 100644 index 0000000000..356156ac9a --- /dev/null +++ b/doc/language/regexp/methods.rdoc @@ -0,0 +1,41 @@ +== \Regexp Methods + +Each of these Ruby core methods can accept a regexp as an argument: + +- Enumerable#all? +- Enumerable#any? +- Enumerable#grep +- Enumerable#grep_v +- Enumerable#none? +- Enumerable#one? +- Enumerable#slice_after +- Enumerable#slice_before +- Regexp#=~ +- Regexp#match +- Regexp#match? +- Regexp.new +- Regexp.union +- String#=~ +- String#[]= +- String#byteindex +- String#byterindex +- String#gsub +- String#gsub! +- String#index +- String#match +- String#match? +- String#partition +- String#rindex +- String#rpartition +- String#scan +- String#slice +- String#slice! +- String#split +- String#start_with? +- String#sub +- String#sub! +- Symbol#=~ +- Symbol#match +- Symbol#match? +- Symbol#slice +- Symbol#start_with? diff --git a/doc/language/regexp/unicode_properties.rdoc b/doc/language/regexp/unicode_properties.rdoc new file mode 100644 index 0000000000..94080f7199 --- /dev/null +++ b/doc/language/regexp/unicode_properties.rdoc @@ -0,0 +1,718 @@ +== \Regexps Based on Unicode Properties + +The properties shown here are those currently supported in Ruby. +Older versions may not support all of these. + +=== POSIX brackets + +- <tt>\p{ASCII}</tt> +- <tt>\p{Alnum}</tt> +- <tt>\p{Alphabetic}</tt>, <tt>\p{Alpha}</tt> +- <tt>\p{Blank}</tt> +- <tt>\p{Cntrl}</tt> +- <tt>\p{Digit}</tt> +- <tt>\p{Graph}</tt> +- <tt>\p{Lowercase}</tt>, <tt>\p{Lower}</tt> +- <tt>\p{Print}</tt> +- <tt>\p{Punct}</tt> +- <tt>\p{Space}</tt> +- <tt>\p{Uppercase}</tt>, <tt>\p{Upper}</tt> +- <tt>\p{Word}</tt> +- <tt>\p{XDigit}</tt> +- <tt>\p{XPosixPunct}</tt> + +=== Special + +- <tt>\p{Any}</tt> +- <tt>\p{Assigned}</tt> + +=== Major and General Categories + +- <tt>\p{Cased_Letter}</tt>, <tt>\p{LC}</tt> +- <tt>\p{Close_Punctuation}</tt>, <tt>\p{Pe}</tt> +- <tt>\p{Connector_Punctuation}</tt>, <tt>\p{Pc}</tt> +- <tt>\p{Control}</tt>, <tt>\p{Cc}</tt> +- <tt>\p{Currency_Symbol}</tt>, <tt>\p{Sc}</tt> +- <tt>\p{Dash_Punctuation}</tt>, <tt>\p{Pd}</tt> +- <tt>\p{Decimal_Number}</tt>, <tt>\p{Nd}</tt> +- <tt>\p{Enclosing_Mark}</tt>, <tt>\p{Me}</tt> +- <tt>\p{Final_Punctuation}</tt>, <tt>\p{Pf}</tt> +- <tt>\p{Format}</tt>, <tt>\p{Cf}</tt> +- <tt>\p{Initial_Punctuation}</tt>, <tt>\p{Pi}</tt> +- <tt>\p{Letter}</tt>, <tt>\p{L}</tt> +- <tt>\p{Letter_Number}</tt>, <tt>\p{Nl}</tt> +- <tt>\p{Line_Separator}</tt>, <tt>\p{Zl}</tt> +- <tt>\p{Lowercase_Letter}</tt>, <tt>\p{Ll}</tt> +- <tt>\p{Mark}</tt>, <tt>\p{M}</tt> +- <tt>\p{Math_Symbol}</tt>, <tt>\p{Sm}</tt> +- <tt>\p{Modifier_Letter}</tt>, <tt>\p{Lm}</tt> +- <tt>\p{Modifier_Symbol}</tt>, <tt>\p{Sk}</tt> +- <tt>\p{Nonspacing_Mark}</tt>, <tt>\p{Mn}</tt> +- <tt>\p{Number}</tt>, <tt>\p{N}</tt> +- <tt>\p{Open_Punctuation}</tt>, <tt>\p{Ps}</tt> +- <tt>\p{Other}</tt>, <tt>\p{C}</tt> +- <tt>\p{Other_Letter}</tt>, <tt>\p{Lo}</tt> +- <tt>\p{Other_Number}</tt>, <tt>\p{No}</tt> +- <tt>\p{Other_Punctuation}</tt>, <tt>\p{Po}</tt> +- <tt>\p{Other_Symbol}</tt>, <tt>\p{So}</tt> +- <tt>\p{Paragraph_Separator}</tt>, <tt>\p{Zp}</tt> +- <tt>\p{Private_Use}</tt>, <tt>\p{Co}</tt> +- <tt>\p{Punctuation}</tt>, <tt>\p{P}</tt> +- <tt>\p{Separator}</tt>, <tt>\p{Z}</tt> +- <tt>\p{Space_Separator}</tt>, <tt>\p{Zs}</tt> +- <tt>\p{Spacing_Mark}</tt>, <tt>\p{Mc}</tt> +- <tt>\p{Surrogate}</tt>, <tt>\p{Cs}</tt> +- <tt>\p{Symbol}</tt>, <tt>\p{S}</tt> +- <tt>\p{Titlecase_Letter}</tt>, <tt>\p{Lt}</tt> +- <tt>\p{Unassigned}</tt>, <tt>\p{Cn}</tt> +- <tt>\p{Uppercase_Letter}</tt>, <tt>\p{Lu}</tt> + +=== Prop List + +- <tt>\p{ASCII_Hex_Digit}</tt>, <tt>\p{AHex}</tt> +- <tt>\p{Bidi_Control}</tt>, <tt>\p{Bidi_C}</tt> +- <tt>\p{Dash}</tt> +- <tt>\p{Deprecated}</tt>, <tt>\p{Dep}</tt> +- <tt>\p{Diacritic}</tt>, <tt>\p{Dia}</tt> +- <tt>\p{Extender}</tt>, <tt>\p{Ext}</tt> +- <tt>\p{Hex_Digit}</tt>, <tt>\p{Hex}</tt> +- <tt>\p{Hyphen}</tt> +- <tt>\p{IDS_Binary_Operator}</tt>, <tt>\p{IDSB}</tt> +- <tt>\p{IDS_Trinary_Operator}</tt>, <tt>\p{IDST}</tt> +- <tt>\p{IDS_Unary_Operator}</tt>, <tt>\p{IDSU}</tt> +- <tt>\p{ID_Compat_Math_Continue}</tt> +- <tt>\p{ID_Compat_Math_Start}</tt> +- <tt>\p{Ideographic}</tt>, <tt>\p{Ideo}</tt> +- <tt>\p{Join_Control}</tt>, <tt>\p{Join_C}</tt> +- <tt>\p{Logical_Order_Exception}</tt>, <tt>\p{LOE}</tt> +- <tt>\p{Modifier_Combining_Mark}</tt>, <tt>\p{MCM}</tt> +- <tt>\p{Noncharacter_Code_Point}</tt>, <tt>\p{NChar}</tt> +- <tt>\p{Other_Alphabetic}</tt>, <tt>\p{OAlpha}</tt> +- <tt>\p{Other_Default_Ignorable_Code_Point}</tt>, <tt>\p{ODI}</tt> +- <tt>\p{Other_Grapheme_Extend}</tt>, <tt>\p{OGr_Ext}</tt> +- <tt>\p{Other_ID_Continue}</tt>, <tt>\p{OIDC}</tt> +- <tt>\p{Other_ID_Start}</tt>, <tt>\p{OIDS}</tt> +- <tt>\p{Other_Lowercase}</tt>, <tt>\p{OLower}</tt> +- <tt>\p{Other_Math}</tt>, <tt>\p{OMath}</tt> +- <tt>\p{Other_Uppercase}</tt>, <tt>\p{OUpper}</tt> +- <tt>\p{Pattern_Syntax}</tt>, <tt>\p{Pat_Syn}</tt> +- <tt>\p{Pattern_White_Space}</tt>, <tt>\p{Pat_WS}</tt> +- <tt>\p{Prepended_Concatenation_Mark}</tt>, <tt>\p{PCM}</tt> +- <tt>\p{Quotation_Mark}</tt>, <tt>\p{QMark}</tt> +- <tt>\p{Radical}</tt> +- <tt>\p{Regional_Indicator}</tt>, <tt>\p{RI}</tt> +- <tt>\p{Sentence_Terminal}</tt>, <tt>\p{STerm}</tt> +- <tt>\p{Soft_Dotted}</tt>, <tt>\p{SD}</tt> +- <tt>\p{Terminal_Punctuation}</tt>, <tt>\p{Term}</tt> +- <tt>\p{Unified_Ideograph}</tt>, <tt>\p{UIdeo}</tt> +- <tt>\p{Variation_Selector}</tt>, <tt>\p{VS}</tt> +- <tt>\p{White_Space}</tt>, <tt>\p{WSpace}</tt> + +=== Derived Core Properties + +- <tt>\p{Alphabetic}</tt>, <tt>\p{Alpha}</tt> +- <tt>\p{Case_Ignorable}</tt>, <tt>\p{CI}</tt> +- <tt>\p{Cased}</tt> +- <tt>\p{Changes_When_Casefolded}</tt>, <tt>\p{CWCF}</tt> +- <tt>\p{Changes_When_Casemapped}</tt>, <tt>\p{CWCM}</tt> +- <tt>\p{Changes_When_Lowercased}</tt>, <tt>\p{CWL}</tt> +- <tt>\p{Changes_When_Titlecased}</tt>, <tt>\p{CWT}</tt> +- <tt>\p{Changes_When_Uppercased}</tt>, <tt>\p{CWU}</tt> +- <tt>\p{Default_Ignorable_Code_Point}</tt>, <tt>\p{DI}</tt> +- <tt>\p{Grapheme_Base}</tt>, <tt>\p{Gr_Base}</tt> +- <tt>\p{Grapheme_Extend}</tt>, <tt>\p{Gr_Ext}</tt> +- <tt>\p{Grapheme_Link}</tt>, <tt>\p{Gr_Link}</tt> +- <tt>\p{ID_Continue}</tt>, <tt>\p{IDC}</tt> +- <tt>\p{ID_Start}</tt>, <tt>\p{IDS}</tt> +- <tt>\p{InCB_Consonant}</tt> +- <tt>\p{InCB_Extend}</tt> +- <tt>\p{InCB_Linker}</tt> +- <tt>\p{Lowercase}</tt>, <tt>\p{Lower}</tt> +- <tt>\p{Math}</tt> +- <tt>\p{Uppercase}</tt>, <tt>\p{Upper}</tt> +- <tt>\p{XID_Continue}</tt>, <tt>\p{XIDC}</tt> +- <tt>\p{XID_Start}</tt>, <tt>\p{XIDS}</tt> + +=== Scripts + +- <tt>\p{Adlam}</tt>, <tt>\p{Adlm}</tt> +- <tt>\p{Ahom}</tt> +- <tt>\p{Anatolian_Hieroglyphs}</tt>, <tt>\p{Hluw}</tt> +- <tt>\p{Arabic}</tt>, <tt>\p{Arab}</tt> +- <tt>\p{Armenian}</tt>, <tt>\p{Armn}</tt> +- <tt>\p{Avestan}</tt>, <tt>\p{Avst}</tt> +- <tt>\p{Balinese}</tt>, <tt>\p{Bali}</tt> +- <tt>\p{Bamum}</tt>, <tt>\p{Bamu}</tt> +- <tt>\p{Bassa_Vah}</tt>, <tt>\p{Bass}</tt> +- <tt>\p{Batak}</tt>, <tt>\p{Batk}</tt> +- <tt>\p{Bengali}</tt>, <tt>\p{Beng}</tt> +- <tt>\p{Beria_Erfe}</tt>, <tt>\p{Berf}</tt> +- <tt>\p{Bhaiksuki}</tt>, <tt>\p{Bhks}</tt> +- <tt>\p{Bopomofo}</tt>, <tt>\p{Bopo}</tt> +- <tt>\p{Brahmi}</tt>, <tt>\p{Brah}</tt> +- <tt>\p{Braille}</tt>, <tt>\p{Brai}</tt> +- <tt>\p{Buginese}</tt>, <tt>\p{Bugi}</tt> +- <tt>\p{Buhid}</tt>, <tt>\p{Buhd}</tt> +- <tt>\p{Canadian_Aboriginal}</tt>, <tt>\p{Cans}</tt> +- <tt>\p{Carian}</tt>, <tt>\p{Cari}</tt> +- <tt>\p{Caucasian_Albanian}</tt>, <tt>\p{Aghb}</tt> +- <tt>\p{Chakma}</tt>, <tt>\p{Cakm}</tt> +- <tt>\p{Cham}</tt> +- <tt>\p{Cherokee}</tt>, <tt>\p{Cher}</tt> +- <tt>\p{Chorasmian}</tt>, <tt>\p{Chrs}</tt> +- <tt>\p{Common}</tt>, <tt>\p{Zyyy}</tt> +- <tt>\p{Coptic}</tt>, <tt>\p{Copt}</tt> +- <tt>\p{Cuneiform}</tt>, <tt>\p{Xsux}</tt> +- <tt>\p{Cypriot}</tt>, <tt>\p{Cprt}</tt> +- <tt>\p{Cypro_Minoan}</tt>, <tt>\p{Cpmn}</tt> +- <tt>\p{Cyrillic}</tt>, <tt>\p{Cyrl}</tt> +- <tt>\p{Deseret}</tt>, <tt>\p{Dsrt}</tt> +- <tt>\p{Devanagari}</tt>, <tt>\p{Deva}</tt> +- <tt>\p{Dives_Akuru}</tt>, <tt>\p{Diak}</tt> +- <tt>\p{Dogra}</tt>, <tt>\p{Dogr}</tt> +- <tt>\p{Duployan}</tt>, <tt>\p{Dupl}</tt> +- <tt>\p{Egyptian_Hieroglyphs}</tt>, <tt>\p{Egyp}</tt> +- <tt>\p{Elbasan}</tt>, <tt>\p{Elba}</tt> +- <tt>\p{Elymaic}</tt>, <tt>\p{Elym}</tt> +- <tt>\p{Ethiopic}</tt>, <tt>\p{Ethi}</tt> +- <tt>\p{Garay}</tt>, <tt>\p{Gara}</tt> +- <tt>\p{Georgian}</tt>, <tt>\p{Geor}</tt> +- <tt>\p{Glagolitic}</tt>, <tt>\p{Glag}</tt> +- <tt>\p{Gothic}</tt>, <tt>\p{Goth}</tt> +- <tt>\p{Grantha}</tt>, <tt>\p{Gran}</tt> +- <tt>\p{Greek}</tt>, <tt>\p{Grek}</tt> +- <tt>\p{Gujarati}</tt>, <tt>\p{Gujr}</tt> +- <tt>\p{Gunjala_Gondi}</tt>, <tt>\p{Gong}</tt> +- <tt>\p{Gurmukhi}</tt>, <tt>\p{Guru}</tt> +- <tt>\p{Gurung_Khema}</tt>, <tt>\p{Gukh}</tt> +- <tt>\p{Han}</tt>, <tt>\p{Hani}</tt> +- <tt>\p{Hangul}</tt>, <tt>\p{Hang}</tt> +- <tt>\p{Hanifi_Rohingya}</tt>, <tt>\p{Rohg}</tt> +- <tt>\p{Hanunoo}</tt>, <tt>\p{Hano}</tt> +- <tt>\p{Hatran}</tt>, <tt>\p{Hatr}</tt> +- <tt>\p{Hebrew}</tt>, <tt>\p{Hebr}</tt> +- <tt>\p{Hiragana}</tt>, <tt>\p{Hira}</tt> +- <tt>\p{Imperial_Aramaic}</tt>, <tt>\p{Armi}</tt> +- <tt>\p{Inherited}</tt>, <tt>\p{Zinh}</tt> +- <tt>\p{Inscriptional_Pahlavi}</tt>, <tt>\p{Phli}</tt> +- <tt>\p{Inscriptional_Parthian}</tt>, <tt>\p{Prti}</tt> +- <tt>\p{Javanese}</tt>, <tt>\p{Java}</tt> +- <tt>\p{Kaithi}</tt>, <tt>\p{Kthi}</tt> +- <tt>\p{Kannada}</tt>, <tt>\p{Knda}</tt> +- <tt>\p{Katakana}</tt>, <tt>\p{Kana}</tt> +- <tt>\p{Kawi}</tt> +- <tt>\p{Kayah_Li}</tt>, <tt>\p{Kali}</tt> +- <tt>\p{Kharoshthi}</tt>, <tt>\p{Khar}</tt> +- <tt>\p{Khitan_Small_Script}</tt>, <tt>\p{Kits}</tt> +- <tt>\p{Khmer}</tt>, <tt>\p{Khmr}</tt> +- <tt>\p{Khojki}</tt>, <tt>\p{Khoj}</tt> +- <tt>\p{Khudawadi}</tt>, <tt>\p{Sind}</tt> +- <tt>\p{Kirat_Rai}</tt>, <tt>\p{Krai}</tt> +- <tt>\p{Lao}</tt>, <tt>\p{Laoo}</tt> +- <tt>\p{Latin}</tt>, <tt>\p{Latn}</tt> +- <tt>\p{Lepcha}</tt>, <tt>\p{Lepc}</tt> +- <tt>\p{Limbu}</tt>, <tt>\p{Limb}</tt> +- <tt>\p{Linear_A}</tt>, <tt>\p{Lina}</tt> +- <tt>\p{Linear_B}</tt>, <tt>\p{Linb}</tt> +- <tt>\p{Lisu}</tt> +- <tt>\p{Lycian}</tt>, <tt>\p{Lyci}</tt> +- <tt>\p{Lydian}</tt>, <tt>\p{Lydi}</tt> +- <tt>\p{Mahajani}</tt>, <tt>\p{Mahj}</tt> +- <tt>\p{Makasar}</tt>, <tt>\p{Maka}</tt> +- <tt>\p{Malayalam}</tt>, <tt>\p{Mlym}</tt> +- <tt>\p{Mandaic}</tt>, <tt>\p{Mand}</tt> +- <tt>\p{Manichaean}</tt>, <tt>\p{Mani}</tt> +- <tt>\p{Marchen}</tt>, <tt>\p{Marc}</tt> +- <tt>\p{Masaram_Gondi}</tt>, <tt>\p{Gonm}</tt> +- <tt>\p{Medefaidrin}</tt>, <tt>\p{Medf}</tt> +- <tt>\p{Meetei_Mayek}</tt>, <tt>\p{Mtei}</tt> +- <tt>\p{Mende_Kikakui}</tt>, <tt>\p{Mend}</tt> +- <tt>\p{Meroitic_Cursive}</tt>, <tt>\p{Merc}</tt> +- <tt>\p{Meroitic_Hieroglyphs}</tt>, <tt>\p{Mero}</tt> +- <tt>\p{Miao}</tt>, <tt>\p{Plrd}</tt> +- <tt>\p{Modi}</tt> +- <tt>\p{Mongolian}</tt>, <tt>\p{Mong}</tt> +- <tt>\p{Mro}</tt>, <tt>\p{Mroo}</tt> +- <tt>\p{Multani}</tt>, <tt>\p{Mult}</tt> +- <tt>\p{Myanmar}</tt>, <tt>\p{Mymr}</tt> +- <tt>\p{Nabataean}</tt>, <tt>\p{Nbat}</tt> +- <tt>\p{Nag_Mundari}</tt>, <tt>\p{Nagm}</tt> +- <tt>\p{Nandinagari}</tt>, <tt>\p{Nand}</tt> +- <tt>\p{New_Tai_Lue}</tt>, <tt>\p{Talu}</tt> +- <tt>\p{Newa}</tt> +- <tt>\p{Nko}</tt>, <tt>\p{Nkoo}</tt> +- <tt>\p{Nushu}</tt>, <tt>\p{Nshu}</tt> +- <tt>\p{Nyiakeng_Puachue_Hmong}</tt>, <tt>\p{Hmnp}</tt> +- <tt>\p{Ogham}</tt>, <tt>\p{Ogam}</tt> +- <tt>\p{Ol_Chiki}</tt>, <tt>\p{Olck}</tt> +- <tt>\p{Ol_Onal}</tt>, <tt>\p{Onao}</tt> +- <tt>\p{Old_Hungarian}</tt>, <tt>\p{Hung}</tt> +- <tt>\p{Old_Italic}</tt>, <tt>\p{Ital}</tt> +- <tt>\p{Old_North_Arabian}</tt>, <tt>\p{Narb}</tt> +- <tt>\p{Old_Permic}</tt>, <tt>\p{Perm}</tt> +- <tt>\p{Old_Persian}</tt>, <tt>\p{Xpeo}</tt> +- <tt>\p{Old_Sogdian}</tt>, <tt>\p{Sogo}</tt> +- <tt>\p{Old_South_Arabian}</tt>, <tt>\p{Sarb}</tt> +- <tt>\p{Old_Turkic}</tt>, <tt>\p{Orkh}</tt> +- <tt>\p{Old_Uyghur}</tt>, <tt>\p{Ougr}</tt> +- <tt>\p{Oriya}</tt>, <tt>\p{Orya}</tt> +- <tt>\p{Osage}</tt>, <tt>\p{Osge}</tt> +- <tt>\p{Osmanya}</tt>, <tt>\p{Osma}</tt> +- <tt>\p{Pahawh_Hmong}</tt>, <tt>\p{Hmng}</tt> +- <tt>\p{Palmyrene}</tt>, <tt>\p{Palm}</tt> +- <tt>\p{Pau_Cin_Hau}</tt>, <tt>\p{Pauc}</tt> +- <tt>\p{Phags_Pa}</tt>, <tt>\p{Phag}</tt> +- <tt>\p{Phoenician}</tt>, <tt>\p{Phnx}</tt> +- <tt>\p{Psalter_Pahlavi}</tt>, <tt>\p{Phlp}</tt> +- <tt>\p{Rejang}</tt>, <tt>\p{Rjng}</tt> +- <tt>\p{Runic}</tt>, <tt>\p{Runr}</tt> +- <tt>\p{Samaritan}</tt>, <tt>\p{Samr}</tt> +- <tt>\p{Saurashtra}</tt>, <tt>\p{Saur}</tt> +- <tt>\p{Sharada}</tt>, <tt>\p{Shrd}</tt> +- <tt>\p{Shavian}</tt>, <tt>\p{Shaw}</tt> +- <tt>\p{Siddham}</tt>, <tt>\p{Sidd}</tt> +- <tt>\p{Sidetic}</tt>, <tt>\p{Sidt}</tt> +- <tt>\p{SignWriting}</tt>, <tt>\p{Sgnw}</tt> +- <tt>\p{Sinhala}</tt>, <tt>\p{Sinh}</tt> +- <tt>\p{Sogdian}</tt>, <tt>\p{Sogd}</tt> +- <tt>\p{Sora_Sompeng}</tt>, <tt>\p{Sora}</tt> +- <tt>\p{Soyombo}</tt>, <tt>\p{Soyo}</tt> +- <tt>\p{Sundanese}</tt>, <tt>\p{Sund}</tt> +- <tt>\p{Sunuwar}</tt>, <tt>\p{Sunu}</tt> +- <tt>\p{Syloti_Nagri}</tt>, <tt>\p{Sylo}</tt> +- <tt>\p{Syriac}</tt>, <tt>\p{Syrc}</tt> +- <tt>\p{Tagalog}</tt>, <tt>\p{Tglg}</tt> +- <tt>\p{Tagbanwa}</tt>, <tt>\p{Tagb}</tt> +- <tt>\p{Tai_Le}</tt>, <tt>\p{Tale}</tt> +- <tt>\p{Tai_Tham}</tt>, <tt>\p{Lana}</tt> +- <tt>\p{Tai_Viet}</tt>, <tt>\p{Tavt}</tt> +- <tt>\p{Tai_Yo}</tt>, <tt>\p{Tayo}</tt> +- <tt>\p{Takri}</tt>, <tt>\p{Takr}</tt> +- <tt>\p{Tamil}</tt>, <tt>\p{Taml}</tt> +- <tt>\p{Tangsa}</tt>, <tt>\p{Tnsa}</tt> +- <tt>\p{Tangut}</tt>, <tt>\p{Tang}</tt> +- <tt>\p{Telugu}</tt>, <tt>\p{Telu}</tt> +- <tt>\p{Thaana}</tt>, <tt>\p{Thaa}</tt> +- <tt>\p{Thai}</tt> +- <tt>\p{Tibetan}</tt>, <tt>\p{Tibt}</tt> +- <tt>\p{Tifinagh}</tt>, <tt>\p{Tfng}</tt> +- <tt>\p{Tirhuta}</tt>, <tt>\p{Tirh}</tt> +- <tt>\p{Todhri}</tt>, <tt>\p{Todr}</tt> +- <tt>\p{Tolong_Siki}</tt>, <tt>\p{Tols}</tt> +- <tt>\p{Toto}</tt> +- <tt>\p{Tulu_Tigalari}</tt>, <tt>\p{Tutg}</tt> +- <tt>\p{Ugaritic}</tt>, <tt>\p{Ugar}</tt> +- <tt>\p{Unknown}</tt>, <tt>\p{Zzzz}</tt> +- <tt>\p{Vai}</tt>, <tt>\p{Vaii}</tt> +- <tt>\p{Vithkuqi}</tt>, <tt>\p{Vith}</tt> +- <tt>\p{Wancho}</tt>, <tt>\p{Wcho}</tt> +- <tt>\p{Warang_Citi}</tt>, <tt>\p{Wara}</tt> +- <tt>\p{Yezidi}</tt>, <tt>\p{Yezi}</tt> +- <tt>\p{Yi}</tt>, <tt>\p{Yiii}</tt> +- <tt>\p{Zanabazar_Square}</tt>, <tt>\p{Zanb}</tt> + +=== Blocks + +- <tt>\p{In_Adlam}</tt> +- <tt>\p{In_Aegean_Numbers}</tt> +- <tt>\p{In_Ahom}</tt> +- <tt>\p{In_Alchemical_Symbols}</tt> +- <tt>\p{In_Alphabetic_Presentation_Forms}</tt> +- <tt>\p{In_Anatolian_Hieroglyphs}</tt> +- <tt>\p{In_Ancient_Greek_Musical_Notation}</tt> +- <tt>\p{In_Ancient_Greek_Numbers}</tt> +- <tt>\p{In_Ancient_Symbols}</tt> +- <tt>\p{In_Arabic}</tt> +- <tt>\p{In_Arabic_Extended_A}</tt> +- <tt>\p{In_Arabic_Extended_B}</tt> +- <tt>\p{In_Arabic_Extended_C}</tt> +- <tt>\p{In_Arabic_Mathematical_Alphabetic_Symbols}</tt> +- <tt>\p{In_Arabic_Presentation_Forms_A}</tt> +- <tt>\p{In_Arabic_Presentation_Forms_B}</tt> +- <tt>\p{In_Arabic_Supplement}</tt> +- <tt>\p{In_Armenian}</tt> +- <tt>\p{In_Arrows}</tt> +- <tt>\p{In_Avestan}</tt> +- <tt>\p{In_Balinese}</tt> +- <tt>\p{In_Bamum}</tt> +- <tt>\p{In_Bamum_Supplement}</tt> +- <tt>\p{In_Basic_Latin}</tt> +- <tt>\p{In_Bassa_Vah}</tt> +- <tt>\p{In_Batak}</tt> +- <tt>\p{In_Bengali}</tt> +- <tt>\p{In_Beria_Erfe}</tt> +- <tt>\p{In_Bhaiksuki}</tt> +- <tt>\p{In_Block_Elements}</tt> +- <tt>\p{In_Bopomofo}</tt> +- <tt>\p{In_Bopomofo_Extended}</tt> +- <tt>\p{In_Box_Drawing}</tt> +- <tt>\p{In_Brahmi}</tt> +- <tt>\p{In_Braille_Patterns}</tt> +- <tt>\p{In_Buginese}</tt> +- <tt>\p{In_Buhid}</tt> +- <tt>\p{In_Byzantine_Musical_Symbols}</tt> +- <tt>\p{In_CJK_Compatibility}</tt> +- <tt>\p{In_CJK_Compatibility_Forms}</tt> +- <tt>\p{In_CJK_Compatibility_Ideographs}</tt> +- <tt>\p{In_CJK_Compatibility_Ideographs_Supplement}</tt> +- <tt>\p{In_CJK_Radicals_Supplement}</tt> +- <tt>\p{In_CJK_Strokes}</tt> +- <tt>\p{In_CJK_Symbols_and_Punctuation}</tt> +- <tt>\p{In_CJK_Unified_Ideographs}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_A}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_B}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_C}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_D}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_E}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_F}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_G}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_H}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_I}</tt> +- <tt>\p{In_CJK_Unified_Ideographs_Extension_J}</tt> +- <tt>\p{In_Carian}</tt> +- <tt>\p{In_Caucasian_Albanian}</tt> +- <tt>\p{In_Chakma}</tt> +- <tt>\p{In_Cham}</tt> +- <tt>\p{In_Cherokee}</tt> +- <tt>\p{In_Cherokee_Supplement}</tt> +- <tt>\p{In_Chess_Symbols}</tt> +- <tt>\p{In_Chorasmian}</tt> +- <tt>\p{In_Combining_Diacritical_Marks}</tt> +- <tt>\p{In_Combining_Diacritical_Marks_Extended}</tt> +- <tt>\p{In_Combining_Diacritical_Marks_Supplement}</tt> +- <tt>\p{In_Combining_Diacritical_Marks_for_Symbols}</tt> +- <tt>\p{In_Combining_Half_Marks}</tt> +- <tt>\p{In_Common_Indic_Number_Forms}</tt> +- <tt>\p{In_Control_Pictures}</tt> +- <tt>\p{In_Coptic}</tt> +- <tt>\p{In_Coptic_Epact_Numbers}</tt> +- <tt>\p{In_Counting_Rod_Numerals}</tt> +- <tt>\p{In_Cuneiform}</tt> +- <tt>\p{In_Cuneiform_Numbers_and_Punctuation}</tt> +- <tt>\p{In_Currency_Symbols}</tt> +- <tt>\p{In_Cypriot_Syllabary}</tt> +- <tt>\p{In_Cypro_Minoan}</tt> +- <tt>\p{In_Cyrillic}</tt> +- <tt>\p{In_Cyrillic_Extended_A}</tt> +- <tt>\p{In_Cyrillic_Extended_B}</tt> +- <tt>\p{In_Cyrillic_Extended_C}</tt> +- <tt>\p{In_Cyrillic_Extended_D}</tt> +- <tt>\p{In_Cyrillic_Supplement}</tt> +- <tt>\p{In_Deseret}</tt> +- <tt>\p{In_Devanagari}</tt> +- <tt>\p{In_Devanagari_Extended}</tt> +- <tt>\p{In_Devanagari_Extended_A}</tt> +- <tt>\p{In_Dingbats}</tt> +- <tt>\p{In_Dives_Akuru}</tt> +- <tt>\p{In_Dogra}</tt> +- <tt>\p{In_Domino_Tiles}</tt> +- <tt>\p{In_Duployan}</tt> +- <tt>\p{In_Early_Dynastic_Cuneiform}</tt> +- <tt>\p{In_Egyptian_Hieroglyph_Format_Controls}</tt> +- <tt>\p{In_Egyptian_Hieroglyphs}</tt> +- <tt>\p{In_Egyptian_Hieroglyphs_Extended_A}</tt> +- <tt>\p{In_Elbasan}</tt> +- <tt>\p{In_Elymaic}</tt> +- <tt>\p{In_Emoticons}</tt> +- <tt>\p{In_Enclosed_Alphanumeric_Supplement}</tt> +- <tt>\p{In_Enclosed_Alphanumerics}</tt> +- <tt>\p{In_Enclosed_CJK_Letters_and_Months}</tt> +- <tt>\p{In_Enclosed_Ideographic_Supplement}</tt> +- <tt>\p{In_Ethiopic}</tt> +- <tt>\p{In_Ethiopic_Extended}</tt> +- <tt>\p{In_Ethiopic_Extended_A}</tt> +- <tt>\p{In_Ethiopic_Extended_B}</tt> +- <tt>\p{In_Ethiopic_Supplement}</tt> +- <tt>\p{In_Garay}</tt> +- <tt>\p{In_General_Punctuation}</tt> +- <tt>\p{In_Geometric_Shapes}</tt> +- <tt>\p{In_Geometric_Shapes_Extended}</tt> +- <tt>\p{In_Georgian}</tt> +- <tt>\p{In_Georgian_Extended}</tt> +- <tt>\p{In_Georgian_Supplement}</tt> +- <tt>\p{In_Glagolitic}</tt> +- <tt>\p{In_Glagolitic_Supplement}</tt> +- <tt>\p{In_Gothic}</tt> +- <tt>\p{In_Grantha}</tt> +- <tt>\p{In_Greek_Extended}</tt> +- <tt>\p{In_Greek_and_Coptic}</tt> +- <tt>\p{In_Gujarati}</tt> +- <tt>\p{In_Gunjala_Gondi}</tt> +- <tt>\p{In_Gurmukhi}</tt> +- <tt>\p{In_Gurung_Khema}</tt> +- <tt>\p{In_Halfwidth_and_Fullwidth_Forms}</tt> +- <tt>\p{In_Hangul_Compatibility_Jamo}</tt> +- <tt>\p{In_Hangul_Jamo}</tt> +- <tt>\p{In_Hangul_Jamo_Extended_A}</tt> +- <tt>\p{In_Hangul_Jamo_Extended_B}</tt> +- <tt>\p{In_Hangul_Syllables}</tt> +- <tt>\p{In_Hanifi_Rohingya}</tt> +- <tt>\p{In_Hanunoo}</tt> +- <tt>\p{In_Hatran}</tt> +- <tt>\p{In_Hebrew}</tt> +- <tt>\p{In_High_Private_Use_Surrogates}</tt> +- <tt>\p{In_High_Surrogates}</tt> +- <tt>\p{In_Hiragana}</tt> +- <tt>\p{In_IPA_Extensions}</tt> +- <tt>\p{In_Ideographic_Description_Characters}</tt> +- <tt>\p{In_Ideographic_Symbols_and_Punctuation}</tt> +- <tt>\p{In_Imperial_Aramaic}</tt> +- <tt>\p{In_Indic_Siyaq_Numbers}</tt> +- <tt>\p{In_Inscriptional_Pahlavi}</tt> +- <tt>\p{In_Inscriptional_Parthian}</tt> +- <tt>\p{In_Javanese}</tt> +- <tt>\p{In_Kaithi}</tt> +- <tt>\p{In_Kaktovik_Numerals}</tt> +- <tt>\p{In_Kana_Extended_A}</tt> +- <tt>\p{In_Kana_Extended_B}</tt> +- <tt>\p{In_Kana_Supplement}</tt> +- <tt>\p{In_Kanbun}</tt> +- <tt>\p{In_Kangxi_Radicals}</tt> +- <tt>\p{In_Kannada}</tt> +- <tt>\p{In_Katakana}</tt> +- <tt>\p{In_Katakana_Phonetic_Extensions}</tt> +- <tt>\p{In_Kawi}</tt> +- <tt>\p{In_Kayah_Li}</tt> +- <tt>\p{In_Kharoshthi}</tt> +- <tt>\p{In_Khitan_Small_Script}</tt> +- <tt>\p{In_Khmer}</tt> +- <tt>\p{In_Khmer_Symbols}</tt> +- <tt>\p{In_Khojki}</tt> +- <tt>\p{In_Khudawadi}</tt> +- <tt>\p{In_Kirat_Rai}</tt> +- <tt>\p{In_Lao}</tt> +- <tt>\p{In_Latin_1_Supplement}</tt> +- <tt>\p{In_Latin_Extended_A}</tt> +- <tt>\p{In_Latin_Extended_Additional}</tt> +- <tt>\p{In_Latin_Extended_B}</tt> +- <tt>\p{In_Latin_Extended_C}</tt> +- <tt>\p{In_Latin_Extended_D}</tt> +- <tt>\p{In_Latin_Extended_E}</tt> +- <tt>\p{In_Latin_Extended_F}</tt> +- <tt>\p{In_Latin_Extended_G}</tt> +- <tt>\p{In_Lepcha}</tt> +- <tt>\p{In_Letterlike_Symbols}</tt> +- <tt>\p{In_Limbu}</tt> +- <tt>\p{In_Linear_A}</tt> +- <tt>\p{In_Linear_B_Ideograms}</tt> +- <tt>\p{In_Linear_B_Syllabary}</tt> +- <tt>\p{In_Lisu}</tt> +- <tt>\p{In_Lisu_Supplement}</tt> +- <tt>\p{In_Low_Surrogates}</tt> +- <tt>\p{In_Lycian}</tt> +- <tt>\p{In_Lydian}</tt> +- <tt>\p{In_Mahajani}</tt> +- <tt>\p{In_Mahjong_Tiles}</tt> +- <tt>\p{In_Makasar}</tt> +- <tt>\p{In_Malayalam}</tt> +- <tt>\p{In_Mandaic}</tt> +- <tt>\p{In_Manichaean}</tt> +- <tt>\p{In_Marchen}</tt> +- <tt>\p{In_Masaram_Gondi}</tt> +- <tt>\p{In_Mathematical_Alphanumeric_Symbols}</tt> +- <tt>\p{In_Mathematical_Operators}</tt> +- <tt>\p{In_Mayan_Numerals}</tt> +- <tt>\p{In_Medefaidrin}</tt> +- <tt>\p{In_Meetei_Mayek}</tt> +- <tt>\p{In_Meetei_Mayek_Extensions}</tt> +- <tt>\p{In_Mende_Kikakui}</tt> +- <tt>\p{In_Meroitic_Cursive}</tt> +- <tt>\p{In_Meroitic_Hieroglyphs}</tt> +- <tt>\p{In_Miao}</tt> +- <tt>\p{In_Miscellaneous_Mathematical_Symbols_A}</tt> +- <tt>\p{In_Miscellaneous_Mathematical_Symbols_B}</tt> +- <tt>\p{In_Miscellaneous_Symbols}</tt> +- <tt>\p{In_Miscellaneous_Symbols_Supplement}</tt> +- <tt>\p{In_Miscellaneous_Symbols_and_Arrows}</tt> +- <tt>\p{In_Miscellaneous_Symbols_and_Pictographs}</tt> +- <tt>\p{In_Miscellaneous_Technical}</tt> +- <tt>\p{In_Modi}</tt> +- <tt>\p{In_Modifier_Tone_Letters}</tt> +- <tt>\p{In_Mongolian}</tt> +- <tt>\p{In_Mongolian_Supplement}</tt> +- <tt>\p{In_Mro}</tt> +- <tt>\p{In_Multani}</tt> +- <tt>\p{In_Musical_Symbols}</tt> +- <tt>\p{In_Myanmar}</tt> +- <tt>\p{In_Myanmar_Extended_A}</tt> +- <tt>\p{In_Myanmar_Extended_B}</tt> +- <tt>\p{In_Myanmar_Extended_C}</tt> +- <tt>\p{In_NKo}</tt> +- <tt>\p{In_Nabataean}</tt> +- <tt>\p{In_Nag_Mundari}</tt> +- <tt>\p{In_Nandinagari}</tt> +- <tt>\p{In_New_Tai_Lue}</tt> +- <tt>\p{In_Newa}</tt> +- <tt>\p{In_No_Block}</tt> +- <tt>\p{In_Number_Forms}</tt> +- <tt>\p{In_Nushu}</tt> +- <tt>\p{In_Nyiakeng_Puachue_Hmong}</tt> +- <tt>\p{In_Ogham}</tt> +- <tt>\p{In_Ol_Chiki}</tt> +- <tt>\p{In_Ol_Onal}</tt> +- <tt>\p{In_Old_Hungarian}</tt> +- <tt>\p{In_Old_Italic}</tt> +- <tt>\p{In_Old_North_Arabian}</tt> +- <tt>\p{In_Old_Permic}</tt> +- <tt>\p{In_Old_Persian}</tt> +- <tt>\p{In_Old_Sogdian}</tt> +- <tt>\p{In_Old_South_Arabian}</tt> +- <tt>\p{In_Old_Turkic}</tt> +- <tt>\p{In_Old_Uyghur}</tt> +- <tt>\p{In_Optical_Character_Recognition}</tt> +- <tt>\p{In_Oriya}</tt> +- <tt>\p{In_Ornamental_Dingbats}</tt> +- <tt>\p{In_Osage}</tt> +- <tt>\p{In_Osmanya}</tt> +- <tt>\p{In_Ottoman_Siyaq_Numbers}</tt> +- <tt>\p{In_Pahawh_Hmong}</tt> +- <tt>\p{In_Palmyrene}</tt> +- <tt>\p{In_Pau_Cin_Hau}</tt> +- <tt>\p{In_Phags_pa}</tt> +- <tt>\p{In_Phaistos_Disc}</tt> +- <tt>\p{In_Phoenician}</tt> +- <tt>\p{In_Phonetic_Extensions}</tt> +- <tt>\p{In_Phonetic_Extensions_Supplement}</tt> +- <tt>\p{In_Playing_Cards}</tt> +- <tt>\p{In_Private_Use_Area}</tt> +- <tt>\p{In_Psalter_Pahlavi}</tt> +- <tt>\p{In_Rejang}</tt> +- <tt>\p{In_Rumi_Numeral_Symbols}</tt> +- <tt>\p{In_Runic}</tt> +- <tt>\p{In_Samaritan}</tt> +- <tt>\p{In_Saurashtra}</tt> +- <tt>\p{In_Sharada}</tt> +- <tt>\p{In_Sharada_Supplement}</tt> +- <tt>\p{In_Shavian}</tt> +- <tt>\p{In_Shorthand_Format_Controls}</tt> +- <tt>\p{In_Siddham}</tt> +- <tt>\p{In_Sidetic}</tt> +- <tt>\p{In_Sinhala}</tt> +- <tt>\p{In_Sinhala_Archaic_Numbers}</tt> +- <tt>\p{In_Small_Form_Variants}</tt> +- <tt>\p{In_Small_Kana_Extension}</tt> +- <tt>\p{In_Sogdian}</tt> +- <tt>\p{In_Sora_Sompeng}</tt> +- <tt>\p{In_Soyombo}</tt> +- <tt>\p{In_Spacing_Modifier_Letters}</tt> +- <tt>\p{In_Specials}</tt> +- <tt>\p{In_Sundanese}</tt> +- <tt>\p{In_Sundanese_Supplement}</tt> +- <tt>\p{In_Sunuwar}</tt> +- <tt>\p{In_Superscripts_and_Subscripts}</tt> +- <tt>\p{In_Supplemental_Arrows_A}</tt> +- <tt>\p{In_Supplemental_Arrows_B}</tt> +- <tt>\p{In_Supplemental_Arrows_C}</tt> +- <tt>\p{In_Supplemental_Mathematical_Operators}</tt> +- <tt>\p{In_Supplemental_Punctuation}</tt> +- <tt>\p{In_Supplemental_Symbols_and_Pictographs}</tt> +- <tt>\p{In_Supplementary_Private_Use_Area_A}</tt> +- <tt>\p{In_Supplementary_Private_Use_Area_B}</tt> +- <tt>\p{In_Sutton_SignWriting}</tt> +- <tt>\p{In_Syloti_Nagri}</tt> +- <tt>\p{In_Symbols_and_Pictographs_Extended_A}</tt> +- <tt>\p{In_Symbols_for_Legacy_Computing}</tt> +- <tt>\p{In_Symbols_for_Legacy_Computing_Supplement}</tt> +- <tt>\p{In_Syriac}</tt> +- <tt>\p{In_Syriac_Supplement}</tt> +- <tt>\p{In_Tagalog}</tt> +- <tt>\p{In_Tagbanwa}</tt> +- <tt>\p{In_Tags}</tt> +- <tt>\p{In_Tai_Le}</tt> +- <tt>\p{In_Tai_Tham}</tt> +- <tt>\p{In_Tai_Viet}</tt> +- <tt>\p{In_Tai_Xuan_Jing_Symbols}</tt> +- <tt>\p{In_Tai_Yo}</tt> +- <tt>\p{In_Takri}</tt> +- <tt>\p{In_Tamil}</tt> +- <tt>\p{In_Tamil_Supplement}</tt> +- <tt>\p{In_Tangsa}</tt> +- <tt>\p{In_Tangut}</tt> +- <tt>\p{In_Tangut_Components}</tt> +- <tt>\p{In_Tangut_Components_Supplement}</tt> +- <tt>\p{In_Tangut_Supplement}</tt> +- <tt>\p{In_Telugu}</tt> +- <tt>\p{In_Thaana}</tt> +- <tt>\p{In_Thai}</tt> +- <tt>\p{In_Tibetan}</tt> +- <tt>\p{In_Tifinagh}</tt> +- <tt>\p{In_Tirhuta}</tt> +- <tt>\p{In_Todhri}</tt> +- <tt>\p{In_Tolong_Siki}</tt> +- <tt>\p{In_Toto}</tt> +- <tt>\p{In_Transport_and_Map_Symbols}</tt> +- <tt>\p{In_Tulu_Tigalari}</tt> +- <tt>\p{In_Ugaritic}</tt> +- <tt>\p{In_Unified_Canadian_Aboriginal_Syllabics}</tt> +- <tt>\p{In_Unified_Canadian_Aboriginal_Syllabics_Extended}</tt> +- <tt>\p{In_Unified_Canadian_Aboriginal_Syllabics_Extended_A}</tt> +- <tt>\p{In_Vai}</tt> +- <tt>\p{In_Variation_Selectors}</tt> +- <tt>\p{In_Variation_Selectors_Supplement}</tt> +- <tt>\p{In_Vedic_Extensions}</tt> +- <tt>\p{In_Vertical_Forms}</tt> +- <tt>\p{In_Vithkuqi}</tt> +- <tt>\p{In_Wancho}</tt> +- <tt>\p{In_Warang_Citi}</tt> +- <tt>\p{In_Yezidi}</tt> +- <tt>\p{In_Yi_Radicals}</tt> +- <tt>\p{In_Yi_Syllables}</tt> +- <tt>\p{In_Yijing_Hexagram_Symbols}</tt> +- <tt>\p{In_Zanabazar_Square}</tt> +- <tt>\p{In_Znamenny_Musical_Notation}</tt> + +=== Emoji + +- <tt>\p{Emoji}</tt> +- <tt>\p{Emoji_Component}</tt>, <tt>\p{EComp}</tt> +- <tt>\p{Emoji_Modifier}</tt>, <tt>\p{EMod}</tt> +- <tt>\p{Emoji_Modifier_Base}</tt>, <tt>\p{EBase}</tt> +- <tt>\p{Emoji_Presentation}</tt>, <tt>\p{EPres}</tt> +- <tt>\p{Extended_Pictographic}</tt>, <tt>\p{ExtPict}</tt> + +=== Graphemes + +- <tt>\p{Grapheme_Cluster_Break_CR}</tt> +- <tt>\p{Grapheme_Cluster_Break_Control}</tt> +- <tt>\p{Grapheme_Cluster_Break_Extend}</tt> +- <tt>\p{Grapheme_Cluster_Break_L}</tt> +- <tt>\p{Grapheme_Cluster_Break_LF}</tt> +- <tt>\p{Grapheme_Cluster_Break_LV}</tt> +- <tt>\p{Grapheme_Cluster_Break_LVT}</tt> +- <tt>\p{Grapheme_Cluster_Break_Prepend}</tt> +- <tt>\p{Grapheme_Cluster_Break_Regional_Indicator}</tt> +- <tt>\p{Grapheme_Cluster_Break_SpacingMark}</tt> +- <tt>\p{Grapheme_Cluster_Break_T}</tt> +- <tt>\p{Grapheme_Cluster_Break_V}</tt> +- <tt>\p{Grapheme_Cluster_Break_ZWJ}</tt> + +=== Derived Ages + +- <tt>\p{Age_10_0}</tt> +- <tt>\p{Age_11_0}</tt> +- <tt>\p{Age_12_0}</tt> +- <tt>\p{Age_12_1}</tt> +- <tt>\p{Age_13_0}</tt> +- <tt>\p{Age_14_0}</tt> +- <tt>\p{Age_15_0}</tt> +- <tt>\p{Age_15_1}</tt> +- <tt>\p{Age_16_0}</tt> +- <tt>\p{Age_17_0}</tt> +- <tt>\p{Age_1_1}</tt> +- <tt>\p{Age_2_0}</tt> +- <tt>\p{Age_2_1}</tt> +- <tt>\p{Age_3_0}</tt> +- <tt>\p{Age_3_1}</tt> +- <tt>\p{Age_3_2}</tt> +- <tt>\p{Age_4_0}</tt> +- <tt>\p{Age_4_1}</tt> +- <tt>\p{Age_5_0}</tt> +- <tt>\p{Age_5_1}</tt> +- <tt>\p{Age_5_2}</tt> +- <tt>\p{Age_6_0}</tt> +- <tt>\p{Age_6_1}</tt> +- <tt>\p{Age_6_2}</tt> +- <tt>\p{Age_6_3}</tt> +- <tt>\p{Age_7_0}</tt> +- <tt>\p{Age_8_0}</tt> +- <tt>\p{Age_9_0}</tt> diff --git a/doc/signals.rdoc b/doc/language/signals.rdoc index 403eb66549..a82dab81c6 100644 --- a/doc/signals.rdoc +++ b/doc/language/signals.rdoc @@ -17,7 +17,7 @@ for its internal data structures, but it does not know when it is safe for data structures in YOUR code. Ruby implements deferred signal handling by registering short C functions with only {async-signal-safe functions}[http://man7.org/linux/man-pages/man7/signal-safety.7.html] as -signal handlers. These short C functions only do enough tell the VM to +signal handlers. These short C functions only do enough to tell the VM to run callbacks registered via Signal.trap later in the main Ruby Thread. == Unsafe methods to call in Signal.trap blocks diff --git a/doc/language/strftime_formatting.rdoc b/doc/language/strftime_formatting.rdoc new file mode 100644 index 0000000000..2bfa6b975e --- /dev/null +++ b/doc/language/strftime_formatting.rdoc @@ -0,0 +1,525 @@ += Formats for Dates and Times + +Several Ruby time-related classes have instance method +strftime+, +which returns a formatted string representing all or part of a date or time: + +- Date#strftime. +- DateTime#strftime. +- Time#strftime. + +Each of these methods takes optional argument +format+, +which has zero or more embedded _format_ _specifications_ (see below). + +Each of these methods returns the string resulting from replacing each +format specification embedded in +format+ with a string form +of one or more parts of the date or time. + +A simple example: + + Time.now.strftime('%H:%M:%S') # => "14:02:07" + +A format specification has the form: + + %[flags][width]conversion + +It consists of: + +- A leading percent character. +- Zero or more _flags_ (each is a character). +- An optional _width_ _specifier_ (an integer). +- A _conversion_ _specifier_ (a character). + +Except for the leading percent character, +the only required part is the conversion specifier, so we begin with that. + +== Conversion Specifiers + +=== \Date (Year, Month, Day) + +- <tt>%Y</tt> - Year including century, zero-padded: + + Time.now.strftime('%Y') # => "2022" + Time.new(-1000).strftime('%Y') # => "-1000" # Before common era. + Time.new(10000).strftime('%Y') # => "10000" # Far future. + Time.new(10).strftime('%Y') # => "0010" # Zero-padded by default. + +- <tt>%y</tt> - Year without century, in range (0.99), zero-padded: + + Time.now.strftime('%y') # => "22" + Time.new(1).strftime('%y') # => "01" # Zero-padded by default. + +- <tt>%C</tt> - Century, zero-padded: + + Time.now.strftime('%C') # => "20" + Time.new(-1000).strftime('%C') # => "-10" # Before common era. + Time.new(10000).strftime('%C') # => "100" # Far future. + Time.new(100).strftime('%C') # => "01" # Zero-padded by default. + +- <tt>%m</tt> - Month of the year, in range (1..12), zero-padded: + + Time.new(2022, 1).strftime('%m') # => "01" # Zero-padded by default. + Time.new(2022, 12).strftime('%m') # => "12" + +- <tt>%B</tt> - Full month name, capitalized: + + Time.new(2022, 1).strftime('%B') # => "January" + Time.new(2022, 12).strftime('%B') # => "December" + +- <tt>%b</tt> - Abbreviated month name, capitalized: + + Time.new(2022, 1).strftime('%b') # => "Jan" + Time.new(2022, 12).strftime('%h') # => "Dec" + +- <tt>%h</tt> - Same as <tt>%b</tt>. + +- <tt>%d</tt> - Day of the month, in range (1..31), zero-padded: + + Time.new(2002, 1, 1).strftime('%d') # => "01" + Time.new(2002, 1, 31).strftime('%d') # => "31" + +- <tt>%e</tt> - Day of the month, in range (1..31), blank-padded: + + Time.new(2002, 1, 1).strftime('%e') # => " 1" + Time.new(2002, 1, 31).strftime('%e') # => "31" + +- <tt>%j</tt> - Day of the year, in range (1..366), zero-padded: + + Time.new(2002, 1, 1).strftime('%j') # => "001" + Time.new(2002, 12, 31).strftime('%j') # => "365" + +=== \Time (Hour, Minute, Second, Subsecond) + +- <tt>%H</tt> - Hour of the day, in range (0..23), zero-padded: + + Time.new(2022, 1, 1, 1).strftime('%H') # => "01" + Time.new(2022, 1, 1, 13).strftime('%H') # => "13" + +- <tt>%k</tt> - Hour of the day, in range (0..23), blank-padded: + + Time.new(2022, 1, 1, 1).strftime('%k') # => " 1" + Time.new(2022, 1, 1, 13).strftime('%k') # => "13" + +- <tt>%I</tt> - Hour of the day, in range (1..12), zero-padded: + + Time.new(2022, 1, 1, 1).strftime('%I') # => "01" + Time.new(2022, 1, 1, 13).strftime('%I') # => "01" + +- <tt>%l</tt> - Hour of the day, in range (1..12), blank-padded: + + Time.new(2022, 1, 1, 1).strftime('%l') # => " 1" + Time.new(2022, 1, 1, 13).strftime('%l') # => " 1" + +- <tt>%P</tt> - Meridian indicator, lowercase: + + Time.new(2022, 1, 1, 1).strftime('%P') # => "am" + Time.new(2022, 1, 1, 13).strftime('%P') # => "pm" + +- <tt>%p</tt> - Meridian indicator, uppercase: + + Time.new(2022, 1, 1, 1).strftime('%p') # => "AM" + Time.new(2022, 1, 1, 13).strftime('%p') # => "PM" + +- <tt>%M</tt> - Minute of the hour, in range (0..59), zero-padded: + + Time.new(2022, 1, 1, 1, 0, 0).strftime('%M') # => "00" + +- <tt>%S</tt> - Second of the minute in range (0..59), zero-padded: + + Time.new(2022, 1, 1, 1, 0, 0, 0).strftime('%S') # => "00" + +- <tt>%L</tt> - Millisecond of the second, in range (0..999), zero-padded: + + Time.new(2022, 1, 1, 1, 0, 0, 0).strftime('%L') # => "000" + +- <tt>%N</tt> - Fractional seconds, default width is 9 digits (nanoseconds): + + t = Time.now # => 2022-06-29 07:10:20.3230914 -0500 + t.strftime('%N') # => "323091400" # Default. + + Use {width specifiers}[rdoc-ref:@Width+Specifiers] + to adjust units: + + t.strftime('%3N') # => "323" # Milliseconds. + t.strftime('%6N') # => "323091" # Microseconds. + t.strftime('%9N') # => "323091400" # Nanoseconds. + t.strftime('%12N') # => "323091400000" # Picoseconds. + t.strftime('%15N') # => "323091400000000" # Femptoseconds. + t.strftime('%18N') # => "323091400000000000" # Attoseconds. + t.strftime('%21N') # => "323091400000000000000" # Zeptoseconds. + t.strftime('%24N') # => "323091400000000000000000" # Yoctoseconds. + +- <tt>%s</tt> - Number of seconds since the epoch: + + Time.now.strftime('%s') # => "1656505136" + +=== Timezone + +- <tt>%z</tt> - Timezone as hour and minute offset from UTC: + + Time.now.strftime('%z') # => "-0500" + +- <tt>%Z</tt> - Timezone name (platform-dependent): + + Time.now.strftime('%Z') # => "Central Daylight Time" + +=== Weekday + +- <tt>%A</tt> - Full weekday name: + + Time.now.strftime('%A') # => "Wednesday" + +- <tt>%a</tt> - Abbreviated weekday name: + + Time.now.strftime('%a') # => "Wed" + +- <tt>%u</tt> - Day of the week, in range (1..7), Monday is 1: + + t = Time.new(2022, 6, 26) # => 2022-06-26 00:00:00 -0500 + t.strftime('%a') # => "Sun" + t.strftime('%u') # => "7" + +- <tt>%w</tt> - Day of the week, in range (0..6), Sunday is 0: + + t = Time.new(2022, 6, 26) # => 2022-06-26 00:00:00 -0500 + t.strftime('%a') # => "Sun" + t.strftime('%w') # => "0" + +=== Week Number + +- <tt>%U</tt> - Week number of the year, in range (0..53), zero-padded, + where each week begins on a Sunday: + + t = Time.new(2022, 6, 26) # => 2022-06-26 00:00:00 -0500 + t.strftime('%a') # => "Sun" + t.strftime('%U') # => "26" + +- <tt>%W</tt> - Week number of the year, in range (0..53), zero-padded, + where each week begins on a Monday: + + t = Time.new(2022, 6, 26) # => 2022-06-26 00:00:00 -0500 + t.strftime('%a') # => "Sun" + t.strftime('%W') # => "25" + +=== Week Dates + +See {ISO 8601 week dates}[https://en.wikipedia.org/wiki/ISO_8601#Week_dates]. + + t0 = Time.new(2023, 1, 1) # => 2023-01-01 00:00:00 -0600 + t1 = Time.new(2024, 1, 1) # => 2024-01-01 00:00:00 -0600 + +- <tt>%G</tt> - Week-based year: + + t0.strftime('%G') # => "2022" + t1.strftime('%G') # => "2024" + +- <tt>%g</tt> - Week-based year without century, in range (0..99), zero-padded: + + t0.strftime('%g') # => "22" + t1.strftime('%g') # => "24" + +- <tt>%V</tt> - Week number of the week-based year, in range (1..53), + zero-padded: + + t0.strftime('%V') # => "52" + t1.strftime('%V') # => "01" + +=== Literals + +- <tt>%n</tt> - Newline character "\n": + + Time.now.strftime('%n') # => "\n" + +- <tt>%t</tt> - Tab character "\t": + + Time.now.strftime('%t') # => "\t" + +- <tt>%%</tt> - Percent character '%': + + Time.now.strftime('%%') # => "%" + +=== Shorthand Conversion Specifiers + +Each shorthand specifier here is shown with its corresponding +longhand specifier. + +- <tt>%c</tt> - \Date and time: + + Time.now.strftime('%c') # => "Wed Jun 29 08:01:41 2022" + Time.now.strftime('%a %b %e %T %Y') # => "Wed Jun 29 08:02:07 2022" + +- <tt>%D</tt> - \Date: + + Time.now.strftime('%D') # => "06/29/22" + Time.now.strftime('%m/%d/%y') # => "06/29/22" + +- <tt>%F</tt> - ISO 8601 date: + + Time.now.strftime('%F') # => "2022-06-29" + Time.now.strftime('%Y-%m-%d') # => "2022-06-29" + +- <tt>%v</tt> - VMS date: + + Time.now.strftime('%v') # => "29-JUN-2022" + Time.now.strftime('%e-%^b-%4Y') # => "29-JUN-2022" + +- <tt>%x</tt> - Same as <tt>%D</tt>. + +- <tt>%X</tt> - Same as <tt>%T</tt>. + +- <tt>%r</tt> - 12-hour time: + + Time.new(2022, 1, 1, 1).strftime('%r') # => "01:00:00 AM" + Time.new(2022, 1, 1, 1).strftime('%I:%M:%S %p') # => "01:00:00 AM" + Time.new(2022, 1, 1, 13).strftime('%r') # => "01:00:00 PM" + Time.new(2022, 1, 1, 13).strftime('%I:%M:%S %p') # => "01:00:00 PM" + +- <tt>%R</tt> - 24-hour time: + + Time.new(2022, 1, 1, 1).strftime('%R') # => "01:00" + Time.new(2022, 1, 1, 1).strftime('%H:%M') # => "01:00" + Time.new(2022, 1, 1, 13).strftime('%R') # => "13:00" + Time.new(2022, 1, 1, 13).strftime('%H:%M') # => "13:00" + +- <tt>%T</tt> - 24-hour time: + + Time.new(2022, 1, 1, 1).strftime('%T') # => "01:00:00" + Time.new(2022, 1, 1, 1).strftime('%H:%M:%S') # => "01:00:00" + Time.new(2022, 1, 1, 13).strftime('%T') # => "13:00:00" + Time.new(2022, 1, 1, 13).strftime('%H:%M:%S') # => "13:00:00" + +- <tt>%+</tt> (not supported in Time#strftime) - \Date and time: + + DateTime.now.strftime('%+') + # => "Wed Jun 29 08:31:53 -05:00 2022" + DateTime.now.strftime('%a %b %e %H:%M:%S %Z %Y') + # => "Wed Jun 29 08:32:18 -05:00 2022" + +== Flags + +Flags may affect certain formatting specifications. + +Multiple flags may be given with a single conversion specified; +order does not matter. + +=== Padding Flags + +- <tt>0</tt> - Pad with zeroes: + + Time.new(10).strftime('%0Y') # => "0010" + +- <tt>_</tt> - Pad with blanks: + + Time.new(10).strftime('%_Y') # => " 10" + +- <tt>-</tt> - Don't pad: + + Time.new(10).strftime('%-Y') # => "10" + +=== Casing Flags + +- <tt>^</tt> - Upcase result: + + Time.new(2022, 1).strftime('%B') # => "January" # No casing flag. + Time.new(2022, 1).strftime('%^B') # => "JANUARY" + +- <tt>#</tt> - Swapcase result: + + Time.now.strftime('%p') # => "AM" + Time.now.strftime('%^p') # => "AM" + Time.now.strftime('%#p') # => "am" + +=== Timezone Flags + +- <tt>:</tt> - Put timezone as colon-separated hours and minutes: + + Time.now.strftime('%:z') # => "-05:00" + +- <tt>::</tt> - Put timezone as colon-separated hours, minutes, and seconds: + + Time.now.strftime('%::z') # => "-05:00:00" + +== Width Specifiers + +The integer width specifier gives a minimum width for the returned string: + + Time.new(2002).strftime('%Y') # => "2002" # No width specifier. + Time.new(2002).strftime('%10Y') # => "0000002002" + Time.new(2002, 12).strftime('%B') # => "December" # No width specifier. + Time.new(2002, 12).strftime('%10B') # => " December" + Time.new(2002, 12).strftime('%3B') # => "December" # Ignored if too small. + += Specialized Format Strings + +Here are a few specialized format strings, +each based on an external standard. + +== HTTP Format + +The HTTP date format is based on +{RFC 2616}[https://www.rfc-editor.org/rfc/rfc2616], +and treats dates in the format <tt>'%a, %d %b %Y %T GMT'</tt>: + + d = Date.new(2001, 2, 3) # => #<Date: 2001-02-03> + # Return HTTP-formatted string. + httpdate = d.httpdate # => "Sat, 03 Feb 2001 00:00:00 GMT" + # Return new date parsed from HTTP-formatted string. + Date.httpdate(httpdate) # => #<Date: 2001-02-03> + # Return hash parsed from HTTP-formatted string. + Date._httpdate(httpdate) + # => {:wday=>6, :mday=>3, :mon=>2, :year=>2001, :hour=>0, :min=>0, :sec=>0, :zone=>"GMT", :offset=>0} + +== RFC 3339 Format + +The RFC 3339 date format is based on +{RFC 3339}[https://www.rfc-editor.org/rfc/rfc3339]: + + d = Date.new(2001, 2, 3) # => #<Date: 2001-02-03> + # Return 3339-formatted string. + rfc3339 = d.rfc3339 # => "2001-02-03T00:00:00+00:00" + # Return new date parsed from 3339-formatted string. + Date.rfc3339(rfc3339) # => #<Date: 2001-02-03> + # Return hash parsed from 3339-formatted string. + Date._rfc3339(rfc3339) + # => {:year=>2001, :mon=>2, :mday=>3, :hour=>0, :min=>0, :sec=>0, :zone=>"+00:00", :offset=>0} + +== RFC 2822 Format + +The RFC 2822 date format is based on +{RFC 2822}[https://www.rfc-editor.org/rfc/rfc2822], +and treats dates in the format <tt>'%a, %-d %b %Y %T %z'</tt>]: + + d = Date.new(2001, 2, 3) # => #<Date: 2001-02-03> + # Return 2822-formatted string. + rfc2822 = d.rfc2822 # => "Sat, 3 Feb 2001 00:00:00 +0000" + # Return new date parsed from 2822-formatted string. + Date.rfc2822(rfc2822) # => #<Date: 2001-02-03> + # Return hash parsed from 2822-formatted string. + Date._rfc2822(rfc2822) + # => {:wday=>6, :mday=>3, :mon=>2, :year=>2001, :hour=>0, :min=>0, :sec=>0, :zone=>"+0000", :offset=>0} + +== JIS X 0301 Format + +The JIS X 0301 format includes the +{Japanese era name}[https://en.wikipedia.org/wiki/Japanese_era_name], +and treats dates in the format <tt>'%Y-%m-%d'</tt> +with the first letter of the romanized era name prefixed: + + d = Date.new(2001, 2, 3) # => #<Date: 2001-02-03> + # Return 0301-formatted string. + jisx0301 = d.jisx0301 # => "H13.02.03" + # Return new date parsed from 0301-formatted string. + Date.jisx0301(jisx0301) # => #<Date: 2001-02-03> + # Return hash parsed from 0301-formatted string. + Date._jisx0301(jisx0301) # => {:year=>2001, :mon=>2, :mday=>3} + +== ISO 8601 Format Specifications + +This section shows format specifications that are compatible with +{ISO 8601}[https://en.wikipedia.org/wiki/ISO_8601]. +Details for various formats may be seen at the links. + +Examples in this section assume: + + t = Time.now # => 2022-06-29 16:49:25.465246 -0500 + +=== Dates + +See {ISO 8601 dates}[https://en.wikipedia.org/wiki/ISO_8601#Dates]. + +- {Years}[https://en.wikipedia.org/wiki/ISO_8601#Years]: + + - Basic year (+YYYY+): + + t.strftime('%Y') # => "2022" + + - Expanded year (<tt>±YYYYY</tt>): + + t.strftime('+%5Y') # => "+02022" + t.strftime('-%5Y') # => "-02022" + +- {Calendar dates}[https://en.wikipedia.org/wiki/ISO_8601#Calendar_dates]: + + - Basic date (+YYYYMMDD+): + + t.strftime('%Y%m%d') # => "20220629" + + - Extended date (<tt>YYYY-MM-DD</tt>): + + t.strftime('%Y-%m-%d') # => "2022-06-29" + + - Reduced extended date (<tt>YYYY-MM</tt>): + + t.strftime('%Y-%m') # => "2022-06" + +- {Week dates}[https://en.wikipedia.org/wiki/ISO_8601#Week_dates]: + + - Basic date (+YYYYWww+ or +YYYYWwwD+): + + t.strftime('%Y%Ww') # => "202226w" + t.strftime('%Y%Ww%u') # => "202226w3" + + - Extended date (<tt>YYYY-Www</tt> or <tt>YYYY-Www-D<tt>): + + t.strftime('%Y-%Ww') # => "2022-26w" + t.strftime('%Y-%Ww-%u') # => "2022-26w-3" + +- {Ordinal dates}[https://en.wikipedia.org/wiki/ISO_8601#Ordinal_dates]: + + - Basic date (+YYYYDDD+): + + t.strftime('%Y%j') # => "2022180" + + - Extended date (<tt>YYYY-DDD</tt>): + + t.strftime('%Y-%j') # => "2022-180" + +=== Times + +See {ISO 8601 times}[https://en.wikipedia.org/wiki/ISO_8601#Times]. + +- Times: + + - Basic time (+Thhmmss.sss+, +Thhmmss+, +Thhmm+, or +Thh+): + + t.strftime('T%H%M%S.%L') # => "T164925.465" + t.strftime('T%H%M%S') # => "T164925" + t.strftime('T%H%M') # => "T1649" + t.strftime('T%H') # => "T16" + + - Extended time (+Thh:mm:ss.sss+, +Thh:mm:ss+, or +Thh:mm+): + + t.strftime('T%H:%M:%S.%L') # => "T16:49:25.465" + t.strftime('T%H:%M:%S') # => "T16:49:25" + t.strftime('T%H:%M') # => "T16:49" + +- {Time zone designators}[https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators]: + + - Timezone (+time+ represents a valid time, + +hh+ represents a valid 2-digit hour, + and +mm+ represents a valid 2-digit minute): + + - Basic timezone (<tt>time±hhmm</tt>, <tt>time±hh</tt>, or +timeZ+): + + t.strftime('T%H%M%S%z') # => "T164925-0500" + t.strftime('T%H%M%S%z').slice(0..-3) # => "T164925-05" + t.strftime('T%H%M%SZ') # => "T164925Z" + + - Extended timezone (<tt>time±hh:mm</tt>): + + t.strftime('T%H:%M:%S%z') # => "T16:49:25-0500" + + - See also: + + - {Local time (unqualified)}[https://en.wikipedia.org/wiki/ISO_8601#Local_time_(unqualified)]. + - {Coordinated Universal Time (UTC)}[https://en.wikipedia.org/wiki/ISO_8601#Coordinated_Universal_Time_(UTC)]. + - {Time offsets from UTC}[https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC]. + +=== Combined \Date and \Time + +See {ISO 8601 Combined date and time representations}[https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations]. + +An ISO 8601 combined date and time representation may be any +ISO 8601 date and any ISO 8601 time, +separated by the letter +T+. + +For the relevant +strftime+ formats, see {Dates}[rdoc-ref:@Dates] and {Times}[rdoc-ref:@Times] above. diff --git a/doc/maintainers.md b/doc/maintainers.md new file mode 100644 index 0000000000..46840343ca --- /dev/null +++ b/doc/maintainers.md @@ -0,0 +1,718 @@ +# Maintainers + +This page describes the current branch, module, library, and extension maintainers of Ruby. + +## Branch Maintainers + +A branch maintainer is responsible for backporting commits into stable branches +and publishing Ruby patch releases. + +[The list of current branch maintainers is available in the wiki](https://github.com/ruby/ruby/wiki/Release-Engineering). + +## Module Maintainers + +A module maintainer is responsible for a certain part of Ruby. + +* The maintainer fixes bugs of the part. Particularly, they should fix + security vulnerabilities as soon as possible. +* They handle issues related the module on the Redmine or ML. +* They may be discharged by the 3 months rule [[ruby-core:25764]](https://blade.ruby-lang.org/ruby-core/25764). +* They have commit right to Ruby's repository to modify their part in the + repository. +* They have "developer" role on the Redmine to modify issues. +* They have authority to decide the feature of their part. But they should + always respect discussions on ruby-core/ruby-dev. + +A submaintainer of a module is like a maintainer. But the submaintainer does +not have authority to change/add a feature on his/her part. They need +consensus on ruby-core/ruby-dev before changing/adding. Some of submaintainers +have commit right, others don't. + +No maintainer means that there is no specific maintainer for the part now. +The member of ruby core team can fix issues at anytime. But major changes need +consensus on ruby-core/ruby-dev. + +### Language core features including security + +* Yukihiro Matsumoto ([matz]) + +### Evaluator + +* Koichi Sasada ([ko1]) + +### Core classes + +* Yukihiro Matsumoto ([matz]) + +### Standard Library Maintainers + +#### lib/mkmf.rb + +* *No maintainer* + +#### lib/rubygems.rb, lib/rubygems/* + +* Hiroshi SHIBATA ([hsbt]) +* https://github.com/ruby/rubygems + +#### lib/unicode_normalize.rb, lib/unicode_normalize/* + +* Martin J. Dürst ([duerst]) + +### Standard Library(Extensions) Maintainers + +#### ext/continuation + +* Koichi Sasada ([ko1]) + +#### ext/coverage + +* Yusuke Endoh ([mame]) + +#### ext/fiber + +* Koichi Sasada ([ko1]) + +#### ext/monitor + +* Koichi Sasada ([ko1]) + +#### ext/objspace + +* *No maintainer* + +#### ext/pty + +* *No maintainer* + +#### ext/ripper + +* *No maintainer* + +#### ext/socket + +* Tanaka Akira ([akr]) +* API change needs matz's approval + +#### ext/win32 + +* NAKAMURA Usaku ([unak]) + +### Default gems(Libraries) Maintainers + +#### lib/bundler.rb, lib/bundler/* + +* Hiroshi SHIBATA ([hsbt]) +* https://github.com/ruby/rubygems +* https://rubygems.org/gems/bundler + +#### lib/cgi/escape.rb + +* *No maintainer* + +#### lib/English.rb + +* *No maintainer* +* https://github.com/ruby/English +* https://rubygems.org/gems/English + +#### lib/delegate.rb + +* *No maintainer* +* https://github.com/ruby/delegate +* https://rubygems.org/gems/delegate + +#### lib/did_you_mean.rb + +* Yuki Nishijima ([yuki24]) +* https://github.com/ruby/did_you_mean +* https://rubygems.org/gems/did_you_mean + +#### ext/digest, ext/digest/* + +* Akinori MUSHA ([knu]) +* https://github.com/ruby/digest +* https://rubygems.org/gems/digest + +#### lib/erb.rb + +* Masatoshi SEKI ([seki]) +* Takashi Kokubun ([k0kubun]) +* https://github.com/ruby/erb +* https://rubygems.org/gems/erb + +#### lib/error_highlight.rb, lib/error_highlight/* + +* Yusuke Endoh ([mame]) +* https://github.com/ruby/error_highlight +* https://rubygems.org/gems/error_highlight + +#### lib/fileutils.rb + +* *No maintainer* +* https://github.com/ruby/fileutils +* https://rubygems.org/gems/fileutils + +#### lib/find.rb + +* Kazuki Tsujimoto ([k-tsj]) +* https://github.com/ruby/find +* https://rubygems.org/gems/find + +#### lib/forwardable.rb + +* Keiju ISHITSUKA ([keiju]) +* https://github.com/ruby/forwardable +* https://rubygems.org/gems/forwardable + +#### lib/ipaddr.rb + +* Akinori MUSHA ([knu]) +* https://github.com/ruby/ipaddr +* https://rubygems.org/gems/ipaddr + +#### lib/optparse.rb, lib/optparse/* + +* Nobuyuki Nakada ([nobu]) +* https://github.com/ruby/optparse +* https://rubygems.org/gems/optparse + +#### lib/net/http.rb, lib/net/https.rb + +* NARUSE, Yui ([nurse]) +* https://github.com/ruby/net-http +* https://rubygems.org/gems/net-http + +#### lib/net/protocol.rb + +* *No maintainer* +* https://github.com/ruby/net-protocol +* https://rubygems.org/gems/net-protocol + +#### lib/open3.rb + +* *No maintainer* +* https://github.com/ruby/open3 +* https://rubygems.org/gems/open3 + +#### lib/open-uri.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/open-uri +* https://rubygems.org/gems/open-uri + +#### lib/pp.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/pp +* https://rubygems.org/gems/pp + +#### lib/prettyprint.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/prettyprint +* https://rubygems.org/gems/prettyprint + +#### lib/prism.rb + +* Kevin Newton ([kddnewton]) +* Eileen Uchitelle ([eileencodes]) +* Aaron Patterson ([tenderlove]) +* https://github.com/ruby/prism +* https://rubygems.org/gems/prism + +#### lib/resolv.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/resolv +* https://rubygems.org/gems/resolv + +#### lib/securerandom.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/securerandom +* https://rubygems.org/gems/securerandom + +#### lib/shellwords.rb + +* Akinori MUSHA ([knu]) +* https://github.com/ruby/shellwords +* https://rubygems.org/gems/shellwords + +#### lib/singleton.rb + +* Yukihiro Matsumoto ([matz]) +* https://github.com/ruby/singleton +* https://rubygems.org/gems/singleton + +#### lib/tempfile.rb + +* *No maintainer* +* https://github.com/ruby/tempfile +* https://rubygems.org/gems/tempfile + +#### lib/time.rb + +* Tanaka Akira ([akr]) +* https://github.com/ruby/time +* https://rubygems.org/gems/time + +#### lib/timeout.rb + +* Yukihiro Matsumoto ([matz]) +* https://github.com/ruby/timeout +* https://rubygems.org/gems/timeout + +#### lib/tmpdir.rb + +* *No maintainer* +* https://github.com/ruby/tmpdir +* https://rubygems.org/gems/tmpdir + +#### lib/un.rb + +* WATANABE Hirofumi ([eban]) +* https://github.com/ruby/un +* https://rubygems.org/gems/un + +#### lib/uri.rb, lib/uri/* + +* NARUSE, Yui ([nurse]) +* https://github.com/ruby/uri +* https://rubygems.org/gems/uri + +#### lib/yaml.rb, lib/yaml/* + +* Aaron Patterson ([tenderlove]) +* Hiroshi SHIBATA ([hsbt]) +* https://github.com/ruby/yaml +* https://rubygems.org/gems/yaml + +#### lib/weakref.rb + +* *No maintainer* +* https://github.com/ruby/weakref +* https://rubygems.org/gems/weakref + +### Default gems(Extensions) Maintainers + +#### ext/cgi + +* Nobuyoshi Nakada ([nobu]) + +#### ext/date + +* *No maintainer* +* https://github.com/ruby/date +* https://rubygems.org/gems/date + +#### ext/etc + +* *No maintainer* +* https://github.com/ruby/etc +* https://rubygems.org/gems/etc + +#### ext/fcntl + +* *No maintainer* +* https://github.com/ruby/fcntl +* https://rubygems.org/gems/fcntl + +#### ext/io/console + +* Nobuyuki Nakada ([nobu]) +* https://github.com/ruby/io-console +* https://rubygems.org/gems/io-console + +#### ext/io/nonblock + +* Nobuyuki Nakada ([nobu]) +* https://github.com/ruby/io-nonblock +* https://rubygems.org/gems/io-nonblock + +#### ext/io/wait + +* Nobuyuki Nakada ([nobu]) +* https://github.com/ruby/io-wait +* https://rubygems.org/gems/io-wait + +#### ext/json + +* NARUSE, Yui ([nurse]) +* Hiroshi SHIBATA ([hsbt]) +* Jean Boussier ([byroot]) +* https://github.com/ruby/json +* https://rubygems.org/gems/json + +#### ext/openssl + +* Kazuki Yamaguchi ([rhenium]) +* https://github.com/ruby/openssl +* https://rubygems.org/gems/openssl + +#### ext/pathname + +* Tanaka Akira ([akr]) +* https://github.com/ruby/pathname +* https://rubygems.org/gems/pathname + +#### ext/psych + +* Aaron Patterson ([tenderlove]) +* Hiroshi SHIBATA ([hsbt]) +* https://github.com/ruby/psych +* https://rubygems.org/gems/psych + +#### ext/stringio + +* Nobuyuki Nakada ([nobu]) +* https://github.com/ruby/stringio +* https://rubygems.org/gems/stringio + +#### ext/strscan + +* Kouhei Sutou ([kou]) +* https://github.com/ruby/strscan +* https://rubygems.org/gems/strscan + +#### ext/zlib + +* NARUSE, Yui ([nurse]) +* https://github.com/ruby/zlib +* https://rubygems.org/gems/zlib + +## Bundled gems upstream repositories and maintainers + +The maintanance policy of bundled gems is different from Module Maintainers above. +Please check the policies for each repository. + +The ruby core team tries to maintain the repositories with no maintainers. +It may needs to make consensus on ruby-core/ruby-dev before making major changes. + +### minitest + +* Ryan Davis ([zenspider]) +* https://github.com/minitest/minitest +* https://rubygems.org/gems/minitest + +### power_assert + +* Tsujimoto Kenta ([k-tsj]) +* https://github.com/ruby/power_assert +* https://rubygems.org/gems/power_assert + +### rake + +* Hiroshi SHIBATA ([hsbt]) +* https://github.com/ruby/rake +* https://rubygems.org/gems/rake + +### test-unit + +* Kouhei Sutou ([kou]) +* https://github.com/test-unit/test-unit +* https://rubygems.org/gems/test-unit + +### rexml + +* Kouhei Sutou ([kou]) +* https://github.com/ruby/rexml +* https://rubygems.org/gems/rexml + +### rss + +* Kouhei Sutou ([kou]) +* https://github.com/ruby/rss +* https://rubygems.org/gems/rss + +### net-ftp + +* Shugo Maeda ([shugo]) +* https://github.com/ruby/net-ftp +* https://rubygems.org/gems/net-ftp + +### net-imap + +* Nicholas A. Evans ([nevans]) +* https://github.com/ruby/net-imap +* https://rubygems.org/gems/net-imap + +### net-pop + +* https://github.com/ruby/net-pop +* https://rubygems.org/gems/net-pop + +### net-smtp + +* TOMITA Masahiro ([tmtm]) +* https://github.com/ruby/net-smtp +* https://rubygems.org/gems/net-smtp + +### matrix + +* Marc-André Lafortune ([marcandre]) +* https://github.com/ruby/matrix +* https://rubygems.org/gems/matrix + +### prime + +* https://github.com/ruby/prime +* https://rubygems.org/gems/prime + +### rbs + +* Soutaro Matsumoto ([soutaro]) +* https://github.com/ruby/rbs +* https://rubygems.org/gems/rbs + +### typeprof + +* Yusuke Endoh ([mame]) +* https://github.com/ruby/typeprof +* https://rubygems.org/gems/typeprof + +### debug + +* Koichi Sasada ([ko1]) +* https://github.com/ruby/debug +* https://rubygems.org/gems/debug + +### racc + +* Yuichi Kaneko ([yui-knk]) +* https://github.com/ruby/racc +* https://rubygems.org/gems/racc + +#### mutex_m + +* https://github.com/ruby/mutex_m +* https://rubygems.org/gems/mutex_m + +#### getoptlong + +* https://github.com/ruby/getoptlong +* https://rubygems.org/gems/getoptlong + +#### base64 + +* Yusuke Endoh ([mame]) +* https://github.com/ruby/base64 +* https://rubygems.org/gems/base64 + +#### bigdecimal + +* Kenta Murata ([mrkn]) +* https://github.com/ruby/bigdecimal +* https://rubygems.org/gems/bigdecimal + +#### observer + +* https://github.com/ruby/observer +* https://rubygems.org/gems/observer + +#### abbrev + +* Akinori MUSHA ([knu]) +* https://github.com/ruby/abbrev +* https://rubygems.org/gems/abbrev + +#### resolv-replace + +* Akira TANAKA ([akr]) +* https://github.com/ruby/resolv-replace +* https://rubygems.org/gems/resolv-replace + +#### rinda + +* Masatoshi SEKI ([seki]) +* https://github.com/ruby/rinda +* https://rubygems.org/gems/rinda + +#### drb + +* Masatoshi SEKI ([seki]) +* https://github.com/ruby/drb +* https://rubygems.org/gems/drb + +#### nkf + +* Naruse Yusuke ([nurse]) +* https://github.com/ruby/nkf +* https://rubygems.org/gems/nkf + +#### syslog + +* Akinori Musha ([knu]) +* https://github.com/ruby/syslog +* https://rubygems.org/gems/syslog + +#### csv + +* Kouhei Sutou ([kou]) +* https://github.com/ruby/csv +* https://rubygems.org/gems/csv + +#### ostruct + +* Marc-André Lafortune ([marcandre]) +* https://github.com/ruby/ostruct +* https://rubygems.org/gems/ostruct + +#### pstore + +* https://github.com/ruby/pstore +* https://rubygems.org/gems/pstore + +#### benchmark + +* https://github.com/ruby/benchmark +* https://rubygems.org/gems/benchmark + +#### logger + +* Naotoshi Seo ([sonots]) +* https://github.com/ruby/logger +* https://rubygems.org/gems/logger + +#### rdoc + +* Stan Lo ([st0012]) +* Nobuyoshi Nakada ([nobu]) +* https://github.com/ruby/rdoc +* https://rubygems.org/gems/rdoc + +#### win32ole + +* Masaki Suketa ([suketa]) +* https://github.com/ruby/win32ole +* https://rubygems.org/gems/win32ole + +#### irb + +* Tomoya Ishida ([tompng]) +* Stan Lo ([st0012]) +* Mari Imaizumi ([ima1zumi]) +* HASUMI Hitoshi ([hasumikin]) +* https://github.com/ruby/irb +* https://rubygems.org/gems/irb + +#### reline + +* Tomoya Ishida ([tompng]) +* Stan Lo ([st0012]) +* Mari Imaizumi ([ima1zumi]) +* HASUMI Hitoshi ([hasumikin]) +* https://github.com/ruby/reline +* https://rubygems.org/gems/reline + +#### readline + +* https://github.com/ruby/readline +* https://rubygems.org/gems/readline + +#### fiddle + +* Kouhei Sutou ([kou]) +* https://github.com/ruby/fiddle +* https://rubygems.org/gems/fiddle + +#### repl_type_completor + +* Tomoya Ishida ([tompng]) +* https://github.com/ruby/repl_type_completor +* https://rubygems.org/gems/repl_type_completor + +#### tsort + +* Tanaka Akira ([akr]) +* https://github.com/ruby/tsort +* https://rubygems.org/gems/tsort + +## Platform Maintainers + +### mswin64 (Microsoft Windows) + +* NAKAMURA Usaku ([unak]) + +### mingw32 (Minimalist GNU for Windows) + +* Nobuyoshi Nakada ([nobu]) + +### AIX + +* Yutaka Kanemoto ([kanemoto]) + +### FreeBSD + +* Akinori MUSHA ([knu]) + +### Solaris + +* Naohisa Goto ([ngoto]) + +### RHEL, CentOS + +* KOSAKI Motohiro ([kosaki]) + +### macOS + +* Kenta Murata ([mrkn]) + +### OpenBSD + +* Jeremy Evans ([jeremyevans]) + +### cygwin, ... + +* **No maintainer** + +### WebAssembly/WASI + +* Yuta Saito ([kateinoigakukun]) + +[akr]: https://github.com/akr +[byroot]: https://github.com/byroot +[colby-swandale]: https://github.com/colby-swandale +[drbrain]: https://github.com/drbrain +[duerst]: https://github.com/duerst +[eban]: https://github.com/eban +[eileencodes]: https://github.com/eileencodes +[hasumikin]: https://github.com/hasumikin +[hsbt]: https://github.com/hsbt +[ima1zumi]: https://github.com/ima1zumi +[jeremyevans]: https://github.com/jeremyevans +[k-tsj]: https://github.com/k-tsj +[k0kubun]: https://github.com/k0kubun +[kanemoto]: https://github.com/kanemoto +[kateinoigakukun]: https://github.com/kateinoigakukun +[kddnewton]: https://github.com/kddnewton +[keiju]: https://github.com/keiju +[knu]: https://github.com/knu +[ko1]: https://github.com/ko1 +[kosaki]: https://github.com/kosaki +[kou]: https://github.com/kou +[mame]: https://github.com/mame +[marcandre]: https://github.com/marcandre +[matz]: https://github.com/matz +[mrkn]: https://github.com/mrkn +[ngoto]: https://github.com/ngoto +[nobu]: https://github.com/nobu +[nurse]: https://github.com/nurse +[rhenium]: https://github.com/rhenium +[seki]: https://github.com/seki +[suketa]: https://github.com/suketa +[sonots]: https://github.com/sonots +[st0012]: https://github.com/st0012 +[tenderlove]: https://github.com/tenderlove +[tompng]: https://github.com/tompng +[unak]: https://github.com/unak +[yuki24]: https://github.com/yuki24 +[zenspider]: https://github.com/zenspider +[k-tsj]: https://github.com/k-tsj +[nevans]: https://github.com/nevans +[tmtm]: https://github.com/tmtm +[shugo]: https://github.com/shugo +[soutaro]: https://github.com/soutaro +[yui-knk]: https://github.com/yui-knk +[hasumikin]: https://github.com/hasumikin +[suketa]: https://github.com/suketa diff --git a/doc/maintainers.rdoc b/doc/maintainers.rdoc deleted file mode 100644 index 1718be1bf9..0000000000 --- a/doc/maintainers.rdoc +++ /dev/null @@ -1,297 +0,0 @@ -= Maintainers - -This page describes the current module, library, and extension maintainers of Ruby. - -== Module Maintainers - -A module maintainer is responsible for a certain part of Ruby. - -* The maintainer fixes bugs of the part. Particularly, they should fix security vulnerabilities as soon as possible. -* They handle issues related the module on the Redmine or ML. -* They may be discharged by the 3 months rule [ruby-core:25764]. -* They have commit right to Ruby's repository to modify their part in the repository. -* They have "developer" role on the Redmine to modify issues. -* They have authority to decide the feature of their part. But they should always respect discussions on ruby-core/ruby-dev. - -A submaintainer of a module is like a maintainer. But The submaintainer does -not have authority to change/add a feature on his/her part. They need consensus -on ruby-core/ruby-dev before changing/adding. Some of submaintainers have -commit right, others don't. - -=== Language core features including security - -Yukihiro Matsumoto (matz) - -=== Evaluator - -Koichi Sasada (ko1) - -=== Core classes - -Yukihiro Matsumoto (matz) - -=== Documentation - -Zachary Scott (zzak) - -== Standard Library Maintainers - -=== Libraries - -[lib/English.rb] - _unmaintained_ -[lib/abbrev.rb] - Akinori MUSHA (knu) -[lib/base64.rb] - Yusuke Endoh (mame) -[lib/benchmark.rb] - _unmaintained_ -[lib/cgi.rb, lib/cgi/*] - Takeyuki Fujioka (xibbar) -[lib/drb.rb, lib/drb/*] - Masatoshi SEKI (seki) -[lib/debug.rb] - _unmaintained_ -[lib/delegate.rb] - _unmaintained_ -[lib/erb.rb] - Masatoshi SEKI (seki), Takashi Kokubun (k0kubun) -[lib/find.rb] - Kazuki Tsujimoto (ktsj) -[lib/getoptlong.rb] - _unmaintained_ -[lib/mkmf.rb] - _unmaintained_ -[lib/monitor.rb] - Shugo Maeda (shugo) -[lib/net/ftp.rb] - Shugo Maeda (shugo) -[lib/net/imap.rb] - Shugo Maeda (shugo) -[lib/net/http.rb, lib/net/https.rb] - NARUSE, Yui (naruse) -[lib/net/pop.rb] - _unmaintained_ -[lib/net/protocol.rb] - _unmaintained_ -[lib/net/smtp.rb] - _unmaintained_ -[lib/observer.rb] - _unmaintained_ -[lib/open-uri.rb] - Tanaka Akira (akr) -[lib/open3.rb] - _unmaintained_ -[lib/optparse.rb, lib/optparse/*] - Nobuyuki Nakada (nobu) -[lib/pp.rb] - Tanaka Akira (akr) -[lib/prettyprint.rb] - Tanaka Akira (akr) -[lib/pstore.rb] - _unmaintained_ -[lib/racc/*] - Aaron Patterson (tenderlove) -[lib/reline.rb, lib/reline/*] - aycabta -[lib/resolv-replace.rb] - Tanaka Akira (akr) -[lib/resolv.rb] - Tanaka Akira (akr) -[lib/rinda/*] - Masatoshi SEKI (seki) -[lib/rubygems.rb, lib/rubygems/*] - Eric Hodel (drbrain), Hiroshi SHIBATA (hsbt) - https://github.com/rubygems/rubygems -[lib/set.rb] - Akinori MUSHA (knu) -[lib/securerandom.rb] - Tanaka Akira (akr) -[lib/shellwords.rb] - Akinori MUSHA (knu) -[lib/singleton.rb] - Yukihiro Matsumoto (matz) -[lib/tempfile.rb] - _unmaintained_ -[lib/tmpdir.rb] - _unmaintained_ -[lib/time.rb] - Tanaka Akira (akr) -[lib/timeout.rb] - Yukihiro Matsumoto (matz) -[lib/tsort.rb] - Tanaka Akira (akr) -[lib/un.rb] - WATANABE Hirofumi (eban) -[lib/unicode_normalize.rb, lib/unicode_normalize/*] - Martin J. Dürst -[lib/uri.rb, lib/uri/*] - YAMADA, Akira (akira) -[lib/weakref.rb] - _unmaintained_ -[lib/yaml.rb, lib/yaml/*] - Aaron Patterson (tenderlove), Hiroshi SHIBATA (hsbt) - -=== Extensions - -[ext/cgi] - Nobuyoshi Nakada (nobu) -[ext/continuation] - Koichi Sasada (ko1) -[ext/coverage] - Yusuke Endoh (mame) -[ext/digest, ext/digest/*] - Akinori MUSHA (knu) -[ext/fiber] - Koichi Sasada (ko1) -[ext/io/nonblock] - Nobuyuki Nakada (nobu) -[ext/io/wait] - Nobuyuki Nakada (nobu) -[ext/nkf] - NARUSE, Yui (naruse) -[ext/objspace] - _unmaintained_ -[ext/pathname] - Tanaka Akira (akr) -[ext/pty] - _unmaintained_ -[ext/racc] - Aaron Patterson (tenderlove) -[ext/readline] - TAKAO Kouji (kouji) -[ext/ripper] - _unmaintained_ -[ext/socket] - * Tanaka Akira (akr) - * API change needs matz's approval -[ext/syslog] - Akinori MUSHA (knu) -[ext/win32] - NAKAMURA Usaku (usa) -[ext/win32ole] - Masaki Suketa (suke) - -== Default gems Maintainers - -=== Libraries - -[lib/bundler.rb, lib/bundler/*] - Hiroshi SHIBATA (hsbt) - https://github.com/bundler/bundler -[lib/cmath.rb] - _unmaintained_ - https://github.com/ruby/cmath -[lib/csv.rb] - Kenta Murata (mrkn), Kouhei Sutou (kou) - https://github.com/ruby/csv -[lib/e2mmap.rb] - Keiju ISHITSUKA (keiju) -[lib/fileutils.rb] - _unmaintained_ - https://github.com/ruby/fileutils -[lib/forwardable.rb] - Keiju ISHITSUKA (keiju) -[lib/ipaddr.rb] - Akinori MUSHA (knu) -[lib/irb.rb, lib/irb/*] - Keiju ISHITSUKA (keiju) -[lib/logger.rb] - Naotoshi Seo (sonots) -[lib/matrix.rb] - Marc-Andre Lafortune (marcandre) -[lib/mutex_m.rb] - Keiju ISHITSUKA (keiju) -[lib/ostruct.rb] - Marc-Andre Lafortune (marcandre) -[lib/prime.rb] - Yuki Sonoda (yugui) -[lib/racc.rb, lib/racc/*] - Aaron Patterson (tenderlove), Hiroshi SHIBATA (hsbt) - https://github.com/ruby/racc -[lib/rdoc.rb, lib/rdoc/*] - Eric Hodel (drbrain), Hiroshi SHIBATA (hsbt) - https://github.com/ruby/rdoc -[lib/rexml/*] - Kouhei Sutou (kou) -[lib/rss.rb, lib/rss/*] - Kouhei Sutou (kou) -[lib/scanf.rb] - David A. Black (dblack) - https://github.com/ruby/scanf -[lib/shell.rb, lib/shell/*] - Keiju ISHITSUKA (keiju) -[lib/sync.rb] - Keiju ISHITSUKA (keiju) -[lib/thwait.rb] - Keiju ISHITSUKA (keiju) -[lib/tracer.rb] - Keiju ISHITSUKA (keiju) -[lib/webrick.rb, lib/webrick/*] - Eric Wong (normalperson) - https://bugs.ruby-lang.org/ - -=== Extensions - -[ext/bigdecimal] - Kenta Murata (mrkn) - https://github.com/ruby/bigdecimal -[ext/date] - _unmaintained_ - https://github.com/ruby/date -[ext/dbm] - _unmaintained_ - https://github.com/ruby/dbm -[ext/etc] - Ruby core team - https://github.com/ruby/etc -[ext/fcntl] - Ruby core team - https://github.com/ruby/fcntl -[ext/fiddle] - Aaron Patterson (tenderlove) - https://github.com/ruby/fiddle -[ext/gdbm] - Yukihiro Matsumoto (matz) - https://github.com/ruby/gdbm -[ext/io/console] - Nobuyuki Nakada (nobu) - https://github.com/ruby/io-console -[ext/json] - NARUSE, Yui (naruse), Hiroshi SHIBATA (hsbt) - https://github.com/flori/json -[ext/openssl] - Kazuki Yamaguchi (rhe) - https://github.com/ruby/openssl -[ext/psych] - Aaron Patterson (tenderlove), Hiroshi SHIBATA (hsbt) - https://github.com/ruby/psych -[ext/sdbm] - Yukihiro Matsumoto (matz) - https://github.com/ruby/sdbm -[ext/stringio] - Nobuyuki Nakada (nobu) - https://github.com/ruby/stringio -[ext/strscan] - Kouhei Sutou (kou) - https://github.com/ruby/strscan -[ext/zlib] - NARUSE, Yui (naruse) - https://github.com/ruby/zlib - -== Bundled gems upstream repositories - -[did_you_mean] - https://github.com/yuki24/did_you_mean -[minitest] - https://github.com/seattlerb/minitest -[net-telnet] - https://github.com/ruby/net-telnet -[power_assert] - https://github.com/k-tsj/power_assert -[rake] - https://github.com/ruby/rake -[test-unit] - https://github.com/test-unit/test-unit -[xmlrpc] - https://github.com/ruby/xmlrpc diff --git a/doc/matchdata/begin.rdoc b/doc/matchdata/begin.rdoc new file mode 100644 index 0000000000..6100617e19 --- /dev/null +++ b/doc/matchdata/begin.rdoc @@ -0,0 +1,30 @@ +Returns the offset (in characters) of the beginning of the specified match. + +When non-negative integer argument +n+ is given, +returns the offset of the beginning of the <tt>n</tt>th match: + + m = /(.)(.)(\d+)(\d)/.match("THX1138.") + # => #<MatchData "HX1138" 1:"H" 2:"X" 3:"113" 4:"8"> + m[0] # => "HX1138" + m.begin(0) # => 1 + m[3] # => "113" + m.begin(3) # => 3 + + m = /(ã‚“)(ã«)(ã¡)/.match('ã“ã‚“ã«ã¡ã¯') + # => #<MatchData "ã‚“ã«ã¡" 1:"ã‚“" 2:"ã«" 3:"ã¡"> + m[0] # => "ã‚“ã«ã¡" + m.begin(0) # => 1 + m[3] # => "ã¡" + m.begin(3) # => 3 + +When string or symbol argument +name+ is given, +returns the offset of the beginning for the named match: + + m = /(?<foo>.)(.)(?<bar>.)/.match("hoge") + # => #<MatchData "hog" foo:"h" bar:"g"> + m[:foo] # => "h" + m.begin('foo') # => 0 + m[:bar] # => "g" + m.begin(:bar) # => 2 + +Related: MatchData#end, MatchData#offset, MatchData#byteoffset. diff --git a/doc/matchdata/bytebegin.rdoc b/doc/matchdata/bytebegin.rdoc new file mode 100644 index 0000000000..54e417a7fc --- /dev/null +++ b/doc/matchdata/bytebegin.rdoc @@ -0,0 +1,30 @@ +Returns the offset (in bytes) of the beginning of the specified match. + +When non-negative integer argument +n+ is given, +returns the offset of the beginning of the <tt>n</tt>th match: + + m = /(.)(.)(\d+)(\d)/.match("THX1138.") + # => #<MatchData "HX1138" 1:"H" 2:"X" 3:"113" 4:"8"> + m[0] # => "HX1138" + m.bytebegin(0) # => 1 + m[3] # => "113" + m.bytebegin(3) # => 3 + + m = /(ã‚“)(ã«)(ã¡)/.match('ã“ã‚“ã«ã¡ã¯') + # => #<MatchData "ã‚“ã«ã¡" 1:"ã‚“" 2:"ã«" 3:"ã¡"> + m[0] # => "ã‚“ã«ã¡" + m.bytebegin(0) # => 3 + m[3] # => "ã¡" + m.bytebegin(3) # => 9 + +When string or symbol argument +name+ is given, +returns the offset of the beginning for the named match: + + m = /(?<foo>.)(.)(?<bar>.)/.match("hoge") + # => #<MatchData "hog" foo:"h" bar:"g"> + m[:foo] # => "h" + m.bytebegin('foo') # => 0 + m[:bar] # => "g" + m.bytebegin(:bar) # => 2 + +Related: MatchData#byteend, MatchData#byteoffset. diff --git a/doc/matchdata/byteend.rdoc b/doc/matchdata/byteend.rdoc new file mode 100644 index 0000000000..0a03f76208 --- /dev/null +++ b/doc/matchdata/byteend.rdoc @@ -0,0 +1,30 @@ +Returns the offset (in bytes) of the end of the specified match. + +When non-negative integer argument +n+ is given, +returns the offset of the end of the <tt>n</tt>th match: + + m = /(.)(.)(\d+)(\d)/.match("THX1138.") + # => #<MatchData "HX1138" 1:"H" 2:"X" 3:"113" 4:"8"> + m[0] # => "HX1138" + m.byteend(0) # => 7 + m[3] # => "113" + m.byteend(3) # => 6 + + m = /(ã‚“)(ã«)(ã¡)/.match('ã“ã‚“ã«ã¡ã¯') + # => #<MatchData "ã‚“ã«ã¡" 1:"ã‚“" 2:"ã«" 3:"ã¡"> + m[0] # => "ã‚“ã«ã¡" + m.byteend(0) # => 12 + m[3] # => "ã¡" + m.byteend(3) # => 12 + +When string or symbol argument +name+ is given, +returns the offset of the end for the named match: + + m = /(?<foo>.)(.)(?<bar>.)/.match("hoge") + # => #<MatchData "hog" foo:"h" bar:"g"> + m[:foo] # => "h" + m.byteend('foo') # => 1 + m[:bar] # => "g" + m.byteend(:bar) # => 3 + +Related: MatchData#bytebegin, MatchData#byteoffset. diff --git a/doc/matchdata/end.rdoc b/doc/matchdata/end.rdoc new file mode 100644 index 0000000000..c43a5428f3 --- /dev/null +++ b/doc/matchdata/end.rdoc @@ -0,0 +1,30 @@ +Returns the offset (in characters) of the end of the specified match. + +When non-negative integer argument +n+ is given, +returns the offset of the end of the <tt>n</tt>th match: + + m = /(.)(.)(\d+)(\d)/.match("THX1138.") + # => #<MatchData "HX1138" 1:"H" 2:"X" 3:"113" 4:"8"> + m[0] # => "HX1138" + m.end(0) # => 7 + m[3] # => "113" + m.end(3) # => 6 + + m = /(ã‚“)(ã«)(ã¡)/.match('ã“ã‚“ã«ã¡ã¯') + # => #<MatchData "ã‚“ã«ã¡" 1:"ã‚“" 2:"ã«" 3:"ã¡"> + m[0] # => "ã‚“ã«ã¡" + m.end(0) # => 4 + m[3] # => "ã¡" + m.end(3) # => 4 + +When string or symbol argument +name+ is given, +returns the offset of the end for the named match: + + m = /(?<foo>.)(.)(?<bar>.)/.match("hoge") + # => #<MatchData "hog" foo:"h" bar:"g"> + m[:foo] # => "h" + m.end('foo') # => 1 + m[:bar] # => "g" + m.end(:bar) # => 3 + +Related: MatchData#begin, MatchData#offset, MatchData#byteoffset. diff --git a/doc/matchdata/offset.rdoc b/doc/matchdata/offset.rdoc new file mode 100644 index 0000000000..4194ef7ef9 --- /dev/null +++ b/doc/matchdata/offset.rdoc @@ -0,0 +1,31 @@ +Returns a 2-element array containing the beginning and ending +offsets (in characters) of the specified match. + +When non-negative integer argument +n+ is given, +returns the starting and ending offsets of the <tt>n</tt>th match: + + m = /(.)(.)(\d+)(\d)/.match("THX1138.") + # => #<MatchData "HX1138" 1:"H" 2:"X" 3:"113" 4:"8"> + m[0] # => "HX1138" + m.offset(0) # => [1, 7] + m[3] # => "113" + m.offset(3) # => [3, 6] + + m = /(ã‚“)(ã«)(ã¡)/.match('ã“ã‚“ã«ã¡ã¯') + # => #<MatchData "ã‚“ã«ã¡" 1:"ã‚“" 2:"ã«" 3:"ã¡"> + m[0] # => "ã‚“ã«ã¡" + m.offset(0) # => [1, 4] + m[3] # => "ã¡" + m.offset(3) # => [3, 4] + +When string or symbol argument +name+ is given, +returns the starting and ending offsets for the named match: + + m = /(?<foo>.)(.)(?<bar>.)/.match("hoge") + # => #<MatchData "hog" foo:"h" bar:"g"> + m[:foo] # => "h" + m.offset('foo') # => [0, 1] + m[:bar] # => "g" + m.offset(:bar) # => [2, 3] + +Related: MatchData#byteoffset, MatchData#begin, MatchData#end. diff --git a/doc/math/math.rdoc b/doc/math/math.rdoc new file mode 100644 index 0000000000..2978375564 --- /dev/null +++ b/doc/math/math.rdoc @@ -0,0 +1,117 @@ +Module \Math provides methods for basic trigonometric, +logarithmic, and transcendental functions, and for extracting roots. + +You can write its constants and method calls thus: + + Math::PI # => 3.141592653589793 + Math::E # => 2.718281828459045 + Math.sin(0.0) # => 0.0 + Math.cos(0.0) # => 1.0 + +If you include module \Math, you can write simpler forms: + + include Math + PI # => 3.141592653589793 + E # => 2.718281828459045 + sin(0.0) # => 0.0 + cos(0.0) # => 1.0 + +For simplicity, the examples here assume: + + include Math + INFINITY = Float::INFINITY + +The domains and ranges for the methods +are denoted by open or closed intervals, +using, respectively, parentheses or square brackets: + +- An open interval does not include the endpoints: + + (-INFINITY, INFINITY) + +- A closed interval includes the endpoints: + + [-1.0, 1.0] + +- A half-open interval includes one endpoint, but not the other: + + [1.0, INFINITY) + +Many values returned by \Math methods are numerical approximations. +This is because many such values are, in mathematics, +of infinite precision, while in numerical computation +the precision is finite. + +Thus, in mathematics, <i>cos(Ï€/2)</i> is exactly zero, +but in our computation <tt>cos(PI/2)</tt> is a number very close to zero: + + cos(PI/2) # => 6.123031769111886e-17 + +For very large and very small returned values, +we have added formatted numbers for clarity: + + tan(PI/2) # => 1.633123935319537e+16 # 16331239353195370.0 + tan(PI) # => -1.2246467991473532e-16 # -0.0000000000000001 + +See class Float for the constants +that affect Ruby's floating-point arithmetic. + +=== What's Here + +==== Trigonometric Functions + +- ::cos: Returns the cosine of the given argument. +- ::sin: Returns the sine of the given argument. +- ::tan: Returns the tangent of the given argument. + +==== Inverse Trigonometric Functions + +- ::acos: Returns the arc cosine of the given argument. +- ::asin: Returns the arc sine of the given argument. +- ::atan: Returns the arc tangent of the given argument. +- ::atan2: Returns the arg tangent of two given arguments. + +==== Hyperbolic Trigonometric Functions + +- ::cosh: Returns the hyperbolic cosine of the given argument. +- ::sinh: Returns the hyperbolic sine of the given argument. +- ::tanh: Returns the hyperbolic tangent of the given argument. + +==== Inverse Hyperbolic Trigonometric Functions + +- ::acosh: Returns the inverse hyperbolic cosine of the given argument. +- ::asinh: Returns the inverse hyperbolic sine of the given argument. +- ::atanh: Returns the inverse hyperbolic tangent of the given argument. + +==== Exponentiation and Logarithmic Functions + +- ::exp: Returns the value of a given value raised to a given power. +- ::log: Returns the logarithm of a given value in a given base. +- ::log10: Returns the base 10 logarithm of the given argument. +- ::log2: Returns the base 2 logarithm of the given argument. + +==== Fraction and Exponent Functions + +- ::frexp: Returns the fraction and exponent of the given argument. +- ::ldexp: Returns the value for a given fraction and exponent. + +==== Root Functions + +- ::cbrt: Returns the cube root of the given argument. +- ::sqrt: Returns the square root of the given argument. + +==== Error Functions + +- ::erf: Returns the value of the Gauss error function for the given argument. +- ::erfc: Returns the value of the complementary error function + for the given argument. + +==== Gamma Functions + +- ::gamma: Returns the value of the gamma function for the given argument. +- ::lgamma: Returns the value of the logarithmic gamma function + for the given argument. + +==== Hypotenuse Function + +- ::hypot: Returns <tt>sqrt(a**2 + b**2)</tt> for the given +a+ and +b+. diff --git a/doc/net-http/examples.rdoc b/doc/net-http/examples.rdoc new file mode 100644 index 0000000000..c1366e7ad1 --- /dev/null +++ b/doc/net-http/examples.rdoc @@ -0,0 +1,31 @@ +Examples here assume that <tt>net/http</tt> has been required +(which also requires +uri+): + + require 'net/http' + +Many code examples here use these example websites: + +- https://jsonplaceholder.typicode.com. +- http://example.com. + +Some examples also assume these variables: + + uri = URI('https://jsonplaceholder.typicode.com/') + uri.freeze # Examples may not modify. + hostname = uri.hostname # => "jsonplaceholder.typicode.com" + path = uri.path # => "/" + port = uri.port # => 443 + +So that example requests may be written as: + + Net::HTTP.get(uri) + Net::HTTP.get(hostname, '/index.html') + Net::HTTP.start(hostname) do |http| + http.get('/todos/1') + http.get('/todos/2') + end + +An example that needs a modified URI first duplicates +uri+, then modifies the duplicate: + + _uri = uri.dup + _uri.path = '/todos/1' diff --git a/doc/net-http/included_getters.rdoc b/doc/net-http/included_getters.rdoc new file mode 100644 index 0000000000..7ac327f4b4 --- /dev/null +++ b/doc/net-http/included_getters.rdoc @@ -0,0 +1,3 @@ +This class also includes (indirectly) module Net::HTTPHeader, +which gives access to its +{methods for getting headers}[rdoc-ref:Net::HTTPHeader@Getters]. diff --git a/doc/optparse/.document b/doc/optparse/.document new file mode 100644 index 0000000000..96dfc7779f --- /dev/null +++ b/doc/optparse/.document @@ -0,0 +1 @@ +*.rdoc diff --git a/doc/optparse/argument_converters.rdoc b/doc/optparse/argument_converters.rdoc new file mode 100644 index 0000000000..532729871c --- /dev/null +++ b/doc/optparse/argument_converters.rdoc @@ -0,0 +1,380 @@ +== Argument Converters + +An option can specify that its argument is to be converted +from the default +String+ to an instance of another class. + +=== Contents + +- {Built-In Argument Converters}[#label-Built-In+Argument+Converters] + - {Date}[#label-Date] + - {DateTime}[#label-DateTime] + - {Time}[#label-Time] + - {URI}[#label-URI] + - {Shellwords}[#label-Shellwords] + - {Integer}[#label-Integer] + - {Float}[#label-Float] + - {Numeric}[#label-Numeric] + - {DecimalInteger}[#label-DecimalInteger] + - {OctalInteger}[#label-OctalInteger] + - {DecimalNumeric}[#label-DecimalNumeric] + - {TrueClass}[#label-TrueClass] + - {FalseClass}[#label-FalseClass] + - {Object}[#label-Object] + - {String}[#label-String] + - {Array}[#label-Array] + - {Regexp}[#label-Regexp] +- {Custom Argument Converters}[#label-Custom+Argument+Converters] + +=== Built-In Argument Converters + ++OptionParser+ has a number of built-in argument converters, +which are demonstrated below. + +==== +Date+ + +File +date.rb+ +defines an option whose argument is to be converted to a +Date+ object. +The argument is converted by method Date#parse. + + :include: ruby/date.rb + +Executions: + + $ ruby date.rb --date 2001-02-03 + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + $ ruby date.rb --date 20010203 + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + $ ruby date.rb --date "3rd Feb 2001" + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + +==== +DateTime+ + +File +datetime.rb+ +defines an option whose argument is to be converted to a +DateTime+ object. +The argument is converted by method DateTime#parse. + + :include: ruby/datetime.rb + +Executions: + + $ ruby datetime.rb --datetime 2001-02-03T04:05:06+07:00 + [#<DateTime: 2001-02-03T04:05:06+07:00 ((2451943j,75906s,0n),+25200s,2299161j)>, DateTime] + $ ruby datetime.rb --datetime 20010203T040506+0700 + [#<DateTime: 2001-02-03T04:05:06+07:00 ((2451943j,75906s,0n),+25200s,2299161j)>, DateTime] + $ ruby datetime.rb --datetime "3rd Feb 2001 04:05:06 PM" + [#<DateTime: 2001-02-03T16:05:06+00:00 ((2451944j,57906s,0n),+0s,2299161j)>, DateTime] + +==== +Time+ + +File +time.rb+ +defines an option whose argument is to be converted to a +Time+ object. +The argument is converted by method Time#httpdate or Time#parse. + + :include: ruby/time.rb + +Executions: + + $ ruby time.rb --time "Thu, 06 Oct 2011 02:26:12 GMT" + [2011-10-06 02:26:12 UTC, Time] + $ ruby time.rb --time 2010-10-31 + [2010-10-31 00:00:00 -0500, Time] + +==== +URI+ + +File +uri.rb+ +defines an option whose argument is to be converted to a +URI+ object. +The argument is converted by method URI#parse. + + :include: ruby/uri.rb + +Executions: + + $ ruby uri.rb --uri https://github.com + [#<URI::HTTPS https://github.com>, URI::HTTPS] + $ ruby uri.rb --uri http://github.com + [#<URI::HTTP http://github.com>, URI::HTTP] + $ ruby uri.rb --uri file://~/var + [#<URI::File file://~/var>, URI::File] + +==== +Shellwords+ + +File +shellwords.rb+ +defines an option whose argument is to be converted to an +Array+ object by method +Shellwords#shellwords. + + :include: ruby/shellwords.rb + +Executions: + + $ ruby shellwords.rb --shellwords "ruby my_prog.rb | less" + [["ruby", "my_prog.rb", "|", "less"], Array] + $ ruby shellwords.rb --shellwords "here are 'two words'" + [["here", "are", "two words"], Array] + +==== +Integer+ + +File +integer.rb+ +defines an option whose argument is to be converted to an +Integer+ object. +The argument is converted by method Kernel#Integer. + + :include: ruby/integer.rb + +Executions: + + $ ruby integer.rb --integer 100 + [100, Integer] + $ ruby integer.rb --integer -100 + [-100, Integer] + $ ruby integer.rb --integer 0100 + [64, Integer] + $ ruby integer.rb --integer 0x100 + [256, Integer] + $ ruby integer.rb --integer 0b100 + [4, Integer] + +==== +Float+ + +File +float.rb+ +defines an option whose argument is to be converted to a +Float+ object. +The argument is converted by method Kernel#Float. + + :include: ruby/float.rb + +Executions: + + $ ruby float.rb --float 1 + [1.0, Float] + $ ruby float.rb --float 3.14159 + [3.14159, Float] + $ ruby float.rb --float 1.234E2 + [123.4, Float] + $ ruby float.rb --float 1.234E-2 + [0.01234, Float] + +==== +Numeric+ + +File +numeric.rb+ +defines an option whose argument is to be converted to an instance +of +Rational+, +Float+, or +Integer+. +The argument is converted by method Kernel#Rational, +Kernel#Float, or Kernel#Integer. + + :include: ruby/numeric.rb + +Executions: + + $ ruby numeric.rb --numeric 1/3 + [(1/3), Rational] + $ ruby numeric.rb --numeric 3.333E-1 + [0.3333, Float] + $ ruby numeric.rb --numeric 3 + [3, Integer] + +==== +DecimalInteger+ + +File +decimal_integer.rb+ +defines an option whose argument is to be converted to an +Integer+ object. +The argument is converted by method Kernel#Integer. + + :include: ruby/decimal_integer.rb + +The argument may not be in a binary or hexadecimal format; +a leading zero is ignored (not parsed as octal). + +Executions: + + $ ruby decimal_integer.rb --decimal_integer 100 + [100, Integer] + $ ruby decimal_integer.rb --decimal_integer -100 + [-100, Integer] + $ ruby decimal_integer.rb --decimal_integer 0100 + [100, Integer] + $ ruby decimal_integer.rb --decimal_integer -0100 + [-100, Integer] + +==== +OctalInteger+ + +File +octal_integer.rb+ +defines an option whose argument is to be converted to an +Integer+ object. +The argument is converted by method Kernel#Integer. + + :include: ruby/octal_integer.rb + +The argument may not be in a binary or hexadecimal format; +it is parsed as octal, regardless of whether it has a leading zero. + +Executions: + + $ ruby octal_integer.rb --octal_integer 100 + [64, Integer] + $ ruby octal_integer.rb --octal_integer -100 + [-64, Integer] + $ ruby octal_integer.rb --octal_integer 0100 + [64, Integer] + +==== +DecimalNumeric+ + +File +decimal_numeric.rb+ +defines an option whose argument is to be converted to an +Integer+ object. +The argument is converted by method Kernel#Integer + + :include: ruby/decimal_numeric.rb + +The argument may not be in a binary or hexadecimal format; +a leading zero causes the argument to be parsed as octal. + +Executions: + + $ ruby decimal_numeric.rb --decimal_numeric 100 + [100, Integer] + $ ruby decimal_numeric.rb --decimal_numeric -100 + [-100, Integer] + $ ruby decimal_numeric.rb --decimal_numeric 0100 + [64, Integer] + +==== +TrueClass+ + +File +true_class.rb+ +defines an option whose argument is to be converted to +true+ or +false+. +The argument is evaluated by method Object#nil?. + + :include: ruby/true_class.rb + +The argument may be any of those shown in the examples below. + +Executions: + + $ ruby true_class.rb --true_class true + [true, TrueClass] + $ ruby true_class.rb --true_class yes + [true, TrueClass] + $ ruby true_class.rb --true_class + + [true, TrueClass] + $ ruby true_class.rb --true_class false + [false, FalseClass] + $ ruby true_class.rb --true_class no + [false, FalseClass] + $ ruby true_class.rb --true_class - + [false, FalseClass] + $ ruby true_class.rb --true_class nil + [false, FalseClass] + +==== +FalseClass+ + +File +false_class.rb+ +defines an option whose argument is to be converted to +true+ or +false+. +The argument is evaluated by method Object#nil?. + + :include: ruby/false_class.rb + +The argument may be any of those shown in the examples below. + +Executions: + + $ ruby false_class.rb --false_class false + [false, FalseClass] + $ ruby false_class.rb --false_class no + [false, FalseClass] + $ ruby false_class.rb --false_class - + [false, FalseClass] + $ ruby false_class.rb --false_class nil + [false, FalseClass] + $ ruby false_class.rb --false_class true + [true, TrueClass] + $ ruby false_class.rb --false_class yes + [true, TrueClass] + $ ruby false_class.rb --false_class + + [true, TrueClass] + +==== +Object+ + +File +object.rb+ +defines an option whose argument is not to be converted from +String+. + + :include: ruby/object.rb + +Executions: + + $ ruby object.rb --object foo + ["foo", String] + $ ruby object.rb --object nil + ["nil", String] + +==== +String+ + +File +string.rb+ +defines an option whose argument is not to be converted from +String+. + + :include: ruby/string.rb + +Executions: + + $ ruby string.rb --string foo + ["foo", String] + $ ruby string.rb --string nil + ["nil", String] + +==== +Array+ + +File +array.rb+ +defines an option whose argument is to be converted from +String+ +to an array of strings, based on comma-separated substrings. + + :include: ruby/array.rb + +Executions: + + $ ruby array.rb --array "" + [[], Array] + $ ruby array.rb --array foo,bar,baz + [["foo", "bar", "baz"], Array] + $ ruby array.rb --array "foo, bar, baz" + [["foo", " bar", " baz"], Array] + +==== +Regexp+ + +File +regexp.rb+ +defines an option whose argument is to be converted to a +Regexp+ object. + + :include: ruby/regexp.rb + +Executions: + + $ ruby regexp.rb --regexp foo + +=== Custom Argument Converters + +You can create custom argument converters. +To create a custom converter, call OptionParser#accept with: + +- An identifier, which may be any object. +- An optional match pattern, which defaults to <tt>/.*/m</tt>. +- A block that accepts the argument and returns the converted value. + +This custom converter accepts any argument and converts it, +if possible, to a +Complex+ object. + + :include: ruby/custom_converter.rb + +Executions: + + $ ruby custom_converter.rb --complex 0 + [(0+0i), Complex] + $ ruby custom_converter.rb --complex 1 + [(1+0i), Complex] + $ ruby custom_converter.rb --complex 1+2i + [(1+2i), Complex] + $ ruby custom_converter.rb --complex 0.3-0.5i + [(0.3-0.5i), Complex] + +This custom converter accepts any 1-word argument +and capitalizes it, if possible. + + :include: ruby/match_converter.rb + +Executions: + + $ ruby match_converter.rb --capitalize foo + ["Foo", String] + $ ruby match_converter.rb --capitalize "foo bar" + match_converter.rb:9:in '<main>': invalid argument: --capitalize foo bar (OptionParser::InvalidArgument) diff --git a/doc/optparse/creates_option.rdoc b/doc/optparse/creates_option.rdoc new file mode 100644 index 0000000000..ab672d5124 --- /dev/null +++ b/doc/optparse/creates_option.rdoc @@ -0,0 +1,7 @@ +Creates an option from the given parameters +params+. +See {Parameters for New Options}[optparse/option_params.rdoc]. + +The block, if given, is the handler for the created option. +When the option is encountered during command-line parsing, +the block is called with the argument given for the option, if any. +See {Option Handlers}[optparse/option_params.rdoc#label-Option+Handlers]. diff --git a/doc/optparse/option_params.rdoc b/doc/optparse/option_params.rdoc new file mode 100644 index 0000000000..575ee66cdb --- /dev/null +++ b/doc/optparse/option_params.rdoc @@ -0,0 +1,520 @@ +== Parameters for New Options + +Option-creating methods in +OptionParser+ +accept arguments that determine the behavior of a new option: + +- OptionParser#on +- OptionParser#on_head +- OptionParser#on_tail +- OptionParser#define +- OptionParser#define_head +- OptionParser#define_tail +- OptionParser#make_switch + +The code examples on this page use: + +- OptionParser#on, to define options. +- OptionParser#parse!, to parse the command line. +- Built-in option <tt>--help</tt>, to display defined options. + +Contents: + +- {Option Names}[#label-Option+Names] + - {Short Names}[#label-Short+Names] + - {Simple Short Names}[#label-Simple+Short+Names] + - {Short Names with Required Arguments}[#label-Short+Names+with+Required+Arguments] + - {Short Names with Optional Arguments}[#label-Short+Names+with+Optional+Arguments] + - {Short Names from Range}[#label-Short+Names+from+Range] + - {Long Names}[#label-Long+Names] + - {Simple Long Names}[#label-Simple+Long+Names] + - {Long Names with Required Arguments}[#label-Long+Names+with+Required+Arguments] + - {Long Names with Optional Arguments}[#label-Long+Names+with+Optional+Arguments] + - {Long Names with Negation}[#label-Long+Names+with+Negation] + - {Mixed Names}[#label-Mixed+Names] +- {Argument Strings}[#label-Argument+Strings] +- {Argument Values}[#label-Argument+Values] + - {Explicit Argument Values}[#label-Explicit+Argument+Values] + - {Explicit Values in Array}[#label-Explicit+Values+in+Array] + - {Explicit Values in Hash}[#label-Explicit+Values+in+Hash] + - {Argument Value Patterns}[#label-Argument+Value+Patterns] +- {Argument Converters}[#label-Argument+Converters] +- {Descriptions}[#label-Descriptions] +- {Option Handlers}[#label-Option+Handlers] + - {Handler Blocks}[#label-Handler+Blocks] + - {Handler Procs}[#label-Handler+Procs] + - {Handler Methods}[#label-Handler+Methods] + +=== Option Names + +There are two kinds of option names: + +- Short option name, consisting of a single hyphen and a single character. +- Long option name, consisting of two hyphens and one or more characters. + +==== Short Names + +===== Simple Short Names + +File +short_simple.rb+ defines two options: + +- One with short name <tt>-x</tt>. +- The other with two short names, in effect, aliases, <tt>-1</tt> and <tt>-%</tt>. + + :include: ruby/short_simple.rb + +Executions: + + $ ruby short_simple.rb --help + Usage: short_simple [options] + -x One short name + -1, -% Two short names (aliases) + $ ruby short_simple.rb -x + ["-x", true] + $ ruby short_simple.rb -1 -x -% + ["-1 or -%", true] + ["-x", true] + ["-1 or -%", true] + +===== Short Names with Required Arguments + +A short name followed (no whitespace) by a dummy word +defines an option that requires an argument. + +File +short_required.rb+ defines an option <tt>-x</tt> +that requires an argument. + + :include: ruby/short_required.rb + +Executions: + + $ ruby short_required.rb --help + Usage: short_required [options] + -xXXX Short name with required argument + $ ruby short_required.rb -x + short_required.rb:6:in '<main>': missing argument: -x (OptionParser::MissingArgument) + $ ruby short_required.rb -x FOO + ["-x", "FOO"] + +===== Short Names with Optional Arguments + +A short name followed (with whitespace) by a dummy word in square brackets +defines an option that allows an optional argument. + +File +short_optional.rb+ defines an option <tt>-x</tt> +that allows an optional argument. + + :include: ruby/short_optional.rb + +Executions: + + $ ruby short_optional.rb --help + Usage: short_optional [options] + -x [XXX] Short name with optional argument + $ ruby short_optional.rb -x + ["-x", nil] + $ ruby short_optional.rb -x FOO + ["-x", "FOO"] + +===== Short Names from Range + +You can define an option with multiple short names +taken from a range of characters. +The parser yields both the actual character cited and the value. + +File +short_range.rb+ defines an option with short names +for all printable characters from <tt>!</tt> to <tt>~</tt>: + + :include: ruby/short_range.rb + +Executions: + + $ ruby short_range.rb --help + Usage: short_range [options] + -[!-~] Short names in (very large) range + $ ruby short_range.rb -! + ["!-~", "!", nil] + $ ruby short_range.rb -! + ["!-~", "!", nil] + $ ruby short_range.rb -A + ["!-~", "A", nil] + $ ruby short_range.rb -z + ["!-~", "z", nil] + +==== Long Names + +===== Simple Long Names + +File +long_simple.rb+ defines two options: + +- One with long name <tt>-xxx</tt>. +- The other with two long names, in effect, aliases, + <tt>--y1%</tt> and <tt>--z2#</tt>. + + :include: ruby/long_simple.rb + +Executions: + + $ ruby long_simple.rb --help + Usage: long_simple [options] + --xxx One long name + --y1%, --z2# Two long names (aliases) + $ ruby long_simple.rb --xxx + ["--xxx", true] + $ ruby long_simple.rb --y1% --xxx --z2# + ["--y1% or --z2#", true] + ["--xxx", true] + ["--y1% or --z2#", true] + +===== Long Names with Required Arguments + +A long name followed (with whitespace) by a dummy word +defines an option that requires an argument. + +File +long_required.rb+ defines an option <tt>--xxx</tt> +that requires an argument. + + :include: ruby/long_required.rb + +Executions: + + $ ruby long_required.rb --help + Usage: long_required [options] + --xxx XXX Long name with required argument + $ ruby long_required.rb --xxx + long_required.rb:6:in '<main>': missing argument: --xxx (OptionParser::MissingArgument) + $ ruby long_required.rb --xxx FOO + ["--xxx", "FOO"] + +===== Long Names with Optional Arguments + +A long name followed (with whitespace) by a dummy word in square brackets +defines an option that allows an optional argument. + +File +long_optional.rb+ defines an option <tt>--xxx</tt> +that allows an optional argument. + + :include: ruby/long_optional.rb + +Executions: + + $ ruby long_optional.rb --help + Usage: long_optional [options] + --xxx [XXX] Long name with optional argument + $ ruby long_optional.rb --xxx + ["--xxx", nil] + $ ruby long_optional.rb --xxx FOO + ["--xxx", "FOO"] + +===== Long Names with Negation + +A long name may be defined with both positive and negative senses. + +File +long_with_negation.rb+ defines an option that has both senses. + + :include: ruby/long_with_negation.rb + +Executions: + + $ ruby long_with_negation.rb --help + Usage: long_with_negation [options] + --[no-]binary Long name with negation + $ ruby long_with_negation.rb --binary + [true, TrueClass] + $ ruby long_with_negation.rb --no-binary + [false, FalseClass] + +==== Mixed Names + +An option may have both short and long names. + +File +mixed_names.rb+ defines a mixture of short and long names. + + :include: ruby/mixed_names.rb + +Executions: + + $ ruby mixed_names.rb --help +Usage: mixed_names [options] + -x, --xxx Short and long, no argument + -y, --yyyYYY Short and long, required argument + -z, --zzz [ZZZ] Short and long, optional argument + $ ruby mixed_names.rb -x + ["--xxx", true] + $ ruby mixed_names.rb --xxx + ["--xxx", true] + $ ruby mixed_names.rb -y + mixed_names.rb:12:in '<main>': missing argument: -y (OptionParser::MissingArgument) + $ ruby mixed_names.rb -y FOO + ["--yyy", "FOO"] + $ ruby mixed_names.rb --yyy + mixed_names.rb:12:in '<main>': missing argument: --yyy (OptionParser::MissingArgument) + $ ruby mixed_names.rb --yyy BAR + ["--yyy", "BAR"] + $ ruby mixed_names.rb -z + ["--zzz", nil] + $ ruby mixed_names.rb -z BAZ + ["--zzz", "BAZ"] + $ ruby mixed_names.rb --zzz + ["--zzz", nil] + $ ruby mixed_names.rb --zzz BAT + ["--zzz", "BAT"] + +=== Argument Keywords + +As seen above, a given option name string may itself +indicate whether the option has no argument, a required argument, +or an optional argument. + +An alternative is to use a separate symbol keyword, +which is one of <tt>:NONE</tt> (the default), +<tt>:REQUIRED</tt>, <tt>:OPTIONAL</tt>. + +File +argument_keywords.rb+ defines an option with a required argument. + + :include: ruby/argument_keywords.rb + +Executions: + + $ ruby argument_keywords.rb --help + Usage: argument_keywords [options] + -x, --xxx Required argument + $ ruby argument_styles.rb --xxx + argument_styles.rb:6:in '<main>': missing argument: --xxx (OptionParser::MissingArgument) + $ ruby argument_styles.rb --xxx FOO + ["--xxx", "FOO"] + +=== Argument Strings + +Still another way to specify a required argument +is to define it in a string separate from the name string. + +File +argument_strings.rb+ defines an option with a required argument. + + :include: ruby/argument_strings.rb + +Executions: + + $ ruby argument_strings.rb --help + Usage: argument_strings [options] + -x, --xxx=XXX Required argument + $ ruby argument_strings.rb --xxx + argument_strings.rb:9:in '<main>': missing argument: --xxx (OptionParser::MissingArgument) + $ ruby argument_strings.rb --xxx FOO + ["--xxx", "FOO"] + +=== Argument Values + +Permissible argument values may be restricted +either by specifying explicit values +or by providing a pattern that the given value must match. + +==== Explicit Argument Values + +You can specify argument values in either of two ways: + +- Specify values an array of strings. +- Specify values a hash. + +===== Explicit Values in Array + +You can specify explicit argument values in an array of strings. +The argument value must be one of those strings, or an unambiguous abbreviation. + +File +explicit_array_values.rb+ defines options with explicit argument values. + + :include: ruby/explicit_array_values.rb + +Executions: + + $ ruby explicit_array_values.rb --help + Usage: explicit_array_values [options] + -xXXX Values for required argument + -y [YYY] Values for optional argument + $ ruby explicit_array_values.rb -x + explicit_array_values.rb:9:in '<main>': missing argument: -x (OptionParser::MissingArgument) + $ ruby explicit_array_values.rb -x foo + ["-x", "foo"] + $ ruby explicit_array_values.rb -x f + ["-x", "foo"] + $ ruby explicit_array_values.rb -x bar + ["-x", "bar"] + $ ruby explicit_array_values.rb -y ba + explicit_array_values.rb:9:in '<main>': ambiguous argument: -y ba (OptionParser::AmbiguousArgument) + $ ruby explicit_array_values.rb -x baz + explicit_array_values.rb:9:in '<main>': invalid argument: -x baz (OptionParser::InvalidArgument) + + +===== Explicit Values in Hash + +You can specify explicit argument values in a hash with string keys. +The value passed must be one of those keys, or an unambiguous abbreviation; +the value yielded will be the value for that key. + +File +explicit_hash_values.rb+ defines options with explicit argument values. + + :include: ruby/explicit_hash_values.rb + +Executions: + + $ ruby explicit_hash_values.rb --help + Usage: explicit_hash_values [options] + -xXXX Values for required argument + -y [YYY] Values for optional argument + $ ruby explicit_hash_values.rb -x + explicit_hash_values.rb:9:in '<main>': missing argument: -x (OptionParser::MissingArgument) + $ ruby explicit_hash_values.rb -x foo + ["-x", 0] + $ ruby explicit_hash_values.rb -x f + ["-x", 0] + $ ruby explicit_hash_values.rb -x bar + ["-x", 1] + $ ruby explicit_hash_values.rb -x baz + explicit_hash_values.rb:9:in '<main>': invalid argument: -x baz (OptionParser::InvalidArgument) + $ ruby explicit_hash_values.rb -y + ["-y", nil] + $ ruby explicit_hash_values.rb -y baz + ["-y", 2] + $ ruby explicit_hash_values.rb -y bat + ["-y", 3] + $ ruby explicit_hash_values.rb -y ba + explicit_hash_values.rb:9:in '<main>': ambiguous argument: -y ba (OptionParser::AmbiguousArgument) + $ ruby explicit_hash_values.rb -y bam + ["-y", nil] + +==== Argument Value Patterns + +You can restrict permissible argument values +by specifying a +Regexp+ that the given argument must match, +or a +Range+ or +Array+ that the converted value must be included in. + +File +matched_values.rb+ defines options with matched argument values. + + :include: ruby/matched_values.rb + +Executions: + + $ ruby matched_values.rb --help + Usage: matched_values [options] + --xxx XXX Matched values + --yyy YYY Check by range + --zzz ZZZ Check by list + $ ruby matched_values.rb --xxx foo + ["--xxx", "foo"] + $ ruby matched_values.rb --xxx FOO + ["--xxx", "FOO"] + $ ruby matched_values.rb --xxx bar + matched_values.rb:12:in '<main>': invalid argument: --xxx bar (OptionParser::InvalidArgument) + $ ruby matched_values.rb --yyy 1 + ["--yyy", 1] + $ ruby matched_values.rb --yyy 4 + matched_values.rb:12:in '<main>': invalid argument: --yyy 4 (OptionParser::InvalidArgument) + $ ruby matched_values.rb --zzz 1 + ["--zzz", 1] + $ ruby matched_values.rb --zzz 2 + matched_values.rb:12:in '<main>': invalid argument: --zzz 2 (OptionParser::InvalidArgument) + +=== Argument Converters + +An option can specify that its argument is to be converted +from the default +String+ to an instance of another class. + +There are a number of built-in converters. +You can also define custom converters. + +See {Argument Converters}[./argument_converters.rdoc]. + +=== Descriptions + +A description parameter is any string parameter +that is not recognized as an +{option name}[#label-Option+Names] or a +{terminator}[#label-Terminators]; +in other words, it does not begin with a hyphen. + +You may give any number of description parameters; +each becomes a line in the text generated by option <tt>--help</tt>. + +File +descriptions.rb+ has six strings in its array +descriptions+. +These are all passed as parameters to OptionParser#on, so that they +all, line for line, become the option's description. + + :include: ruby/descriptions.rb + +Executions: + + $ ruby descriptions.rb --help + Usage: descriptions [options] + --xxx Lorem ipsum dolor sit amet, consectetuer + adipiscing elit. Aenean commodo ligula eget. + Aenean massa. Cum sociis natoque penatibus + et magnis dis parturient montes, nascetur + ridiculus mus. Donec quam felis, ultricies + nec, pellentesque eu, pretium quis, sem. + $ ruby descriptions.rb --xxx + ["--xxx", true] + +=== Option Handlers + +The handler for an option is an executable that will be called +when the option is encountered. The handler may be: + +- A block (this is most often seen). +- A proc. +- A method. + +==== Handler Blocks + +An option handler may be a block. + +File +block.rb+ defines an option that has a handler block. + + :include: ruby/block.rb + +Executions: + + $ ruby block.rb --help + Usage: block [options] + --xxx Option with no argument + --yyy YYY Option with required argument + $ ruby block.rb --xxx + ["Handler block for -xxx called with value:", true] + $ ruby block.rb --yyy FOO + ["Handler block for -yyy called with value:", "FOO"] + +==== Handler Procs + +An option handler may be a Proc. + +File +proc.rb+ defines an option that has a handler proc. + + :include: ruby/proc.rb + +Executions: + + $ ruby proc.rb --help + Usage: proc [options] + --xxx Option with no argument + --yyy YYY Option with required argument + $ ruby proc.rb --xxx + ["Handler proc for -xxx called with value:", true] + $ ruby proc.rb --yyy FOO + ["Handler proc for -yyy called with value:", "FOO"] + +==== Handler Methods + +An option handler may be a Method. + +File +proc.rb+ defines an option that has a handler method. + + :include: ruby/method.rb + +Executions: + + $ ruby method.rb --help + Usage: method [options] + --xxx Option with no argument + --yyy YYY Option with required argument + $ ruby method.rb --xxx + ["Handler method for -xxx called with value:", true] + $ ruby method.rb --yyy FOO + ["Handler method for -yyy called with value:", "FOO"] diff --git a/doc/optparse/ruby/argument_abbreviation.rb b/doc/optparse/ruby/argument_abbreviation.rb new file mode 100644 index 0000000000..49007ebe69 --- /dev/null +++ b/doc/optparse/ruby/argument_abbreviation.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx=VALUE', %w[ABC def], 'Argument abbreviations') do |value| + p ['--xxx', value] +end +parser.on('-y', '--yyy=VALUE', {"abc"=>"XYZ", def: "FOO"}, 'Argument abbreviations') do |value| + p ['--yyy', value] +end +parser.parse! diff --git a/doc/optparse/ruby/argument_keywords.rb b/doc/optparse/ruby/argument_keywords.rb new file mode 100644 index 0000000000..8533257c67 --- /dev/null +++ b/doc/optparse/ruby/argument_keywords.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', :REQUIRED, 'Required argument') do |value| + p ['--xxx', value] +end +parser.parse! diff --git a/doc/optparse/ruby/argument_strings.rb b/doc/optparse/ruby/argument_strings.rb new file mode 100644 index 0000000000..77861dda30 --- /dev/null +++ b/doc/optparse/ruby/argument_strings.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', '=XXX', 'Required argument') do |value| + p ['--xxx', value] +end +parser.parse! diff --git a/doc/optparse/ruby/argv.rb b/doc/optparse/ruby/argv.rb new file mode 100644 index 0000000000..12495cfa1f --- /dev/null +++ b/doc/optparse/ruby/argv.rb @@ -0,0 +1,2 @@ +p ARGV + diff --git a/doc/optparse/ruby/array.rb b/doc/optparse/ruby/array.rb new file mode 100644 index 0000000000..7c6c14fad4 --- /dev/null +++ b/doc/optparse/ruby/array.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--array=ARRAY', Array) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/basic.rb b/doc/optparse/ruby/basic.rb new file mode 100644 index 0000000000..91d37627c0 --- /dev/null +++ b/doc/optparse/ruby/basic.rb @@ -0,0 +1,17 @@ +# Require the OptionParser code. +require 'optparse' +# Create an OptionParser object. +parser = OptionParser.new +# Define one or more options. +parser.on('-x', 'Whether to X') do |value| + p ['x', value] +end +parser.on('-y', 'Whether to Y') do |value| + p ['y', value] +end +parser.on('-z', 'Whether to Z') do |value| + p ['z', value] +end +# Parse the command line and return pared-down ARGV. +p parser.parse! + diff --git a/doc/optparse/ruby/block.rb b/doc/optparse/ruby/block.rb new file mode 100644 index 0000000000..c4dfdeb31e --- /dev/null +++ b/doc/optparse/ruby/block.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx', 'Option with no argument') do |value| + p ['Handler block for -xxx called with value:', value] +end +parser.on('--yyy YYY', 'Option with required argument') do |value| + p ['Handler block for -yyy called with value:', value] +end +parser.parse! diff --git a/doc/optparse/ruby/collected_options.rb b/doc/optparse/ruby/collected_options.rb new file mode 100644 index 0000000000..2115e03a9a --- /dev/null +++ b/doc/optparse/ruby/collected_options.rb @@ -0,0 +1,8 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', 'Short and long, no argument') +parser.on('-yYYY', '--yyy', 'Short and long, required argument') +parser.on('-z [ZZZ]', '--zzz', 'Short and long, optional argument') +options = {} +parser.parse!(into: options) +p options diff --git a/doc/optparse/ruby/custom_converter.rb b/doc/optparse/ruby/custom_converter.rb new file mode 100644 index 0000000000..029da08c46 --- /dev/null +++ b/doc/optparse/ruby/custom_converter.rb @@ -0,0 +1,9 @@ +require 'optparse/date' +parser = OptionParser.new +parser.accept(Complex) do |value| + value.to_c +end +parser.on('--complex COMPLEX', Complex) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/date.rb b/doc/optparse/ruby/date.rb new file mode 100644 index 0000000000..5994ad6a85 --- /dev/null +++ b/doc/optparse/ruby/date.rb @@ -0,0 +1,6 @@ +require 'optparse/date' +parser = OptionParser.new +parser.on('--date=DATE', Date) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/datetime.rb b/doc/optparse/ruby/datetime.rb new file mode 100644 index 0000000000..b9b591d5f6 --- /dev/null +++ b/doc/optparse/ruby/datetime.rb @@ -0,0 +1,6 @@ +require 'optparse/date' +parser = OptionParser.new +parser.on('--datetime=DATETIME', DateTime) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/decimal_integer.rb b/doc/optparse/ruby/decimal_integer.rb new file mode 100644 index 0000000000..360bd284f8 --- /dev/null +++ b/doc/optparse/ruby/decimal_integer.rb @@ -0,0 +1,7 @@ +require 'optparse' +include OptionParser::Acceptables +parser = OptionParser.new +parser.on('--decimal_integer=DECIMAL_INTEGER', DecimalInteger) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/decimal_numeric.rb b/doc/optparse/ruby/decimal_numeric.rb new file mode 100644 index 0000000000..954da13561 --- /dev/null +++ b/doc/optparse/ruby/decimal_numeric.rb @@ -0,0 +1,7 @@ +require 'optparse' +include OptionParser::Acceptables +parser = OptionParser.new +parser.on('--decimal_numeric=DECIMAL_NUMERIC', DecimalNumeric) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/default_values.rb b/doc/optparse/ruby/default_values.rb new file mode 100644 index 0000000000..24c26faea2 --- /dev/null +++ b/doc/optparse/ruby/default_values.rb @@ -0,0 +1,8 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', 'Short and long, no argument') +parser.on('-yYYY', '--yyy', 'Short and long, required argument') +parser.on('-z [ZZZ]', '--zzz', 'Short and long, optional argument') +options = {yyy: 'AAA', zzz: 'BBB'} +parser.parse!(into: options) +p options diff --git a/doc/optparse/ruby/descriptions.rb b/doc/optparse/ruby/descriptions.rb new file mode 100644 index 0000000000..9aec80aae2 --- /dev/null +++ b/doc/optparse/ruby/descriptions.rb @@ -0,0 +1,15 @@ +require 'optparse' +parser = OptionParser.new +description = <<-EOT +Lorem ipsum dolor sit amet, consectetuer +adipiscing elit. Aenean commodo ligula eget. +Aenean massa. Cum sociis natoque penatibus +et magnis dis parturient montes, nascetur +ridiculus mus. Donec quam felis, ultricies +nec, pellentesque eu, pretium quis, sem. +EOT +descriptions = description.split($/) +parser.on('--xxx', *descriptions) do |value| + p ['--xxx', value] +end +parser.parse! diff --git a/doc/optparse/ruby/explicit_array_values.rb b/doc/optparse/ruby/explicit_array_values.rb new file mode 100644 index 0000000000..64f930a4bc --- /dev/null +++ b/doc/optparse/ruby/explicit_array_values.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-xXXX', ['foo', 'bar'], 'Values for required argument' ) do |value| + p ['-x', value] +end +parser.on('-y [YYY]', ['baz', 'bat'], 'Values for optional argument') do |value| + p ['-y', value] +end +parser.parse! diff --git a/doc/optparse/ruby/explicit_hash_values.rb b/doc/optparse/ruby/explicit_hash_values.rb new file mode 100644 index 0000000000..9c9e6a48ed --- /dev/null +++ b/doc/optparse/ruby/explicit_hash_values.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-xXXX', {foo: 0, bar: 1}, 'Values for required argument' ) do |value| + p ['-x', value] +end +parser.on('-y [YYY]', {baz: 2, bat: 3}, 'Values for optional argument') do |value| + p ['-y', value] +end +parser.parse! diff --git a/doc/optparse/ruby/false_class.rb b/doc/optparse/ruby/false_class.rb new file mode 100644 index 0000000000..04fe335ede --- /dev/null +++ b/doc/optparse/ruby/false_class.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--false_class=FALSE_CLASS', FalseClass) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/float.rb b/doc/optparse/ruby/float.rb new file mode 100644 index 0000000000..390df7f7bd --- /dev/null +++ b/doc/optparse/ruby/float.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--float=FLOAT', Float) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/help.rb b/doc/optparse/ruby/help.rb new file mode 100644 index 0000000000..95bcde1d77 --- /dev/null +++ b/doc/optparse/ruby/help.rb @@ -0,0 +1,18 @@ +require 'optparse' +parser = OptionParser.new +parser.on( + '-x', '--xxx', + 'Adipiscing elit. Aenean commodo ligula eget.', + 'Aenean massa. Cum sociis natoque penatibus', + ) +parser.on( + '-y', '--yyy YYY', + 'Lorem ipsum dolor sit amet, consectetuer.' +) +parser.on( + '-z', '--zzz [ZZZ]', + 'Et magnis dis parturient montes, nascetur', + 'ridiculus mus. Donec quam felis, ultricies', + 'nec, pellentesque eu, pretium quis, sem.', + ) +parser.parse! diff --git a/doc/optparse/ruby/help_banner.rb b/doc/optparse/ruby/help_banner.rb new file mode 100644 index 0000000000..0943a3e029 --- /dev/null +++ b/doc/optparse/ruby/help_banner.rb @@ -0,0 +1,7 @@ +require 'optparse' +parser = OptionParser.new +parser.banner = "Usage: ruby help_banner.rb" +parser.parse! + + + diff --git a/doc/optparse/ruby/help_format.rb b/doc/optparse/ruby/help_format.rb new file mode 100644 index 0000000000..a2f1e85b00 --- /dev/null +++ b/doc/optparse/ruby/help_format.rb @@ -0,0 +1,25 @@ +require 'optparse' +parser = OptionParser.new( + 'ruby help_format.rb [options]', # Banner + 20, # Width of options field + ' ' * 2 # Indentation +) +parser.on( + '-x', '--xxx', + 'Adipiscing elit. Aenean commodo ligula eget.', + 'Aenean massa. Cum sociis natoque penatibus', + ) +parser.on( + '-y', '--yyy YYY', + 'Lorem ipsum dolor sit amet, consectetuer.' +) +parser.on( + '-z', '--zzz [ZZZ]', + 'Et magnis dis parturient montes, nascetur', + 'ridiculus mus. Donec quam felis, ultricies', + 'nec, pellentesque eu, pretium quis, sem.', + ) +parser.parse! + + + diff --git a/doc/optparse/ruby/help_program_name.rb b/doc/optparse/ruby/help_program_name.rb new file mode 100644 index 0000000000..7b3fbff067 --- /dev/null +++ b/doc/optparse/ruby/help_program_name.rb @@ -0,0 +1,7 @@ +require 'optparse' +parser = OptionParser.new +parser.program_name = 'help_program_name.rb' +parser.parse! + + + diff --git a/doc/optparse/ruby/integer.rb b/doc/optparse/ruby/integer.rb new file mode 100644 index 0000000000..f10656ff1a --- /dev/null +++ b/doc/optparse/ruby/integer.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--integer=INTEGER', Integer) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/long_names.rb b/doc/optparse/ruby/long_names.rb new file mode 100644 index 0000000000..a49dbda69f --- /dev/null +++ b/doc/optparse/ruby/long_names.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx', 'Long name') do |value| + p ['-xxx', value] +end +parser.on('--y1%', '--z2#', "Two long names") do |value| + p ['--y1% or --z2#', value] +end +parser.parse! diff --git a/doc/optparse/ruby/long_optional.rb b/doc/optparse/ruby/long_optional.rb new file mode 100644 index 0000000000..38dd82166b --- /dev/null +++ b/doc/optparse/ruby/long_optional.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx [XXX]', 'Long name with optional argument') do |value| + p ['--xxx', value] +end +parser.parse! diff --git a/doc/optparse/ruby/long_required.rb b/doc/optparse/ruby/long_required.rb new file mode 100644 index 0000000000..b76c997339 --- /dev/null +++ b/doc/optparse/ruby/long_required.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx XXX', 'Long name with required argument') do |value| + p ['--xxx', value] +end +parser.parse! diff --git a/doc/optparse/ruby/long_simple.rb b/doc/optparse/ruby/long_simple.rb new file mode 100644 index 0000000000..4e489c43ed --- /dev/null +++ b/doc/optparse/ruby/long_simple.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx', 'One long name') do |value| + p ['--xxx', value] +end +parser.on('--y1%', '--z2#', 'Two long names (aliases)') do |value| + p ['--y1% or --z2#', value] +end +parser.parse! diff --git a/doc/optparse/ruby/long_with_negation.rb b/doc/optparse/ruby/long_with_negation.rb new file mode 100644 index 0000000000..3f2913c361 --- /dev/null +++ b/doc/optparse/ruby/long_with_negation.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--[no-]binary', 'Long name with negation') do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/match_converter.rb b/doc/optparse/ruby/match_converter.rb new file mode 100644 index 0000000000..13dc5fcb51 --- /dev/null +++ b/doc/optparse/ruby/match_converter.rb @@ -0,0 +1,9 @@ +require 'optparse/date' +parser = OptionParser.new +parser.accept(:capitalize, /\w*/) do |value| + value.capitalize +end +parser.on('--capitalize XXX', :capitalize) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/matched_values.rb b/doc/optparse/ruby/matched_values.rb new file mode 100644 index 0000000000..a1aba140e6 --- /dev/null +++ b/doc/optparse/ruby/matched_values.rb @@ -0,0 +1,12 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx XXX', /foo/i, 'Matched values') do |value| + p ['--xxx', value] +end +parser.on('--yyy YYY', Integer, 'Check by range', 1..3) do |value| + p ['--yyy', value] +end +parser.on('--zzz ZZZ', Integer, 'Check by list', [1, 3, 4]) do |value| + p ['--zzz', value] +end +parser.parse! diff --git a/doc/optparse/ruby/method.rb b/doc/optparse/ruby/method.rb new file mode 100644 index 0000000000..3f02ff5798 --- /dev/null +++ b/doc/optparse/ruby/method.rb @@ -0,0 +1,11 @@ +require 'optparse' +parser = OptionParser.new +def xxx_handler(value) + p ['Handler method for -xxx called with value:', value] +end +parser.on('--xxx', 'Option with no argument', method(:xxx_handler)) +def yyy_handler(value) + p ['Handler method for -yyy called with value:', value] +end +parser.on('--yyy YYY', 'Option with required argument', method(:yyy_handler)) +parser.parse! diff --git a/doc/optparse/ruby/missing_options.rb b/doc/optparse/ruby/missing_options.rb new file mode 100644 index 0000000000..9428463cfd --- /dev/null +++ b/doc/optparse/ruby/missing_options.rb @@ -0,0 +1,12 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', 'Short and long, no argument') +parser.on('-yYYY', '--yyy', 'Short and long, required argument') +parser.on('-z [ZZZ]', '--zzz', 'Short and long, optional argument') +options = {} +parser.parse!(into: options) +required_options = [:xxx, :zzz] +missing_options = required_options - options.keys +unless missing_options.empty? + fail "Missing required options: #{missing_options}" +end diff --git a/doc/optparse/ruby/mixed_names.rb b/doc/optparse/ruby/mixed_names.rb new file mode 100644 index 0000000000..67f81e7e8d --- /dev/null +++ b/doc/optparse/ruby/mixed_names.rb @@ -0,0 +1,12 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', '--xxx', 'Short and long, no argument') do |value| + p ['--xxx', value] +end +parser.on('-yYYY', '--yyy', 'Short and long, required argument') do |value| + p ['--yyy', value] +end +parser.on('-z [ZZZ]', '--zzz', 'Short and long, optional argument') do |value| + p ['--zzz', value] +end +parser.parse! diff --git a/doc/optparse/ruby/name_abbrev.rb b/doc/optparse/ruby/name_abbrev.rb new file mode 100644 index 0000000000..b438c1b3dd --- /dev/null +++ b/doc/optparse/ruby/name_abbrev.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-n', '--dry-run',) do |value| + p ['--dry-run', value] +end +parser.on('-d', '--draft',) do |value| + p ['--draft', value] +end +parser.parse! diff --git a/doc/optparse/ruby/no_abbreviation.rb b/doc/optparse/ruby/no_abbreviation.rb new file mode 100644 index 0000000000..5464492705 --- /dev/null +++ b/doc/optparse/ruby/no_abbreviation.rb @@ -0,0 +1,10 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-n', '--dry-run',) do |value| + p ['--dry-run', value] +end +parser.on('-d', '--draft',) do |value| + p ['--draft', value] +end +parser.require_exact = true +parser.parse! diff --git a/doc/optparse/ruby/numeric.rb b/doc/optparse/ruby/numeric.rb new file mode 100644 index 0000000000..d7021f154a --- /dev/null +++ b/doc/optparse/ruby/numeric.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--numeric=NUMERIC', Numeric) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/object.rb b/doc/optparse/ruby/object.rb new file mode 100644 index 0000000000..0f5ae8b922 --- /dev/null +++ b/doc/optparse/ruby/object.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--object=OBJECT', Object) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/octal_integer.rb b/doc/optparse/ruby/octal_integer.rb new file mode 100644 index 0000000000..b9644a076b --- /dev/null +++ b/doc/optparse/ruby/octal_integer.rb @@ -0,0 +1,7 @@ +require 'optparse' +include OptionParser::Acceptables +parser = OptionParser.new +parser.on('--octal_integer=OCTAL_INTEGER', OctalInteger) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/optional_argument.rb b/doc/optparse/ruby/optional_argument.rb new file mode 100644 index 0000000000..456368a8ba --- /dev/null +++ b/doc/optparse/ruby/optional_argument.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x [XXX]', '--xxx', 'Optional argument via short name') do |value| + p ['--xxx', value] +end +parser.on('-y', '--yyy [YYY]', 'Optional argument via long name') do |value| + p ['--yyy', value] +end +parser.parse! diff --git a/doc/optparse/ruby/parse.rb b/doc/optparse/ruby/parse.rb new file mode 100644 index 0000000000..a5d4329484 --- /dev/null +++ b/doc/optparse/ruby/parse.rb @@ -0,0 +1,13 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx') do |value| + p ['--xxx', value] +end +parser.on('--yyy YYY') do |value| + p ['--yyy', value] +end +parser.on('--zzz [ZZZ]') do |value| + p ['--zzz', value] +end +ret = parser.parse(ARGV) +puts "Returned: #{ret} (#{ret.class})" diff --git a/doc/optparse/ruby/parse_bang.rb b/doc/optparse/ruby/parse_bang.rb new file mode 100644 index 0000000000..567bc733cf --- /dev/null +++ b/doc/optparse/ruby/parse_bang.rb @@ -0,0 +1,13 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--xxx') do |value| + p ['--xxx', value] +end +parser.on('--yyy YYY') do |value| + p ['--yyy', value] +end +parser.on('--zzz [ZZZ]') do |value| + p ['--zzz', value] +end +ret = parser.parse! +puts "Returned: #{ret} (#{ret.class})" diff --git a/doc/optparse/ruby/proc.rb b/doc/optparse/ruby/proc.rb new file mode 100644 index 0000000000..9c669fdc92 --- /dev/null +++ b/doc/optparse/ruby/proc.rb @@ -0,0 +1,13 @@ +require 'optparse' +parser = OptionParser.new +parser.on( + '--xxx', + 'Option with no argument', + ->(value) {p ['Handler proc for -xxx called with value:', value]} +) +parser.on( + '--yyy YYY', + 'Option with required argument', + ->(value) {p ['Handler proc for -yyy called with value:', value]} +) +parser.parse! diff --git a/doc/optparse/ruby/regexp.rb b/doc/optparse/ruby/regexp.rb new file mode 100644 index 0000000000..6aba45ce76 --- /dev/null +++ b/doc/optparse/ruby/regexp.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--regexp=REGEXP', Regexp) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/required_argument.rb b/doc/optparse/ruby/required_argument.rb new file mode 100644 index 0000000000..228a492c3c --- /dev/null +++ b/doc/optparse/ruby/required_argument.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x XXX', '--xxx', 'Required argument via short name') do |value| + p ['--xxx', value] +end +parser.on('-y', '--y YYY', 'Required argument via long name') do |value| + p ['--yyy', value] +end +parser.parse! diff --git a/doc/optparse/ruby/shellwords.rb b/doc/optparse/ruby/shellwords.rb new file mode 100644 index 0000000000..d181d4a4f6 --- /dev/null +++ b/doc/optparse/ruby/shellwords.rb @@ -0,0 +1,6 @@ +require 'optparse/shellwords' +parser = OptionParser.new +parser.on('--shellwords=SHELLWORDS', Shellwords) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/short_names.rb b/doc/optparse/ruby/short_names.rb new file mode 100644 index 0000000000..4a756518fa --- /dev/null +++ b/doc/optparse/ruby/short_names.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', 'Short name') do |value| + p ['x', value] +end +parser.on('-1', '-%', 'Two short names') do |value| + p ['-1 or -%', value] +end +parser.parse! diff --git a/doc/optparse/ruby/short_optional.rb b/doc/optparse/ruby/short_optional.rb new file mode 100644 index 0000000000..6eebf01c5f --- /dev/null +++ b/doc/optparse/ruby/short_optional.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x [XXX]', 'Short name with optional argument') do |value| + p ['-x', value] +end +parser.parse! diff --git a/doc/optparse/ruby/short_range.rb b/doc/optparse/ruby/short_range.rb new file mode 100644 index 0000000000..f5b870a4bd --- /dev/null +++ b/doc/optparse/ruby/short_range.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-[!-~]', 'Short names in (very large) range') do |name, value| + p ['!-~', name, value] +end +parser.parse! diff --git a/doc/optparse/ruby/short_required.rb b/doc/optparse/ruby/short_required.rb new file mode 100644 index 0000000000..867c02c9f5 --- /dev/null +++ b/doc/optparse/ruby/short_required.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-xXXX', 'Short name with required argument') do |value| + p ['-x', value] +end +parser.parse! diff --git a/doc/optparse/ruby/short_simple.rb b/doc/optparse/ruby/short_simple.rb new file mode 100644 index 0000000000..d3d489e2dc --- /dev/null +++ b/doc/optparse/ruby/short_simple.rb @@ -0,0 +1,9 @@ +require 'optparse' +parser = OptionParser.new +parser.on('-x', 'One short name') do |value| + p ['-x', value] +end +parser.on('-1', '-%', 'Two short names (aliases)') do |value| + p ['-1 or -%', value] +end +parser.parse! diff --git a/doc/optparse/ruby/string.rb b/doc/optparse/ruby/string.rb new file mode 100644 index 0000000000..fee84a17ea --- /dev/null +++ b/doc/optparse/ruby/string.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--string=STRING', String) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/terminator.rb b/doc/optparse/ruby/terminator.rb new file mode 100644 index 0000000000..c718ac1a97 --- /dev/null +++ b/doc/optparse/ruby/terminator.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--my_option XXX') do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/time.rb b/doc/optparse/ruby/time.rb new file mode 100644 index 0000000000..aa8b0cfa16 --- /dev/null +++ b/doc/optparse/ruby/time.rb @@ -0,0 +1,6 @@ +require 'optparse/time' +parser = OptionParser.new +parser.on('--time=TIME', Time) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/true_class.rb b/doc/optparse/ruby/true_class.rb new file mode 100644 index 0000000000..40db9d07c5 --- /dev/null +++ b/doc/optparse/ruby/true_class.rb @@ -0,0 +1,6 @@ +require 'optparse' +parser = OptionParser.new +parser.on('--true_class=TRUE_CLASS', TrueClass) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/ruby/uri.rb b/doc/optparse/ruby/uri.rb new file mode 100644 index 0000000000..b492835b4f --- /dev/null +++ b/doc/optparse/ruby/uri.rb @@ -0,0 +1,6 @@ +require 'optparse/uri' +parser = OptionParser.new +parser.on('--uri=URI', URI) do |value| + p [value, value.class] +end +parser.parse! diff --git a/doc/optparse/tutorial.rdoc b/doc/optparse/tutorial.rdoc new file mode 100644 index 0000000000..1134f94ddf --- /dev/null +++ b/doc/optparse/tutorial.rdoc @@ -0,0 +1,858 @@ +== Tutorial + +=== Why +OptionParser+? + +When a Ruby program executes, it captures its command-line arguments +and options into variable ARGV. +This simple program just prints its +ARGV+: + + :include: ruby/argv.rb + +Execution, with arguments and options: + + $ ruby argv.rb foo --bar --baz bat bam + ["foo", "--bar", "--baz", "bat", "bam"] + +The executing program is responsible for parsing and handling +the command-line options. + +OptionParser offers methods for parsing and handling those options. + +With +OptionParser+, you can define options so that for each option: + +- The code that defines the option and code that handles that option + are in the same place. +- The option may take no argument, a required argument, or an optional argument. +- The argument may be automatically converted to a specified class. +- The argument may be restricted to specified _forms_. +- The argument may be restricted to specified _values_. + +The class also has method #help, which displays automatically-generated help text. + +=== Contents + +- {To Begin With}[#label-To+Begin+With] +- {Defining Options}[#label-Defining+Options] +- {Option Names}[#label-Option+Names] + - {Short Option Names}[#label-Short+Option+Names] + - {Long Option Names}[#label-Long+Option+Names] + - {Mixing Option Names}[#label-Mixing+Option+Names] + - {Option Name Abbreviations}[#label-Option+Name+Abbreviations] +- {Option Arguments}[#label-Option+Arguments] + - {Option with No Argument}[#label-Option+with+No+Argument] + - {Option with Required Argument}[#label-Option+with+Required+Argument] + - {Option with Optional Argument}[#label-Option+with+Optional+Argument] + - {Argument Abbreviations}[#label-Argument+Abbreviations] +- {Argument Values}[#label-Argument+Values] + - {Explicit Argument Values}[#label-Explicit+Argument+Values] + - {Explicit Values in Array}[#label-Explicit+Values+in+Array] + - {Explicit Values in Hash}[#label-Explicit+Values+in+Hash] + - {Argument Value Patterns}[#label-Argument+Value+Patterns] +- {Keyword Argument into}[#label-Keyword+Argument+into] + - {Collecting Options}[#label-Collecting+Options] + - {Checking for Missing Options}[#label-Checking+for+Missing+Options] + - {Default Values for Options}[#label-Default+Values+for+Options] +- {Argument Converters}[#label-Argument+Converters] +- {Help}[#label-Help] +- {Top List and Base List}[#label-Top+List+and+Base+List] +- {Methods for Defining Options}[#label-Methods+for+Defining+Options] +- {Parsing}[#label-Parsing] + - {Method parse!}[#label-Method+parse-21] + - {Method parse}[#label-Method+parse] + - {Method order!}[#label-Method+order-21] + - {Method order}[#label-Method+order] + - {Method permute!}[#label-Method+permute-21] + - {Method permute}[#label-Method+permute] + +=== To Begin With + +To use +OptionParser+: + +1. Require the +OptionParser+ code. +2. Create an +OptionParser+ object. +3. Define one or more options. +4. Parse the command line. + +File +basic.rb+ defines three options, <tt>-x</tt>, +<tt>-y</tt>, and <tt>-z</tt>, each with a descriptive string, +and each with a block. + + :include: ruby/basic.rb + +From these defined options, the parser automatically builds help text: + + $ ruby basic.rb --help + Usage: basic [options] + -x Whether to X + -y Whether to Y + -z Whether to Z + +When an option is found during parsing, +the block defined for the option is called with the argument value. +An invalid option raises an exception. + +Method #parse!, which is used most often in this tutorial, +removes from +ARGV+ the options and arguments it finds, +leaving other non-option arguments for the program to handle on its own. +The method returns the possibly-reduced +ARGV+ array. + +Executions: + + $ ruby basic.rb -x -z + ["x", true] + ["z", true] + [] + $ ruby basic.rb -z -y -x + ["z", true] + ["y", true] + ["x", true] + [] + $ ruby basic.rb -x input_file.txt output_file.txt + ["x", true] + ["input_file.txt", "output_file.txt"] + $ ruby basic.rb -a + basic.rb:16:in '<main>': invalid option: -a (OptionParser::InvalidOption) + +=== Defining Options + +A common way to define an option in +OptionParser+ +is with instance method OptionParser#on. + +The method may be called with any number of arguments +(whose order does not matter), +and may also have a trailing optional keyword argument +into+. + +The given arguments determine the characteristics of the new option. +These may include: + +- One or more short option names. +- One or more long option names. +- Whether the option takes no argument, an optional argument, or a required argument. +- Acceptable _forms_ for the argument. +- Acceptable _values_ for the argument. +- A proc or method to be called when the parser encounters the option. +- String descriptions for the option. + +=== Option Names + +You can give an option one or more names of two types: + +- Short (1-character) name, beginning with one hyphen (<tt>-</tt>). +- Long (multi-character) name, beginning with two hyphens (<tt>--</tt>). + +==== Short Option Names + +A short option name consists of a hyphen and a single character. + +File +short_names.rb+ +defines an option with a short name, <tt>-x</tt>, +and an option with two short names (aliases, in effect) <tt>-y</tt> and <tt>-z</tt>. + + :include: ruby/short_names.rb + +Executions: + + $ ruby short_names.rb --help + Usage: short_names [options] + -x Short name + -1, -% Two short names + $ ruby short_names.rb -x + ["x", true] + $ ruby short_names.rb -1 + ["-1 or -%", true] + $ ruby short_names.rb -% + ["-1 or -%", true] + +Multiple short names can "share" a hyphen: + + $ ruby short_names.rb -x1% + ["x", true] + ["-1 or -%", true] + ["-1 or -%", true] + +==== Long Option Names + +A long option name consists of two hyphens and a one or more characters +(usually two or more characters). + +File +long_names.rb+ +defines an option with a long name, <tt>--xxx</tt>, +and an option with two long names (aliases, in effect) <tt>--y1%</tt> and <tt>--z2#</tt>. + + :include: ruby/long_names.rb + +Executions: + + $ ruby long_names.rb --help + Usage: long_names [options] + --xxx Long name + --y1%, --z2# Two long names + $ ruby long_names.rb --xxx + ["-xxx", true] + $ ruby long_names.rb --y1% + ["--y1% or --z2#", true] + $ ruby long_names.rb --z2# + ["--y1% or --z2#", true] + +A long name may be defined with both positive and negative senses. + +File +long_with_negation.rb+ defines an option that has both senses. + + :include: ruby/long_with_negation.rb + +Executions: + + $ ruby long_with_negation.rb --help + Usage: long_with_negation [options] + --[no-]binary Long name with negation + $ ruby long_with_negation.rb --binary + [true, TrueClass] + $ ruby long_with_negation.rb --no-binary + [false, FalseClass] + +==== Mixing Option Names + +Many developers like to mix short and long option names, +so that a short name is in effect an abbreviation of a long name. + +File +mixed_names.rb+ +defines options that each have both a short and a long name. + + :include: ruby/mixed_names.rb + +Executions: + + $ ruby mixed_names.rb --help + Usage: mixed_names [options] + -x, --xxx Short and long, no argument + -y, --yyyYYY Short and long, required argument + -z, --zzz [ZZZ] Short and long, optional argument + $ ruby mixed_names.rb -x + ["--xxx", true] + $ ruby mixed_names.rb --xxx + ["--xxx", true] + $ ruby mixed_names.rb -y + mixed_names.rb:12:in '<main>': missing argument: -y (OptionParser::MissingArgument) + $ ruby mixed_names.rb -y FOO + ["--yyy", "FOO"] + $ ruby mixed_names.rb --yyy + mixed_names.rb:12:in '<main>': missing argument: --yyy (OptionParser::MissingArgument) + $ ruby mixed_names.rb --yyy BAR + ["--yyy", "BAR"] + $ ruby mixed_names.rb -z + ["--zzz", nil] + $ ruby mixed_names.rb -z BAZ + ["--zzz", "BAZ"] + $ ruby mixed_names.rb --zzz + ["--zzz", nil] + $ ruby mixed_names.rb --zzz BAT + ["--zzz", "BAT"] + +==== Option Name Abbreviations + +By default, abbreviated option names on the command-line are allowed. +An abbreviated name is valid if it is unique among abbreviated option names. + + :include: ruby/name_abbrev.rb + +Executions: + + $ ruby name_abbrev.rb --help + Usage: name_abbrev [options] + -n, --dry-run + -d, --draft + $ ruby name_abbrev.rb -n + ["--dry-run", true] + $ ruby name_abbrev.rb --dry-run + ["--dry-run", true] + $ ruby name_abbrev.rb -d + ["--draft", true] + $ ruby name_abbrev.rb --draft + ["--draft", true] + $ ruby name_abbrev.rb --d + name_abbrev.rb:9:in '<main>': ambiguous option: --d (OptionParser::AmbiguousOption) + $ ruby name_abbrev.rb --dr + name_abbrev.rb:9:in '<main>': ambiguous option: --dr (OptionParser::AmbiguousOption) + $ ruby name_abbrev.rb --dry + ["--dry-run", true] + $ ruby name_abbrev.rb --dra + ["--draft", true] + +You can disable abbreviation using method +require_exact+. + + :include: ruby/no_abbreviation.rb + +Executions: + + $ ruby no_abbreviation.rb --dry-ru + no_abbreviation.rb:10:in '<main>': invalid option: --dry-ru (OptionParser::InvalidOption) + $ ruby no_abbreviation.rb --dry-run + ["--dry-run", true] + +=== Option Arguments + +An option may take no argument, a required argument, or an optional argument. + +==== Option with No Argument + +All the examples above define options with no argument. + +==== Option with Required Argument + +Specify a required argument for an option by adding a dummy word +to its name definition. + +File +required_argument.rb+ defines two options; +each has a required argument because the name definition has a following dummy word. + + :include: ruby/required_argument.rb + +When an option is found, the given argument is yielded. + +Executions: + + $ ruby required_argument.rb --help + Usage: required_argument [options] + -x, --xxx XXX Required argument via short name + -y, --y YYY Required argument via long name + $ ruby required_argument.rb -x AAA + ["--xxx", "AAA"] + $ ruby required_argument.rb -y BBB + ["--yyy", "BBB"] + +Omitting a required argument raises an error: + + $ ruby required_argument.rb -x + required_argument.rb:9:in '<main>': missing argument: -x (OptionParser::MissingArgument) + +==== Option with Optional Argument + +Specify an optional argument for an option by adding a dummy word +enclosed in square brackets to its name definition. + +File +optional_argument.rb+ defines two options; +each has an optional argument because the name definition has a following dummy word +in square brackets. + + :include: ruby/optional_argument.rb + +When an option with an argument is found, the given argument yielded. + +Executions: + + $ ruby optional_argument.rb --help + Usage: optional_argument [options] + -x, --xxx [XXX] Optional argument via short name + -y, --yyy [YYY] Optional argument via long name + $ ruby optional_argument.rb -x AAA + ["--xxx", "AAA"] + $ ruby optional_argument.rb -y BBB + ["--yyy", "BBB"] + +Omitting an optional argument does not raise an error. + +==== Argument Abbreviations + +Specify an argument list as an Array or a Hash. + + :include: ruby/argument_abbreviation.rb + +When an argument is abbreviated, the expanded argument yielded. + +Executions: + + $ ruby argument_abbreviation.rb --help + Usage: argument_abbreviation [options] + Usage: argument_abbreviation [options] + -x, --xxx=VALUE Argument abbreviations + -y, --yyy=VALUE Argument abbreviations + $ ruby argument_abbreviation.rb --xxx A + ["--xxx", "ABC"] + $ ruby argument_abbreviation.rb --xxx c + argument_abbreviation.rb:9:in '<main>': invalid argument: --xxx c (OptionParser::InvalidArgument) + $ ruby argument_abbreviation.rb --yyy a --yyy d + ["--yyy", "XYZ"] + ["--yyy", "FOO"] + +=== Argument Values + +Permissible argument values may be restricted +either by specifying explicit values +or by providing a pattern that the given value must match. + +==== Explicit Argument Values + +You can specify argument values in either of two ways: + +- Specify values an array of strings. +- Specify values a hash. + +===== Explicit Values in Array + +You can specify explicit argument values in an array of strings. +The argument value must be one of those strings, or an unambiguous abbreviation. + +File +explicit_array_values.rb+ defines options with explicit argument values. + + :include: ruby/explicit_array_values.rb + +Executions: + + $ ruby explicit_array_values.rb --help + Usage: explicit_array_values [options] + -xXXX Values for required argument + -y [YYY] Values for optional argument + $ ruby explicit_array_values.rb -x + explicit_array_values.rb:9:in '<main>': missing argument: -x (OptionParser::MissingArgument) + $ ruby explicit_array_values.rb -x foo + ["-x", "foo"] + $ ruby explicit_array_values.rb -x f + ["-x", "foo"] + $ ruby explicit_array_values.rb -x bar + ["-x", "bar"] + $ ruby explicit_array_values.rb -y ba + explicit_array_values.rb:9:in '<main>': ambiguous argument: -y ba (OptionParser::AmbiguousArgument) + $ ruby explicit_array_values.rb -x baz + explicit_array_values.rb:9:in '<main>': invalid argument: -x baz (OptionParser::InvalidArgument) + + +===== Explicit Values in Hash + +You can specify explicit argument values in a hash with string keys. +The value passed must be one of those keys, or an unambiguous abbreviation; +the value yielded will be the value for that key. + +File +explicit_hash_values.rb+ defines options with explicit argument values. + + :include: ruby/explicit_hash_values.rb + +Executions: + + $ ruby explicit_hash_values.rb --help + Usage: explicit_hash_values [options] + -xXXX Values for required argument + -y [YYY] Values for optional argument + $ ruby explicit_hash_values.rb -x + explicit_hash_values.rb:9:in '<main>': missing argument: -x (OptionParser::MissingArgument) + $ ruby explicit_hash_values.rb -x foo + ["-x", 0] + $ ruby explicit_hash_values.rb -x f + ["-x", 0] + $ ruby explicit_hash_values.rb -x bar + ["-x", 1] + $ ruby explicit_hash_values.rb -x baz + explicit_hash_values.rb:9:in '<main>': invalid argument: -x baz (OptionParser::InvalidArgument) + $ ruby explicit_hash_values.rb -y + ["-y", nil] + $ ruby explicit_hash_values.rb -y baz + ["-y", 2] + $ ruby explicit_hash_values.rb -y bat + ["-y", 3] + $ ruby explicit_hash_values.rb -y ba + explicit_hash_values.rb:9:in '<main>': ambiguous argument: -y ba (OptionParser::AmbiguousArgument) + $ ruby explicit_hash_values.rb -y bam + ["-y", nil] + +==== Argument Value Patterns + +You can restrict permissible argument values +by specifying a Regexp that the given argument must match. + +File +matched_values.rb+ defines options with matched argument values. + + :include: ruby/matched_values.rb + +Executions: + + $ ruby matched_values.rb --help + Usage: matched_values [options] + --xxx XXX Matched values + $ ruby matched_values.rb --xxx foo + ["--xxx", "foo"] + $ ruby matched_values.rb --xxx FOO + ["--xxx", "FOO"] + $ ruby matched_values.rb --xxx bar + matched_values.rb:6:in '<main>': invalid argument: --xxx bar (OptionParser::InvalidArgument) + +=== Keyword Argument +into+ + +In parsing options, you can add keyword option +into+ with a hash-like argument; +each parsed option will be added as a name/value pair. + +This is useful for: + +- Collecting options. +- Checking for missing options. +- Providing default values for options. + +==== Collecting Options + +Use keyword argument +into+ to collect options. + + :include: ruby/collected_options.rb + +Executions: + + $ ruby collected_options.rb --help + Usage: into [options] + -x, --xxx Short and long, no argument + -y, --yyyYYY Short and long, required argument + -z, --zzz [ZZZ] Short and long, optional argument + $ ruby collected_options.rb --xxx + {:xxx=>true} + $ ruby collected_options.rb --xxx --yyy FOO + {:xxx=>true, :yyy=>"FOO"} + $ ruby collected_options.rb --xxx --yyy FOO --zzz Bar + {:xxx=>true, :yyy=>"FOO", :zzz=>"Bar"} + $ ruby collected_options.rb --xxx --yyy FOO --yyy BAR + {:xxx=>true, :yyy=>"BAR"} + +Note in the last execution that the argument value for option <tt>--yyy</tt> +was overwritten. + +==== Checking for Missing Options + +Use the collected options to check for missing options. + + :include: ruby/missing_options.rb + +Executions: + + $ ruby missing_options.rb --help + Usage: missing_options [options] + -x, --xxx Short and long, no argument + -y, --yyyYYY Short and long, required argument + -z, --zzz [ZZZ] Short and long, optional argument + $ ruby missing_options.rb --yyy FOO + missing_options.rb:11:in '<main>': Missing required options: [:xxx, :zzz] (RuntimeError) + +==== Default Values for Options + +Initialize the +into+ argument to define default values for options. + + :include: ruby/default_values.rb + +Executions: + + $ ruby default_values.rb --help + Usage: default_values [options] + -x, --xxx Short and long, no argument + -y, --yyyYYY Short and long, required argument + -z, --zzz [ZZZ] Short and long, optional argument + $ ruby default_values.rb --yyy FOO + {:yyy=>"FOO", :zzz=>"BBB"} + +=== Argument Converters + +An option can specify that its argument is to be converted +from the default +String+ to an instance of another class. +There are a number of built-in converters. + +Example: File +date.rb+ +defines an option whose argument is to be converted to a +Date+ object. +The argument is converted by method Date#parse. + + :include: ruby/date.rb + +Executions: + + $ ruby date.rb --date 2001-02-03 + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + $ ruby date.rb --date 20010203 + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + $ ruby date.rb --date "3rd Feb 2001" + [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, Date] + +You can also define custom converters. +See {Argument Converters}[./argument_converters.rdoc] +for both built-in and custom converters. + +=== Help + ++OptionParser+ makes automatically generated help text available. + +The help text consists of: + +- A banner, showing the usage. +- Option short and long names. +- Option dummy argument names. +- Option descriptions. + +Example code: + + :include: ruby/help.rb + +The option names and dummy argument names are defined as described above. + +The option description consists of the strings that are not themselves option names; +An option can have more than one description string. +Execution: + + Usage: help [options] + -x, --xxx Adipiscing elit. Aenean commodo ligula eget. + Aenean massa. Cum sociis natoque penatibus + -y, --yyy YYY Lorem ipsum dolor sit amet, consectetuer. + -z, --zzz [ZZZ] Et magnis dis parturient montes, nascetur + ridiculus mus. Donec quam felis, ultricies + nec, pellentesque eu, pretium quis, sem. + +The program name is included in the default banner: +<tt>Usage: #{program_name} [options]</tt>; +you can change the program name. + + :include: ruby/help_program_name.rb + +Execution: + + $ ruby help_program_name.rb --help + Usage: help_program_name.rb [options] + +You can also change the entire banner. + + :include: ruby/help_banner.rb + +Execution: + + $ ruby help_banner.rb --help + Usage: ruby help_banner.rb + +By default, the option names are indented 4 spaces +and the width of the option-names field is 32 spaces. + +You can change these values, along with the banner, +by passing parameters to OptionParser.new. + + :include: ruby/help_format.rb + +Execution: + + $ ruby help_format.rb --help + ruby help_format.rb [options] + -x, --xxx Adipiscing elit. Aenean commodo ligula eget. + Aenean massa. Cum sociis natoque penatibus + -y, --yyy YYY Lorem ipsum dolor sit amet, consectetuer. + -z, --zzz [ZZZ] Et magnis dis parturient montes, nascetur + ridiculus mus. Donec quam felis, ultricies + nec, pellentesque eu, pretium quis, sem. + +=== Top List and Base List + +An +OptionParser+ object maintains a stack of OptionParser::List objects, +each of which has a collection of zero or more options. +It is unlikely that you'll need to add or take away from that stack. + +The stack includes: + +- The <em>top list</em>, given by OptionParser#top. +- The <em>base list</em>, given by OptionParser#base. + +When +OptionParser+ builds its help text, the options in the top list +precede those in the base list. + +=== Methods for Defining Options + +Option-defining methods allow you to create an option, and also append/prepend it +to the top list or append it to the base list. + +Each of these next three methods accepts a sequence of parameter arguments and a block, +creates an option object using method OptionParser#make_switch (see below), +and returns the created option: + +- \Method OptionParser#define appends the created option to the top list. + +- \Method OptionParser#define_head prepends the created option to the top list. + +- \Method OptionParser#define_tail appends the created option to the base list. + +These next three methods are identical to the three above, +except for their return values: + +- \Method OptionParser#on is identical to method OptionParser#define, + except that it returns the parser object +self+. + +- \Method OptionParser#on_head is identical to method OptionParser#define_head, + except that it returns the parser object +self+. + +- \Method OptionParser#on_tail is identical to method OptionParser#define_tail, + except that it returns the parser object +self+. + +Though you may never need to call it directly, +here's the core method for defining an option: + +- \Method OptionParser#make_switch accepts an array of parameters and a block. + See {Parameters for New Options}[optparse/option_params.rdoc]. + This method is unlike others here in that it: + - Accepts an <em>array of parameters</em>; + others accept a <em>sequence of parameter arguments</em>. + - Returns an array containing the created option object, + option names, and other values; + others return either the created option object + or the parser object +self+. + +=== Parsing + ++OptionParser+ has six instance methods for parsing. + +Three have names ending with a "bang" (<tt>!</tt>): + +- parse! +- order! +- permute! + +Each of these methods: + +- Accepts an optional array of string arguments +argv+; + if not given, +argv+ defaults to the value of OptionParser#default_argv, + whose initial value is ARGV. +- Accepts an optional keyword argument +into+ + (see {Keyword Argument into}[#label-Keyword+Argument+into]). +- Returns +argv+, possibly with some elements removed. + +The three other methods have names _not_ ending with a "bang": + +- parse +- order +- permute + +Each of these methods: + +- Accepts an array of string arguments + _or_ zero or more string arguments. +- Accepts an optional keyword argument +into+ and its value _into_. + (see {Keyword Argument into}[#label-Keyword+Argument+into]). +- Returns +argv+, possibly with some elements removed. + +==== \Method +parse!+ + +\Method +parse!+: + +- Accepts an optional array of string arguments +argv+; + if not given, +argv+ defaults to the value of OptionParser#default_argv, + whose initial value is ARGV. +- Accepts an optional keyword argument +into+ + (see {Keyword Argument into}[#label-Keyword+Argument+into]). +- Returns +argv+, possibly with some elements removed. + +The method processes the elements in +argv+ beginning at <tt>argv[0]</tt>, +and ending, by default, at the end. + +Otherwise processing ends and the method returns when: + +- The terminator argument <tt>--</tt> is found; + the terminator argument is removed before the return. +- Environment variable +POSIXLY_CORRECT+ is defined + and a non-option argument is found; + the non-option argument is not removed. + Note that the _value_ of that variable does not matter, + as only its existence is checked. + +File +parse_bang.rb+: + + :include: ruby/parse_bang.rb + +Help: + + $ ruby parse_bang.rb --help + Usage: parse_bang [options] + --xxx + --yyy YYY + --zzz [ZZZ] + +Default behavior: + + $ ruby parse_bang.rb input_file.txt output_file.txt --xxx --yyy FOO --zzz BAR + ["--xxx", true] + ["--yyy", "FOO"] + ["--zzz", "BAR"] + Returned: ["input_file.txt", "output_file.txt"] (Array) + +Processing ended by terminator argument: + + $ ruby parse_bang.rb input_file.txt output_file.txt --xxx --yyy FOO -- --zzz BAR + ["--xxx", true] + ["--yyy", "FOO"] + Returned: ["input_file.txt", "output_file.txt", "--zzz", "BAR"] (Array) + +Processing ended by non-option found when +POSIXLY_CORRECT+ is defined: + + $ POSIXLY_CORRECT=true ruby parse_bang.rb --xxx input_file.txt output_file.txt -yyy FOO + ["--xxx", true] + Returned: ["input_file.txt", "output_file.txt", "-yyy", "FOO"] (Array) + +==== \Method +parse+ + +\Method +parse+: + +- Accepts an array of string arguments + _or_ zero or more string arguments. +- Accepts an optional keyword argument +into+ and its value _into_. + (see {Keyword Argument into}[#label-Keyword+Argument+into]). +- Returns +argv+, possibly with some elements removed. + +If given an array +ary+, the method forms array +argv+ as <tt>ary.dup</tt>. +If given zero or more string arguments, those arguments are formed +into array +argv+. + +The method calls + + parse!(argv, into: into) + +Note that environment variable +POSIXLY_CORRECT+ +and the terminator argument <tt>--</tt> are honored. + +File +parse.rb+: + + :include: ruby/parse.rb + +Help: + + $ ruby parse.rb --help + Usage: parse [options] + --xxx + --yyy YYY + --zzz [ZZZ] + +Default behavior: + + $ ruby parse.rb input_file.txt output_file.txt --xxx --yyy FOO --zzz BAR + ["--xxx", true] + ["--yyy", "FOO"] + ["--zzz", "BAR"] + Returned: ["input_file.txt", "output_file.txt"] (Array) + +Processing ended by terminator argument: + + $ ruby parse.rb input_file.txt output_file.txt --xxx --yyy FOO -- --zzz BAR + ["--xxx", true] + ["--yyy", "FOO"] + Returned: ["input_file.txt", "output_file.txt", "--zzz", "BAR"] (Array) + +Processing ended by non-option found when +POSIXLY_CORRECT+ is defined: + + $ POSIXLY_CORRECT=true ruby parse.rb --xxx input_file.txt output_file.txt -yyy FOO + ["--xxx", true] + Returned: ["input_file.txt", "output_file.txt", "-yyy", "FOO"] (Array) + +==== \Method +order!+ + +Calling method OptionParser#order! gives exactly the same result as +calling method OptionParser#parse! with environment variable ++POSIXLY_CORRECT+ defined. + +==== \Method +order+ + +Calling method OptionParser#order gives exactly the same result as +calling method OptionParser#parse with environment variable ++POSIXLY_CORRECT+ defined. + +==== \Method +permute!+ + +Calling method OptionParser#permute! gives exactly the same result as +calling method OptionParser#parse! with environment variable ++POSIXLY_CORRECT+ _not_ defined. + +==== \Method +permute+ + +Calling method OptionParser#permute gives exactly the same result as +calling method OptionParser#parse with environment variable ++POSIXLY_CORRECT+ _not_ defined. diff --git a/doc/pty/README.expect.ja b/doc/pty/README.expect.ja index 7c0456f24f..a4eb6b01df 100644 --- a/doc/pty/README.expect.ja +++ b/doc/pty/README.expect.ja @@ -1,21 +1,23 @@ - README for expect += README for expect by A. Ito, 28 October, 1998 - Expectライブラリã¯ï¼Œtcl ã® expect パッケージã¨ä¼¼ãŸã‚ˆã†ãªæ©Ÿèƒ½ã‚’ +Expectライブラリã¯ï¼Œtcl ã® expect パッケージã¨ä¼¼ãŸã‚ˆã†ãªæ©Ÿèƒ½ã‚’ IOクラスã«è¿½åŠ ã—ã¾ã™ï¼Ž - è¿½åŠ ã•れるメソッドã®ä½¿ã„æ–¹ã¯æ¬¡ã®é€šã‚Šã§ã™ï¼Ž +è¿½åŠ ã•れるメソッドã®ä½¿ã„æ–¹ã¯æ¬¡ã®é€šã‚Šã§ã™ï¼Ž - IO#expect(pattern,timeout=9999999) +[IO#expect(pattern,timeout=9999999)] -pattern 㯠String ã‹ Regexp ã®ã‚¤ãƒ³ã‚¹ã‚¿ãƒ³ã‚¹ï¼Œtimeout 㯠Fixnum -ã®ã‚¤ãƒ³ã‚¹ã‚¿ãƒ³ã‚¹ã§ã™ï¼Žtimeout ã¯çœç•¥ã§ãã¾ã™ï¼Ž - ã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ãŒãƒ–ãƒãƒƒã‚¯ãªã—ã§å‘¼ã°ã‚ŒãŸå ´åˆï¼Œã¾ãšãƒ¬ã‚·ãƒ¼ãƒã§ã‚ã‚‹ -IOオブジェクトã‹ã‚‰ pattern ã«ãƒžãƒƒãƒã™ã‚‹ãƒ‘ターンãŒèªã¿ã“ã¾ã‚Œã‚‹ -ã¾ã§å¾…ã¡ã¾ã™ï¼Žãƒ‘ターンãŒå¾—られãŸã‚‰ï¼Œãã®ãƒ‘ターンã«é–¢ã™ã‚‹é…列を -è¿”ã—ã¾ã™ï¼Žé…åˆ—ã®æœ€åˆã®è¦ç´ ã¯ï¼Œpattern ã«ãƒžãƒƒãƒã™ã‚‹ã¾ã§ã«èªã¿ã“ -ã¾ã‚ŒãŸå†…å®¹ã®æ–‡å—列ã§ã™ï¼Ž2番目以é™ã®è¦ç´ ã¯ï¼Œpattern ã®æ£è¦è¡¨ç¾ -ã®ä¸ã«ã‚¢ãƒ³ã‚«ãƒ¼ãŒã‚ã£ãŸå ´åˆã«ï¼Œãã®ã‚¢ãƒ³ã‚«ãƒ¼ã«ãƒžãƒƒãƒã™ã‚‹éƒ¨åˆ†ã§ã™ï¼Ž -ã‚‚ã—タイムアウトãŒèµ·ããŸå ´åˆã¯ï¼Œã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ã¯nilã‚’è¿”ã—ã¾ã™ï¼Ž - ã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ãŒãƒ–ãƒãƒƒã‚¯ä»˜ãã§å‘¼ã°ã‚ŒãŸå ´åˆã«ã¯ï¼Œãƒžãƒƒãƒã—ãŸè¦ç´ ã® -é…列ãŒãƒ–ãƒãƒƒã‚¯å¼•æ•°ã¨ã—ã¦æ¸¡ã•れ,ブãƒãƒƒã‚¯ãŒè©•価ã•れã¾ã™ï¼Ž + _pattern_ 㯠String ã‹ Regexp ã®ã‚¤ãƒ³ã‚¹ã‚¿ãƒ³ã‚¹ï¼Œ_timeout_ 㯠Fixnum + ã®ã‚¤ãƒ³ã‚¹ã‚¿ãƒ³ã‚¹ã§ã™ï¼Ž_timeout_ ã¯çœç•¥ã§ãã¾ã™ï¼Ž + + ã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ãŒãƒ–ãƒãƒƒã‚¯ãªã—ã§å‘¼ã°ã‚ŒãŸå ´åˆï¼Œã¾ãšãƒ¬ã‚·ãƒ¼ãƒã§ã‚ã‚‹ + IOオブジェクトã‹ã‚‰ _pattern_ ã«ãƒžãƒƒãƒã™ã‚‹ãƒ‘ターンãŒèªã¿ã“ã¾ã‚Œã‚‹ + ã¾ã§å¾…ã¡ã¾ã™ï¼Žãƒ‘ターンãŒå¾—られãŸã‚‰ï¼Œãã®ãƒ‘ターンã«é–¢ã™ã‚‹é…列を + è¿”ã—ã¾ã™ï¼Žé…åˆ—ã®æœ€åˆã®è¦ç´ ã¯ï¼Œ_pattern_ ã«ãƒžãƒƒãƒã™ã‚‹ã¾ã§ã«èªã¿ã“ + ã¾ã‚ŒãŸå†…å®¹ã®æ–‡å—列ã§ã™ï¼Ž2番目以é™ã®è¦ç´ ã¯ï¼Œ_pattern_ ã®æ£è¦è¡¨ç¾ + ã®ä¸ã«ã‚¢ãƒ³ã‚«ãƒ¼ãŒã‚ã£ãŸå ´åˆã«ï¼Œãã®ã‚¢ãƒ³ã‚«ãƒ¼ã«ãƒžãƒƒãƒã™ã‚‹éƒ¨åˆ†ã§ã™ï¼Ž + ã‚‚ã—タイムアウトãŒèµ·ããŸå ´åˆã¯ï¼Œã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ã¯ +nil+ ã‚’è¿”ã—ã¾ã™ï¼Ž + + ã“ã®ãƒ¡ã‚½ãƒƒãƒ‰ãŒãƒ–ãƒãƒƒã‚¯ä»˜ãã§å‘¼ã°ã‚ŒãŸå ´åˆã«ã¯ï¼Œãƒžãƒƒãƒã—ãŸè¦ç´ ã® + é…列ãŒãƒ–ãƒãƒƒã‚¯å¼•æ•°ã¨ã—ã¦æ¸¡ã•れ,ブãƒãƒƒã‚¯ãŒè©•価ã•れã¾ã™ï¼Ž diff --git a/doc/pty/README.ja b/doc/pty/README.ja index 2d83ffa033..a26b4932ff 100644 --- a/doc/pty/README.ja +++ b/doc/pty/README.ja @@ -1,27 +1,26 @@ -pty 拡張モジュール version 0.3 by A.ito += pty 拡張モジュール version 0.3 by A.ito 1. ã¯ã˜ã‚ã« -ã“ã®æ‹¡å¼µãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã¯ï¼Œä»®æƒ³tty (pty) を通ã—ã¦é©å½“ãªã‚³ãƒžãƒ³ãƒ‰ã‚’ -実行ã™ã‚‹æ©Ÿèƒ½ã‚’ ruby ã«æä¾›ã—ã¾ã™ï¼Ž + ã“ã®æ‹¡å¼µãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã¯ï¼Œä»®æƒ³tty (pty) を通ã—ã¦é©å½“ãªã‚³ãƒžãƒ³ãƒ‰ã‚’ + 実行ã™ã‚‹æ©Ÿèƒ½ã‚’ ruby ã«æä¾›ã—ã¾ã™ï¼Ž 2. インストール -次ã®ã‚ˆã†ã«ã—ã¦ã‚¤ãƒ³ã‚¹ãƒˆãƒ¼ãƒ«ã—ã¦ãã ã•ã„. + 次ã®ã‚ˆã†ã«ã—ã¦ã‚¤ãƒ³ã‚¹ãƒˆãƒ¼ãƒ«ã—ã¦ãã ã•ã„. -(1) ruby extconf.rb + 1. <tt>ruby extconf.rb</tt> + を実行ã™ã‚‹ã¨ Makefile ãŒç”Ÿæˆã•れã¾ã™ï¼Ž - を実行ã™ã‚‹ã¨ Makefile ãŒç”Ÿæˆã•れã¾ã™ï¼Ž - -(2) make; make install を実行ã—ã¦ãã ã•ã„. + 2. <tt>make; make install</tt> を実行ã—ã¦ãã ã•ã„. 3. 何ãŒã§ãã‚‹ã‹ -ã“ã®æ‹¡å¼µãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã¯ï¼ŒPTY ã¨ã„ã†ãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã‚’定義ã—ã¾ã™ï¼Žãã®ä¸ -ã«ã¯ï¼Œæ¬¡ã®ã‚ˆã†ãªãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«é–¢æ•°ãŒå«ã¾ã‚Œã¦ã„ã¾ã™ï¼Ž + ã“ã®æ‹¡å¼µãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã¯ï¼ŒPTY ã¨ã„ã†ãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«ã‚’定義ã—ã¾ã™ï¼Žãã®ä¸ + ã«ã¯ï¼Œæ¬¡ã®ã‚ˆã†ãªãƒ¢ã‚¸ãƒ¥ãƒ¼ãƒ«é–¢æ•°ãŒå«ã¾ã‚Œã¦ã„ã¾ã™ï¼Ž - getpty(command) - spawn(command) + [PTY.getpty(command)] + [PTY.spawn(command)] ã“ã®é–¢æ•°ã¯ï¼Œä»®æƒ³ttyを確ä¿ã—,指定ã•れãŸã‚³ãƒžãƒ³ãƒ‰ã‚’ãã®ä»®æƒ³tty ã®å‘ã“ã†ã§å®Ÿè¡Œã—,é…列を返ã—ã¾ã™ï¼Žæˆ»ã‚Šå€¤ã¯3ã¤ã®è¦ç´ ã‹ã‚‰ãªã‚‹ @@ -35,12 +34,7 @@ pty 拡張モジュール version 0.3 by A.ito ã®ã¿ä¾‹å¤–ãŒç™ºç”Ÿã—ã¾ã™ï¼Žåプãƒã‚»ã‚¹ã‚’モニターã—ã¦ã„るスレッドã¯ãƒ–ãƒãƒƒ クを抜ã‘ã‚‹ã¨ãã«çµ‚了ã—ã¾ã™ï¼Ž - protect_signal - reset_signal - - 廃æ¢äºˆå®šã§ã™ï¼Ž - - PTY.open + [PTY.open] 仮想ttyを確ä¿ã—,マスターå´ã«å¯¾å¿œã™ã‚‹IOオブジェクトã¨ã‚¹ãƒ¬ãƒ¼ãƒ–å´ã« 対応ã™ã‚‹Fileオブジェクトã®é…列を返ã—ã¾ã™ï¼Žãƒ–ãƒãƒƒã‚¯ä»˜ãã§å‘¼ã³å‡ºã• @@ -48,7 +42,7 @@ pty 拡張モジュール version 0.3 by A.ito クã‹ã‚‰è¿”ã•れãŸçµæžœã‚’è¿”ã—ã¾ã™ï¼Žã¾ãŸã€ã“ã®ãƒžã‚¹ã‚¿ãƒ¼IOã¨ã‚¹ãƒ¬ãƒ¼ãƒ–File ã¯ã€ãƒ–ãƒãƒƒã‚¯ã‚’抜ã‘ã‚‹ã¨ãã«ã‚¯ãƒãƒ¼ã‚ºæ¸ˆã¿ã§ãªã‘れã°ã‚¯ãƒãƒ¼ã‚ºã•れã¾ã™ï¼Ž - PTY.check(pid[, raise=false]) + [PTY.check(pid[, raise=false])] pidã§æŒ‡å®šã•れãŸåプãƒã‚»ã‚¹ã®çŠ¶æ…‹ã‚’ãƒã‚§ãƒƒã‚¯ã—,実行ä¸ã§ã‚れã°nilã‚’ è¿”ã—ã¾ã™ï¼Žçµ‚了ã—ã¦ã„ã‚‹ã‹åœæ¢ã—ã¦ã„ã‚‹å ´åˆã€ç¬¬äºŒå¼•æ•°ãŒå½ã§ã‚れã°ã€ @@ -57,20 +51,20 @@ pty 拡張モジュール version 0.3 by A.ito 4. 利用ã«ã¤ã„㦠-伊藤彰則ãŒè‘—ä½œæ¨©ã‚’ä¿æœ‰ã—ã¾ã™ï¼Ž + 伊藤彰則ãŒè‘—ä½œæ¨©ã‚’ä¿æœ‰ã—ã¾ã™ï¼Ž -ソースプãƒã‚°ãƒ©ãƒ ã¾ãŸã¯ãƒ‰ã‚ュメントã«å…ƒã®è‘—ä½œæ¨©è¡¨ç¤ºãŒæ”¹å¤‰ã•れãšã« -表示ã•れã¦ã„ã‚‹å ´åˆã«é™ã‚Šï¼Œèª°ã§ã‚‚,ã“ã®ã‚½ãƒ•トウェアを無償ã‹ã¤è‘—作 -権者ã«ç„¡æ–ã§åˆ©ç”¨ãƒ»é…布・改変ã§ãã¾ã™ï¼Žåˆ©ç”¨ç›®çš„ã¯é™å®šã•れã¦ã„ã¾ã› -ん. + ソースプãƒã‚°ãƒ©ãƒ ã¾ãŸã¯ãƒ‰ã‚ュメントã«å…ƒã®è‘—ä½œæ¨©è¡¨ç¤ºãŒæ”¹å¤‰ã•れãšã« + 表示ã•れã¦ã„ã‚‹å ´åˆã«é™ã‚Šï¼Œèª°ã§ã‚‚,ã“ã®ã‚½ãƒ•トウェアを無償ã‹ã¤è‘—作 + 権者ã«ç„¡æ–ã§åˆ©ç”¨ãƒ»é…布・改変ã§ãã¾ã™ï¼Žåˆ©ç”¨ç›®çš„ã¯é™å®šã•れã¦ã„ã¾ã› + ん. -ã“ã®ãƒ—ãƒã‚°ãƒ©ãƒ ã®åˆ©ç”¨ãƒ»é…布ãã®ä»–ã“ã®ãƒ—ãƒã‚°ãƒ©ãƒ ã«é–¢ä¿‚ã™ã‚‹è¡Œç‚ºã«ã‚ˆ -ã£ã¦ç”Ÿã˜ãŸã„ã‹ãªã‚‹æå®³ã«å¯¾ã—ã¦ã‚‚,作者ã¯ä¸€åˆ‡è²¬ä»»ã‚’è² ã„ã¾ã›ã‚“. + ã“ã®ãƒ—ãƒã‚°ãƒ©ãƒ ã®åˆ©ç”¨ãƒ»é…布ãã®ä»–ã“ã®ãƒ—ãƒã‚°ãƒ©ãƒ ã«é–¢ä¿‚ã™ã‚‹è¡Œç‚ºã«ã‚ˆ + ã£ã¦ç”Ÿã˜ãŸã„ã‹ãªã‚‹æå®³ã«å¯¾ã—ã¦ã‚‚,作者ã¯ä¸€åˆ‡è²¬ä»»ã‚’è² ã„ã¾ã›ã‚“. 5. ãƒã‚°å ±å‘Šç‰ -ãƒã‚°ãƒ¬ãƒãƒ¼ãƒˆã¯æ“迎ã—ã¾ã™ï¼Ž + ãƒã‚°ãƒ¬ãƒãƒ¼ãƒˆã¯æ“迎ã—ã¾ã™ï¼Ž aito@ei5sun.yz.yamagata-u.ac.jp -ã¾ã§é›»åメールã§ãƒã‚°ãƒ¬ãƒãƒ¼ãƒˆã‚’ãŠé€ã‚Šãã ã•ã„. + ã¾ã§é›»åメールã§ãƒã‚°ãƒ¬ãƒãƒ¼ãƒˆã‚’ãŠé€ã‚Šãã ã•ã„. diff --git a/doc/regexp.rdoc b/doc/regexp.rdoc deleted file mode 100644 index 9218a75b67..0000000000 --- a/doc/regexp.rdoc +++ /dev/null @@ -1,709 +0,0 @@ -# -*- mode: rdoc; coding: utf-8; fill-column: 74; -*- - -Regular expressions (<i>regexp</i>s) are patterns which describe the -contents of a string. They're used for testing whether a string contains a -given pattern, or extracting the portions that match. They are created -with the <tt>/</tt><i>pat</i><tt>/</tt> and -<tt>%r{</tt><i>pat</i><tt>}</tt> literals or the <tt>Regexp.new</tt> -constructor. - -A regexp is usually delimited with forward slashes (<tt>/</tt>). For -example: - - /hay/ =~ 'haystack' #=> 0 - /y/.match('haystack') #=> #<MatchData "y"> - -If a string contains the pattern it is said to <i>match</i>. A literal -string matches itself. - -Here 'haystack' does not contain the pattern 'needle', so it doesn't match: - - /needle/.match('haystack') #=> nil - -Here 'haystack' contains the pattern 'hay', so it matches: - - /hay/.match('haystack') #=> #<MatchData "hay"> - -Specifically, <tt>/st/</tt> requires that the string contains the letter -_s_ followed by the letter _t_, so it matches _haystack_, also. - -== <tt>=~</tt> and Regexp#match - -Pattern matching may be achieved by using <tt>=~</tt> operator or Regexp#match -method. - -=== <tt>=~</tt> operator - -<tt>=~</tt> is Ruby's basic pattern-matching operator. When one operand is a -regular expression and the other is a string then the regular expression is -used as a pattern to match against the string. (This operator is equivalently -defined by Regexp and String so the order of String and Regexp do not matter. -Other classes may have different implementations of <tt>=~</tt>.) If a match -is found, the operator returns index of first match in string, otherwise it -returns +nil+. - - /hay/ =~ 'haystack' #=> 0 - 'haystack' =~ /hay/ #=> 0 - /a/ =~ 'haystack' #=> 1 - /u/ =~ 'haystack' #=> nil - -Using <tt>=~</tt> operator with a String and Regexp the <tt>$~</tt> global -variable is set after a successful match. <tt>$~</tt> holds a MatchData -object. Regexp.last_match is equivalent to <tt>$~</tt>. - -=== Regexp#match method - -The #match method returns a MatchData object: - - /st/.match('haystack') #=> #<MatchData "st"> - -== Metacharacters and Escapes - -The following are <i>metacharacters</i> <tt>(</tt>, <tt>)</tt>, -<tt>[</tt>, <tt>]</tt>, <tt>{</tt>, <tt>}</tt>, <tt>.</tt>, <tt>?</tt>, -<tt>+</tt>, <tt>*</tt>. They have a specific meaning when appearing in a -pattern. To match them literally they must be backslash-escaped. To match -a backslash literally, backslash-escape it: <tt>\\\\</tt>. - - /1 \+ 2 = 3\?/.match('Does 1 + 2 = 3?') #=> #<MatchData "1 + 2 = 3?"> - /a\\\\b/.match('a\\\\b') #=> #<MatchData "a\\b"> - -Patterns behave like double-quoted strings so can contain the same -backslash escapes. - - /\s\u{6771 4eac 90fd}/.match("Go to æ±äº¬éƒ½") - #=> #<MatchData " æ±äº¬éƒ½"> - -Arbitrary Ruby expressions can be embedded into patterns with the -<tt>#{...}</tt> construct. - - place = "æ±äº¬éƒ½" - /#{place}/.match("Go to æ±äº¬éƒ½") - #=> #<MatchData "æ±äº¬éƒ½"> - -== Character Classes - -A <i>character class</i> is delimited with square brackets (<tt>[</tt>, -<tt>]</tt>) and lists characters that may appear at that point in the -match. <tt>/[ab]/</tt> means _a_ or _b_, as opposed to <tt>/ab/</tt> which -means _a_ followed by _b_. - - /W[aeiou]rd/.match("Word") #=> #<MatchData "Word"> - -Within a character class the hyphen (<tt>-</tt>) is a metacharacter -denoting an inclusive range of characters. <tt>[abcd]</tt> is equivalent -to <tt>[a-d]</tt>. A range can be followed by another range, so -<tt>[abcdwxyz]</tt> is equivalent to <tt>[a-dw-z]</tt>. The order in which -ranges or individual characters appear inside a character class is -irrelevant. - - /[0-9a-f]/.match('9f') #=> #<MatchData "9"> - /[9f]/.match('9f') #=> #<MatchData "9"> - -If the first character of a character class is a caret (<tt>^</tt>) the -class is inverted: it matches any character _except_ those named. - - /[^a-eg-z]/.match('f') #=> #<MatchData "f"> - -A character class may contain another character class. By itself this -isn't useful because <tt>[a-z[0-9]]</tt> describes the same set as -<tt>[a-z0-9]</tt>. However, character classes also support the <tt>&&</tt> -operator which performs set intersection on its arguments. The two can be -combined as follows: - - /[a-w&&[^c-g]z]/ # ([a-w] AND ([^c-g] OR z)) - -This is equivalent to: - - /[abh-w]/ - -The following metacharacters also behave like character classes: - -* <tt>/./</tt> - Any character except a newline. -* <tt>/./m</tt> - Any character (the +m+ modifier enables multiline mode) -* <tt>/\w/</tt> - A word character (<tt>[a-zA-Z0-9_]</tt>) -* <tt>/\W/</tt> - A non-word character (<tt>[^a-zA-Z0-9_]</tt>). - Please take a look at {Bug #4044}[https://bugs.ruby-lang.org/issues/4044] if - using <tt>/\W/</tt> with the <tt>/i</tt> modifier. -* <tt>/\d/</tt> - A digit character (<tt>[0-9]</tt>) -* <tt>/\D/</tt> - A non-digit character (<tt>[^0-9]</tt>) -* <tt>/\h/</tt> - A hexdigit character (<tt>[0-9a-fA-F]</tt>) -* <tt>/\H/</tt> - A non-hexdigit character (<tt>[^0-9a-fA-F]</tt>) -* <tt>/\s/</tt> - A whitespace character: <tt>/[ \t\r\n\f\v]/</tt> -* <tt>/\S/</tt> - A non-whitespace character: <tt>/[^ \t\r\n\f\v]/</tt> - -POSIX <i>bracket expressions</i> are also similar to character classes. -They provide a portable alternative to the above, with the added benefit -that they encompass non-ASCII characters. For instance, <tt>/\d/</tt> -matches only the ASCII decimal digits (0-9); whereas <tt>/[[:digit:]]/</tt> -matches any character in the Unicode _Nd_ category. - -* <tt>/[[:alnum:]]/</tt> - Alphabetic and numeric character -* <tt>/[[:alpha:]]/</tt> - Alphabetic character -* <tt>/[[:blank:]]/</tt> - Space or tab -* <tt>/[[:cntrl:]]/</tt> - Control character -* <tt>/[[:digit:]]/</tt> - Digit -* <tt>/[[:graph:]]/</tt> - Non-blank character (excludes spaces, control - characters, and similar) -* <tt>/[[:lower:]]/</tt> - Lowercase alphabetical character -* <tt>/[[:print:]]/</tt> - Like [:graph:], but includes the space character -* <tt>/[[:punct:]]/</tt> - Punctuation character -* <tt>/[[:space:]]/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline, - carriage return, etc.) -* <tt>/[[:upper:]]/</tt> - Uppercase alphabetical -* <tt>/[[:xdigit:]]/</tt> - Digit allowed in a hexadecimal number (i.e., - 0-9a-fA-F) - -Ruby also supports the following non-POSIX character classes: - -* <tt>/[[:word:]]/</tt> - A character in one of the following Unicode - general categories _Letter_, _Mark_, _Number_, - <i>Connector_Punctuation</i> -* <tt>/[[:ascii:]]/</tt> - A character in the ASCII character set - - # U+06F2 is "EXTENDED ARABIC-INDIC DIGIT TWO" - /[[:digit:]]/.match("\u06F2") #=> #<MatchData "\u{06F2}"> - /[[:upper:]][[:lower:]]/.match("Hello") #=> #<MatchData "He"> - /[[:xdigit:]][[:xdigit:]]/.match("A6") #=> #<MatchData "A6"> - -== Repetition - -The constructs described so far match a single character. They can be -followed by a repetition metacharacter to specify how many times they need -to occur. Such metacharacters are called <i>quantifiers</i>. - -* <tt>*</tt> - Zero or more times -* <tt>+</tt> - One or more times -* <tt>?</tt> - Zero or one times (optional) -* <tt>{</tt><i>n</i><tt>}</tt> - Exactly <i>n</i> times -* <tt>{</tt><i>n</i><tt>,}</tt> - <i>n</i> or more times -* <tt>{,</tt><i>m</i><tt>}</tt> - <i>m</i> or less times -* <tt>{</tt><i>n</i><tt>,</tt><i>m</i><tt>}</tt> - At least <i>n</i> and - at most <i>m</i> times - -At least one uppercase character ('H'), at least one lowercase character -('e'), two 'l' characters, then one 'o': - - "Hello".match(/[[:upper:]]+[[:lower:]]+l{2}o/) #=> #<MatchData "Hello"> - -Repetition is <i>greedy</i> by default: as many occurrences as possible -are matched while still allowing the overall match to succeed. By -contrast, <i>lazy</i> matching makes the minimal amount of matches -necessary for overall success. A greedy metacharacter can be made lazy by -following it with <tt>?</tt>. - -Both patterns below match the string. The first uses a greedy quantifier so -'.+' matches '<a><b>'; the second uses a lazy quantifier so '.+?' matches -'<a>': - - /<.+>/.match("<a><b>") #=> #<MatchData "<a><b>"> - /<.+?>/.match("<a><b>") #=> #<MatchData "<a>"> - -A quantifier followed by <tt>+</tt> matches <i>possessively</i>: once it -has matched it does not backtrack. They behave like greedy quantifiers, -but having matched they refuse to "give up" their match even if this -jeopardises the overall match. - -== Capturing - -Parentheses can be used for <i>capturing</i>. The text enclosed by the -<i>n</i><sup>th</sup> group of parentheses can be subsequently referred to -with <i>n</i>. Within a pattern use the <i>backreference</i> -<tt>\n</tt>; outside of the pattern use -<tt>MatchData[</tt><i>n</i><tt>]</tt>. - -'at' is captured by the first group of parentheses, then referred to later -with <tt>\1</tt>: - - /[csh](..) [csh]\1 in/.match("The cat sat in the hat") - #=> #<MatchData "cat sat in" 1:"at"> - -Regexp#match returns a MatchData object which makes the captured text -available with its #[] method: - - /[csh](..) [csh]\1 in/.match("The cat sat in the hat")[1] #=> 'at' - -Capture groups can be referred to by name when defined with the -<tt>(?<</tt><i>name</i><tt>>)</tt> or <tt>(?'</tt><i>name</i><tt>')</tt> -constructs. - - /\$(?<dollars>\d+)\.(?<cents>\d+)/.match("$3.67") - #=> #<MatchData "$3.67" dollars:"3" cents:"67"> - /\$(?<dollars>\d+)\.(?<cents>\d+)/.match("$3.67")[:dollars] #=> "3" - -Named groups can be backreferenced with <tt>\k<</tt><i>name</i><tt>></tt>, -where _name_ is the group name. - - /(?<vowel>[aeiou]).\k<vowel>.\k<vowel>/.match('ototomy') - #=> #<MatchData "ototo" vowel:"o"> - -*Note*: A regexp can't use named backreferences and numbered -backreferences simultaneously. - -When named capture groups are used with a literal regexp on the left-hand -side of an expression and the <tt>=~</tt> operator, the captured text is -also assigned to local variables with corresponding names. - - /\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ "$3.67" #=> 0 - dollars #=> "3" - -== Grouping - -Parentheses also <i>group</i> the terms they enclose, allowing them to be -quantified as one <i>atomic</i> whole. - -The pattern below matches a vowel followed by 2 word characters: - - /[aeiou]\w{2}/.match("Caenorhabditis elegans") #=> #<MatchData "aen"> - -Whereas the following pattern matches a vowel followed by a word character, -twice, i.e. <tt>[aeiou]\w[aeiou]\w</tt>: 'enor'. - - /([aeiou]\w){2}/.match("Caenorhabditis elegans") - #=> #<MatchData "enor" 1:"or"> - -The <tt>(?:</tt>...<tt>)</tt> construct provides grouping without -capturing. That is, it combines the terms it contains into an atomic whole -without creating a backreference. This benefits performance at the slight -expense of readability. - -The first group of parentheses captures 'n' and the second 'ti'. The second -group is referred to later with the backreference <tt>\2</tt>: - - /I(n)ves(ti)ga\2ons/.match("Investigations") - #=> #<MatchData "Investigations" 1:"n" 2:"ti"> - -The first group of parentheses is now made non-capturing with '?:', so it -still matches 'n', but doesn't create the backreference. Thus, the -backreference <tt>\1</tt> now refers to 'ti'. - - /I(?:n)ves(ti)ga\1ons/.match("Investigations") - #=> #<MatchData "Investigations" 1:"ti"> - -=== Atomic Grouping - -Grouping can be made <i>atomic</i> with -<tt>(?></tt><i>pat</i><tt>)</tt>. This causes the subexpression <i>pat</i> -to be matched independently of the rest of the expression such that what -it matches becomes fixed for the remainder of the match, unless the entire -subexpression must be abandoned and subsequently revisited. In this -way <i>pat</i> is treated as a non-divisible whole. Atomic grouping is -typically used to optimise patterns so as to prevent the regular -expression engine from backtracking needlessly. - -The <tt>"</tt> in the pattern below matches the first character of the string, -then <tt>.*</tt> matches <i>Quote"</i>. This causes the overall match to fail, -so the text matched by <tt>.*</tt> is backtracked by one position, which -leaves the final character of the string available to match <tt>"</tt> - - /".*"/.match('"Quote"') #=> #<MatchData "\"Quote\""> - -If <tt>.*</tt> is grouped atomically, it refuses to backtrack <i>Quote"</i>, -even though this means that the overall match fails - - /"(?>.*)"/.match('"Quote"') #=> nil - -== Subexpression Calls - -The <tt>\g<</tt><i>name</i><tt>></tt> syntax matches the previous -subexpression named _name_, which can be a group name or number, again. -This differs from backreferences in that it re-executes the group rather -than simply trying to re-match the same text. - -This pattern matches a <i>(</i> character and assigns it to the <tt>paren</tt> -group, tries to call that the <tt>paren</tt> sub-expression again but fails, -then matches a literal <i>)</i>: - - /\A(?<paren>\(\g<paren>*\))*\z/ =~ '()' - - - /\A(?<paren>\(\g<paren>*\))*\z/ =~ '(())' #=> 0 - # ^1 - # ^2 - # ^3 - # ^4 - # ^5 - # ^6 - # ^7 - # ^8 - # ^9 - # ^10 - -1. Matches at the beginning of the string, i.e. before the first - character. -2. Enters a named capture group called <tt>paren</tt> -3. Matches a literal <i>(</i>, the first character in the string -4. Calls the <tt>paren</tt> group again, i.e. recurses back to the - second step -5. Re-enters the <tt>paren</tt> group -6. Matches a literal <i>(</i>, the second character in the - string -7. Try to call <tt>paren</tt> a third time, but fail because - doing so would prevent an overall successful match -8. Match a literal <i>)</i>, the third character in the string. - Marks the end of the second recursive call -9. Match a literal <i>)</i>, the fourth character in the string -10. Match the end of the string - -== Alternation - -The vertical bar metacharacter (<tt>|</tt>) combines two expressions into -a single one that matches either of the expressions. Each expression is an -<i>alternative</i>. - - /\w(and|or)\w/.match("Feliformia") #=> #<MatchData "form" 1:"or"> - /\w(and|or)\w/.match("furandi") #=> #<MatchData "randi" 1:"and"> - /\w(and|or)\w/.match("dissemblance") #=> nil - -== Character Properties - -The <tt>\p{}</tt> construct matches characters with the named property, -much like POSIX bracket classes. - -* <tt>/\p{Alnum}/</tt> - Alphabetic and numeric character -* <tt>/\p{Alpha}/</tt> - Alphabetic character -* <tt>/\p{Blank}/</tt> - Space or tab -* <tt>/\p{Cntrl}/</tt> - Control character -* <tt>/\p{Digit}/</tt> - Digit -* <tt>/\p{Graph}/</tt> - Non-blank character (excludes spaces, control - characters, and similar) -* <tt>/\p{Lower}/</tt> - Lowercase alphabetical character -* <tt>/\p{Print}/</tt> - Like <tt>\p{Graph}</tt>, but includes the space character -* <tt>/\p{Punct}/</tt> - Punctuation character -* <tt>/\p{Space}/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline, - carriage return, etc.) -* <tt>/\p{Upper}/</tt> - Uppercase alphabetical -* <tt>/\p{XDigit}/</tt> - Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F) -* <tt>/\p{Word}/</tt> - A member of one of the following Unicode general - category <i>Letter</i>, <i>Mark</i>, <i>Number</i>, - <i>Connector\_Punctuation</i> -* <tt>/\p{ASCII}/</tt> - A character in the ASCII character set -* <tt>/\p{Any}/</tt> - Any Unicode character (including unassigned - characters) -* <tt>/\p{Assigned}/</tt> - An assigned character - -A Unicode character's <i>General Category</i> value can also be matched -with <tt>\p{</tt><i>Ab</i><tt>}</tt> where <i>Ab</i> is the category's -abbreviation as described below: - -* <tt>/\p{L}/</tt> - 'Letter' -* <tt>/\p{Ll}/</tt> - 'Letter: Lowercase' -* <tt>/\p{Lm}/</tt> - 'Letter: Mark' -* <tt>/\p{Lo}/</tt> - 'Letter: Other' -* <tt>/\p{Lt}/</tt> - 'Letter: Titlecase' -* <tt>/\p{Lu}/</tt> - 'Letter: Uppercase -* <tt>/\p{Lo}/</tt> - 'Letter: Other' -* <tt>/\p{M}/</tt> - 'Mark' -* <tt>/\p{Mn}/</tt> - 'Mark: Nonspacing' -* <tt>/\p{Mc}/</tt> - 'Mark: Spacing Combining' -* <tt>/\p{Me}/</tt> - 'Mark: Enclosing' -* <tt>/\p{N}/</tt> - 'Number' -* <tt>/\p{Nd}/</tt> - 'Number: Decimal Digit' -* <tt>/\p{Nl}/</tt> - 'Number: Letter' -* <tt>/\p{No}/</tt> - 'Number: Other' -* <tt>/\p{P}/</tt> - 'Punctuation' -* <tt>/\p{Pc}/</tt> - 'Punctuation: Connector' -* <tt>/\p{Pd}/</tt> - 'Punctuation: Dash' -* <tt>/\p{Ps}/</tt> - 'Punctuation: Open' -* <tt>/\p{Pe}/</tt> - 'Punctuation: Close' -* <tt>/\p{Pi}/</tt> - 'Punctuation: Initial Quote' -* <tt>/\p{Pf}/</tt> - 'Punctuation: Final Quote' -* <tt>/\p{Po}/</tt> - 'Punctuation: Other' -* <tt>/\p{S}/</tt> - 'Symbol' -* <tt>/\p{Sm}/</tt> - 'Symbol: Math' -* <tt>/\p{Sc}/</tt> - 'Symbol: Currency' -* <tt>/\p{Sc}/</tt> - 'Symbol: Currency' -* <tt>/\p{Sk}/</tt> - 'Symbol: Modifier' -* <tt>/\p{So}/</tt> - 'Symbol: Other' -* <tt>/\p{Z}/</tt> - 'Separator' -* <tt>/\p{Zs}/</tt> - 'Separator: Space' -* <tt>/\p{Zl}/</tt> - 'Separator: Line' -* <tt>/\p{Zp}/</tt> - 'Separator: Paragraph' -* <tt>/\p{C}/</tt> - 'Other' -* <tt>/\p{Cc}/</tt> - 'Other: Control' -* <tt>/\p{Cf}/</tt> - 'Other: Format' -* <tt>/\p{Cn}/</tt> - 'Other: Not Assigned' -* <tt>/\p{Co}/</tt> - 'Other: Private Use' -* <tt>/\p{Cs}/</tt> - 'Other: Surrogate' - -Lastly, <tt>\p{}</tt> matches a character's Unicode <i>script</i>. The -following scripts are supported: <i>Arabic</i>, <i>Armenian</i>, -<i>Balinese</i>, <i>Bengali</i>, <i>Bopomofo</i>, <i>Braille</i>, -<i>Buginese</i>, <i>Buhid</i>, <i>Canadian_Aboriginal</i>, <i>Carian</i>, -<i>Cham</i>, <i>Cherokee</i>, <i>Common</i>, <i>Coptic</i>, -<i>Cuneiform</i>, <i>Cypriot</i>, <i>Cyrillic</i>, <i>Deseret</i>, -<i>Devanagari</i>, <i>Ethiopic</i>, <i>Georgian</i>, <i>Glagolitic</i>, -<i>Gothic</i>, <i>Greek</i>, <i>Gujarati</i>, <i>Gurmukhi</i>, <i>Han</i>, -<i>Hangul</i>, <i>Hanunoo</i>, <i>Hebrew</i>, <i>Hiragana</i>, -<i>Inherited</i>, <i>Kannada</i>, <i>Katakana</i>, <i>Kayah_Li</i>, -<i>Kharoshthi</i>, <i>Khmer</i>, <i>Lao</i>, <i>Latin</i>, <i>Lepcha</i>, -<i>Limbu</i>, <i>Linear_B</i>, <i>Lycian</i>, <i>Lydian</i>, -<i>Malayalam</i>, <i>Mongolian</i>, <i>Myanmar</i>, <i>New_Tai_Lue</i>, -<i>Nko</i>, <i>Ogham</i>, <i>Ol_Chiki</i>, <i>Old_Italic</i>, -<i>Old_Persian</i>, <i>Oriya</i>, <i>Osmanya</i>, <i>Phags_Pa</i>, -<i>Phoenician</i>, <i>Rejang</i>, <i>Runic</i>, <i>Saurashtra</i>, -<i>Shavian</i>, <i>Sinhala</i>, <i>Sundanese</i>, <i>Syloti_Nagri</i>, -<i>Syriac</i>, <i>Tagalog</i>, <i>Tagbanwa</i>, <i>Tai_Le</i>, -<i>Tamil</i>, <i>Telugu</i>, <i>Thaana</i>, <i>Thai</i>, <i>Tibetan</i>, -<i>Tifinagh</i>, <i>Ugaritic</i>, <i>Vai</i>, and <i>Yi</i>. - -Unicode codepoint U+06E9 is named "ARABIC PLACE OF SAJDAH" and belongs to the -Arabic script: - - /\p{Arabic}/.match("\u06E9") #=> #<MatchData "\u06E9"> - -All character properties can be inverted by prefixing their name with a -caret (<tt>^</tt>). - -Letter 'A' is not in the Unicode Ll (Letter; Lowercase) category, so this -match succeeds: - - /\p{^Ll}/.match("A") #=> #<MatchData "A"> - -== Anchors - -Anchors are metacharacter that match the zero-width positions between -characters, <i>anchoring</i> the match to a specific position. - -* <tt>^</tt> - Matches beginning of line -* <tt>$</tt> - Matches end of line -* <tt>\A</tt> - Matches beginning of string. -* <tt>\Z</tt> - Matches end of string. If string ends with a newline, - it matches just before newline -* <tt>\z</tt> - Matches end of string -* <tt>\G</tt> - Matches first matching position: - - In methods like <tt>String#gsub</tt> and <tt>String#scan</tt>, it changes on each iteration. - It initially matches the beginning of subject, and in each following iteration it matches where the last match finished. - - " a b c".gsub(/ /, '_') #=> "____a_b_c" - " a b c".gsub(/\G /, '_') #=> "____a b c" - - In methods like <tt>Regexp#match</tt> and <tt>String#match</tt> that take an (optional) offset, it matches where the search begins. - - "hello, world".match(/,/, 3) #=> #<MatchData ","> - "hello, world".match(/\G,/, 3) #=> nil - -* <tt>\b</tt> - Matches word boundaries when outside brackets; - backspace (0x08) when inside brackets -* <tt>\B</tt> - Matches non-word boundaries -* <tt>(?=</tt><i>pat</i><tt>)</tt> - <i>Positive lookahead</i> assertion: - ensures that the following characters match <i>pat</i>, but doesn't - include those characters in the matched text -* <tt>(?!</tt><i>pat</i><tt>)</tt> - <i>Negative lookahead</i> assertion: - ensures that the following characters do not match <i>pat</i>, but - doesn't include those characters in the matched text -* <tt>(?<=</tt><i>pat</i><tt>)</tt> - <i>Positive lookbehind</i> - assertion: ensures that the preceding characters match <i>pat</i>, but - doesn't include those characters in the matched text -* <tt>(?<!</tt><i>pat</i><tt>)</tt> - <i>Negative lookbehind</i> - assertion: ensures that the preceding characters do not match - <i>pat</i>, but doesn't include those characters in the matched text - -If a pattern isn't anchored it can begin at any point in the string: - - /real/.match("surrealist") #=> #<MatchData "real"> - -Anchoring the pattern to the beginning of the string forces the match to start -there. 'real' doesn't occur at the beginning of the string, so now the match -fails: - - /\Areal/.match("surrealist") #=> nil - -The match below fails because although 'Demand' contains 'and', the pattern -does not occur at a word boundary. - - /\band/.match("Demand") - -Whereas in the following example 'and' has been anchored to a non-word -boundary so instead of matching the first 'and' it matches from the fourth -letter of 'demand' instead: - - /\Band.+/.match("Supply and demand curve") #=> #<MatchData "and curve"> - -The pattern below uses positive lookahead and positive lookbehind to match -text appearing in <b></b> tags without including the tags in the match: - - /(?<=<b>)\w+(?=<\/b>)/.match("Fortune favours the <b>bold</b>") - #=> #<MatchData "bold"> - -== Options - -The end delimiter for a regexp can be followed by one or more single-letter -options which control how the pattern can match. - -* <tt>/pat/i</tt> - Ignore case -* <tt>/pat/m</tt> - Treat a newline as a character matched by <tt>.</tt> -* <tt>/pat/x</tt> - Ignore whitespace and comments in the pattern -* <tt>/pat/o</tt> - Perform <tt>#{}</tt> interpolation only once - -<tt>i</tt>, <tt>m</tt>, and <tt>x</tt> can also be applied on the -subexpression level with the -<tt>(?</tt><i>on</i><tt>-</tt><i>off</i><tt>)</tt> construct, which -enables options <i>on</i>, and disables options <i>off</i> for the -expression enclosed by the parentheses: - - /a(?i:b)c/.match('aBc') #=> #<MatchData "aBc"> - /a(?-i:b)c/i.match('ABC') #=> nil - -Additionally, these options can also be toggled for the remainder of the -pattern: - - /a(?i)bc/.match('abC') #=> #<MatchData "abC"> - -Options may also be used with <tt>Regexp.new</tt>: - - Regexp.new("abc", Regexp::IGNORECASE) #=> /abc/i - Regexp.new("abc", Regexp::MULTILINE) #=> /abc/m - Regexp.new("abc # Comment", Regexp::EXTENDED) #=> /abc # Comment/x - Regexp.new("abc", Regexp::IGNORECASE | Regexp::MULTILINE) #=> /abc/mi - -== Free-Spacing Mode and Comments - -As mentioned above, the <tt>x</tt> option enables <i>free-spacing</i> -mode. Literal white space inside the pattern is ignored, and the -octothorpe (<tt>#</tt>) character introduces a comment until the end of -the line. This allows the components of the pattern to be organized in a -potentially more readable fashion. - -A contrived pattern to match a number with optional decimal places: - - float_pat = /\A - [[:digit:]]+ # 1 or more digits before the decimal point - (\. # Decimal point - [[:digit:]]+ # 1 or more digits after the decimal point - )? # The decimal point and following digits are optional - \Z/x - float_pat.match('3.14') #=> #<MatchData "3.14" 1:".14"> - -There are a number of strategies for matching whitespace: - -* Use a pattern such as <tt>\s</tt> or <tt>\p{Space}</tt>. -* Use escaped whitespace such as <tt>\ </tt>, i.e. a space preceded by a backslash. -* Use a character class such as <tt>[ ]</tt>. - -Comments can be included in a non-<tt>x</tt> pattern with the -<tt>(?#</tt><i>comment</i><tt>)</tt> construct, where <i>comment</i> is -arbitrary text ignored by the regexp engine. - -Comments in regexp literals cannot include unescaped terminator -characters. - -== Encoding - -Regular expressions are assumed to use the source encoding. This can be -overridden with one of the following modifiers. - -* <tt>/</tt><i>pat</i><tt>/u</tt> - UTF-8 -* <tt>/</tt><i>pat</i><tt>/e</tt> - EUC-JP -* <tt>/</tt><i>pat</i><tt>/s</tt> - Windows-31J -* <tt>/</tt><i>pat</i><tt>/n</tt> - ASCII-8BIT - -A regexp can be matched against a string when they either share an -encoding, or the regexp's encoding is _US-ASCII_ and the string's encoding -is ASCII-compatible. - -If a match between incompatible encodings is attempted an -<tt>Encoding::CompatibilityError</tt> exception is raised. - -The <tt>Regexp#fixed_encoding?</tt> predicate indicates whether the regexp -has a <i>fixed</i> encoding, that is one incompatible with ASCII. A -regexp's encoding can be explicitly fixed by supplying -<tt>Regexp::FIXEDENCODING</tt> as the second argument of -<tt>Regexp.new</tt>: - - r = Regexp.new("a".force_encoding("iso-8859-1"),Regexp::FIXEDENCODING) - r =~ "a\u3042" - # raises Encoding::CompatibilityError: incompatible encoding regexp match - # (ISO-8859-1 regexp with UTF-8 string) - -== Special global variables - -Pattern matching sets some global variables : -* <tt>$~</tt> is equivalent to Regexp.last_match; -* <tt>$&</tt> contains the complete matched text; -* <tt>$`</tt> contains string before match; -* <tt>$'</tt> contains string after match; -* <tt>$1</tt>, <tt>$2</tt> and so on contain text matching first, second, etc - capture group; -* <tt>$+</tt> contains last capture group. - -Example: - - m = /s(\w{2}).*(c)/.match('haystack') #=> #<MatchData "stac" 1:"ta" 2:"c"> - $~ #=> #<MatchData "stac" 1:"ta" 2:"c"> - Regexp.last_match #=> #<MatchData "stac" 1:"ta" 2:"c"> - - $& #=> "stac" - # same as m[0] - $` #=> "hay" - # same as m.pre_match - $' #=> "k" - # same as m.post_match - $1 #=> "ta" - # same as m[1] - $2 #=> "c" - # same as m[2] - $3 #=> nil - # no third group in pattern - $+ #=> "c" - # same as m[-1] - -These global variables are thread-local and method-local variables. - -== Performance - -Certain pathological combinations of constructs can lead to abysmally bad -performance. - -Consider a string of 25 <i>a</i>s, a <i>d</i>, 4 <i>a</i>s, and a -<i>c</i>. - - s = 'a' * 25 + 'd' + 'a' * 4 + 'c' - #=> "aaaaaaaaaaaaaaaaaaaaaaaaadaaaac" - -The following patterns match instantly as you would expect: - - /(b|a)/ =~ s #=> 0 - /(b|a+)/ =~ s #=> 0 - /(b|a+)*/ =~ s #=> 0 - -However, the following pattern takes appreciably longer: - - /(b|a+)*c/ =~ s #=> 26 - -This happens because an atom in the regexp is quantified by both an -immediate <tt>+</tt> and an enclosing <tt>*</tt> with nothing to -differentiate which is in control of any particular character. The -nondeterminism that results produces super-linear performance. (Consult -<i>Mastering Regular Expressions</i> (3rd ed.), pp 222, by -<i>Jeffery Friedl</i>, for an in-depth analysis). This particular case -can be fixed by use of atomic grouping, which prevents the unnecessary -backtracking: - - (start = Time.now) && /(b|a+)*c/ =~ s && (Time.now - start) - #=> 24.702736882 - (start = Time.now) && /(?>b|a+)*c/ =~ s && (Time.now - start) - #=> 0.000166571 - -A similar case is typified by the following example, which takes -approximately 60 seconds to execute for me: - -Match a string of 29 <i>a</i>s against a pattern of 29 optional <i>a</i>s -followed by 29 mandatory <i>a</i>s: - - Regexp.new('a?' * 29 + 'a' * 29) =~ 'a' * 29 - -The 29 optional <i>a</i>s match the string, but this prevents the 29 -mandatory <i>a</i>s that follow from matching. Ruby must then backtrack -repeatedly so as to satisfy as many of the optional matches as it can -while still matching the mandatory 29. It is plain to us that none of the -optional matches can succeed, but this fact unfortunately eludes Ruby. - -The best way to improve performance is to significantly reduce the amount of -backtracking needed. For this case, instead of individually matching 29 -optional <i>a</i>s, a range of optional <i>a</i>s can be matched all at once -with <i>a{0,29}</i>: - - Regexp.new('a{0,29}' + 'a' * 29) =~ 'a' * 29 - diff --git a/doc/security/command_injection.rdoc b/doc/security/command_injection.rdoc new file mode 100644 index 0000000000..d46e42f7be --- /dev/null +++ b/doc/security/command_injection.rdoc @@ -0,0 +1,15 @@ += Command Injection + +Some Ruby core methods accept string data +that includes text to be executed as a system command. + +They should not be called with unknown or unsanitized commands. + +These methods include: + +- Kernel.exec +- Kernel.spawn +- Kernel.system +- {\`command` (backtick method)}[rdoc-ref:Kernel#`] + (also called by the expression <tt>%x[command]</tt>). +- IO.popen (when called with other than <tt>"-"</tt>). diff --git a/doc/security.rdoc b/doc/security/security.rdoc index d7d6464ce1..af9970d336 100644 --- a/doc/security.rdoc +++ b/doc/security/security.rdoc @@ -15,19 +15,6 @@ mailto:security@ruby-lang.org ({the PGP public key}[https://www.ruby-lang.org/security.asc]), which is a private mailing list. Reported problems will be published after fixes. -== <code>$SAFE</code> - -Ruby provides a mechanism to restrict what operations can be performed by Ruby -code in the form of the <code>$SAFE</code> variable. - -However, <code>$SAFE</code> does not provide a secure environment for executing -untrusted code. - -If you need to execute untrusted code, you should use an operating system level -sandboxing mechanism. On Linux, ptrace or LXC can be used to sandbox -potentially malicious code. Other similar mechanisms exist on every major -operating system. - == +Marshal.load+ Ruby's +Marshal+ module provides methods for serializing and deserializing Ruby @@ -50,7 +37,7 @@ programs for configuration and database persistence of Ruby object trees. Similar to +Marshal+, it is able to deserialize into arbitrary Ruby classes. For example, the following YAML data will create an +ERB+ object when -deserialized: +deserialized, using the +unsafe_load+ method: !ruby/object:ERB src: puts `uname` @@ -66,19 +53,16 @@ method, variable and constant names. The reason for this is that symbols are simply integers with names attached to them, so they are faster to look up in hashtables. -Starting in version 2.2, most symbols can be garbage collected; these are -called <i>mortal</i> symbols. Most symbols you create (e.g. by calling -+to_sym+) are mortal. +Most symbols can be garbage collected; these are called _mortal_ +symbols. Most symbols you create (e.g. by calling +to_sym+) are mortal. -<i>Immortal</i> symbols on the other hand will never be garbage collected. +_Immortal_ symbols on the other hand will never be garbage collected. They are created when modifying code: * defining a method (e.g. with +define_method+), * setting an instance variable (e.g. with +instance_variable_set+), * creating a variable or constant (e.g. with +const_set+) -C extensions that have not been updated and are still calling `SYM2ID` +C extensions that have not been updated and are still calling +SYM2ID+ will create immortal symbols. -Bugs in 2.2.0: +send+ and +__send__+ also created immortal symbols, -and calling methods with keyword arguments could also create some. Don't create immortal symbols from user inputs. Otherwise, this would allow a user to mount a denial of service attack against your application by @@ -141,12 +125,3 @@ Note that the use of +public_send+ is also dangerous, as +send+ itself is public: 1.public_send("send", "eval", "...ruby code to be executed...") - -== DRb - -As DRb allows remote clients to invoke arbitrary methods, it is not suitable to -expose to untrusted clients. - -When using DRb, try to avoid exposing it over the network if possible. If this -isn't possible and you need to expose DRb to the world, you *must* configure an -appropriate security policy with <code>DRb::ACL</code>. diff --git a/doc/shell.rd.ja b/doc/shell.rd.ja deleted file mode 100644 index a9507fe92a..0000000000 --- a/doc/shell.rd.ja +++ /dev/null @@ -1,335 +0,0 @@ - -- shell.rb - $Release Version: 0.6.0 $ - $Revision$ - by Keiju ISHITSUKA(keiju@ishitsuka.com) - -=begin - -= 目的 - -ruby上ã§sh/cshã®ã‚ˆã†ã«ã‚³ãƒžãƒ³ãƒ‰ã®å®Ÿè¡ŒåŠã³ãƒ•ィルタリングを手軽ã«è¡Œã†. -sh/cshã®åˆ¶å¾¡æ–‡ã¯rubyã®æ©Ÿèƒ½ã‚’用ã„ã¦å®Ÿç¾ã™ã‚‹. - -= 主ãªã‚¯ãƒ©ã‚¹ä¸€è¦§ - -== Shell - -Shellオブジェクトã¯ã‚«ãƒ¬ãƒ³ãƒˆãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªã‚’æŒã¡, コマンド実行ã¯ãã“ã‹ã‚‰ã® -相対パスã«ãªã‚Šã¾ã™. - ---- Shell#cwd ---- Shell#dir ---- Shell#getwd ---- Shell#pwd - - カレントディレクトリを返ã™ã€‚ - ---- Shell#system_path - - コマンドサーãƒãƒ‘スã®é…列を返ã™ã€‚ - ---- Shell#umask - - umaskã‚’è¿”ã™ã€‚ - -== Filter - -コマンドã®å®Ÿè¡Œçµæžœã¯ã™ã¹ã¦Filterã¨ã—ã¦ã‹ãˆã‚Šã¾ã™. Enumerableã‚’includeã— -ã¦ã„ã¾ã™. - -= 主ãªãƒ¡ã‚½ãƒƒãƒ‰ä¸€è¦§ - -== コマンド定義 - -OS上ã®ã‚³ãƒžãƒ³ãƒ‰ã‚’実行ã™ã‚‹ã«ã¯ã¾ãš, Shellã®ãƒ¡ã‚½ãƒƒãƒ‰ã¨ã—ã¦å®šç¾©ã—ã¾ã™. - -注) コマンドを定義ã—ãªãã¨ã‚‚直接実行ã§ãã‚‹Shell#systemコマンドもã‚りã¾ã™. - ---- Shell.def_system_command(command, path = command) - - Shellã®ãƒ¡ã‚½ãƒƒãƒ‰ã¨ã—ã¦commandを登録ã—ã¾ã™. - - 例) - Shell.def_system_command "ls" - ls を定義 - - Shell.def_system_command "sys_sort", "sort" - sortコマンドをsys_sortã¨ã—ã¦å®šç¾© - ---- Shell.undef_system_command(command) - - commandを削除ã—ã¾ã™. - ---- Shell.alias_command(ali, command, *opts) {...} - - commandã®aliasã‚’ã—ã¾ã™. - - 例) - Shell.alias_command "lsC", "ls", "-CBF", "--show-control-chars" - Shell.alias_command("lsC", "ls"){|*opts| ["-CBF", "--show-control-chars", *opts]} - ---- Shell.unalias_command(ali) - - commandã®aliasを削除ã—ã¾ã™. - ---- Shell.install_system_commands(pre = "sys_") - - system_path上ã«ã‚ã‚‹å…¨ã¦ã®å®Ÿè¡Œå¯èƒ½ãƒ•ァイルをShellã«å®šç¾©ã™ã‚‹. メソッ - ドåã¯å…ƒã®ãƒ•ァイルåã®é ã«preã‚’ã¤ã‘ãŸã‚‚ã®ã¨ãªã‚‹. - -== ç”Ÿæˆ - ---- Shell.new - - プãƒã‚»ã‚¹ã®ã‚«ãƒ¬ãƒ³ãƒˆãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªã‚’カレントディレクトリã¨ã™ã‚‹Shellオ - ブジェクトを生æˆã—ã¾ã™. - ---- Shell.cd(path) - - pathをカレントディレクトリã¨ã™ã‚‹Shellオブジェクトを生æˆã—ã¾ã™. - -== プãƒã‚»ã‚¹ç®¡ç† - ---- Shell#jobs - - スケジューリングã•れã¦ã„ã‚‹jobã®ä¸€è¦§ã‚’è¿”ã™. - ---- Shell#kill sig, job - - jobã«ã‚·ã‚°ãƒŠãƒ«sigã‚’é€ã‚‹ - -== カレントディレクトリæ“作 - ---- Shell#cd(path, &block) ---- Shell#chdir - - カレントディレクトリをpathã«ã™ã‚‹. イテレータã¨ã—ã¦å‘¼ã°ã‚ŒãŸã¨ãã«ã¯ - ブãƒãƒƒã‚¯å®Ÿè¡Œä¸ã®ã¿ã‚«ãƒ¬ãƒ³ãƒˆãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªã‚’変更ã™ã‚‹. - ---- Shell#pushd(path = nil, &block) ---- Shell#pushdir - - カレントディレクトリをディレクトリスタックã«ã¤ã¿, カレントディレク - トリをpathã«ã™ã‚‹. pathãŒçœç•¥ã•れãŸã¨ãã«ã¯, カレントディレクトリ㨠- ディレクトリスタックã®ãƒˆãƒƒãƒ—を交æ›ã™ã‚‹. イテレータã¨ã—ã¦å‘¼ã°ã‚ŒãŸã¨ - ãã«ã¯, ブãƒãƒƒã‚¯å®Ÿè¡Œä¸ã®ã¿pushdã™ã‚‹. - ---- Shell#popd ---- Shell#popdir - - ディレクトリスタックã‹ã‚‰ãƒãƒƒãƒ—ã—, ãれをカレントディレクトリã«ã™ã‚‹. - -== ファイル/ディレクトリæ“作 - ---- Shell#foreach(path = nil, &block) - - pathãŒãƒ•ァイルãªã‚‰, File#foreach - pathãŒãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªãªã‚‰, Dir#foreach - ---- Shell#open(path, mode) - - pathãŒãƒ•ァイルãªã‚‰, File#open - pathãŒãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªãªã‚‰, Dir#open - ---- Shell#unlink(path) - - pathãŒãƒ•ァイルãªã‚‰, File#unlink - pathãŒãƒ‡ã‚£ãƒ¬ã‚¯ãƒˆãƒªãªã‚‰, Dir#unlink - ---- Shell#test(command, file1, file2) ---- Shell#[command, file1, file2] - - ファイルテスト関数testã¨åŒã˜. - 例) - sh[?e, "foo"] - sh[:e, "foo"] - sh["e", "foo"] - sh[:exists?, "foo"] - sh["exists?", "foo"] - ---- Shell#mkdir(*path) - - Dir.mkdirã¨åŒã˜(複数å¯) - ---- Shell#rmdir(*path) - - Dir.rmdirã¨åŒã˜(複数å¯) - -== コマンド実行 - ---- System#system(command, *opts) - - commandを実行ã™ã‚‹. - 例) - print sh.system("ls", "-l") - sh.system("ls", "-l") | sh.head > STDOUT - ---- System#rehash - - リãƒãƒƒã‚·ãƒ¥ã™ã‚‹ - ---- Shell#transact &block - - ブãƒãƒƒã‚¯ä¸ã§ã¯shellã‚’selfã¨ã—ã¦å®Ÿè¡Œã™ã‚‹. - 例) - sh.transact{system("ls", "-l") | head > STDOUT} - ---- Shell#out(dev = STDOUT, &block) - - transactを呼ã³å‡ºã—ãã®çµæžœã‚’devã«å‡ºåŠ›ã™ã‚‹. - -== 内部コマンド - ---- Shell#echo(*strings) ---- Shell#cat(*files) ---- Shell#glob(patten) ---- Shell#tee(file) - - ã“れらã¯å®Ÿè¡Œã™ã‚‹ã¨, ãれらを内容ã¨ã™ã‚‹Filterオブジェクトを返ã—ã¾ã™. - ---- Filter#each &block - - フィルタã®ä¸€è¡Œãšã¤ã‚’blockã«æ¸¡ã™. - ---- Filter#<(src) - - srcをフィルタã®å…¥åŠ›ã¨ã™ã‚‹. srcãŒ, æ–‡å—列ãªã‚‰ã°ãƒ•ァイルを, IOã§ã‚れ - ã°ãれをãã®ã¾ã¾å…¥åŠ›ã¨ã™ã‚‹. - ---- Filter#>(to) - - srcをフィルタã®å‡ºåŠ›ã¨ã™ã‚‹. toãŒ, æ–‡å—列ãªã‚‰ã°ãƒ•ァイルã«, IOã§ã‚れ - ã°ãれをãã®ã¾ã¾å‡ºåŠ›ã¨ã™ã‚‹. - ---- Filter#>>(to) - - srcをフィルタã«è¿½åŠ ã™ã‚‹. toãŒ, æ–‡å—列ãªã‚‰ã°ãƒ•ァイルã«, IOã§ã‚れ㰠- ãれをãã®ã¾ã¾å‡ºåŠ›ã¨ã™ã‚‹. - ---- Filter#|(filter) - - パイプçµåˆ - ---- Filter#+(filter) - - filter1 + filter2 㯠filter1ã®å‡ºåŠ›ã®å¾Œ, filter2ã®å‡ºåŠ›ã‚’è¡Œã†. - ---- Filter#to_a ---- Filter#to_s - -== 組込ã¿ã‚³ãƒžãƒ³ãƒ‰ - ---- Shell#atime(file) ---- Shell#basename(file, *opt) ---- Shell#chmod(mode, *files) ---- Shell#chown(owner, group, *file) ---- Shell#ctime(file) ---- Shell#delete(*file) ---- Shell#dirname(file) ---- Shell#ftype(file) ---- Shell#join(*file) ---- Shell#link(file_from, file_to) ---- Shell#lstat(file) ---- Shell#mtime(file) ---- Shell#readlink(file) ---- Shell#rename(file_from, file_to) ---- Shell#split(file) ---- Shell#stat(file) ---- Shell#symlink(file_from, file_to) ---- Shell#truncate(file, length) ---- Shell#utime(atime, mtime, *file) - - ã“れらã¯Fileクラスã«ã‚ã‚‹åŒåã®ã‚¯ãƒ©ã‚¹ãƒ¡ã‚½ãƒƒãƒ‰ã¨åŒã˜ã§ã™. - ---- Shell#blockdev?(file) ---- Shell#chardev?(file) ---- Shell#directory?(file) ---- Shell#executable?(file) ---- Shell#executable_real?(file) ---- Shell#exist?(file)/Shell#exists?(file) ---- Shell#file?(file) ---- Shell#grpowned?(file) ---- Shell#owned?(file) ---- Shell#pipe?(file) ---- Shell#readable?(file) ---- Shell#readable_real?(file) ---- Shell#setgid?(file) ---- Shell#setuid?(file) ---- Shell#size(file)/Shell#size?(file) ---- Shell#socket?(file) ---- Shell#sticky?(file) ---- Shell#symlink?(file) ---- Shell#writable?(file) ---- Shell#writable_real?(file) ---- Shell#zero?(file) - - ã“れらã¯FileTestクラスã«ã‚ã‚‹åŒåã®ã‚¯ãƒ©ã‚¹ãƒ¡ã‚½ãƒƒãƒ‰ã¨åŒã˜ã§ã™. - ---- Shell#syscopy(filename_from, filename_to) ---- Shell#copy(filename_from, filename_to) ---- Shell#move(filename_from, filename_to) ---- Shell#compare(filename_from, filename_to) ---- Shell#safe_unlink(*filenames) ---- Shell#makedirs(*filenames) ---- Shell#install(filename_from, filename_to, mode) - - ã“れらã¯FileToolsクラスã«ã‚ã‚‹åŒåã®ã‚¯ãƒ©ã‚¹ãƒ¡ã‚½ãƒƒãƒ‰ã¨åŒã˜ã§ã™. - - ãã®ä»–, 以下ã®ã‚‚ã®ãŒã‚¨ã‚¤ãƒªã‚¢ã‚¹ã•れã¦ã„ã¾ã™. - ---- Shell#cmp <- Shell#compare ---- Shell#mv <- Shell#move ---- Shell#cp <- Shell#copy ---- Shell#rm_f <- Shell#safe_unlink ---- Shell#mkpath <- Shell#makedirs - -= サンプル - -== ex1 - - sh = Shell.cd("/tmp") - sh.mkdir "shell-test-1" unless sh.exists?("shell-test-1") - sh.cd("shell-test-1") - for dir in ["dir1", "dir3", "dir5"] - if !sh.exists?(dir) - sh.mkdir dir - sh.cd(dir) do - f = sh.open("tmpFile", "w") - f.print "TEST\n" - f.close - end - print sh.pwd - end - end - -== ex2 - - sh = Shell.cd("/tmp") - sh.transact do - mkdir "shell-test-1" unless exists?("shell-test-1") - cd("shell-test-1") - for dir in ["dir1", "dir3", "dir5"] - if !exists?(dir) - mkdir dir - cd(dir) do - f = open("tmpFile", "w") - f.print "TEST\n" - f.close - end - print pwd - end - end - end - -== ex3 - - sh.cat("/etc/printcap") | sh.tee("tee1") > "tee2" - (sh.cat < "/etc/printcap") | sh.tee("tee11") > "tee12" - sh.cat("/etc/printcap") | sh.tee("tee1") >> "tee2" - (sh.cat < "/etc/printcap") | sh.tee("tee11") >> "tee12" - -== ex4 - - print sh.cat("/etc/passwd").head.collect{|l| l =~ /keiju/} - -=end diff --git a/doc/standard_library.md b/doc/standard_library.md new file mode 100644 index 0000000000..7a477283a9 --- /dev/null +++ b/doc/standard_library.md @@ -0,0 +1,225 @@ +# Ruby Standard Library + +The Ruby Standard Library is a large collection of classes and modules you can +require in your code to gain additional features. + +Below is an overview of the libraries and extensions, followed by a brief description +of each. + +## Libraries + +- `MakeMakefile`: A module used to generate a Makefile for C extensions +- `RbConfig`: Information about your Ruby configuration and build +- `Gem`: A package management framework for Ruby + +## Extensions + +- `Coverage`: Provides coverage measurement for Ruby +- `Monitor`: Provides a reentrant mutex +- `objspace`: Extends the ObjectSpace module to add methods for internal statistics +- `PTY`: Creates and manages pseudo-terminals +- `Ripper`: Provides an interface for parsing Ruby programs into S-expressions +- `Socket`: Accesses underlying OS socket implementations + +# Default gems + +- Default gems are shipped with Ruby releases and also available as rubygems. +- Default gems are not uninstallable from the Ruby installation. +- Default gems can be updated using rubygems. + - e.g. `gem update json` +- Default gems can be used with bundler environments like `unbundled_env`. +- Default gems can be used at any version in a Gemfile. + - e.g. `gem "json", ">= 2.6"` + +## Libraries + +- Bundler ([GitHub][bundler]): Manage your Ruby application's gem dependencies +- Delegator ([GitHub][delegate]): Provides three abilities to delegate method calls to an object +- DidYouMean ([GitHub][did_you_mean]): "Did you mean?" experience in Ruby +- English ([GitHub][English]): Provides references to special global variables with less cryptic names +- ERB ([GitHub][erb]): An easy-to-use but powerful templating system for Ruby +- ErrorHighlight ([GitHub][error_highlight]): Highlight error locations in your code +- FileUtils ([GitHub][fileutils]): Several file utility methods for copying, moving, removing, etc. +- Find ([GitHub][find]): This module supports top-down traversal of a set of file paths +- Forwardable ([GitHub][forwardable]): Provides delegation of specified methods to a designated object +- IPAddr ([GitHub][ipaddr]): Provides methods to manipulate IPv4 and IPv6 IP addresses +- OptionParser ([GitHub][optparse]): Ruby-oriented class for command-line option analysis +- Net::HTTP ([GitHub][net-http]): HTTP client API for Ruby +- Open3 ([GitHub][open3]): Provides access to stdin, stdout, and stderr when running other programs +- OpenURI ([GitHub][open-uri]): An easy-to-use wrapper for URI::HTTP, URI::HTTPS, and URI::FTP +- PP ([GitHub][pp]): Provides a PrettyPrinter for Ruby objects +- PrettyPrint ([GitHub][prettyprint]): Implements a pretty printing algorithm for readable structure +- Prism ([GitHub][prism]): A portable, error-tolerant Ruby parser +- Resolv ([GitHub][resolv]): Thread-aware DNS resolver library in Ruby +- SecureRandom ([GitHub][securerandom]): Interface for a secure random number generator +- Shellwords ([GitHub][shellwords]): Manipulates strings with the word parsing rules of the UNIX Bourne shell +- Singleton ([GitHub][singleton]): Implementation of the Singleton pattern for Ruby +- Tempfile ([GitHub][tempfile]): A utility class for managing temporary files +- Time ([GitHub][time]): Extends the Time class with methods for parsing and conversion +- Timeout ([GitHub][timeout]): Auto-terminate potentially long-running operations in Ruby +- TmpDir ([GitHub][tmpdir]): Extends the Dir class to manage the OS temporary file path +- UN ([GitHub][un]): Utilities to replace common UNIX commands +- URI ([GitHub][uri]): A Ruby module providing support for Uniform Resource Identifiers +- YAML ([GitHub][yaml]): The Ruby client library for the Psych YAML implementation +- WeakRef ([GitHub][weakref]): Allows a referenced object to be garbage-collected + +## Extensions + +- Date ([GitHub][date]): Represents dates, with a subclass for dates with time and timezones +- Digest ([GitHub][digest]): Provides a framework for message digest libraries +- Etc ([GitHub][etc]): Provides access to information typically stored in the UNIX /etc directory +- Fcntl ([GitHub][fcntl]): Loads constants defined in the OS fcntl.h C header file +- IO.console ([GitHub][io-console]): Extensions for the IO class, including `IO.console`, `IO.winsize`, etc. +- IO#nonblock ([GitHub][io-nonblock]): Enable non-blocking mode with IO class. +- IO#wait ([GitHub][io-wait]): Provides the feature for waiting until IO is readable or writable without blocking. +- JSON ([GitHub][json]): Implements JavaScript Object Notation for Ruby +- OpenSSL ([GitHub][openssl]): Provides SSL, TLS, and general-purpose cryptography for Ruby +- Pathname ([GitHub][pathname]): Representation of the name of a file or directory on the filesystem +- Psych ([GitHub][psych]): A YAML parser and emitter for Ruby +- StringIO ([GitHub][stringio]): Pseudo-I/O on String objects +- StringScanner ([GitHub][strscan]): Provides lexical scanning operations on a String +- Zlib ([GitHub][zlib]): Ruby interface for the zlib compression/decompression library + +# Bundled gems + +- Bundled gems are shipped with Ruby releases and also available as rubygems. + - They are only bundled with Ruby releases. + - They can be uninstalled from the Ruby installation. + - They need to be declared in a Gemfile when used with bundler. + +## Libraries + +- [minitest]: A test library supporting TDD, BDD, mocking, and benchmarking +- [power_assert]: Power Assert for Ruby +- [rake][rake-doc] ([GitHub][rake]): Ruby build program with capabilities similar to make +- [test-unit]: A compatibility layer for MiniTest +- [rexml][rexml-doc] ([GitHub][rexml]): An XML toolkit for Ruby +- [rss]: A family of libraries supporting various XML-based "feeds" +- [net-ftp]: Support for the File Transfer Protocol +- [net-imap]: Ruby client API for the Internet Message Access Protocol +- [net-pop]: Ruby client library for POP3 +- [net-smtp]: Simple Mail Transfer Protocol client library for Ruby +- [matrix]: Represents a mathematical matrix +- [prime]: Prime numbers and factorization library +- [rbs]: RBS is a language to describe the structure of Ruby programs +- [typeprof]: A type analysis tool for Ruby code based on abstract interpretation +- [debug]: Debugging functionality for Ruby +- [racc][racc-doc] ([GitHub][racc]): A LALR(1) parser generator written in Ruby +- [mutex_m]: Mixin to extend objects to be handled like a Mutex +- [getoptlong]: Parse command line options similar to the GNU C getopt_long() +- [base64]: Support for encoding and decoding binary data using a Base64 representation +- [bigdecimal]: Provides arbitrary-precision floating point decimal arithmetic +- [observer]: Provides a mechanism for the publish/subscribe pattern in Ruby +- [abbrev]: Calculates a set of unique abbreviations for a given set of strings +- [resolv-replace]: Replace Socket DNS with Resolv +- [rinda]: The Linda distributed computing paradigm in Ruby +- [drb]: Distributed object system for Ruby +- [nkf]: Ruby extension for the Network Kanji Filter +- [syslog]: Ruby interface for the POSIX system logging facility +- [csv][csv-doc] ([GitHub][csv]): Provides an interface to read and write CSV files and data +- [ostruct]: A class to build custom data structures, similar to a Hash +- [benchmark]: Provides methods to measure and report the time used to execute code +- [logger][logger-doc] ([GitHub][logger]): Provides a simple logging utility for outputting messages +- [pstore]: Implements a file-based persistence mechanism based on a Hash +- [win32ole]: Provides an interface for OLE Automation in Ruby +- [reline][reline-doc] ([GitHub][reline]): GNU Readline and Editline in a pure Ruby implementation +- [readline]: Wrapper for the Readline extension and Reline +- [fiddle]: A libffi wrapper for Ruby +- [tsort]: Topological sorting using Tarjan's algorithm + +## Tools + +- [IRB][irb-doc] ([GitHub][irb]): Interactive Ruby command-line tool for REPL (Read Eval Print Loop) +- [RDoc][rdoc-doc] ([GitHub][rdoc]): Documentation generator for Ruby + +[abbrev]: https://github.com/ruby/abbrev +[base64]: https://github.com/ruby/base64 +[benchmark]: https://github.com/ruby/benchmark +[bigdecimal]: https://github.com/ruby/bigdecimal +[bundler]: https://github.com/rubygems/rubygems +[csv]: https://github.com/ruby/csv +[date]: https://github.com/ruby/date +[debug]: https://github.com/ruby/debug +[delegate]: https://github.com/ruby/delegate +[did_you_mean]: https://github.com/ruby/did_you_mean +[digest]: https://github.com/ruby/digest +[drb]: https://github.com/ruby/drb +[English]: https://github.com/ruby/English +[erb]: https://github.com/ruby/erb +[error_highlight]: https://github.com/ruby/error_highlight +[etc]: https://github.com/ruby/etc +[fcntl]: https://github.com/ruby/fcntl +[fiddle]: https://github.com/ruby/fiddle +[fileutils]: https://github.com/ruby/fileutils +[find]: https://github.com/ruby/find +[forwardable]: https://github.com/ruby/forwardable +[getoptlong]: https://github.com/ruby/getoptlong +[io-console]: https://github.com/ruby/io-console +[io-nonblock]: https://github.com/ruby/io-nonblock +[io-wait]: https://github.com/ruby/io-wait +[ipaddr]: https://github.com/ruby/ipaddr +[irb]: https://github.com/ruby/irb +[json]: https://github.com/ruby/json +[logger]: https://github.com/ruby/logger +[matrix]: https://github.com/ruby/matrix +[minitest]: https://github.com/seattlerb/minitest +[mutex_m]: https://github.com/ruby/mutex_m +[net-ftp]: https://github.com/ruby/net-ftp +[net-http]: https://github.com/ruby/net-http +[net-imap]: https://github.com/ruby/net-imap +[net-pop]: https://github.com/ruby/net-pop +[net-smtp]: https://github.com/ruby/net-smtp +[nkf]: https://github.com/ruby/nkf +[observer]: https://github.com/ruby/observer +[open-uri]: https://github.com/ruby/open-uri +[open3]: https://github.com/ruby/open3 +[openssl]: https://github.com/ruby/openssl +[optparse]: https://github.com/ruby/optparse +[ostruct]: https://github.com/ruby/ostruct +[pathname]: https://github.com/ruby/pathname +[power_assert]: https://github.com/ruby/power_assert +[pp]: https://github.com/ruby/pp +[prettyprint]: https://github.com/ruby/prettyprint +[prime]: https://github.com/ruby/prime +[prism]: https://github.com/ruby/prism +[pstore]: https://github.com/ruby/pstore +[psych]: https://github.com/ruby/psych +[racc]: https://github.com/ruby/racc +[rake]: https://github.com/ruby/rake +[rbs]: https://github.com/ruby/rbs +[rdoc]: https://github.com/ruby/rdoc +[readline]: https://github.com/ruby/readline +[reline]: https://github.com/ruby/reline +[resolv-replace]: https://github.com/ruby/resolv-replace +[resolv]: https://github.com/ruby/resolv +[rexml]: https://github.com/ruby/rexml +[rinda]: https://github.com/ruby/rinda +[rss]: https://github.com/ruby/rss +[securerandom]: https://github.com/ruby/securerandom +[shellwords]: https://github.com/ruby/shellwords +[singleton]: https://github.com/ruby/singleton +[stringio]: https://github.com/ruby/stringio +[strscan]: https://github.com/ruby/strscan +[syslog]: https://github.com/ruby/syslog +[tempfile]: https://github.com/ruby/tempfile +[test-unit]: https://github.com/test-unit/test-unit +[time]: https://github.com/ruby/time +[timeout]: https://github.com/ruby/timeout +[tmpdir]: https://github.com/ruby/tmpdir +[tsort]: https://github.com/ruby/tsort +[typeprof]: https://github.com/ruby/typeprof +[un]: https://github.com/ruby/un +[uri]: https://github.com/ruby/uri +[weakref]: https://github.com/ruby/weakref +[win32ole]: https://github.com/ruby/win32ole +[yaml]: https://github.com/ruby/yaml +[zlib]: https://github.com/ruby/zlib + +[reline-doc]: https://ruby.github.io/reline/ +[rake-doc]: https://ruby.github.io/rake/ +[irb-doc]: https://ruby.github.io/irb/ +[rdoc-doc]: https://ruby.github.io/rdoc/ +[logger-doc]: https://ruby.github.io/logger/ +[racc-doc]: https://ruby.github.io/racc/ +[csv-doc]: https://ruby.github.io/csv/ +[rexml-doc]: https://ruby.github.io/rexml/ diff --git a/doc/standard_library.rdoc b/doc/standard_library.rdoc deleted file mode 100644 index 11aed156b9..0000000000 --- a/doc/standard_library.rdoc +++ /dev/null @@ -1,127 +0,0 @@ -= Ruby Standard Library - -The Ruby Standard Library is a vast collection of classes and modules that you -can require in your code for additional features. - -Below is an overview of libraries and extensions followed by a brief -description. - -== Libraries - -Abbrev:: Calculates a set of unique abbreviations for a given set of strings -Base64:: Support for encoding and decoding binary data using a Base64 representation -Benchmark:: Provides methods to measure and report the time used to execute code -CGI:: Support for the Common Gateway Interface protocol -DEBUGGER__:: Debugging functionality for Ruby -Delegator:: Provides three abilities to delegate method calls to an object -DRb:: Distributed object system for Ruby -English.rb:: Require 'English.rb' to reference global variables with less cryptic names -ERB:: An easy to use but powerful templating system for Ruby -Find:: This module supports top-down traversal of a set of file paths -GetoptLong:: Parse command line options similar to the GNU C getopt_long() -MakeMakefile:: Module used to generate a Makefile for C extensions -Monitor:: Provides an object or module to use safely by more than one thread -Net::FTP:: Support for the File Transfer Protocol -Net::HTTP:: HTTP client api for Ruby -Net::IMAP:: Ruby client api for Internet Message Access Protocol -Net::POP3:: Ruby client library for POP3 -Net::SMTP:: Simple Mail Transfer Protocol client library for Ruby -Observable:: Provides a mechanism for publish/subscribe pattern in Ruby -OpenURI:: An easy-to-use wrapper for Net::HTTP, Net::HTTPS and Net::FTP -Open3:: Provides access to stdin, stdout and stderr when running other programs -OptionParser:: Ruby-oriented class for command-line option analysis -PP:: Provides a PrettyPrinter for Ruby objects -PrettyPrinter:: Implements a pretty printing algorithm for readable structure -PStore:: Implements a file based persistence mechanism based on a Hash -RbConfig:: Information of your configure and build of Ruby -resolv-replace.rb:: Replace Socket DNS with Resolv -Resolv:: Thread-aware DNS resolver library in Ruby -Rinda:: The Linda distributed computing paradigm in Ruby -Gem:: Package management framework for Ruby -SecureRandom:: Interface for secure random number generator -Set:: Provides a class to deal with collections of unordered, unique values -Shellwords:: Manipulates strings with word parsing rules of UNIX Bourne shell -Singleton:: Implementation of the Singleton pattern for Ruby -Tempfile:: A utility class for managing temporary files -Time:: Extends the Time class with methods for parsing and conversion -Timeout:: Auto-terminate potentially long-running operations in Ruby -tmpdir.rb:: Extends the Dir class to manage the OS temporary file path -TSort:: Topological sorting using Tarjan's algorithm -un.rb:: Utilities to replace common UNIX commands -URI:: A Ruby module providing support for Uniform Resource Identifiers -WeakRef:: Allows a referenced object to be garbage-collected -YAML:: Ruby client library for the Psych YAML implementation - -== Extensions - -Coverage:: Provides coverage measurement for Ruby -Digest:: Provides a framework for message digest libraries -IO:: Extensions for Ruby IO class, including #wait and ::console -NKF:: Ruby extension for Network Kanji Filter -objspace:: Extends ObjectSpace module to add methods for internal statistics -Pathname:: Representation of the name of a file or directory on the filesystem -PTY:: Creates and manages pseudo terminals -Readline:: Provides an interface for GNU Readline and Edit Line (libedit) -Ripper:: Provides an interface for parsing Ruby programs into S-expressions -Socket:: Access underlying OS socket implementations -Syslog:: Ruby interface for the POSIX system logging facility -WIN32OLE:: Provides an interface for OLE Automation in Ruby - -= Default gems - -== Libraries - -Bundler:: Manage your Ruby application's gem dependencies -CMath:: Provides Trigonometric and Transcendental functions for complex numbers -CSV:: Provides an interface to read and write CSV files and data -E2MM:: Module for defining custom exceptions with specific messages -FileUtils:: Several file utility methods for copying, moving, removing, etc -Forwardable:: Provides delegation of specified methods to a designated object -IPAddr:: Provides methods to manipulate IPv4 and IPv6 IP addresses -IRB:: Interactive Ruby command-line tool for REPL (Read Eval Print Loop) -Logger:: Provides a simple logging utility for outputting messages -Matrix:: Represents a mathematical matrix. -Mutex_m:: Mixin to extend objects to be handled like a Mutex -OpenStruct:: Class to build custom data structures, similar to a Hash -Prime:: Prime numbers and factorization library -Racc:: A LALR(1) parser generator written in Ruby. -RDoc:: Produces HTML and command-line documentation for Ruby -REXML:: An XML toolkit for Ruby -RSS:: Family of libraries that support various formats of XML "feeds" -Scanf:: A Ruby implementation of the C function scanf(3) -Shell:: An idiomatic Ruby interface for common UNIX shell commands -Synchronizer:: A module that provides a two-phase lock with a counter -ThreadsWait:: Watches for termination of multiple threads -Tracer:: Outputs a source level execution trace of a Ruby program -WEBrick:: An HTTP server toolkit for Ruby - -== Extensions - -BigDecimal:: Provides arbitrary-precision floating point decimal arithmetic -Date:: A subclass of Object includes Comparable module for handling dates -DateTime:: Subclass of Date to handling dates, hours, minutes, seconds, offsets -DBM:: Provides a wrapper for the UNIX-style Database Manager Library -Etc:: Provides access to information typically stored in UNIX /etc directory -Fcntl:: Loads constants defined in the OS fcntl.h C header file -Fiddle:: A libffi wrapper for Ruby -GDBM:: Ruby extension for the GNU dbm (gdbm) library -IO::console:: Console interface -JSON:: Implements Javascript Object Notation for Ruby -OpenSSL:: Provides SSL, TLS and general purpose cryptography for Ruby -Psych:: A YAML parser and emitter for Ruby -SDBM:: Provides a simple file-based key-value store with String keys and values -StringIO:: Pseudo I/O on String objects -StringScanner:: Provides lexical scanning operations on a String -Zlib:: Ruby interface for the zlib compression/decompression library - -= Bundled gems - -== Libraries - -DidYouMean:: "Did you mean?" experience in Ruby -MiniTest:: A test suite with TDD, BDD, mocking and benchmarking -Net::Telnet:: Telnet client library for Ruby -PowerAssert:: Power Assert for Ruby. -Rake:: Ruby build program with capabilities similar to make -Test::Unit:: A compatibility layer for MiniTest -XMLRPC:: Remote Procedure Call over HTTP support for Ruby diff --git a/doc/string.rb b/doc/string.rb new file mode 100644 index 0000000000..304ab60c29 --- /dev/null +++ b/doc/string.rb @@ -0,0 +1,421 @@ +# A +String+ object has an arbitrary sequence of bytes, +# typically representing text or binary data. +# A +String+ object may be created using String::new or as literals. +# +# String objects differ from Symbol objects in that Symbol objects are +# designed to be used as identifiers, instead of text or data. +# +# You can create a +String+ object explicitly with: +# +# - A {string literal}[rdoc-ref:syntax/literals.rdoc@String+Literals]. +# - A {heredoc literal}[rdoc-ref:syntax/literals.rdoc@Here+Document+Literals]. +# +# You can convert certain objects to Strings with: +# +# - Method #String. +# +# Some +String+ methods modify +self+. +# Typically, a method whose name ends with <tt>!</tt> modifies +self+ +# and returns +self+; +# often, a similarly named method (without the <tt>!</tt>) +# returns a new string. +# +# In general, if both bang and non-bang versions of a method exist, +# the bang method mutates and the non-bang method does not. +# However, a method without a bang can also mutate, such as String#replace. +# +# == Substitution Methods +# +# These methods perform substitutions: +# +# - String#sub: One substitution (or none); returns a new string. +# - String#sub!: One substitution (or none); returns +self+ if any changes, +# +nil+ otherwise. +# - String#gsub: Zero or more substitutions; returns a new string. +# - String#gsub!: Zero or more substitutions; returns +self+ if any changes, +# +nil+ otherwise. +# +# Each of these methods takes: +# +# - A first argument, +pattern+ (String or Regexp), +# that specifies the substring(s) to be replaced. +# +# - Either of the following: +# +# - A second argument, +replacement+ (String or Hash), +# that determines the replacing string. +# - A block that will determine the replacing string. +# +# The examples in this section mostly use the String#sub and String#gsub methods; +# the principles illustrated apply to all four substitution methods. +# +# <b>Argument +pattern+</b> +# +# Argument +pattern+ is commonly a regular expression: +# +# s = 'hello' +# s.sub(/[aeiou]/, '*') # => "h*llo" +# s.gsub(/[aeiou]/, '*') # => "h*ll*" +# s.gsub(/[aeiou]/, '') # => "hll" +# s.sub(/ell/, 'al') # => "halo" +# s.gsub(/xyzzy/, '*') # => "hello" +# 'THX1138'.gsub(/\d+/, '00') # => "THX00" +# +# When +pattern+ is a string, all its characters are treated +# as ordinary characters (not as Regexp special characters): +# +# 'THX1138'.gsub('\d+', '00') # => "THX1138" +# +# <b>+String+ +replacement+</b> +# +# If +replacement+ is a string, that string determines +# the replacing string that is substituted for the matched text. +# +# Each of the examples above uses a simple string as the replacing string. +# +# +String+ +replacement+ may contain back-references to the pattern's captures: +# +# - <tt>\n</tt> (_n_ is a non-negative integer) refers to <tt>$n</tt>. +# - <tt>\k<name></tt> refers to the named capture +name+. +# +# See Regexp for details. +# +# Note that within the string +replacement+, a character combination +# such as <tt>$&</tt> is treated as ordinary text, not as +# a special match variable. +# However, you may refer to some special match variables using these +# combinations: +# +# - <tt>\&</tt> and <tt>\0</tt> correspond to <tt>$&</tt>, +# which contains the complete matched text. +# - <tt>\'</tt> corresponds to <tt>$'</tt>, +# which contains the string after the match. +# - <tt>\`</tt> corresponds to <tt>$`</tt>, +# which contains the string before the match. +# - <tt>\\+</tt> corresponds to <tt>$+</tt>, +# which contains the last capture group. +# +# See Regexp for details. +# +# Note that <tt>\\\\</tt> is interpreted as an escape, i.e., a single backslash. +# +# Note also that a string literal consumes backslashes. +# See {String Literals}[rdoc-ref:syntax/literals.rdoc@String+Literals] for details about string literals. +# +# A back-reference is typically preceded by an additional backslash. +# For example, if you want to write a back-reference <tt>\&</tt> in +# +replacement+ with a double-quoted string literal, you need to write +# <tt>"..\\\\&.."</tt>. +# +# If you want to write a non-back-reference string <tt>\&</tt> in +# +replacement+, you need to first escape the backslash to prevent +# this method from interpreting it as a back-reference, and then you +# need to escape the backslashes again to prevent a string literal from +# consuming them: <tt>"..\\\\\\\\&.."</tt>. +# +# You may want to use the block form to avoid excessive backslashes. +# +# <b>\Hash +replacement+</b> +# +# If the argument +replacement+ is a hash, and +pattern+ matches one of its keys, +# the replacing string is the value for that key: +# +# h = {'foo' => 'bar', 'baz' => 'bat'} +# 'food'.sub('foo', h) # => "bard" +# +# Note that a symbol key does not match: +# +# h = {foo: 'bar', baz: 'bat'} +# 'food'.sub('foo', h) # => "d" +# +# <b>Block</b> +# +# In the block form, the current match string is passed to the block; +# the block's return value becomes the replacing string: +# +# s = '@' +# '1234'.gsub(/\d/) { |match| s.succ! } # => "ABCD" +# +# Special match variables such as <tt>$1</tt>, <tt>$2</tt>, <tt>$`</tt>, +# <tt>$&</tt>, and <tt>$'</tt> are set appropriately. +# +# == Whitespace in Strings +# +# In the class +String+, _whitespace_ is defined as a contiguous sequence of characters +# consisting of any mixture of the following: +# +# - NL (null): <tt>"\x00"</tt>, <tt>"\u0000"</tt>. +# - HT (horizontal tab): <tt>"\x09"</tt>, <tt>"\t"</tt>. +# - LF (line feed): <tt>"\x0a"</tt>, <tt>"\n"</tt>. +# - VT (vertical tab): <tt>"\x0b"</tt>, <tt>"\v"</tt>. +# - FF (form feed): <tt>"\x0c"</tt>, <tt>"\f"</tt>. +# - CR (carriage return): <tt>"\x0d"</tt>, <tt>"\r"</tt>. +# - SP (space): <tt>"\x20"</tt>, <tt>" "</tt>. +# +# +# Whitespace is relevant for the following methods: +# +# - #lstrip, #lstrip!: Strip leading whitespace. +# - #rstrip, #rstrip!: Strip trailing whitespace. +# - #strip, #strip!: Strip leading and trailing whitespace. +# +# == What's Here +# +# First, what's elsewhere. Class +String+: +# +# - Inherits from the {Object class}[rdoc-ref:Object@What-27s+Here]. +# - Includes the {Comparable module}[rdoc-ref:Comparable@What-27s+Here]. +# +# Here, class +String+ provides methods that are useful for: +# +# - {Creating a \String}[rdoc-ref:String@Creating+a+String]. +# - {Freezing/Unfreezing a \String}[rdoc-ref:String@Freezing-2FUnfreezing]. +# - {Querying a \String}[rdoc-ref:String@Querying]. +# - {Comparing Strings}[rdoc-ref:String@Comparing]. +# - {Modifying a \String}[rdoc-ref:String@Modifying]. +# - {Converting to a new \String}[rdoc-ref:String@Converting+to+New+String]. +# - {Converting to a non-\String}[rdoc-ref:String@Converting+to+Non--5CString]. +# - {Iterating over a \String}[rdoc-ref:String@Iterating]. +# +# === Creating a \String +# +# - ::new: Returns a new string. +# - ::try_convert: Returns a new string created from a given object. +# +# === Freezing/Unfreezing +# +# - #+@: Returns a string that is not frozen: +self+ if not frozen; +# +self.dup+ otherwise. +# - #-@ (aliased as #dedup): Returns a string that is frozen: +self+ if already frozen; +# +self.freeze+ otherwise. +# - #freeze: Freezes +self+ if not already frozen; returns +self+. +# +# === Querying +# +# _Counts_ +# +# - #bytesize: Returns the count of bytes. +# - #count: Returns the count of substrings matching given strings. +# - #empty?: Returns whether the length of +self+ is zero. +# - #length (aliased as #size): Returns the count of characters (not bytes). +# +# _Substrings_ +# +# - #=~: Returns the index of the first substring that matches a given +# Regexp or other object; returns +nil+ if no match is found. +# - #byteindex: Returns the byte index of the first occurrence of a given substring. +# - #byterindex: Returns the byte index of the last occurrence of a given substring. +# - #index: Returns the index of the _first_ occurrence of a given substring; +# returns +nil+ if none found. +# - #rindex: Returns the index of the _last_ occurrence of a given substring; +# returns +nil+ if none found. +# - #include?: Returns +true+ if the string contains a given substring; +false+ otherwise. +# - #match: Returns a MatchData object if the string matches a given Regexp; +nil+ otherwise. +# - #match?: Returns +true+ if the string matches a given Regexp; +false+ otherwise. +# - #start_with?: Returns +true+ if the string begins with any of the given substrings. +# - #end_with?: Returns +true+ if the string ends with any of the given substrings. +# +# _Encodings_ +# +# - #encoding\: Returns the Encoding object that represents the encoding of the string. +# - #unicode_normalized?: Returns +true+ if the string is in Unicode normalized form; +false+ otherwise. +# - #valid_encoding?: Returns +true+ if the string contains only characters that are valid +# for its encoding. +# - #ascii_only?: Returns +true+ if the string has only ASCII characters; +false+ otherwise. +# +# _Other_ +# +# - #sum: Returns a basic checksum for the string: the sum of each byte. +# - #hash: Returns the integer hash code. +# +# === Comparing +# +# - #== (aliased as #===): Returns +true+ if a given other string has the same content as +self+. +# - #eql?: Returns +true+ if the content is the same as the given other string. +# - #<=>: Returns -1, 0, or 1 as a given other string is smaller than, +# equal to, or larger than +self+. +# - #casecmp: Ignoring case, returns -1, 0, or 1 as +# +self+ is smaller than, equal to, or larger than a given other string. +# - #casecmp?: Ignoring case, returns whether a given other string is equal to +self+. +# +# === Modifying +# +# Each of these methods modifies +self+. +# +# _Insertion_ +# +# - #insert: Returns +self+ with a given string inserted at a specified offset. +# - #<<: Returns +self+ concatenated with a given string or integer. +# - #append_as_bytes: Returns +self+ concatenated with strings without performing any +# encoding validation or conversion. +# - #prepend: Prefixes to +self+ the concatenation of given other strings. +# +# _Substitution_ +# +# - #bytesplice: Replaces bytes of +self+ with bytes from a given string; returns +self+. +# - #sub!: Replaces the first substring that matches a given pattern with a given replacement string; +# returns +self+ if any changes, +nil+ otherwise. +# - #gsub!: Replaces each substring that matches a given pattern with a given replacement string; +# returns +self+ if any changes, +nil+ otherwise. +# - #succ! (aliased as #next!): Returns +self+ modified to become its own successor. +# - #replace: Returns +self+ with its entire content replaced by a given string. +# - #reverse!: Returns +self+ with its characters in reverse order. +# - #setbyte: Sets the byte at a given integer offset to a given value; returns the argument. +# - #tr!: Replaces specified characters in +self+ with specified replacement characters; +# returns +self+ if any changes, +nil+ otherwise. +# - #tr_s!: Replaces specified characters in +self+ with specified replacement characters, +# removing duplicates from the substrings that were modified; +# returns +self+ if any changes, +nil+ otherwise. +# +# _Casing_ +# +# - #capitalize!: Upcases the initial character and downcases all others; +# returns +self+ if any changes, +nil+ otherwise. +# - #downcase!: Downcases all characters; returns +self+ if any changes, +nil+ otherwise. +# - #upcase!: Upcases all characters; returns +self+ if any changes, +nil+ otherwise. +# - #swapcase!: Upcases each downcase character and downcases each upcase character; +# returns +self+ if any changes, +nil+ otherwise. +# +# _Encoding_ +# +# - #encode!: Returns +self+ with all characters transcoded from one encoding to another. +# - #unicode_normalize!: Unicode-normalizes +self+; returns +self+. +# - #scrub!: Replaces each invalid byte with a given character; returns +self+. +# - #force_encoding: Changes the encoding to a given encoding; returns +self+. +# +# _Deletion_ +# +# - #clear: Removes all content, so that +self+ is empty; returns +self+. +# - #slice!, #[]=: Removes a substring determined by a given index, start/length, range, regexp, or substring. +# - #squeeze!: Removes contiguous duplicate characters; returns +self+. +# - #delete!: Removes characters as determined by the intersection of substring arguments. +# - #delete_prefix!: Removes leading prefix; returns +self+ if any changes, +nil+ otherwise. +# - #delete_suffix!: Removes trailing suffix; returns +self+ if any changes, +nil+ otherwise. +# - #lstrip!: Removes leading whitespace; returns +self+ if any changes, +nil+ otherwise. +# - #rstrip!: Removes trailing whitespace; returns +self+ if any changes, +nil+ otherwise. +# - #strip!: Removes leading and trailing whitespace; returns +self+ if any changes, +nil+ otherwise. +# - #chomp!: Removes the trailing record separator, if found; returns +self+ if any changes, +nil+ otherwise. +# - #chop!: Removes trailing newline characters if found; otherwise removes the last character; +# returns +self+ if any changes, +nil+ otherwise. +# +# === Converting to New \String +# +# Each of these methods returns a new +String+ based on +self+, +# often just a modified copy of +self+. +# +# _Extension_ +# +# - #*: Returns the concatenation of multiple copies of +self+. +# - #+: Returns the concatenation of +self+ and a given other string. +# - #center: Returns a copy of +self+, centered by specified padding. +# - #concat: Returns the concatenation of +self+ with given other strings. +# - #ljust: Returns a copy of +self+ of a given length, right-padded with a given other string. +# - #rjust: Returns a copy of +self+ of a given length, left-padded with a given other string. +# +# _Encoding_ +# +# - #b: Returns a copy of +self+ with ASCII-8BIT encoding. +# - #scrub: Returns a copy of +self+ with each invalid byte replaced with a given character. +# - #unicode_normalize: Returns a copy of +self+ with each character Unicode-normalized. +# - #encode: Returns a copy of +self+ with all characters transcoded from one encoding to another. +# +# _Substitution_ +# +# - #dump: Returns a printable version of +self+, enclosed in double-quotes. +# - #undump: Inverse of #dump; returns a copy of +self+ with changes of the kinds made by #dump "undone." +# - #sub: Returns a copy of +self+ with the first substring matching a given pattern +# replaced with a given replacement string. +# - #gsub: Returns a copy of +self+ with each substring that matches a given pattern +# replaced with a given replacement string. +# - #succ (aliased as #next): Returns the string that is the successor to +self+. +# - #reverse: Returns a copy of +self+ with its characters in reverse order. +# - #tr: Returns a copy of +self+ with specified characters replaced with specified replacement characters. +# - #tr_s: Returns a copy of +self+ with specified characters replaced with +# specified replacement characters, +# removing duplicates from the substrings that were modified. +# - #%: Returns the string resulting from formatting a given object into +self+. +# +# _Casing_ +# +# - #capitalize: Returns a copy of +self+ with the first character upcased +# and all other characters downcased. +# - #downcase: Returns a copy of +self+ with all characters downcased. +# - #upcase: Returns a copy of +self+ with all characters upcased. +# - #swapcase: Returns a copy of +self+ with all upcase characters downcased +# and all downcase characters upcased. +# +# _Deletion_ +# +# - #delete: Returns a copy of +self+ with characters removed. +# - #delete_prefix: Returns a copy of +self+ with a given prefix removed. +# - #delete_suffix: Returns a copy of +self+ with a given suffix removed. +# - #lstrip: Returns a copy of +self+ with leading whitespace removed. +# - #rstrip: Returns a copy of +self+ with trailing whitespace removed. +# - #strip: Returns a copy of +self+ with leading and trailing whitespace removed. +# - #chomp: Returns a copy of +self+ with a trailing record separator removed, if found. +# - #chop: Returns a copy of +self+ with trailing newline characters or the last character removed. +# - #squeeze: Returns a copy of +self+ with contiguous duplicate characters removed. +# - #[] (aliased as #slice): Returns a substring determined by a given index, start/length, range, regexp, or string. +# - #byteslice: Returns a substring determined by a given index, start/length, or range. +# - #chr: Returns the first character. +# +# _Duplication_ +# +# - #to_s (aliased as #to_str): If +self+ is a subclass of +String+, returns +self+ copied into a +String+; +# otherwise, returns +self+. +# +# === Converting to Non-\String +# +# Each of these methods converts the contents of +self+ to a non-+String+. +# +# <em>Characters, Bytes, and Clusters</em> +# +# - #bytes: Returns an array of the bytes in +self+. +# - #chars: Returns an array of the characters in +self+. +# - #codepoints: Returns an array of the integer ordinals in +self+. +# - #getbyte: Returns the integer byte at the given index in +self+. +# - #grapheme_clusters: Returns an array of the grapheme clusters in +self+. +# +# _Splitting_ +# +# - #lines: Returns an array of the lines in +self+, as determined by a given record separator. +# - #partition: Returns a 3-element array determined by the first substring that matches +# a given substring or regexp. +# - #rpartition: Returns a 3-element array determined by the last substring that matches +# a given substring or regexp. +# - #split: Returns an array of substrings determined by a given delimiter -- regexp or string -- +# or, if a block is given, passes those substrings to the block. +# +# _Matching_ +# +# - #scan: Returns an array of substrings matching a given regexp or string, or, +# if a block is given, passes each matching substring to the block. +# - #unpack: Returns an array of substrings extracted from +self+ according to a given format. +# - #unpack1: Returns the first substring extracted from +self+ according to a given format. +# +# _Numerics_ +# +# - #hex: Returns the integer value of the leading characters, interpreted as hexadecimal digits. +# - #oct: Returns the integer value of the leading characters, interpreted as octal digits. +# - #ord: Returns the integer ordinal of the first character in +self+. +# - #to_c: Returns the complex value of leading characters, interpreted as a complex number. +# - #to_i: Returns the integer value of leading characters, interpreted as an integer. +# - #to_f: Returns the floating-point value of leading characters, interpreted as a floating-point number. +# - #to_r: Returns the rational value of leading characters, interpreted as a rational. +# +# <em>Strings and Symbols</em> +# +# - #inspect: Returns a copy of +self+, enclosed in double quotes, with special characters escaped. +# - #intern (aliased as #to_sym): Returns the symbol corresponding to +self+. +# +# === Iterating +# +# - #each_byte: Calls the given block with each successive byte in +self+. +# - #each_char: Calls the given block with each successive character in +self+. +# - #each_codepoint: Calls the given block with each successive integer codepoint in +self+. +# - #each_grapheme_cluster: Calls the given block with each successive grapheme cluster in +self+. +# - #each_line: Calls the given block with each successive line in +self+, +# as determined by a given record separator. +# - #upto: Calls the given block with each string value returned by successive calls to #succ. + +class String; end diff --git a/doc/string/aref.rdoc b/doc/string/aref.rdoc new file mode 100644 index 0000000000..a9ab8857bc --- /dev/null +++ b/doc/string/aref.rdoc @@ -0,0 +1,96 @@ +Returns the substring of +self+ specified by the arguments. + +<b>Form <tt>self[offset]</tt></b> + +With non-negative integer argument +offset+ given, +returns the 1-character substring found in self at character offset +offset+: + + 'hello'[0] # => "h" + 'hello'[4] # => "o" + 'hello'[5] # => nil + 'ã“ã‚“ã«ã¡ã¯'[4] # => "ã¯" + +With negative integer argument +offset+ given, +counts backward from the end of +self+: + + 'hello'[-1] # => "o" + 'hello'[-5] # => "h" + 'hello'[-6] # => nil + +<b>Form <tt>self[offset, size]</tt></b> + +With integer arguments +offset+ and +size+ given, +returns a substring of size +size+ characters (as available) +beginning at character offset specified by +offset+. + +If argument +offset+ is non-negative, +the offset is +offset+: + + 'hello'[0, 1] # => "h" + 'hello'[0, 5] # => "hello" + 'hello'[0, 6] # => "hello" + 'hello'[2, 3] # => "llo" + 'hello'[2, 0] # => "" + 'hello'[2, -1] # => nil + +If argument +offset+ is negative, +counts backward from the end of +self+: + + 'hello'[-1, 1] # => "o" + 'hello'[-5, 5] # => "hello" + 'hello'[-1, 0] # => "" + 'hello'[-6, 5] # => nil + +Special case: if +offset+ equals the size of +self+, +returns a new empty string: + + 'hello'[5, 3] # => "" + +<b>Form <tt>self[range]</tt></b> + +With Range argument +range+ given, +forms substring <tt>self[range.start, range.size]</tt>: + + 'hello'[0..2] # => "hel" + 'hello'[0, 3] # => "hel" + + 'hello'[0...2] # => "he" + 'hello'[0, 2] # => "he" + + 'hello'[0, 0] # => "" + 'hello'[0...0] # => "" + +<b>Form <tt>self[regexp, capture = 0]</tt></b> + +With Regexp argument +regexp+ given and +capture+ as zero, +searches for a matching substring in +self+; +updates {Regexp-related global variables}[rdoc-ref:Regexp@Global+Variables]: + + 'hello'[/ell/] # => "ell" + 'hello'[/l+/] # => "ll" + 'hello'[//] # => "" + 'hello'[/nosuch/] # => nil + +With +capture+ as a positive integer +n+, +returns the +n+th matched group: + + 'hello'[/(h)(e)(l+)(o)/] # => "hello" + 'hello'[/(h)(e)(l+)(o)/, 1] # => "h" + $1 # => "h" + 'hello'[/(h)(e)(l+)(o)/, 2] # => "e" + $2 # => "e" + 'hello'[/(h)(e)(l+)(o)/, 3] # => "ll" + 'hello'[/(h)(e)(l+)(o)/, 4] # => "o" + 'hello'[/(h)(e)(l+)(o)/, 5] # => nil + +<b>Form <tt>self[substring]</tt></b> + +With string argument +substring+ given, +returns the matching substring of +self+, if found: + + 'hello'['ell'] # => "ell" + 'hello'[''] # => "" + 'hello'['nosuch'] # => nil + 'ã“ã‚“ã«ã¡ã¯'['ã‚“ã«ã¡'] # => "ã‚“ã«ã¡" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/aset.rdoc b/doc/string/aset.rdoc new file mode 100644 index 0000000000..98c58b59cc --- /dev/null +++ b/doc/string/aset.rdoc @@ -0,0 +1,179 @@ +Returns +self+ with all, a substring, or none of its contents replaced; +returns the argument +other_string+. + +<b>Form <tt>self[index] = other_string</tt></b> + +With non-negative integer argument +index+ given, +searches for the 1-character substring found in self at character offset index: + + s = 'hello' + s[0] = 'foo' # => "foo" + s # => "fooello" + + s = 'hello' + s[4] = 'foo' # => "foo" + s # => "hellfoo" + + s = 'hello' + s[5] = 'foo' # => "foo" + s # => "hellofoo" + + s = 'hello' + s[6] = 'foo' # Raises IndexError: index 6 out of string. + +With negative integer argument +index+ given, +counts backward from the end of +self+: + + s = 'hello' + s[-1] = 'foo' # => "foo" + s # => "hellfoo" + + s = 'hello' + s[-5] = 'foo' # => "foo" + s # => "fooello" + + s = 'hello' + s[-6] = 'foo' # Raises IndexError: index -6 out of string. + +<b>Form <tt>self[start, length] = other_string</tt></b> + +With integer arguments +start+ and +length+ given, +searches for a substring of size +length+ characters (as available) +beginning at character offset specified by +start+. + +If argument +start+ is non-negative, +the offset is +start+: + + s = 'hello' + s[0, 1] = 'foo' # => "foo" + s # => "fooello" + + s = 'hello' + s[0, 5] = 'foo' # => "foo" + s # => "foo" + + s = 'hello' + s[0, 9] = 'foo' # => "foo" + s # => "foo" + + s = 'hello' + s[2, 0] = 'foo' # => "foo" + s # => "hefoollo" + + s = 'hello' + s[2, -1] = 'foo' # Raises IndexError: negative length -1. + +If argument +start+ is negative, +counts backward from the end of +self+: + + s = 'hello' + s[-1, 1] = 'foo' # => "foo" + s # => "hellfoo" + + s = 'hello' + s[-1, 9] = 'foo' # => "foo" + s # => "hellfoo" + + s = 'hello' + s[-5, 2] = 'foo' # => "foo" + s # => "foollo" + + s = 'hello' + s[-3, 0] = 'foo' # => "foo" + s # => "hefoollo" + + s = 'hello' + s[-6, 2] = 'foo' # Raises IndexError: index -6 out of string. + +Special case: if +start+ equals the length of +self+, +the argument is appended to +self+: + + s = 'hello' + s[5, 3] = 'foo' # => "foo" + s # => "hellofoo" + +<b>Form <tt>self[range] = other_string</tt></b> + +With Range argument +range+ given, +equivalent to <tt>self[range.start, range.size] = other_string</tt>: + + s0 = 'hello' + s1 = 'hello' + s0[0..2] = 'foo' # => "foo" + s1[0, 3] = 'foo' # => "foo" + s0 # => "foolo" + s1 # => "foolo" + + s = 'hello' + s[0...2] = 'foo' # => "foo" + s # => "foollo" + + s = 'hello' + s[0...0] = 'foo' # => "foo" + s # => "foohello" + + s = 'hello' + s[9..10] = 'foo' # Raises RangeError: 9..10 out of range + +<b>Form <tt>self[regexp, capture = 0] = other_string</tt></b> + +With Regexp argument +regexp+ given and +capture+ as zero, +searches for a matching substring in +self+; +updates {Regexp-related global variables}[rdoc-ref:Regexp@Global+Variables]: + + s = 'hello' + s[/l/] = 'L' # => "L" + [$`, $&, $'] # => ["he", "l", "lo"] + s[/eLlo/] = 'owdy' # => "owdy" + [$`, $&, $'] # => ["h", "eLlo", ""] + s[/eLlo/] = 'owdy' # Raises IndexError: regexp not matched. + [$`, $&, $'] # => [nil, nil, nil] + +With +capture+ as a positive integer +n+, +searches for the +n+th matched group: + + s = 'hello' + s[/(h)(e)(l+)(o)/] = 'foo' # => "foo" + [$`, $&, $'] # => ["", "hello", ""] + + s = 'hello' + s[/(h)(e)(l+)(o)/, 1] = 'foo' # => "foo" + s # => "fooello" + [$`, $&, $'] # => ["", "hello", ""] + + s = 'hello' + s[/(h)(e)(l+)(o)/, 2] = 'foo' # => "foo" + s # => "hfoollo" + [$`, $&, $'] # => ["", "hello", ""] + + s = 'hello' + s[/(h)(e)(l+)(o)/, 4] = 'foo' # => "foo" + s # => "hellfoo" + [$`, $&, $'] # => ["", "hello", ""] + + s = 'hello' + # => "hello" + s[/(h)(e)(l+)(o)/, 5] = 'foo # Raises IndexError: index 5 out of regexp. + + s = 'hello' + s[/nosuch/] = 'foo' # Raises IndexError: regexp not matched. + +<b>Form <tt>self[substring] = other_string</tt></b> + +With string argument +substring+ given: + + s = 'hello' + s['l'] = 'foo' # => "foo" + s # => "hefoolo" + + s = 'hello' + s['ll'] = 'foo' # => "foo" + s # => "hefooo" + + s = 'ã“ã‚“ã«ã¡ã¯' + s['ã‚“ã«ã¡'] = 'foo' # => "foo" + s # => "ã“fooã¯" + + s['nosuch'] = 'foo' # Raises IndexError: string not matched. + +Related: see {Modifying}[rdoc-ref:String@Modifying]. diff --git a/doc/string/b.rdoc b/doc/string/b.rdoc new file mode 100644 index 0000000000..8abd6d9532 --- /dev/null +++ b/doc/string/b.rdoc @@ -0,0 +1,16 @@ +Returns a copy of +self+ that has ASCII-8BIT encoding; +the underlying bytes are not modified: + + s = "\x99" + s.encoding # => #<Encoding:UTF-8> + t = s.b # => "\x99" + t.encoding # => #<Encoding:ASCII-8BIT> + + s = "\u4095" # => "ä‚•" + s.encoding # => #<Encoding:UTF-8> + s.bytes # => [228, 130, 149] + t = s.b # => "\xE4\x82\x95" + t.encoding # => #<Encoding:ASCII-8BIT> + t.bytes # => [228, 130, 149] + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/bytes.rdoc b/doc/string/bytes.rdoc new file mode 100644 index 0000000000..6dde0a745d --- /dev/null +++ b/doc/string/bytes.rdoc @@ -0,0 +1,7 @@ +Returns an array of the bytes in +self+: + + 'hello'.bytes # => [104, 101, 108, 108, 111] + 'ã“ã‚“ã«ã¡ã¯'.bytes + # => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/bytesize.rdoc b/doc/string/bytesize.rdoc new file mode 100644 index 0000000000..8d12a0d454 --- /dev/null +++ b/doc/string/bytesize.rdoc @@ -0,0 +1,12 @@ +Returns the count of bytes in +self+. + +Note that the byte count may be different from the character count (returned by #size): + + s = 'foo' + s.bytesize # => 3 + s.size # => 3 + s = 'ã“ã‚“ã«ã¡ã¯' + s.bytesize # => 15 + s.size # => 5 + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/byteslice.rdoc b/doc/string/byteslice.rdoc new file mode 100644 index 0000000000..d70441fb2b --- /dev/null +++ b/doc/string/byteslice.rdoc @@ -0,0 +1,54 @@ +Returns a substring of +self+, or +nil+ if the substring cannot be constructed. + +With integer arguments +offset+ and +length+ given, +returns the substring beginning at the given +offset+ +and of the given +length+ (as available): + + s = '0123456789' # => "0123456789" + s.byteslice(2) # => "2" + s.byteslice(200) # => nil + s.byteslice(4, 3) # => "456" + s.byteslice(4, 30) # => "456789" + +Returns +nil+ if +length+ is negative or +offset+ falls outside of +self+: + + s.byteslice(4, -1) # => nil + s.byteslice(40, 2) # => nil + +Counts backwards from the end of +self+ +if +offset+ is negative: + + s = '0123456789' # => "0123456789" + s.byteslice(-4) # => "6" + s.byteslice(-4, 3) # => "678" + +With Range argument +range+ given, returns +<tt>byteslice(range.begin, range.size)</tt>: + + s = '0123456789' # => "0123456789" + s.byteslice(4..6) # => "456" + s.byteslice(-6..-4) # => "456" + s.byteslice(5..2) # => "" # range.size is zero. + s.byteslice(40..42) # => nil + +The starting and ending offsets need not be on character boundaries: + + s = 'ã“ã‚“ã«ã¡ã¯' + s.byteslice(0, 3) # => "ã“" + s.byteslice(1, 3) # => "\x81\x93\xE3" + +The encodings of +self+ and the returned substring +are always the same: + + s.encoding # => #<Encoding:UTF-8> + s.byteslice(0, 3).encoding # => #<Encoding:UTF-8> + s.byteslice(1, 3).encoding # => #<Encoding:UTF-8> + +But, depending on the character boundaries, +the encoding of the returned substring may not be valid: + + s.valid_encoding? # => true + s.byteslice(0, 3).valid_encoding? # => true + s.byteslice(1, 3).valid_encoding? # => false + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/bytesplice.rdoc b/doc/string/bytesplice.rdoc new file mode 100644 index 0000000000..5689ef4a2b --- /dev/null +++ b/doc/string/bytesplice.rdoc @@ -0,0 +1,66 @@ +Replaces <i>target bytes</i> in +self+ with <i>source bytes</i> from the given string +str+; +returns +self+. + +In the first form, arguments +offset+ and +length+ determine the target bytes, +and the source bytes are all of the given +str+: + + '0123456789'.bytesplice(0, 3, 'abc') # => "abc3456789" + '0123456789'.bytesplice(3, 3, 'abc') # => "012abc6789" + '0123456789'.bytesplice(0, 50, 'abc') # => "abc" + '0123456789'.bytesplice(50, 3, 'abc') # Raises IndexError. + +The counts of the target bytes and source source bytes may be different: + + '0123456789'.bytesplice(0, 6, 'abc') # => "abc6789" # Shorter source. + '0123456789'.bytesplice(0, 1, 'abc') # => "abc123456789" # Shorter target. + +And either count may be zero (i.e., specifying an empty string): + + '0123456789'.bytesplice(0, 3, '') # => "3456789" # Empty source. + '0123456789'.bytesplice(0, 0, 'abc') # => "abc0123456789" # Empty target. + +In the second form, just as in the first, +arugments +offset+ and +length+ determine the target bytes; +argument +str+ _contains_ the source bytes, +and the additional arguments +str_offset+ and +str_length+ +determine the actual source bytes: + + '0123456789'.bytesplice(0, 3, 'abc', 0, 3) # => "abc3456789" + '0123456789'.bytesplice(0, 3, 'abc', 1, 1) # => "b3456789" # Shorter source. + '0123456789'.bytesplice(0, 1, 'abc', 0, 3) # => "abc123456789" # Shorter target. + '0123456789'.bytesplice(0, 3, 'abc', 1, 0) # => "3456789" # Empty source. + '0123456789'.bytesplice(0, 0, 'abc', 0, 3) # => "abc0123456789" # Empty target. + +In the third form, argument +range+ determines the target bytes +and the source bytes are all of the given +str+: + + '0123456789'.bytesplice(0..2, 'abc') # => "abc3456789" + '0123456789'.bytesplice(3..5, 'abc') # => "012abc6789" + '0123456789'.bytesplice(0..5, 'abc') # => "abc6789" # Shorter source. + '0123456789'.bytesplice(0..0, 'abc') # => "abc123456789" # Shorter target. + '0123456789'.bytesplice(0..2, '') # => "3456789" # Empty source. + '0123456789'.bytesplice(0...0, 'abc') # => "abc0123456789" # Empty target. + +In the fourth form, just as in the third, +arugment +range+ determines the target bytes; +argument +str+ _contains_ the source bytes, +and the additional argument +str_range+ +determines the actual source bytes: + + '0123456789'.bytesplice(0..2, 'abc', 0..2) # => "abc3456789" + '0123456789'.bytesplice(3..5, 'abc', 0..2) # => "012abc6789" + '0123456789'.bytesplice(0..2, 'abc', 0..1) # => "ab3456789" # Shorter source. + '0123456789'.bytesplice(0..1, 'abc', 0..2) # => "abc23456789" # Shorter target. + '0123456789'.bytesplice(0..2, 'abc', 0...0) # => "3456789" # Empty source. + '0123456789'.bytesplice(0...0, 'abc', 0..2) # => "abc0123456789" # Empty target. + +In any of the forms, the beginnings and endings of both source and target +must be on character boundaries. + +In these examples, +self+ has five 3-byte characters, +and so has character boundaries at offsets 0, 3, 6, 9, 12, and 15. + + 'ã“ã‚“ã«ã¡ã¯'.bytesplice(0, 3, 'abc') # => "abcã‚“ã«ã¡ã¯" + 'ã“ã‚“ã«ã¡ã¯'.bytesplice(1, 3, 'abc') # Raises IndexError. + 'ã“ã‚“ã«ã¡ã¯'.bytesplice(0, 2, 'abc') # Raises IndexError. + diff --git a/doc/string/capitalize.rdoc b/doc/string/capitalize.rdoc new file mode 100644 index 0000000000..3a1a2dcb8b --- /dev/null +++ b/doc/string/capitalize.rdoc @@ -0,0 +1,26 @@ +Returns a string containing the characters in +self+, +each with possibly changed case: + +- The first character made uppercase. +- All other characters are made lowercase. + +Examples: + + 'hello'.capitalize # => "Hello" + 'HELLO'.capitalize # => "Hello" + 'straße'.capitalize # => "Straße" # Lowercase 'ß' not changed. + 'STRAẞE'.capitalize # => "Straße" # Uppercase 'ẞ' downcased to 'ß'. + +Some characters (and some character sets) do not have upcase and downcase versions; +see {Case Mapping}[rdoc-ref:case_mapping.rdoc]: + + s = '1, 2, 3, ...' + s.capitalize == s # => true + s = 'ã“ã‚“ã«ã¡ã¯' + s.capitalize == s # => true + +The casing is affected by the given +mapping+, +which may be +:ascii+, +:fold+, or +:turkic+; +see {Case Mappings}[rdoc-ref:case_mapping.rdoc@Case+Mappings]. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/center.rdoc b/doc/string/center.rdoc new file mode 100644 index 0000000000..b86c8b5916 --- /dev/null +++ b/doc/string/center.rdoc @@ -0,0 +1,19 @@ +Returns a centered copy of +self+. + +If integer argument +size+ is greater than the size (in characters) of +self+, +returns a new string of length +size+ that is a copy of +self+, +centered and padded on one or both ends with +pad_string+: + + 'hello'.center(6) # => "hello " # Padded on one end. + 'hello'.center(10) # => " hello " # Padded on both ends. + 'hello'.center(20, '-|') # => "-|-|-|-hello-|-|-|-|" # Some padding repeated. + 'hello'.center(10, 'abcdefg') # => "abhelloabc" # Some padding not used. + ' hello '.center(13) # => " hello " + 'ã“ã‚“ã«ã¡ã¯'.center(10) # => " ã“ã‚“ã«ã¡ã¯ " # Multi-byte characters. + +If +size+ is less than or equal to the size of +self+, returns an unpadded copy of +self+: + + 'hello'.center(5) # => "hello" + 'hello'.center(-10) # => "hello" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/chars.rdoc b/doc/string/chars.rdoc new file mode 100644 index 0000000000..d4d15bf2ad --- /dev/null +++ b/doc/string/chars.rdoc @@ -0,0 +1,7 @@ +Returns an array of the characters in +self+: + + 'hello'.chars # => ["h", "e", "l", "l", "o"] + 'ã“ã‚“ã«ã¡ã¯'.chars # => ["ã“", "ã‚“", "ã«", "ã¡", "ã¯"] + ''.chars # => [] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/chomp.rdoc b/doc/string/chomp.rdoc new file mode 100644 index 0000000000..4efff5c291 --- /dev/null +++ b/doc/string/chomp.rdoc @@ -0,0 +1,31 @@ +Returns a new string copied from +self+, with trailing characters possibly removed: + +When +line_sep+ is <tt>"\n"</tt>, removes the last one or two characters +if they are <tt>"\r"</tt>, <tt>"\n"</tt>, or <tt>"\r\n"</tt> +(but not <tt>"\n\r"</tt>): + + $/ # => "\n" + "abc\r".chomp # => "abc" + "abc\n".chomp # => "abc" + "abc\r\n".chomp # => "abc" + "abc\n\r".chomp # => "abc\n" + "ã“ã‚“ã«ã¡ã¯\r\n".chomp # => "ã“ã‚“ã«ã¡ã¯" + +When +line_sep+ is <tt>''</tt> (an empty string), +removes multiple trailing occurrences of <tt>"\n"</tt> or <tt>"\r\n"</tt> +(but not <tt>"\r"</tt> or <tt>"\n\r"</tt>): + + "abc\n\n\n".chomp('') # => "abc" + "abc\r\n\r\n\r\n".chomp('') # => "abc" + "abc\n\n\r\n\r\n\n\n".chomp('') # => "abc" + "abc\n\r\n\r\n\r".chomp('') # => "abc\n\r\n\r\n\r" + "abc\r\r\r".chomp('') # => "abc\r\r\r" + +When +line_sep+ is neither <tt>"\n"</tt> nor <tt>''</tt>, +removes a single trailing line separator if there is one: + + 'abcd'.chomp('cd') # => "ab" + 'abcdcd'.chomp('cd') # => "abcd" + 'abcd'.chomp('xx') # => "abcd" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/chop.rdoc b/doc/string/chop.rdoc new file mode 100644 index 0000000000..d818ba467a --- /dev/null +++ b/doc/string/chop.rdoc @@ -0,0 +1,17 @@ +Returns a new string copied from +self+, with trailing characters possibly removed. + +Removes <tt>"\r\n"</tt> if those are the last two characters. + + "abc\r\n".chop # => "abc" + "ã“ã‚“ã«ã¡ã¯\r\n".chop # => "ã“ã‚“ã«ã¡ã¯" + +Otherwise removes the last character if it exists. + + 'abcd'.chop # => "abc" + 'ã“ã‚“ã«ã¡ã¯'.chop # => "ã“ã‚“ã«ã¡" + ''.chop # => "" + +If you only need to remove the newline separator at the end of the string, +String#chomp is a better alternative. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/chr.rdoc b/doc/string/chr.rdoc new file mode 100644 index 0000000000..153d5d71c3 --- /dev/null +++ b/doc/string/chr.rdoc @@ -0,0 +1,7 @@ +Returns a string containing the first character of +self+: + + 'hello'.chr # => "h" + 'ã“ã‚“ã«ã¡ã¯'.chr # => "ã“" + ''.chr # => "" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/codepoints.rdoc b/doc/string/codepoints.rdoc new file mode 100644 index 0000000000..22cb22c889 --- /dev/null +++ b/doc/string/codepoints.rdoc @@ -0,0 +1,8 @@ +Returns an array of the codepoints in +self+; +each codepoint is the integer value for a character: + + 'hello'.codepoints # => [104, 101, 108, 108, 111] + 'ã“ã‚“ã«ã¡ã¯'.codepoints # => [12371, 12435, 12395, 12385, 12399] + ''.codepoints # => [] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/concat.rdoc b/doc/string/concat.rdoc new file mode 100644 index 0000000000..92ba664b8c --- /dev/null +++ b/doc/string/concat.rdoc @@ -0,0 +1,11 @@ +Concatenates each object in +objects+ to +self+; returns +self+: + + 'foo'.concat('bar', 'baz') # => "foobarbaz" + +For each given object +object+ that is an integer, +the value is considered a codepoint and converted to a character before concatenation: + + 'foo'.concat(32, 'bar', 32, 'baz') # => "foo bar baz" # Embeds spaces. + 'ã“ã‚“'.concat(12395, 12385, 12399) # => "ã“ã‚“ã«ã¡ã¯" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/count.rdoc b/doc/string/count.rdoc new file mode 100644 index 0000000000..7a3b9f1e21 --- /dev/null +++ b/doc/string/count.rdoc @@ -0,0 +1,74 @@ +Returns the total number of characters in +self+ that are specified by the given selectors. + +For one 1-character selector, +returns the count of instances of that character: + + s = 'abracadabra' + s.count('a') # => 5 + s.count('b') # => 2 + s.count('x') # => 0 + s.count('') # => 0 + + s = 'よã‚ã—ããŠé¡˜ã„ã—ã¾ã™' + s.count('よ') # => 1 + s.count('ã—') # => 2 + +For one multi-character selector, +returns the count of instances for all specified characters: + + s = 'abracadabra' + s.count('ab') # => 7 + s.count('abc') # => 8 + s.count('abcd') # => 9 + s.count('abcdr') # => 11 + s.count('abcdrx') # => 11 + +Order and repetition do not matter: + + s.count('ba') == s.count('ab') # => true + s.count('baab') == s.count('ab') # => true + +For multiple selectors, +forms a single selector that is the intersection of characters in all selectors +and returns the count of instances for that selector: + + s = 'abcdefg' + s.count('abcde', 'dcbfg') == s.count('bcd') # => true + s.count('abc', 'def') == s.count('') # => true + +In a character selector, three characters get special treatment: + +- A caret (<tt>'^'</tt>) functions as a _negation_ operator + for the immediately following characters: + + s = 'abracadabra' + s.count('^bc') # => 8 # Count of all except 'b' and 'c'. + +- A hyphen (<tt>'-'</tt>) between two other characters defines a _range_ of characters: + + s = 'abracadabra' + s.count('a-c') # => 8 # Count of all 'a', 'b', and 'c'. + +- A backslash (<tt>'\'</tt>) acts as an escape for a caret, a hyphen, + or another backslash: + + s = 'abracadabra' + s.count('\^bc') # => 3 # Count of '^', 'b', and 'c'. + s.count('a\-c') # => 6 # Count of 'a', '-', and 'c'. + 'foo\bar\baz'.count('\\') # => 2 # Count of '\'. + +These usages may be mixed: + + s = 'abracadabra' + s.count('a-cq-t') # => 10 # Multiple ranges. + s.count('ac-d') # => 7 # Range mixed with plain characters. + s.count('^a-c') # => 3 # Range mixed with negation. + +For multiple selectors, all forms may be used, including negations, ranges, and escapes. + + s = 'abracadabra' + s.count('^abc', '^def') == s.count('^abcdef') # => true + s.count('a-e', 'c-g') == s.count('cde') # => true + s.count('^abc', 'c-g') == s.count('defg') # => true + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/delete.rdoc b/doc/string/delete.rdoc new file mode 100644 index 0000000000..1827f177e6 --- /dev/null +++ b/doc/string/delete.rdoc @@ -0,0 +1,75 @@ +Returns a new string that is a copy of +self+ with certain characters removed; +the removed characters are all instances of those specified by the given string +selectors+. + +For one 1-character selector, +removes all instances of that character: + + s = 'abracadabra' + s.delete('a') # => "brcdbr" + s.delete('b') # => "aracadara" + s.delete('x') # => "abracadabra" + s.delete('') # => "abracadabra" + + s = 'よã‚ã—ããŠé¡˜ã„ã—ã¾ã™' + s.delete('よ') # => "ã‚ã—ããŠé¡˜ã„ã—ã¾ã™" + s.delete('ã—') # => "よã‚ããŠé¡˜ã„ã¾ã™" + +For one multi-character selector, +removes all instances of the specified characters: + + s = 'abracadabra' + s.delete('ab') # => "rcdr" + s.delete('abc') # => "rdr" + s.delete('abcd') # => "rr" + s.delete('abcdr') # => "" + s.delete('abcdrx') # => "" + +Order and repetition do not matter: + + s.delete('ba') == s.delete('ab') # => true + s.delete('baab') == s.delete('ab') # => true + +For multiple selectors, +forms a single selector that is the intersection of characters in all selectors +and removes all instances of characters specified by that selector: + + s = 'abcdefg' + s.delete('abcde', 'dcbfg') == s.delete('bcd') # => true + s.delete('abc', 'def') == s.delete('') # => true + +In a character selector, three characters get special treatment: + +- A caret (<tt>'^'</tt>) functions as a _negation_ operator + for the immediately following characters: + + s = 'abracadabra' + s.delete('^bc') # => "bcb" # Deletes all except 'b' and 'c'. + +- A hyphen (<tt>'-'</tt>) between two other characters defines a _range_ of characters: + + s = 'abracadabra' + s.delete('a-c') # => "rdr" # Deletes all 'a', 'b', and 'c'. + +- A backslash (<tt>'\'</tt>) acts as an escape for a caret, a hyphen, + or another backslash: + + s = 'abracadabra' + s.delete('\^bc') # => "araadara" # Deletes all '^', 'b', and 'c'. + s.delete('a\-c') # => "brdbr" # Deletes all 'a', '-', and 'c'. + 'foo\bar\baz'.delete('\\') # => "foobarbaz" # Deletes all '\'. + +These usages may be mixed: + + s = 'abracadabra' + s.delete('a-cq-t') # => "d" # Multiple ranges. + s.delete('ac-d') # => "brbr" # Range mixed with plain characters. + s.delete('^a-c') # => "abacaaba" # Range mixed with negation. + +For multiple selectors, all forms may be used, including negations, ranges, and escapes. + + s = 'abracadabra' + s.delete('^abc', '^def') == s.delete('^abcdef') # => true + s.delete('a-e', 'c-g') == s.delete('cde') # => true + s.delete('^abc', 'c-g') == s.delete('defg') # => true + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/delete_prefix.rdoc b/doc/string/delete_prefix.rdoc new file mode 100644 index 0000000000..6255e300e3 --- /dev/null +++ b/doc/string/delete_prefix.rdoc @@ -0,0 +1,9 @@ +Returns a copy of +self+ with leading substring +prefix+ removed: + + 'oof'.delete_prefix('o') # => "of" + 'oof'.delete_prefix('oo') # => "f" + 'oof'.delete_prefix('oof') # => "" + 'oof'.delete_prefix('x') # => "oof" + 'ã“ã‚“ã«ã¡ã¯'.delete_prefix('ã“ã‚“') # => "ã«ã¡ã¯" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/delete_suffix.rdoc b/doc/string/delete_suffix.rdoc new file mode 100644 index 0000000000..a4d9a80f85 --- /dev/null +++ b/doc/string/delete_suffix.rdoc @@ -0,0 +1,10 @@ +Returns a copy of +self+ with trailing substring <tt>suffix</tt> removed: + + 'foo'.delete_suffix('o') # => "fo" + 'foo'.delete_suffix('oo') # => "f" + 'foo'.delete_suffix('foo') # => "" + 'foo'.delete_suffix('f') # => "foo" + 'foo'.delete_suffix('x') # => "foo" + 'ã“ã‚“ã«ã¡ã¯'.delete_suffix('ã¡ã¯') # => "ã“ã‚“ã«" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/downcase.rdoc b/doc/string/downcase.rdoc new file mode 100644 index 0000000000..d5fffa037b --- /dev/null +++ b/doc/string/downcase.rdoc @@ -0,0 +1,20 @@ +Returns a new string containing the downcased characters in +self+: + + 'HELLO'.downcase # => "hello" + 'STRAẞE'.downcase # => "straße" + 'ПРИВЕТ'.downcase # => "привет" + 'RubyGems.org'.downcase # => "rubygems.org" + +Some characters (and some character sets) do not have upcase and downcase versions; +see {Case Mapping}[rdoc-ref:case_mapping.rdoc]: + + s = '1, 2, 3, ...' + s.downcase == s # => true + s = 'ã“ã‚“ã«ã¡ã¯' + s.downcase == s # => true + +The casing is affected by the given +mapping+, +which may be +:ascii+, +:fold+, or +:turkic+; +see {Case Mappings}[rdoc-ref:case_mapping.rdoc@Case+Mappings]. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/dump.rdoc b/doc/string/dump.rdoc new file mode 100644 index 0000000000..add3c35662 --- /dev/null +++ b/doc/string/dump.rdoc @@ -0,0 +1,89 @@ +For an ordinary string, this method, +String#dump+, +returns a printable ASCII-only version of +self+, enclosed in double-quotes. + +For a dumped string, method String#undump is the inverse of +String#dump+; +it returns a "restored" version of +self+, +where all the dumping changes have been undone. + +In the simplest case, the dumped string contains the original string, +enclosed in double-quotes; +this example is done in +irb+ (interactive Ruby), which uses method `inspect` to render the results: + + s = 'hello' # => "hello" + s.dump # => "\"hello\"" + s.dump.undump # => "hello" + +Keep in mind that in the second line above: + +- The outer double-quotes are put on by +inspect+, + and _are_ _not_ part of the output of #dump. +- The inner double-quotes _are_ part of the output of +dump+, + and are escaped by +inspect+ because they are within the outer double-quotes. + +To avoid confusion, we'll use this helper method to omit the outer double-quotes: + + def dump(s) + print "String: ", s, "\n" + print "Dumped: ", s.dump, "\n" + print "Undumped: ", s.dump.undump, "\n" + end + +So that for string <tt>'hello'</tt>, we'll see: + + String: hello + Dumped: "hello" + Undumped: hello + +In a dump, certain special characters are escaped: + + String: " + Dumped: "\"" + Undumped: " + + String: \ + Dumped: "\\" + Undumped: \ + +In a dump, unprintable characters are replaced by printable ones; +the unprintable characters are the whitespace characters (other than space itself); +here we see the ordinals for those characers, together with explanatory text: + + h = { + 7 => 'Alert (BEL)', + 8 => 'Backspace (BS)', + 9 => 'Horizontal tab (HT)', + 10 => 'Linefeed (LF)', + 11 => 'Vertical tab (VT)', + 12 => 'Formfeed (FF)', + 13 => 'Carriage return (CR)' + } + +In this example, the dumped output is printed by method #inspect, +and so contains both outer double-quotes and escaped inner double-quotes: + + s = '' + h.keys.each {|i| s << i } # => [7, 8, 9, 10, 11, 12, 13] + s # => "\a\b\t\n\v\f\r" + s.dump # => "\"\\a\\b\\t\\n\\v\\f\\r\"" + +If +self+ is encoded in UTF-8 and contains Unicode characters, +each Unicode character is dumped as a Unicode escape sequence: + + String: ã“ã‚“ã«ã¡ã¯ + Dumped: "\u3053\u3093\u306B\u3061\u306F" + Undumped: ã“ã‚“ã«ã¡ã¯ + +If the encoding of +self+ is not ASCII-compatible +(i.e., if <tt>self.encoding.ascii_compatible?</tt> returns +false+), +each ASCII-compatible byte is dumped as an ASCII character, +and all other bytes are dumped as hexadecimal; +also appends <tt>.dup.force_encoding(\"encoding\")</tt>, +where <tt><encoding></tt> is <tt>self.encoding.name</tt>: + + String: hello + Dumped: "\xFE\xFF\x00h\x00e\x00l\x00l\x00o".dup.force_encoding("UTF-16") + Undumped: hello + + String: ã“ã‚“ã«ã¡ã¯ + Dumped: "\xFE\xFF0S0\x930k0a0o".dup.force_encoding("UTF-16") + Undumped: ã“ã‚“ã«ã¡ã¯ diff --git a/doc/string/each_byte.rdoc b/doc/string/each_byte.rdoc new file mode 100644 index 0000000000..642d71e84b --- /dev/null +++ b/doc/string/each_byte.rdoc @@ -0,0 +1,15 @@ +With a block given, calls the block with each successive byte from +self+; +returns +self+: + + a = [] + 'hello'.each_byte {|byte| a.push(byte) } # Five 1-byte characters. + a # => [104, 101, 108, 108, 111] + a = [] + 'ã“ã‚“ã«ã¡ã¯'.each_byte {|byte| a.push(byte) } # Five 3-byte characters. + a # => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + +With no block given, returns an enumerator. + +Related: see {Iterating}[rdoc-ref:String@Iterating]. + + diff --git a/doc/string/each_char.rdoc b/doc/string/each_char.rdoc new file mode 100644 index 0000000000..2dd56711d3 --- /dev/null +++ b/doc/string/each_char.rdoc @@ -0,0 +1,17 @@ +With a block given, calls the block with each successive character from +self+; +returns +self+: + + a = [] + 'hello'.each_char do |char| + a.push(char) + end + a # => ["h", "e", "l", "l", "o"] + a = [] + 'ã“ã‚“ã«ã¡ã¯'.each_char do |char| + a.push(char) + end + a # => ["ã“", "ã‚“", "ã«", "ã¡", "ã¯"] + +With no block given, returns an enumerator. + +Related: see {Iterating}[rdoc-ref:String@Iterating]. diff --git a/doc/string/each_codepoint.rdoc b/doc/string/each_codepoint.rdoc new file mode 100644 index 0000000000..8e4e7545e6 --- /dev/null +++ b/doc/string/each_codepoint.rdoc @@ -0,0 +1,18 @@ +With a block given, calls the block with each successive codepoint from +self+; +each {codepoint}[https://en.wikipedia.org/wiki/Code_point] is the integer value for a character; +returns +self+: + + a = [] + 'hello'.each_codepoint do |codepoint| + a.push(codepoint) + end + a # => [104, 101, 108, 108, 111] + a = [] + 'ã“ã‚“ã«ã¡ã¯'.each_codepoint do |codepoint| + a.push(codepoint) + end + a # => [12371, 12435, 12395, 12385, 12399] + +With no block given, returns an enumerator. + +Related: see {Iterating}[rdoc-ref:String@Iterating]. diff --git a/doc/string/each_grapheme_cluster.rdoc b/doc/string/each_grapheme_cluster.rdoc new file mode 100644 index 0000000000..384cd6967d --- /dev/null +++ b/doc/string/each_grapheme_cluster.rdoc @@ -0,0 +1,19 @@ +With a block given, calls the given block with each successive grapheme cluster from +self+ +(see {Unicode Grapheme Cluster Boundaries}[https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries]); +returns +self+: + + a = [] + 'hello'.each_grapheme_cluster do |grapheme_cluster| + a.push(grapheme_cluster) + end + a # => ["h", "e", "l", "l", "o"] + + a = [] + 'ã“ã‚“ã«ã¡ã¯'.each_grapheme_cluster do |grapheme_cluster| + a.push(grapheme_cluster) + end + a # => ["ã“", "ã‚“", "ã«", "ã¡", "ã¯"] + +With no block given, returns an enumerator. + +Related: see {Iterating}[rdoc-ref:String@Iterating]. diff --git a/doc/string/each_line.rdoc b/doc/string/each_line.rdoc new file mode 100644 index 0000000000..217c188e35 --- /dev/null +++ b/doc/string/each_line.rdoc @@ -0,0 +1,66 @@ +With a block given, forms the substrings (lines) +that are the result of splitting +self+ +at each occurrence of the given +record_separator+; +passes each line to the block; +returns +self+. + +With the default +record_separator+: + + $/ # => "\n" + s = <<~EOT + This is the first line. + This is line two. + + This is line four. + This is line five. + EOT + s.each_line {|line| p line } + +Output: + + "This is the first line.\n" + "This is line two.\n" + "\n" + "This is line four.\n" + "This is line five.\n" + +With a different +record_separator+: + + record_separator = ' is ' + s.each_line(record_separator) {|line| p line } + +Output: + + "This is " + "the first line.\nThis is " + "line two.\n\nThis is " + "line four.\nThis is " + "line five.\n" + +With +chomp+ as +true+, removes the trailing +record_separator+ from each line: + + s.each_line(chomp: true) {|line| p line } + +Output: + + "This is the first line." + "This is line two." + "" + "This is line four." + "This is line five." + +With an empty string as +record_separator+, +forms and passes "paragraphs" by splitting at each occurrence +of two or more newlines: + + record_separator = '' + s.each_line(record_separator) {|line| p line } + +Output: + + "This is the first line.\nThis is line two.\n\n" + "This is line four.\nThis is line five.\n" + +With no block given, returns an enumerator. + +Related: see {Iterating}[rdoc-ref:String@Iterating]. diff --git a/doc/string/encode.rdoc b/doc/string/encode.rdoc new file mode 100644 index 0000000000..14b959ffff --- /dev/null +++ b/doc/string/encode.rdoc @@ -0,0 +1,50 @@ +Returns a copy of +self+ transcoded as determined by +dst_encoding+; +see {Encodings}[rdoc-ref:encodings.rdoc]. + +By default, raises an exception if +self+ +contains an invalid byte or a character not defined in +dst_encoding+; +that behavior may be modified by encoding options; see below. + +With no arguments: + +- Uses the same encoding if <tt>Encoding.default_internal</tt> is +nil+ + (the default): + + Encoding.default_internal # => nil + s = "Ruby\x99".force_encoding('Windows-1252') + s.encoding # => #<Encoding:Windows-1252> + s.bytes # => [82, 117, 98, 121, 153] + t = s.encode # => "Ruby\x99" + t.encoding # => #<Encoding:Windows-1252> + t.bytes # => [82, 117, 98, 121, 226, 132, 162] + +- Otherwise, uses the encoding <tt>Encoding.default_internal</tt>: + + Encoding.default_internal = 'UTF-8' + t = s.encode # => "Rubyâ„¢" + t.encoding # => #<Encoding:UTF-8> + +With only argument +dst_encoding+ given, uses that encoding: + + s = "Ruby\x99".force_encoding('Windows-1252') + s.encoding # => #<Encoding:Windows-1252> + t = s.encode('UTF-8') # => "Rubyâ„¢" + t.encoding # => #<Encoding:UTF-8> + +With arguments +dst_encoding+ and +src_encoding+ given, +interprets +self+ using +src_encoding+, encodes the new string using +dst_encoding+: + + s = "Ruby\x99" + t = s.encode('UTF-8', 'Windows-1252') # => "Rubyâ„¢" + t.encoding # => #<Encoding:UTF-8> + +Optional keyword arguments +enc_opts+ specify encoding options; +see {Encoding Options}[rdoc-ref:encodings.rdoc@Encoding+Options]. + +Please note that, unless <code>invalid: :replace</code> option is +given, conversion from an encoding +enc+ to the same encoding +enc+ +(independent of whether +enc+ is given explicitly or implicitly) is a +no-op, i.e. the string is simply copied without any changes, and no +exceptions are raised, even if there are invalid bytes. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/end_with_p.rdoc b/doc/string/end_with_p.rdoc new file mode 100644 index 0000000000..9a95d74fde --- /dev/null +++ b/doc/string/end_with_p.rdoc @@ -0,0 +1,9 @@ +Returns whether +self+ ends with any of the given +strings+: + + 'foo'.end_with?('oo') # => true + 'foo'.end_with?('bar', 'oo') # => true + 'foo'.end_with?('bar', 'baz') # => false + 'foo'.end_with?('') # => true + 'ã“ã‚“ã«ã¡ã¯'.end_with?('ã¯') # => true + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/eql_p.rdoc b/doc/string/eql_p.rdoc new file mode 100644 index 0000000000..85409c5ed6 --- /dev/null +++ b/doc/string/eql_p.rdoc @@ -0,0 +1,18 @@ +Returns whether +self+ and +object+ have the same length and content: + + s = 'foo' + s.eql?('foo') # => true + s.eql?('food') # => false + s.eql?('FOO') # => false + +Returns +false+ if the two strings' encodings are not compatible: + + s0 = "äöü" # => "äöü" + s1 = s0.encode(Encoding::ISO_8859_1) # => "\xE4\xF6\xFC" + s0.encoding # => #<Encoding:UTF-8> + s1.encoding # => #<Encoding:ISO-8859-1> + s0.eql?(s1) # => false + +See {Encodings}[rdoc-ref:encodings.rdoc]. + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/force_encoding.rdoc b/doc/string/force_encoding.rdoc new file mode 100644 index 0000000000..a509e67f80 --- /dev/null +++ b/doc/string/force_encoding.rdoc @@ -0,0 +1,21 @@ +Changes the encoding of +self+ to the given +encoding+, +which may be a string encoding name or an Encoding object; +does not change the underlying bytes; +returns self: + + s = 'Å‚aÅ‚' + s.bytes # => [197, 130, 97, 197, 130] + s.encoding # => #<Encoding:UTF-8> + s.force_encoding('ascii') # => "\xC5\x82a\xC5\x82" + s.encoding # => #<Encoding:US-ASCII> + s.valid_encoding? # => true + s.bytes # => [197, 130, 97, 197, 130] + +Makes the change even if the given +encoding+ is invalid +for +self+ (as is the change above): + + s.valid_encoding? # => false + +See {Encodings}[rdoc-ref:encodings.rdoc]. + +Related: see {Modifying}[rdoc-ref:String@Modifying]. diff --git a/doc/string/getbyte.rdoc b/doc/string/getbyte.rdoc new file mode 100644 index 0000000000..1d0ed2a5a4 --- /dev/null +++ b/doc/string/getbyte.rdoc @@ -0,0 +1,23 @@ +Returns the byte at zero-based +index+ as an integer: + + s = 'foo' + s.getbyte(0) # => 102 + s.getbyte(1) # => 111 + s.getbyte(2) # => 111 + +Counts backward from the end if +index+ is negative: + + s.getbyte(-3) # => 102 + +Returns +nil+ if +index+ is out of range: + + s.getbyte(3) # => nil + s.getbyte(-4) # => nil + +More examples: + + s = 'ã“ã‚“ã«ã¡ã¯' + s.bytes # => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + s.getbyte(2) # => 147 + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/grapheme_clusters.rdoc b/doc/string/grapheme_clusters.rdoc new file mode 100644 index 0000000000..07ea1e318b --- /dev/null +++ b/doc/string/grapheme_clusters.rdoc @@ -0,0 +1,19 @@ +Returns an array of the grapheme clusters in +self+ +(see {Unicode Grapheme Cluster Boundaries}[https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries]): + + s = "ä-pqr-b̈-xyz-c̈" + s.size # => 16 + s.bytesize # => 19 + s.grapheme_clusters.size # => 13 + s.grapheme_clusters + # => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"] + +Details: + + s = "ä" + s.grapheme_clusters # => ["ä"] # One grapheme cluster. + s.bytes # => [97, 204, 136] # Three bytes. + s.chars # => ["a", "̈"] # Two characters. + s.chars.map {|char| char.ord } # => [97, 776] # Their values. + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/hash.rdoc b/doc/string/hash.rdoc new file mode 100644 index 0000000000..fe94770ed9 --- /dev/null +++ b/doc/string/hash.rdoc @@ -0,0 +1,19 @@ +Returns the integer hash value for +self+. + +Two \String objects that have identical content and compatible encodings +also have the same hash value; +see Object#hash and {Encodings}[rdoc-ref:encodings.rdoc]: + + s = 'foo' + h = s.hash # => -569050784 + h == 'foo'.hash # => true + h == 'food'.hash # => false + h == 'FOO'.hash # => false + + s0 = "äöü" + s1 = s0.encode(Encoding::ISO_8859_1) + s0.encoding # => #<Encoding:UTF-8> + s1.encoding # => #<Encoding:ISO-8859-1> + s0.hash == s1.hash # => false + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/index.rdoc b/doc/string/index.rdoc new file mode 100644 index 0000000000..c3cff24dac --- /dev/null +++ b/doc/string/index.rdoc @@ -0,0 +1,38 @@ +Returns the integer position of the first substring that matches the given argument +pattern+, +or +nil+ if none found. + +When +pattern+ is a string, +returns the index of the first matching substring in +self+: + + 'foo'.index('f') # => 0 + 'foo'.index('o') # => 1 + 'foo'.index('oo') # => 1 + 'foo'.index('ooo') # => nil + 'ã“ã‚“ã«ã¡ã¯'.index('ã¡') # => 3 + +When +pattern+ is a Regexp, returns the index of the first match in +self+: + + 'foo'.index(/o./) # => 1 + 'foo'.index(/.o/) # => 0 + +When +offset+ is non-negative, begins the search at position +offset+; +the returned index is relative to the beginning of +self+: + + 'bar'.index('r', 0) # => 2 + 'bar'.index('r', 1) # => 2 + 'bar'.index('r', 2) # => 2 + 'bar'.index('r', 3) # => nil + 'bar'.index(/[r-z]/, 0) # => 2 + 'ã“ã‚“ã«ã¡ã¯'.index('ã¡', 2) # => 3 + +With negative integer argument +offset+, selects the search position by counting backward +from the end of +self+: + + 'foo'.index('o', -1) # => 2 + 'foo'.index('o', -2) # => 1 + 'foo'.index('o', -3) # => 1 + 'foo'.index('o', -4) # => nil + 'foo'.index(/o./, -2) # => 1 + 'foo'.index(/.o/, -2) # => 1 + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/insert.rdoc b/doc/string/insert.rdoc new file mode 100644 index 0000000000..73205f2069 --- /dev/null +++ b/doc/string/insert.rdoc @@ -0,0 +1,15 @@ +Inserts the given +other_string+ into +self+; returns +self+. + +If the given +index+ is non-negative, inserts +other_string+ at offset +index+: + + 'foo'.insert(0, 'bar') # => "barfoo" + 'foo'.insert(1, 'bar') # => "fbaroo" + 'foo'.insert(3, 'bar') # => "foobar" + 'ã“ã‚“ã«ã¡ã¯'.insert(2, 'bar') # => "ã“ã‚“barã«ã¡ã¯" + +If the +index+ is negative, counts backward from the end of +self+ +and inserts +other_string+ _after_ the offset: + + 'foo'.insert(-2, 'bar') # => "fobaro" + +Related: see {Modifying}[rdoc-ref:String@Modifying]. diff --git a/doc/string/inspect.rdoc b/doc/string/inspect.rdoc new file mode 100644 index 0000000000..907828c2af --- /dev/null +++ b/doc/string/inspect.rdoc @@ -0,0 +1,38 @@ +Returns a printable version of +self+, enclosed in double-quotes. + +Most printable characters are rendered simply as themselves: + + 'abc'.inspect # => "\"abc\"" + '012'.inspect # => "\"012\"" + ''.inspect # => "\"\"" + "\u000012".inspect # => "\"\\u000012\"" + 'ã“ã‚“ã«ã¡ã¯'.inspect # => "\"ã“ã‚“ã«ã¡ã¯\"" + +But printable characters double-quote (<tt>'"'</tt>) and backslash and (<tt>'\\'</tt>) are escaped: + + '"'.inspect # => "\"\\\"\"" + '\\'.inspect # => "\"\\\\\"" + +Unprintable characters are the {ASCII characters}[https://en.wikipedia.org/wiki/ASCII] +whose values are in range <tt>0..31</tt>, +along with the character whose value is +127+. + +Most of these characters are rendered thus: + + 0.chr.inspect # => "\"\\x00\"" + 1.chr.inspect # => "\"\\x01\"" + 2.chr.inspect # => "\"\\x02\"" + # ... + +A few, however, have special renderings: + + 7.chr.inspect # => "\"\\a\"" # BEL + 8.chr.inspect # => "\"\\b\"" # BS + 9.chr.inspect # => "\"\\t\"" # TAB + 10.chr.inspect # => "\"\\n\"" # LF + 11.chr.inspect # => "\"\\v\"" # VT + 12.chr.inspect # => "\"\\f\"" # FF + 13.chr.inspect # => "\"\\r\"" # CR + 27.chr.inspect # => "\"\\e\"" # ESC + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/intern.rdoc b/doc/string/intern.rdoc new file mode 100644 index 0000000000..eded6ac3d7 --- /dev/null +++ b/doc/string/intern.rdoc @@ -0,0 +1,8 @@ +Returns the Symbol object derived from +self+, +creating it if it did not already exist: + + 'foo'.intern # => :foo + 'ã“ã‚“ã«ã¡ã¯'.intern # => :ã“ã‚“ã«ã¡ã¯ + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. + diff --git a/doc/string/length.rdoc b/doc/string/length.rdoc new file mode 100644 index 0000000000..eb68edb10c --- /dev/null +++ b/doc/string/length.rdoc @@ -0,0 +1,11 @@ +Returns the count of characters (not bytes) in +self+: + + 'foo'.length # => 3 + 'ã“ã‚“ã«ã¡ã¯'.length # => 5 + +Contrast with String#bytesize: + + 'foo'.bytesize # => 3 + 'ã“ã‚“ã«ã¡ã¯'.bytesize # => 15 + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/ljust.rdoc b/doc/string/ljust.rdoc new file mode 100644 index 0000000000..a8ca62ee76 --- /dev/null +++ b/doc/string/ljust.rdoc @@ -0,0 +1,13 @@ +Returns a copy of +self+, left-justified and, if necessary, right-padded with the +pad_string+: + + 'hello'.ljust(10) # => "hello " + ' hello'.ljust(10) # => " hello " + 'hello'.ljust(10, 'ab') # => "helloababa" + 'ã“ã‚“ã«ã¡ã¯'.ljust(10) # => "ã“ã‚“ã«ã¡ã¯ " + +If <tt>width <= self.length</tt>, returns a copy of +self+: + + 'hello'.ljust(5) # => "hello" + 'hello'.ljust(1) # => "hello" # Does not truncate to width. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/new.rdoc b/doc/string/new.rdoc new file mode 100644 index 0000000000..e2752d6e1f --- /dev/null +++ b/doc/string/new.rdoc @@ -0,0 +1,51 @@ +Returns a new \String object containing the given +string+. + +The +options+ are optional keyword options (see below). + +With no argument given and keyword +encoding+ also not given, +returns an empty string with the Encoding <tt>ASCII-8BIT</tt>: + + s = String.new # => "" + s.encoding # => #<Encoding:ASCII-8BIT> + +With argument +string+ given and keyword option +encoding+ not given, +returns a new string with the same encoding as +string+: + + s0 = 'foo'.encode(Encoding::UTF_16) + s1 = String.new(s0) + s1.encoding # => #<Encoding:UTF-16 (dummy)> + +(Unlike \String.new, +a {string literal}[rdoc-ref:syntax/literals.rdoc@String+Literals] like <tt>''</tt> or a +{here document literal}[rdoc-ref:syntax/literals.rdoc@Here+Document+Literals] +always has {script encoding}[rdoc-ref:encodings.rdoc@Script+Encoding].) + +With keyword option +encoding+ given, +returns a string with the specified encoding; +the +encoding+ may be an Encoding object, an encoding name, +or an encoding name alias: + + String.new(encoding: Encoding::US_ASCII).encoding # => #<Encoding:US-ASCII> + String.new('', encoding: Encoding::US_ASCII).encoding # => #<Encoding:US-ASCII> + String.new('foo', encoding: Encoding::US_ASCII).encoding # => #<Encoding:US-ASCII> + String.new('foo', encoding: 'US-ASCII').encoding # => #<Encoding:US-ASCII> + String.new('foo', encoding: 'ASCII').encoding # => #<Encoding:US-ASCII> + +The given encoding need not be valid for the string's content, +and its validity is not checked: + + s = String.new('ã“ã‚“ã«ã¡ã¯', encoding: 'ascii') + s.valid_encoding? # => false + +But the given +encoding+ itself is checked: + + String.new('foo', encoding: 'bar') # Raises ArgumentError. + +With keyword option +capacity+ given, +the given value is advisory only, +and may or may not set the size of the internal buffer, +which may in turn affect performance: + + String.new('foo', capacity: 1) # Buffer size is at least 4 (includes terminal null byte). + String.new('foo', capacity: 4096) # Buffer size is at least 4; + # may be equal to, greater than, or less than 4096. diff --git a/doc/string/ord.rdoc b/doc/string/ord.rdoc new file mode 100644 index 0000000000..87b469db02 --- /dev/null +++ b/doc/string/ord.rdoc @@ -0,0 +1,7 @@ +Returns the integer ordinal of the first character of +self+: + + 'h'.ord # => 104 + 'hello'.ord # => 104 + 'ã“ã‚“ã«ã¡ã¯'.ord # => 12371 + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/partition.rdoc b/doc/string/partition.rdoc new file mode 100644 index 0000000000..614ad029d4 --- /dev/null +++ b/doc/string/partition.rdoc @@ -0,0 +1,43 @@ +Returns a 3-element array of substrings of +self+. + +If +pattern+ is matched, returns the array: + + [pre_match, first_match, post_match] + +where: + +- +first_match+ is the first-found matching substring. +- +pre_match+ and +post_match+ are the preceding and following substrings. + +If +pattern+ is not matched, returns the array: + + [self.dup, "", ""] + +Note that in the examples below, a returned string <tt>'hello'</tt> +is a copy of +self+, not +self+. + +If +pattern+ is a Regexp, performs the equivalent of <tt>self.match(pattern)</tt> +(also setting {matched-data variables}[rdoc-ref:language/globals.md@Matched+Data]): + + 'hello'.partition(/h/) # => ["", "h", "ello"] + 'hello'.partition(/l/) # => ["he", "l", "lo"] + 'hello'.partition(/l+/) # => ["he", "ll", "o"] + 'hello'.partition(/o/) # => ["hell", "o", ""] + 'hello'.partition(/^/) # => ["", "", "hello"] + 'hello'.partition(//) # => ["", "", "hello"] + 'hello'.partition(/$/) # => ["hello", "", ""] + 'hello'.partition(/x/) # => ["hello", "", ""] + +If +pattern+ is not a Regexp, converts it to a string (if it is not already one), +then performs the equivalent of <tt>self.index(pattern)</tt> +(and does _not_ set {matched-data global variables}[rdoc-ref:language/globals.md@Matched+Data]): + + 'hello'.partition('h') # => ["", "h", "ello"] + 'hello'.partition('l') # => ["he", "l", "lo"] + 'hello'.partition('ll') # => ["he", "ll", "o"] + 'hello'.partition('o') # => ["hell", "o", ""] + 'hello'.partition('') # => ["", "", "hello"] + 'hello'.partition('x') # => ["hello", "", ""] + 'ã“ã‚“ã«ã¡ã¯'.partition('ã«') # => ["ã“ã‚“", "ã«", "ã¡ã¯"] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/rindex.rdoc b/doc/string/rindex.rdoc new file mode 100644 index 0000000000..2b81c3716d --- /dev/null +++ b/doc/string/rindex.rdoc @@ -0,0 +1,51 @@ +Returns the integer position of the _last_ substring that matches the given argument +pattern+, +or +nil+ if none found. + +When +pattern+ is a string, returns the index of the last matching substring in self: + + 'foo'.rindex('f') # => 0 + 'foo'.rindex('o') # => 2 + 'foo'.rindex('oo' # => 1 + 'foo'.rindex('ooo') # => nil + 'ã“ã‚“ã«ã¡ã¯'.rindex('ã¡') # => 3 + +When +pattern+ is a Regexp, returns the index of the last match in self: + + 'foo'.rindex(/f/) # => 0 + 'foo'.rindex(/o/) # => 2 + 'foo'.rindex(/oo/) # => 1 + 'foo'.rindex(/ooo/) # => nil + +When +offset+ is non-negative, it specifies the maximum starting position in the +string to end the search: + + 'foo'.rindex('o', 0) # => nil + 'foo'.rindex('o', 1) # => 1 + 'foo'.rindex('o', 2) # => 2 + 'foo'.rindex('o', 3) # => 2 + +With negative integer argument +offset+, +selects the search position by counting backward from the end of +self+: + + 'foo'.rindex('o', -1) # => 2 + 'foo'.rindex('o', -2) # => 1 + 'foo'.rindex('o', -3) # => nil + 'foo'.rindex('o', -4) # => nil + +The last match means starting at the possible last position, not +the last of longest matches: + + 'foo'.rindex(/o+/) # => 2 + $~ # => #<MatchData "o"> + +To get the last longest match, combine with negative lookbehind: + + 'foo'.rindex(/(?<!o)o+/) # => 1 + $~ # => #<MatchData "oo"> + +Or String#index with negative lookforward. + + 'foo'.index(/o+(?!.*o)/) # => 1 + $~ # => #<MatchData "oo"> + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/rjust.rdoc b/doc/string/rjust.rdoc new file mode 100644 index 0000000000..acd3f198d4 --- /dev/null +++ b/doc/string/rjust.rdoc @@ -0,0 +1,17 @@ +Returns a right-justified copy of +self+. + +If integer argument +width+ is greater than the size (in characters) of +self+, +returns a new string of length +width+ that is a copy of +self+, +right justified and padded on the left with +pad_string+: + + 'hello'.rjust(10) # => " hello" + 'hello '.rjust(10) # => " hello " + 'hello'.rjust(10, 'ab') # => "ababahello" + 'ã“ã‚“ã«ã¡ã¯'.rjust(10) # => " ã“ã‚“ã«ã¡ã¯" + +If <tt>width <= self.size</tt>, returns a copy of +self+: + + 'hello'.rjust(5, 'ab') # => "hello" + 'hello'.rjust(1, 'ab') # => "hello" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/rpartition.rdoc b/doc/string/rpartition.rdoc new file mode 100644 index 0000000000..eed03949a5 --- /dev/null +++ b/doc/string/rpartition.rdoc @@ -0,0 +1,47 @@ +Returns a 3-element array of substrings of +self+. + +Searches +self+ for a match of +pattern+, seeking the _last_ match. + +If +pattern+ is not matched, returns the array: + + ["", "", self.dup] + +If +pattern+ is matched, returns the array: + + [pre_match, last_match, post_match] + +where: + +- +last_match+ is the last-found matching substring. +- +pre_match+ and +post_match+ are the preceding and following substrings. + +The pattern used is: + +- +pattern+ itself, if it is a Regexp. +- <tt>Regexp.quote(pattern)</tt>, if +pattern+ is a string. + +Note that in the examples below, a returned string <tt>'hello'</tt> is a copy of +self+, not +self+. + +If +pattern+ is a Regexp, searches for the last matching substring +(also setting {matched-data global variables}[rdoc-ref:language/globals.md@Matched+Data]): + + 'hello'.rpartition(/l/) # => ["hel", "l", "o"] + 'hello'.rpartition(/ll/) # => ["he", "ll", "o"] + 'hello'.rpartition(/h/) # => ["", "h", "ello"] + 'hello'.rpartition(/o/) # => ["hell", "o", ""] + 'hello'.rpartition(//) # => ["hello", "", ""] + 'hello'.rpartition(/x/) # => ["", "", "hello"] + 'ã“ã‚“ã«ã¡ã¯'.rpartition(/ã«/) # => ["ã“ã‚“", "ã«", "ã¡ã¯"] + +If +pattern+ is not a Regexp, converts it to a string (if it is not already one), +then searches for the last matching substring +(and does _not_ set {matched-data global variables}[rdoc-ref:language/globals.md@Matched+Data]): + + 'hello'.rpartition('l') # => ["hel", "l", "o"] + 'hello'.rpartition('ll') # => ["he", "ll", "o"] + 'hello'.rpartition('h') # => ["", "h", "ello"] + 'hello'.rpartition('o') # => ["hell", "o", ""] + 'hello'.rpartition('') # => ["hello", "", ""] + 'ã“ã‚“ã«ã¡ã¯'.rpartition('ã«') # => ["ã“ã‚“", "ã«", "ã¡ã¯"] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/scan.rdoc b/doc/string/scan.rdoc new file mode 100644 index 0000000000..04a2b02ff4 --- /dev/null +++ b/doc/string/scan.rdoc @@ -0,0 +1,35 @@ +Matches a pattern against +self+: + +- If +pattern+ is a Regexp, the pattern used is +pattern+ itself. +- If +pattern+ is a string, the pattern used is <tt>Regexp.quote(pattern)</tt>. + +Generates a collection of matching results +and updates {regexp-related global variables}[rdoc-ref:Regexp@Global+Variables]: + +- If the pattern contains no groups, each result is a matched substring. +- If the pattern contains groups, each result is an array + containing a matched substring for each group. + +With no block given, returns an array of the results: + + 'cruel world'.scan(/\w+/) # => ["cruel", "world"] + 'cruel world'.scan(/.../) # => ["cru", "el ", "wor"] + 'cruel world'.scan(/(...)/) # => [["cru"], ["el "], ["wor"]] + 'cruel world'.scan(/(..)(..)/) # => [["cr", "ue"], ["l ", "wo"]] + 'ã“ã‚“ã«ã¡ã¯'.scan(/../) # => ["ã“ã‚“", "ã«ã¡"] + 'abracadabra'.scan('ab') # => ["ab", "ab"] + 'abracadabra'.scan('nosuch') # => [] + +With a block given, calls the block with each result; returns +self+: + + 'cruel world'.scan(/\w+/) {|w| p w } + # => "cruel" + # => "world" + 'cruel world'.scan(/(.)(.)/) {|x, y| p [x, y] } + # => ["c", "r"] + # => ["u", "e"] + # => ["l", " "] + # => ["w", "o"] + # => ["r", "l"] + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/scrub.rdoc b/doc/string/scrub.rdoc new file mode 100644 index 0000000000..5ace376cdb --- /dev/null +++ b/doc/string/scrub.rdoc @@ -0,0 +1,22 @@ +Returns a copy of +self+ with each invalid byte sequence replaced +by the given +replacement_string+. + +With no block given, replaces each invalid sequence +with the given +default_replacement_string+ +(by default, <tt>"�"</tt> for a Unicode encoding, <tt>'?'</tt> otherwise): + + "foo\x81\x81bar"scrub # => "foo��bar" + "foo\x81\x81bar".force_encoding('US-ASCII').scrub # => "foo??bar" + "foo\x81\x81bar".scrub('xyzzy') # => "fooxyzzyxyzzybar" + +With a block given, calls the block with each invalid sequence, +and replaces that sequence with the return value of the block: + + "foo\x81\x81bar".scrub {|sequence| p sequence; 'XYZZY' } # => "fooXYZZYXYZZYbar" + +Output : + + "\x81" + "\x81" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/split.rdoc b/doc/string/split.rdoc new file mode 100644 index 0000000000..1aee1de0a4 --- /dev/null +++ b/doc/string/split.rdoc @@ -0,0 +1,101 @@ +Creates an array of substrings by splitting +self+ +at each occurrence of the given field separator +field_sep+. + +With no arguments given, +splits using the field separator <tt>$;</tt>, +whose default value is +nil+. + +With no block given, returns the array of substrings: + + 'abracadabra'.split('a') # => ["", "br", "c", "d", "br"] + +When +field_sep+ is +nil+ or <tt>' '</tt> (a single space), +splits at each sequence of whitespace: + + 'foo bar baz'.split(nil) # => ["foo", "bar", "baz"] + 'foo bar baz'.split(' ') # => ["foo", "bar", "baz"] + "foo \n\tbar\t\n baz".split(' ') # => ["foo", "bar", "baz"] + 'foo bar baz'.split(' ') # => ["foo", "bar", "baz"] + ''.split(' ') # => [] + +When +field_sep+ is an empty string, +splits at every character: + + 'abracadabra'.split('') # => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"] + ''.split('') # => [] + 'ã“ã‚“ã«ã¡ã¯'.split('') # => ["ã“", "ã‚“", "ã«", "ã¡", "ã¯"] + +When +field_sep+ is a non-empty string and different from <tt>' '</tt> (a single space), +uses that string as the separator: + + 'abracadabra'.split('a') # => ["", "br", "c", "d", "br"] + 'abracadabra'.split('ab') # => ["", "racad", "ra"] + ''.split('a') # => [] + 'ã“ã‚“ã«ã¡ã¯'.split('ã«') # => ["ã“ã‚“", "ã¡ã¯"] + +When +field_sep+ is a Regexp, +splits at each occurrence of a matching substring: + + 'abracadabra'.split(/ab/) # => ["", "racad", "ra"] + '1 + 1 == 2'.split(/\W+/) # => ["1", "1", "2"] + 'abracadabra'.split(//) # => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"] + +If the \Regexp contains groups, their matches are included +in the returned array: + + '1:2:3'.split(/(:)()()/, 2) # => ["1", ":", "", "", "2:3"] + +Argument +limit+ sets a limit on the size of the returned array; +it also determines whether trailing empty strings are included in the returned array. + +When +limit+ is zero, +there is no limit on the size of the array, +but trailing empty strings are omitted: + + 'abracadabra'.split('', 0) # => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"] + 'abracadabra'.split('a', 0) # => ["", "br", "c", "d", "br"] # Empty string after last 'a' omitted. + +When +limit+ is a positive integer, +there is a limit on the size of the array (no more than <tt>n - 1</tt> splits occur), +and trailing empty strings are included: + + 'abracadabra'.split('', 3) # => ["a", "b", "racadabra"] + 'abracadabra'.split('a', 3) # => ["", "br", "cadabra"] + 'abracadabra'.split('', 30) # => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""] + 'abracadabra'.split('a', 30) # => ["", "br", "c", "d", "br", ""] + 'abracadabra'.split('', 1) # => ["abracadabra"] + 'abracadabra'.split('a', 1) # => ["abracadabra"] + +When +limit+ is negative, +there is no limit on the size of the array, +and trailing empty strings are omitted: + + 'abracadabra'.split('', -1) # => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""] + 'abracadabra'.split('a', -1) # => ["", "br", "c", "d", "br", ""] + +If a block is given, it is called with each substring and returns +self+: + + 'foo bar baz'.split(' ') {|substring| p substring } + +Output : + + "foo" + "bar" + "baz" + +Note that the above example is functionally equivalent to: + + 'foo bar baz'.split(' ').each {|substring| p substring } + +Output : + + "foo" + "bar" + "baz" + +But the latter: + +- Has poorer performance because it creates an intermediate array. +- Returns an array (instead of +self+). + +Related: see {Converting to Non-String}[rdoc-ref:String@Converting+to+Non--5CString]. diff --git a/doc/string/squeeze.rdoc b/doc/string/squeeze.rdoc new file mode 100644 index 0000000000..1a38c08b32 --- /dev/null +++ b/doc/string/squeeze.rdoc @@ -0,0 +1,33 @@ +Returns a copy of +self+ with each tuple (doubling, tripling, etc.) of specified characters +"squeezed" down to a single character. + +The tuples to be squeezed are specified by arguments +selectors+, +each of which is a string; +see {Character Selectors}[rdoc-ref:character_selectors.rdoc@Character+Selectors]. + +A single argument may be a single character: + + 'Noooooo!'.squeeze('o') # => "No!" + 'foo bar baz'.squeeze(' ') # => "foo bar baz" + 'Mississippi'.squeeze('s') # => "Misisippi" + 'Mississippi'.squeeze('p') # => "Mississipi" + 'Mississippi'.squeeze('x') # => "Mississippi" # Unused selector character is ignored. + 'беÑÑонница'.squeeze('Ñ') # => "беÑонница" + 'беÑÑонница'.squeeze('н') # => "беÑÑоница" + +A single argument may be a string of characters: + + 'Mississippi'.squeeze('sp') # => "Misisipi" + 'Mississippi'.squeeze('ps') # => "Misisipi" # Order doesn't matter. + 'Mississippi'.squeeze('nonsense') # => "Misisippi" # Unused selector characters are ignored. + +A single argument may be a range of characters: + + 'Mississippi'.squeeze('a-p') # => "Mississipi" + 'Mississippi'.squeeze('q-z') # => "Misisippi" + 'Mississippi'.squeeze('a-z') # => "Misisipi" + +Multiple arguments are allowed; +see {Multiple Character Selectors}[rdoc-ref:character_selectors.rdoc@Multiple+Character+Selectors]. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/start_with_p.rdoc b/doc/string/start_with_p.rdoc new file mode 100644 index 0000000000..f78edc7fa3 --- /dev/null +++ b/doc/string/start_with_p.rdoc @@ -0,0 +1,16 @@ +Returns whether +self+ starts with any of the given +patterns+. + +For each argument, the pattern used is: + +- The pattern itself, if it is a Regexp. +- <tt>Regexp.quote(pattern)</tt>, if it is a string. + +Returns +true+ if any pattern matches the beginning, +false+ otherwise: + + 'hello'.start_with?('hell') # => true + 'hello'.start_with?(/H/i) # => true + 'hello'.start_with?('heaven', 'hell') # => true + 'hello'.start_with?('heaven', 'paradise') # => false + 'ã“ã‚“ã«ã¡ã¯'.start_with?('ã“') # => true + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/sub.rdoc b/doc/string/sub.rdoc new file mode 100644 index 0000000000..ff051ea177 --- /dev/null +++ b/doc/string/sub.rdoc @@ -0,0 +1,33 @@ +Returns a copy of self, possibly with a substring replaced. + +Argument +pattern+ may be a string or a Regexp; +argument +replacement+ may be a string or a Hash. + +Varying types for the argument values makes this method very versatile. + +Below are some simple examples; for many more examples, +see {Substitution Methods}[rdoc-ref:String@Substitution+Methods]. + +With arguments +pattern+ and string +replacement+ given, +replaces the first matching substring with the given replacement string: + + s = 'abracadabra' # => "abracadabra" + s.sub('bra', 'xyzzy') # => "axyzzycadabra" + s.sub(/bra/, 'xyzzy') # => "axyzzycadabra" + s.sub('nope', 'xyzzy') # => "abracadabra" + +With arguments +pattern+ and hash +replacement+ given, +replaces the first matching substring with a value from the given replacement hash, or removes it: + + h = {'a' => 'A', 'b' => 'B', 'c' => 'C'} + s.sub('b', h) # => "aBracadabra" + s.sub(/b/, h) # => "aBracadabra" + s.sub(/d/, h) # => "abracaabra" # 'd' removed. + +With argument +pattern+ and a block given, +calls the block with each matching substring; +replaces that substring with the block’s return value: + + s.sub('b') {|match| match.upcase } # => "aBracadabra" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/succ.rdoc b/doc/string/succ.rdoc new file mode 100644 index 0000000000..1b4b936a8e --- /dev/null +++ b/doc/string/succ.rdoc @@ -0,0 +1,52 @@ +Returns the successor to +self+. The successor is calculated by +incrementing characters. + +The first character to be incremented is the rightmost alphanumeric: +or, if no alphanumerics, the rightmost character: + + 'THX1138'.succ # => "THX1139" + '<<koala>>'.succ # => "<<koalb>>" + '***'.succ # => '**+' + 'ã“ã‚“ã«ã¡ã¯'.succ # => "ã“ã‚“ã«ã¡ã°" + +The successor to a digit is another digit, "carrying" to the next-left +character for a "rollover" from 9 to 0, and prepending another digit +if necessary: + + '00'.succ # => "01" + '09'.succ # => "10" + '99'.succ # => "100" + +The successor to a letter is another letter of the same case, +carrying to the next-left character for a rollover, +and prepending another same-case letter if necessary: + + 'aa'.succ # => "ab" + 'az'.succ # => "ba" + 'zz'.succ # => "aaa" + 'AA'.succ # => "AB" + 'AZ'.succ # => "BA" + 'ZZ'.succ # => "AAA" + +The successor to a non-alphanumeric character is the next character +in the underlying character set's collating sequence, +carrying to the next-left character for a rollover, +and prepending another character if necessary: + + s = 0.chr * 3 # => "\x00\x00\x00" + s.succ # => "\x00\x00\x01" + s = 255.chr * 3 # => "\xFF\xFF\xFF" + s.succ # => "\x01\x00\x00\x00" + +Carrying can occur between and among mixtures of alphanumeric characters: + + s = 'zz99zz99' # => "zz99zz99" + s.succ # => "aaa00aa00" + s = '99zz99zz' # => "99zz99zz" + s.succ # => "100aa00aa" + +The successor to an empty +String+ is a new empty +String+: + + ''.succ # => "" + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/sum.rdoc b/doc/string/sum.rdoc new file mode 100644 index 0000000000..22045e5f4d --- /dev/null +++ b/doc/string/sum.rdoc @@ -0,0 +1,12 @@ +Returns a basic +n+-bit {checksum}[https://en.wikipedia.org/wiki/Checksum] of the characters in +self+; +the checksum is the sum of the binary value of each byte in +self+, +modulo <tt>2**n - 1</tt>: + + 'hello'.sum # => 532 + 'hello'.sum(4) # => 4 + 'hello'.sum(64) # => 532 + 'ã“ã‚“ã«ã¡ã¯'.sum # => 2582 + +This is not a particularly strong checksum. + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/string/swapcase.rdoc b/doc/string/swapcase.rdoc new file mode 100644 index 0000000000..4353c8528a --- /dev/null +++ b/doc/string/swapcase.rdoc @@ -0,0 +1,31 @@ +Returns a string containing the characters in +self+, with cases reversed: + +- Each uppercase character is downcased. +- Each lowercase character is upcased. + +Examples: + + 'Hello'.swapcase # => "hELLO" + 'Straße'.swapcase # => "sTRASSE" + 'RubyGems.org'.swapcase # => "rUBYgEMS.ORG" + +The sizes of +self+ and the upcased result may differ: + + s = 'Straße' + s.size # => 6 + s.swapcase # => "sTRASSE" + s.swapcase.size # => 7 + +Some characters (and some character sets) do not have upcase and downcase versions; +see {Case Mapping}[rdoc-ref:case_mapping.rdoc]: + + s = '1, 2, 3, ...' + s.swapcase == s # => true + s = 'ã“ã‚“ã«ã¡ã¯' + s.swapcase == s # => true + +The casing is affected by the given +mapping+, +which may be +:ascii+, +:fold+, or +:turkic+; +see {Case Mappings}[rdoc-ref:case_mapping.rdoc@Case+Mappings]. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/unicode_normalize.rdoc b/doc/string/unicode_normalize.rdoc new file mode 100644 index 0000000000..5f733c0fb8 --- /dev/null +++ b/doc/string/unicode_normalize.rdoc @@ -0,0 +1,28 @@ +Returns a copy of +self+ with +{Unicode normalization}[https://unicode.org/reports/tr15] applied. + +Argument +form+ must be one of the following symbols +(see {Unicode normalization forms}[https://unicode.org/reports/tr15/#Norm_Forms]): + +- +:nfc+: Canonical decomposition, followed by canonical composition. +- +:nfd+: Canonical decomposition. +- +:nfkc+: Compatibility decomposition, followed by canonical composition. +- +:nfkd+: Compatibility decomposition. + +The encoding of +self+ must be one of: + +- <tt>Encoding::UTF_8</tt>. +- <tt>Encoding::UTF_16BE</tt>. +- <tt>Encoding::UTF_16LE</tt>. +- <tt>Encoding::UTF_32BE</tt>. +- <tt>Encoding::UTF_32LE</tt>. +- <tt>Encoding::GB18030</tt>. +- <tt>Encoding::UCS_2BE</tt>. +- <tt>Encoding::UCS_4BE</tt>. + +Examples: + + "a\u0300".unicode_normalize # => "à " # Lowercase 'a' with grave accens. + "a\u0300".unicode_normalize(:nfd) # => "aÌ€" # Same. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/upcase.rdoc b/doc/string/upcase.rdoc new file mode 100644 index 0000000000..ad859e8973 --- /dev/null +++ b/doc/string/upcase.rdoc @@ -0,0 +1,27 @@ +Returns a new string containing the upcased characters in +self+: + + 'hello'.upcase # => "HELLO" + 'straße'.upcase # => "STRASSE" + 'привет'.upcase # => "ПРИВЕТ" + 'RubyGems.org'.upcase # => "RUBYGEMS.ORG" + +The sizes of +self+ and the upcased result may differ: + + s = 'Straße' + s.size # => 6 + s.upcase # => "STRASSE" + s.upcase.size # => 7 + +Some characters (and some character sets) do not have upcase and downcase versions; +see {Case Mapping}[rdoc-ref:case_mapping.rdoc]: + + s = '1, 2, 3, ...' + s.upcase == s # => true + s = 'ã“ã‚“ã«ã¡ã¯' + s.upcase == s # => true + +The casing is affected by the given +mapping+, +which may be +:ascii+, +:fold+, or +:turkic+; +see {Case Mappings}[rdoc-ref:case_mapping.rdoc@Case+Mappings]. + +Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String]. diff --git a/doc/string/upto.rdoc b/doc/string/upto.rdoc new file mode 100644 index 0000000000..f860fe84fe --- /dev/null +++ b/doc/string/upto.rdoc @@ -0,0 +1,38 @@ +With a block given, calls the block with each +String+ value +returned by successive calls to String#succ; +the first value is +self+, the next is <tt>self.succ</tt>, and so on; +the sequence terminates when value +other_string+ is reached; +returns +self+: + + a = [] + 'a'.upto('f') {|c| a.push(c) } + a # => ["a", "b", "c", "d", "e", "f"] + + a = [] + 'Ж'.upto('П') {|c| a.push(c) } + a # => ["Ж", "З", "И", "Й", "К", "Л", "М", "Ð", "О", "П"] + + a = [] + 'よ'.upto('ã‚') {|c| a.push(c) } + a # => ["よ", "ら", "り", "ã‚‹", "れ", "ã‚"] + + a = [] + 'a8'.upto('b6') {|c| a.push(c) } + a # => ["a8", "a9", "b0", "b1", "b2", "b3", "b4", "b5", "b6"] + +If argument +exclusive+ is given as a truthy object, the last value is omitted: + + a = [] + 'a'.upto('f', true) {|c| a.push(c) } + a # => ["a", "b", "c", "d", "e"] + +If +other_string+ would not be reached, does not call the block: + + '25'.upto('5') {|s| fail s } + 'aa'.upto('a') {|s| fail s } + +With no block given, returns a new Enumerator: + + 'a8'.upto('b6') # => #<Enumerator: "a8":upto("b6")> + +Related: see {Iterating}[rdoc-ref:String@Iterating]. diff --git a/doc/string/valid_encoding_p.rdoc b/doc/string/valid_encoding_p.rdoc new file mode 100644 index 0000000000..e1db55174a --- /dev/null +++ b/doc/string/valid_encoding_p.rdoc @@ -0,0 +1,8 @@ +Returns whether +self+ is encoded correctly: + + s = 'Straße' + s.valid_encoding? # => true + s.encoding # => #<Encoding:UTF-8> + s.force_encoding(Encoding::ASCII).valid_encoding? # => false + +Related: see {Querying}[rdoc-ref:String@Querying]. diff --git a/doc/stringio/each_byte.rdoc b/doc/stringio/each_byte.rdoc new file mode 100644 index 0000000000..65e81c53a7 --- /dev/null +++ b/doc/stringio/each_byte.rdoc @@ -0,0 +1,34 @@ +With a block given, calls the block with each remaining byte in the stream; +positions the stream at end-of-file; +returns +self+: + + bytes = [] + strio = StringIO.new('hello') # Five 1-byte characters. + strio.each_byte {|byte| bytes.push(byte) } + strio.eof? # => true + bytes # => [104, 101, 108, 108, 111] + bytes = [] + strio = StringIO.new('теÑÑ‚') # Four 2-byte characters. + strio.each_byte {|byte| bytes.push(byte) } + bytes # => [209, 130, 208, 181, 209, 129, 209, 130] + bytes = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') # Five 3-byte characters. + strio.each_byte {|byte| bytes.push(byte) } + bytes # => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + +The position in the stream matters: + + bytes = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.getc # => "ã“" + strio.pos # => 3 # 3-byte character was read. + strio.each_byte {|byte| bytes.push(byte) } + bytes # => [227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + +If at end-of-file, does not call the block: + + strio.eof? # => true + strio.each_byte {|byte| fail 'Boo!' } + strio.eof? # => true + +With no block given, returns a new {Enumerator}[rdoc-ref:Enumerator]. diff --git a/doc/stringio/each_char.rdoc b/doc/stringio/each_char.rdoc new file mode 100644 index 0000000000..d0b5e4082c --- /dev/null +++ b/doc/stringio/each_char.rdoc @@ -0,0 +1,34 @@ +With a block given, calls the block with each remaining character in the stream; +positions the stream at end-of-file; +returns +self+: + + chars = [] + strio = StringIO.new('hello') + strio.each_char {|char| chars.push(char) } + strio.eof? # => true + chars # => ["h", "e", "l", "l", "o"] + chars = [] + strio = StringIO.new('теÑÑ‚') + strio.each_char {|char| chars.push(char) } + chars # => ["Ñ‚", "е", "Ñ", "Ñ‚"] + chars = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.each_char {|char| chars.push(char) } + chars # => ["ã“", "ã‚“", "ã«", "ã¡", "ã¯"] + +Stream position matters: + + chars = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.getc # => "ã“" + strio.pos # => 3 # 3-byte character was read. + strio.each_char {|char| chars.push(char) } + chars # => ["ã‚“", "ã«", "ã¡", "ã¯"] + +When at end-of-stream does not call the block: + + strio.eof? # => true + strio.each_char {|char| fail 'Boo!' } + strio.eof? # => true + +With no block given, returns a new {Enumerator}[rdoc-ref:Enumerator]. diff --git a/doc/stringio/each_codepoint.rdoc b/doc/stringio/each_codepoint.rdoc new file mode 100644 index 0000000000..ede16de599 --- /dev/null +++ b/doc/stringio/each_codepoint.rdoc @@ -0,0 +1,36 @@ +With a block given, calls the block with each successive codepoint from self; +sets the position to end-of-stream; +returns +self+. + +Each codepoint is the integer value for a character; returns self: + + codepoints = [] + strio = StringIO.new('hello') + strio.each_codepoint {|codepoint| codepoints.push(codepoint) } + strio.eof? # => true + codepoints # => [104, 101, 108, 108, 111] + codepoints = [] + strio = StringIO.new('теÑÑ‚') + strio.each_codepoint {|codepoint| codepoints.push(codepoint) } + codepoints # => [1090, 1077, 1089, 1090] + codepoints = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.each_codepoint {|codepoint| codepoints.push(codepoint) } + codepoints # => [12371, 12435, 12395, 12385, 12399] + +Position in the stream matters: + + codepoints = [] + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.getc # => "ã“" + strio.pos # => 3 + strio.each_codepoint {|codepoint| codepoints.push(codepoint) } + codepoints # => [12435, 12395, 12385, 12399] + +When at end-of-stream, the block is not called: + + strio.eof? # => true + strio.each_codepoint {|codepoint| fail 'Boo!' } + strio.eof? # => true + +With no block given, returns a new {Enumerator}[rdoc-ref:Enumerator]. diff --git a/doc/stringio/each_line.md b/doc/stringio/each_line.md new file mode 100644 index 0000000000..e29640a12a --- /dev/null +++ b/doc/stringio/each_line.md @@ -0,0 +1,189 @@ +With a block given calls the block with each remaining line (see "Position" below) in the stream; +returns `self`. + +Leaves stream position at end-of-stream. + +**No Arguments** + +With no arguments given, +reads lines using the default record separator +(global variable `$/`, whose initial value is `"\n"`). + +```ruby +strio = StringIO.new(TEXT) +strio.each_line {|line| p line } +strio.eof? # => true +``` + +Output: + +``` +"First line\n" +"Second line\n" +"\n" +"Fourth line\n" +"Fifth line\n" +``` + +**Argument `sep`** + +With only string argument `sep` given, +reads lines using that string as the record separator: + +```ruby +strio = StringIO.new(TEXT) +strio.each_line(' ') {|line| p line } +``` + +Output: + +``` +"First " +"line\nSecond " +"line\n\nFourth " +"line\nFifth " +"line\n" +``` + +**Argument `limit`** + +With only integer argument `limit` given, +reads lines using the default record separator; +also limits the size (in characters) of each line to the given limit: + +```ruby +strio = StringIO.new(TEXT) +strio.each_line(10) {|line| p line } +``` + +Output: + +``` +"First line" +"\n" +"Second lin" +"e\n" +"\n" +"Fourth lin" +"e\n" +"Fifth line" +"\n" +``` + +**Arguments `sep` and `limit`** + +With arguments `sep` and `limit` both given, +honors both: + +```ruby +strio = StringIO.new(TEXT) +strio.each_line(' ', 10) {|line| p line } +``` + +Output: + +``` +"First " +"line\nSecon" +"d " +"line\n\nFour" +"th " +"line\nFifth" +" " +"line\n" +``` + +**Position** + +As stated above, method `each` _remaining_ line in the stream. + +In the examples above each `strio` object starts with its position at beginning-of-stream; +but in other cases the position may be anywhere (see StringIO#pos): + +```ruby +strio = StringIO.new(TEXT) +strio.pos = 30 # Set stream position to character 30. +strio.each_line {|line| p line } +``` + +Output: + +``` +" line\n" +"Fifth line\n" +``` + +In all the examples above, the stream position is at the beginning of a character; +in other cases, that need not be so: + +```ruby +s = 'ã“ã‚“ã«ã¡ã¯' # Five 3-byte characters. +strio = StringIO.new(s) +strio.pos = 3 # At beginning of second character. +strio.each_line {|line| p line } +strio.pos = 4 # At second byte of second character. +strio.each_line {|line| p line } +strio.pos = 5 # At third byte of second character. +strio.each_line {|line| p line } +``` + +Output: + +``` +"ã‚“ã«ã¡ã¯" +"\x82\x93ã«ã¡ã¯" +"\x93ã«ã¡ã¯" +``` + +**Special Record Separators** + +Like some methods in class `IO`, StringIO.each honors two special record separators; +see {Special Line Separators}[https://docs.ruby-lang.org/en/master/IO.html#class-IO-label-Special+Line+Separator+Values]. + +```ruby +strio = StringIO.new(TEXT) +strio.each_line('') {|line| p line } # Read as paragraphs (separated by blank lines). +``` + +Output: + +``` +"First line\nSecond line\n\n" +"Fourth line\nFifth line\n" +``` + +```ruby +strio = StringIO.new(TEXT) +strio.each_line(nil) {|line| p line } # "Slurp"; read it all. +``` + +Output: + +``` +"First line\nSecond line\n\nFourth line\nFifth line\n" +``` + +**Keyword Argument `chomp`** + +With keyword argument `chomp` given as `true` (the default is `false`), +removes trailing newline (if any) from each line: + +```ruby +strio = StringIO.new(TEXT) +strio.each_line(chomp: true) {|line| p line } +``` + +Output: + +``` +"First line" +"Second line" +"" +"Fourth line" +"Fifth line" +``` + +With no block given, returns a new {Enumerator}[https://docs.ruby-lang.org/en/master/Enumerator.html]. + + +Related: StringIO.each_byte, StringIO.each_char, StringIO.each_codepoint. diff --git a/doc/stringio/getbyte.rdoc b/doc/stringio/getbyte.rdoc new file mode 100644 index 0000000000..5e524941bc --- /dev/null +++ b/doc/stringio/getbyte.rdoc @@ -0,0 +1,31 @@ +Reads and returns the next integer byte (not character) from the stream: + + s = 'foo' + s.bytes # => [102, 111, 111] + strio = StringIO.new(s) + strio.getbyte # => 102 + strio.getbyte # => 111 + strio.getbyte # => 111 + +Returns +nil+ if at end-of-stream: + + strio.eof? # => true + strio.getbyte # => nil + +Returns a byte, not a character: + + s = 'Привет' + s.bytes + # => [208, 159, 209, 128, 208, 184, 208, 178, 208, 181, 209, 130] + strio = StringIO.new(s) + strio.getbyte # => 208 + strio.getbyte # => 159 + + s = 'ã“ã‚“ã«ã¡ã¯' + s.bytes + # => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175] + strio = StringIO.new(s) + strio.getbyte # => 227 + strio.getbyte # => 129 + +Related: #each_byte, #ungetbyte, #getc. diff --git a/doc/stringio/getc.rdoc b/doc/stringio/getc.rdoc new file mode 100644 index 0000000000..b2ab46843c --- /dev/null +++ b/doc/stringio/getc.rdoc @@ -0,0 +1,34 @@ +Reads and returns the next character (or byte; see below) from the stream: + + strio = StringIO.new('foo') + strio.getc # => "f" + strio.getc # => "o" + strio.getc # => "o" + +Returns +nil+ if at end-of-stream: + + strio.eof? # => true + strio.getc # => nil + +Returns characters, not bytes: + + strio = StringIO.new('Привет') + strio.getc # => "П" + strio.getc # => "Ñ€" + + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') + strio.getc # => "ã“" + strio.getc # => "ã‚“" + +In each of the examples above, the stream is positioned at the beginning of a character; +in other cases that need not be true: + + strio = StringIO.new('ã“ã‚“ã«ã¡ã¯') # Five 3-byte characters. + strio.pos = 3 # => 3 # At beginning of second character; returns character. + strio.getc # => "ã‚“" + strio.pos = 4 # => 4 # At second byte of second character; returns byte. + strio.getc # => "\x82" + strio.pos = 5 # => 5 # At third byte of second character; returns byte. + strio.getc # => "\x93" + +Related: #getbyte, #putc, #ungetc. diff --git a/doc/stringio/gets.rdoc b/doc/stringio/gets.rdoc new file mode 100644 index 0000000000..bbefeb008a --- /dev/null +++ b/doc/stringio/gets.rdoc @@ -0,0 +1,99 @@ +Reads and returns a line from the stream; +returns +nil+ if at end-of-stream. + +Side effects: + +- Increments stream position by the number of bytes read. +- Assigns the return value to global variable <tt>$_</tt>. + +With no arguments given, reads a line using the default record separator +(global variable <tt>$/</tt>,* whose initial value is <tt>"\n"</tt>): + + strio = StringIO.new(TEXT) + strio.pos # => 0 + strio.gets # => "First line\n" + strio.pos # => 11 + $_ # => "First line\n" + strio.gets # => "Second line\n" + strio.read # => "\nFourth line\nFifth line\n" + strio.eof? # => true + strio.gets # => nil + + strio = StringIO.new('Привет') # Six 2-byte characters + strio.pos # => 0 + strio.gets # => "Привет" + strio.pos # => 12 + +<b>Argument +sep+</b> + +With only string argument +sep+ given, reads a line using that string as the record separator: + + strio = StringIO.new(TEXT) + strio.gets(' ') # => "First " + strio.gets(' ') # => "line\nSecond " + strio.gets(' ') # => "line\n\nFourth " + +<b>Argument +limit+</b> + +With only integer argument +limit+ given, +reads a line using the default record separator; +limits the size (in characters) of each line to the given limit: + + strio = StringIO.new(TEXT) + strio.gets(10) # => "First line" + strio.gets(10) # => "\n" + strio.gets(10) # => "Second lin" + strio.gets(10) # => "e\n" + +<b>Arguments +sep+ and +limit+</b> + +With arguments +sep+ and +limit+ both given, honors both: + + strio = StringIO.new(TEXT) + strio.gets(' ', 10) # => "First " + strio.gets(' ', 10) # => "line\nSecon" + strio.gets(' ', 10) # => "d " + +<b>Position</b> + +As stated above, method +gets+ reads and returns the next line in the stream. + +In the examples above each +strio+ object starts with its position at beginning-of-stream; +but in other cases the position may be anywhere: + + strio = StringIO.new(TEXT) + strio.pos = 12 + strio.gets # => "econd line\n" + +The position need not be at a character boundary: + + strio = StringIO.new('Привет') # Six 2-byte characters. + strio.pos = 2 # At beginning of second character. + strio.gets # => "ривет" + strio.pos = 3 # In middle of second character. + strio.gets # => "\x80ивет" + +<b>Special Record Separators</b> + +Like some methods in class IO, method +gets+ honors two special record separators; +see {Special Line Separators}[https://docs.ruby-lang.org/en/master/IO.html#class-IO-label-Special+Line+Separator+Values]: + + strio = StringIO.new(TEXT) + strio.gets('') # Read "paragraph" (up to empty line). + # => "First line\nSecond line\n\n" + + strio = StringIO.new(TEXT) + strio.gets(nil) # "Slurp": read all. + # => "First line\nSecond line\n\nFourth line\nFifth line\n" + +<b>Keyword Argument +chomp+</b> + +With keyword argument +chomp+ given as +true+ (the default is +false+), +removes the trailing newline (if any) from the returned line: + + strio = StringIO.new(TEXT) + strio.gets # => "First line\n" + strio.gets(chomp: true) # => "Second line" + +Related: #each_line, #readlines, +{Kernel#puts}[rdoc-ref:Kernel#puts]. diff --git a/doc/stringio/pread.rdoc b/doc/stringio/pread.rdoc new file mode 100644 index 0000000000..2dcbc18ad8 --- /dev/null +++ b/doc/stringio/pread.rdoc @@ -0,0 +1,65 @@ +**Note**: \Method +pread+ is different from other reading methods +in that it does not modify +self+ in any way; +thus, multiple threads may read safely from the same stream. + +Reads up to +maxlen+ bytes from the stream, +beginning at 0-based byte offset +offset+; +returns a string containing the read bytes. + +The returned string: + +- Contains +maxlen+ bytes from the stream, if available; + otherwise contains all available bytes. +- Has encoding +Encoding::ASCII_8BIT+. + +With only arguments +maxlen+ and +offset+ given, +returns a new string: + + english = 'Hello' # Five 1-byte characters. + strio = StringIO.new(english) + strio.pread(3, 0) # => "Hel" + strio.pread(3, 2) # => "llo" + strio.pread(0, 0) # => "" + strio.pread(50, 0) # => "Hello" + strio.pread(50, 2) # => "llo" + strio.pread(50, 4) # => "o" + strio.pread(0, 0).encoding + # => #<Encoding:BINARY (ASCII-8BIT)> + + russian = 'Привет' # Six 2-byte characters. + strio = StringIO.new(russian) + strio.pread(50, 0) # All 12 bytes. + # => "\xD0\x9F\xD1\x80\xD0\xB8\xD0\xB2\xD0\xB5\xD1\x82" + strio.pread(3, 0) # => "\xD0\x9F\xD1" + strio.pread(3, 3) # => "\x80\xD0\xB8" + strio.pread(0, 0).encoding + # => #<Encoding:BINARY (ASCII-8BIT)> + + japanese = 'ã“ã‚“ã«ã¡ã¯' # Five 3-byte characters. + strio = StringIO.new(japanese) + strio.pread(50, 0) # All 15 bytes. + # => "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF" + strio.pread(6, 0) # => "\xE3\x81\x93\xE3\x82\x93" + strio.pread(1, 2) # => "\x93" + strio.pread(0, 0).encoding + # => #<Encoding:BINARY (ASCII-8BIT)> + +Raises an exception if +offset+ is out-of-range: + + strio = StringIO.new(english) + strio.pread(5, 50) # Raises EOFError: end of file reached + +With string argument +out_string+ given: + +- Reads as above. +- Overwrites the content of +out_string+ with the read bytes. + +Examples: + + out_string = 'Will be overwritten' + out_string.encoding # => #<Encoding:UTF-8> + result = StringIO.new(english).pread(50, 0, out_string) + result.__id__ == out_string.__id__ # => true + out_string # => "Hello" + out_string.encoding # => #<Encoding:BINARY (ASCII-8BIT)> + diff --git a/doc/stringio/putc.rdoc b/doc/stringio/putc.rdoc new file mode 100644 index 0000000000..4636ffa0db --- /dev/null +++ b/doc/stringio/putc.rdoc @@ -0,0 +1,82 @@ +Replaces one or more bytes at position +pos+ +with bytes of the given argument; +advances the position by the count of bytes written; +returns the argument. + +\StringIO object for 1-byte characters. + + strio = StringIO.new('foo') + strio.pos # => 0 + +With 1-byte argument, replaces one byte: + + strio.putc('b') + strio.string # => "boo" + strio.pos # => 1 + strio.putc('a') # => "a" + strio.string # => "bao" + strio.pos # => 2 + strio.putc('r') # => "r" + strio.string # => "bar" + strio.pos # => 3 + strio.putc('n') # => "n" + strio.string # => "barn" + strio.pos # => 4 + +Fills with null characters if necessary: + + strio.pos = 6 + strio.putc('x') # => "x" + strio.string # => "barn\u0000\u0000x" + strio.pos # => 7 + +With integer argument, replaces one byte with the low-order byte of the integer: + + strio = StringIO.new('foo') + strio.putc(70) + strio.string # => "Foo" + strio.putc(79) + strio.string # => "FOo" + strio.putc(79 + 1024) + strio.string # => "FOO" + +\StringIO object for Multi-byte characters: + + greek = 'αβγδε' # Five 2-byte characters. + strio = StringIO.new(greek) + strio.string# => "αβγδε" + strio.string.b # => "\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5" + strio.string.bytesize # => 10 + strio.string.chars # => ["α", "β", "γ", "δ", "ε"] + strio.string.size # => 5 + +With 1-byte argument, replaces one byte of the string: + + strio.putc(' ') # 1-byte ascii space. + strio.pos # => 1 + strio.string # => " \xB1βγδε" + strio.string.b # => " \xB1\xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5" + strio.string.bytesize # => 10 + strio.string.chars # => [" ", "\xB1", "β", "γ", "δ", "ε"] + strio.string.size # => 6 + + strio.putc(' ') + strio.pos # => 2 + strio.string # => " βγδε" + strio.string.b # => " \xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5" + strio.string.bytesize # => 10 + strio.string.chars # => [" ", " ", "β", "γ", "δ", "ε"] + strio.string.size # => 6 + +With 2-byte argument, replaces two bytes of the string: + + strio.rewind + strio.putc('α') + strio.pos # => 2 + strio.string # => "αβγδε" + strio.string.b # => "\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5" + strio.string.bytesize # => 10 + strio.string.chars # => ["α", "β", "γ", "δ", "ε"] + strio.string.size # => 5 + +Related: #getc, #ungetc. diff --git a/doc/stringio/read.rdoc b/doc/stringio/read.rdoc new file mode 100644 index 0000000000..46b9fa349f --- /dev/null +++ b/doc/stringio/read.rdoc @@ -0,0 +1,83 @@ +Reads and returns a string containing bytes read from the stream, +beginning at the current position; +advances the position by the count of bytes read. + +With no arguments given, +reads all remaining bytes in the stream; +returns a new string containing bytes read: + + strio = StringIO.new('Hello') # Five 1-byte characters. + strio.read # => "Hello" + strio.pos # => 5 + strio.read # => "" + StringIO.new('').read # => "" + +With non-negative argument +maxlen+ given, +reads +maxlen+ bytes as available; +returns a new string containing the bytes read, or +nil+ if none: + + strio.rewind + strio.read(3) # => "Hel" + strio.read(3) # => "lo" + strio.read(3) # => nil + + russian = 'Привет' # Six 2-byte characters. + russian.b + # => "\xD0\x9F\xD1\x80\xD0\xB8\xD0\xB2\xD0\xB5\xD1\x82" + strio = StringIO.new(russian) + strio.read(6) # => "\xD0\x9F\xD1\x80\xD0\xB8" + strio.read(6) # => "\xD0\xB2\xD0\xB5\xD1\x82" + strio.read(6) # => nil + + japanese = 'ã“ã‚“ã«ã¡ã¯' + japanese.b + # => "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF" + strio = StringIO.new(japanese) + strio.read(9) # => "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB" + strio.read(9) # => "\xE3\x81\xA1\xE3\x81\xAF" + strio.read(9) # => nil + +With argument +max_len+ as +nil+ and string argument +out_string+ given, +reads the remaining bytes in the stream; +clears +out_string+ and writes the bytes into it; +returns +out_string+: + + out_string = 'Will be overwritten' + strio = StringIO.new('Hello') + strio.read(nil, out_string) # => "Hello" + strio.read(nil, out_string) # => "" + +With non-negative argument +maxlen+ and string argument +out_string+ given, +reads the +maxlen bytes from the stream, as availble; +clears +out_string+ and writes the bytes into it; +returns +out_string+ if any bytes were read, or +nil+ if none: + + out_string = 'Will be overwritten' + strio = StringIO.new('Hello') + strio.read(3, out_string) # => "Hel" + strio.read(3, out_string) # => "lo" + strio.read(3, out_string) # => nil + + out_string = 'Will be overwritten' + strio = StringIO.new(russian) + strio.read(6, out_string) # => "При" + strio.read(6, out_string) # => "вет" + strio.read(6, out_string) # => nil + strio.rewind + russian.b + # => "\xD0\x9F\xD1\x80\xD0\xB8\xD0\xB2\xD0\xB5\xD1\x82" + strio.read(3) # => "\xD0\x9F\xD1" + strio.read(3) # => "\x80\xD0\xB8" + + out_string = 'Will be overwritten' + strio = StringIO.new(japanese) + strio.read(9, out_string) # => "ã“ã‚“ã«" + strio.read(9, out_string) # => "ã¡ã¯" + strio.read(9, out_string) # => nil + strio.rewind + japanese.b + # => "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF" + strio.read(4) # => "\xE3\x81\x93\xE3" + strio.read(4) # => "\x82\x93\xE3\x81" + +Related: #gets, #readlines. diff --git a/doc/stringio/size.rdoc b/doc/stringio/size.rdoc new file mode 100644 index 0000000000..9323adf8c3 --- /dev/null +++ b/doc/stringio/size.rdoc @@ -0,0 +1,5 @@ +Returns the number of bytes in the string in +self+: + + StringIO.new('hello').size # => 5 # Five 1-byte characters. + StringIO.new('теÑÑ‚').size # => 8 # Four 2-byte characters. + StringIO.new('ã“ã‚“ã«ã¡ã¯').size # => 15 # Five 3-byte characters. diff --git a/doc/stringio/stringio.md b/doc/stringio/stringio.md new file mode 100644 index 0000000000..8931d1c30c --- /dev/null +++ b/doc/stringio/stringio.md @@ -0,0 +1,700 @@ +\Class \StringIO supports accessing a string as a stream, +similar in some ways to [class IO][io class]. + +You can create a \StringIO instance using: + +- StringIO.new: returns a new \StringIO object containing the given string. +- StringIO.open: passes a new \StringIO object to the given block. + +Like an \IO stream, a \StringIO stream has certain properties: + +- **Read/write mode**: whether the stream may be read, written, appended to, etc.; + see [Read/Write Mode][read/write mode]. +- **Data mode**: text-only or binary; + see [Data Mode][data mode]. +- **Encodings**: internal and external encodings; + see [Encodings][encodings]. +- **Position**: where in the stream the next read or write is to occur; + see [Position][position]. +- **Line number**: a special, line-oriented, "position" (different from the position mentioned above); + see [Line Number][line number]. +- **Open/closed**: whether the stream is open or closed, for reading or writing. + see [Open/Closed Streams][open/closed streams]. +- **BOM**: byte mark order; + see [Byte Order Mark][bom (byte order mark)]. + +## About the Examples + +Examples on this page assume that \StringIO has been required: + +```ruby +require 'stringio' +``` + +And that this constant has been defined: + +```ruby +TEXT = <<EOT +First line +Second line + +Fourth line +Fifth line +EOT +``` + +## Stream Properties + +### Read/Write Mode + +#### Summary + +| Mode | Initial Clear? | Read | Write | +|:--------------------------:|:--------------:|:--------:|:--------:| +| <tt>'r'</tt>: read-only | No | Anywhere | Error | +| <tt>'w'</tt>: write-only | Yes | Error | Anywhere | +| <tt>'a'</tt>: append-only | No | Error | End only | +| <tt>'r+'</tt>: read/write | No | Anywhere | Anywhere | +| <tt>'w+'</tt>: read-write | Yes | Anywhere | Anywhere | +| <tt>'a+'</tt>: read/append | No | Anywhere | End only | + +Each section below describes a read/write mode. + +Any of the modes may be given as a string or as file constants; +example: + +```ruby +strio = StringIO.new('foo', 'a') +strio = StringIO.new('foo', File::WRONLY | File::APPEND) +``` + +#### `'r'`: Read-Only + +Mode specified as one of: + +- String: `'r'`. +- Constant: `File::RDONLY`. + +Initial state: + +```ruby +strio = StringIO.new('foobarbaz', 'r') +strio.pos # => 0 # Beginning-of-stream. +strio.string # => "foobarbaz" # Not cleared. +``` + +May be read anywhere: + +```ruby +strio.gets(3) # => "foo" +strio.gets(3) # => "bar" +strio.pos = 9 +strio.gets(3) # => nil +``` + +May not be written: + +```ruby +strio.write('foo') # Raises IOError: not opened for writing +``` + +#### `'w'`: Write-Only + +Mode specified as one of: + +- String: `'w'`. +- Constant: `File::WRONLY`. + +Initial state: + +```ruby +strio = StringIO.new('foo', 'w') +strio.pos # => 0 # Beginning of stream. +strio.string # => "" # Initially cleared. +``` + +May be written anywhere (even past end-of-stream): + +```ruby +strio.write('foobar') +strio.string # => "foobar" +strio.rewind +strio.write('FOO') +strio.string # => "FOObar" +strio.pos = 3 +strio.write('BAR') +strio.string # => "FOOBAR" +strio.pos = 9 +strio.write('baz') +strio.string # => "FOOBAR\u0000\u0000\u0000baz" # Null-padded. +``` + +May not be read: + +```ruby +strio.read # Raises IOError: not opened for reading +``` + +#### `'a'`: Append-Only + +Mode specified as one of: + +- String: `'a'`. +- Constant: `File::WRONLY | File::APPEND`. + +Initial state: + +```ruby +strio = StringIO.new('foo', 'a') +strio.pos # => 0 # Beginning-of-stream. +strio.string # => "foo" # Not cleared. +``` + +May be written only at the end; position does not affect writing: + +```ruby +strio.write('bar') +strio.string # => "foobar" +strio.write('baz') +strio.string # => "foobarbaz" +strio.pos = 400 +strio.write('bat') +strio.string # => "foobarbazbat" +``` + +May not be read: + +```ruby +strio.gets # Raises IOError: not opened for reading +``` + +#### `'r+'`: Read/Write + +Mode specified as one of: + +- String: `'r+'`. +- Constant: `File::RDRW`. + +Initial state: + +```ruby +strio = StringIO.new('foobar', 'r+') +strio.pos # => 0 # Beginning-of-stream. +strio.string # => "foobar" # Not cleared. +``` + +May be written anywhere (even past end-of-stream): + +```ruby +strio.write('FOO') +strio.string # => "FOObar" +strio.write('BAR') +strio.string # => "FOOBAR" +strio.write('BAZ') +strio.string # => "FOOBARBAZ" +strio.pos = 12 +strio.write('BAT') +strio.string # => "FOOBARBAZ\u0000\u0000\u0000BAT" # Null padded. +``` + +May be read anywhere: + +```ruby +strio.pos = 0 +strio.gets(3) # => "FOO" +strio.pos = 6 +strio.gets(3) # => "BAZ" +strio.pos = 400 +strio.gets(3) # => nil +``` + +#### `'w+'`: Read/Write (Initially Clear) + +Mode specified as one of: + +- String: `'w+'`. +- Constant: `File::RDWR | File::TRUNC`. + +Initial state: + +```ruby +strio = StringIO.new('foo', 'w+') +strio.pos # => 0 # Beginning-of-stream. +strio.string # => "" # Truncated. +``` + +May be written anywhere (even past end-of-stream): + +```ruby +strio.write('foobar') +strio.string # => "foobar" +strio.rewind +strio.write('FOO') +strio.string # => "FOObar" +strio.write('BAR') +strio.string # => "FOOBAR" +strio.write('BAZ') +strio.string # => "FOOBARBAZ" +strio.pos = 12 +strio.write('BAT') +strio.string # => "FOOBARBAZ\u0000\u0000\u0000BAT" # Null-padded. +``` + +May be read anywhere: + +```ruby +strio.rewind +strio.gets(3) # => "FOO" +strio.gets(3) # => "BAR" +strio.pos = 12 +strio.gets(3) # => "BAT" +strio.pos = 400 +strio.gets(3) # => nil +``` + +#### `'a+'`: Read/Append + +Mode specified as one of: + +- String: `'a+'`. +- Constant: `File::RDWR | File::APPEND`. + +Initial state: + +```ruby +strio = StringIO.new('foo', 'a+') +strio.pos # => 0 # Beginning-of-stream. +strio.string # => "foo" # Not cleared. +``` + +May be written only at the end; #rewind; position does not affect writing: + +```ruby +strio.write('bar') +strio.string # => "foobar" +strio.write('baz') +strio.string # => "foobarbaz" +strio.pos = 400 +strio.write('bat') +strio.string # => "foobarbazbat" +``` + +May be read anywhere: + +```ruby +strio.rewind +strio.gets(3) # => "foo" +strio.gets(3) # => "bar" +strio.pos = 9 +strio.gets(3) # => "bat" +strio.pos = 400 +strio.gets(3) # => nil +``` + +### Data Mode + +To specify whether the stream is to be treated as text or as binary data, +either of the following may be suffixed to any of the string read/write modes above: + +- `'t'`: Text; + initializes the encoding as Encoding::UTF_8. +- `'b'`: Binary; + initializes the encoding as Encoding::ASCII_8BIT. + +If neither is given, the stream defaults to text data. + +Examples: + +```ruby +strio = StringIO.new('foo', 'rt') +strio.external_encoding # => #<Encoding:UTF-8> +data = "\u9990\u9991\u9992\u9993\u9994" +strio = StringIO.new(data, 'rb') +strio.external_encoding # => #<Encoding:BINARY (ASCII-8BIT)> +``` + +When the data mode is specified, the read/write mode may not be omitted: + +```ruby +StringIO.new(data, 'b') # Raises ArgumentError: invalid access mode b +``` + +A text stream may be changed to binary by calling instance method #binmode; +a binary stream may not be changed to text. + +### Encodings + +A stream has an encoding; see [Encodings][encodings document]. + +The initial encoding for a new or re-opened stream depends on its [data mode][data mode]: + +- Text: `Encoding::UTF_8`. +- Binary: `Encoding::ASCII_8BIT`. + +These instance methods are relevant: + +- #external_encoding: returns the current encoding of the stream as an `Encoding` object. +- #internal_encoding: returns +nil+; a stream does not have an internal encoding. +- #set_encoding: sets the encoding for the stream. +- #set_encoding_by_bom: sets the encoding for the stream to the stream's BOM (byte order mark). + +Examples: + +```ruby +strio = StringIO.new('foo', 'rt') # Text mode. +strio.external_encoding # => #<Encoding:UTF-8> +data = "\u9990\u9991\u9992\u9993\u9994" +strio = StringIO.new(data, 'rb') # Binary mode. +strio.external_encoding # => #<Encoding:BINARY (ASCII-8BIT)> +strio = StringIO.new('foo') +strio.external_encoding # => #<Encoding:UTF-8> +strio.set_encoding('US-ASCII') +strio.external_encoding # => #<Encoding:US-ASCII> +``` + +### Position + +A stream has a _position_, and integer offset (in bytes) into the stream. +The initial position of a stream is zero. + +#### Getting and Setting the Position + +Each of these methods initializes (to zero) the position of a new or re-opened stream: + +- ::new: returns a new stream. +- ::open: passes a new stream to the block. +- #reopen: re-initializes the stream. + +Each of these methods queries, gets, or sets the position, without otherwise changing the stream: + +- #eof?: returns whether the position is at end-of-stream. +- #pos: returns the position. +- #pos=: sets the position. +- #rewind: sets the position to zero. +- #seek: sets the position. + +Examples: + +```ruby +strio = StringIO.new('foobar') +strio.pos # => 0 +strio.pos = 3 +strio.pos # => 3 +strio.eof? # => false +strio.rewind +strio.pos # => 0 +strio.seek(0, IO::SEEK_END) +strio.pos # => 6 +strio.eof? # => true +``` + +#### Position Before and After Reading + +Except for #pread, a stream reading method (see [Basic Reading][basic reading]) +begins reading at the current position. + +Except for #pread, a read method advances the position past the read substring. + +Examples: + +```ruby +strio = StringIO.new(TEXT) +strio.string # => "First line\nSecond line\n\nFourth line\nFifth line\n" +strio.pos # => 0 +strio.getc # => "F" +strio.pos # => 1 +strio.gets # => "irst line\n" +strio.pos # => 11 +strio.pos = 24 +strio.gets # => "Fourth line\n" +strio.pos # => 36 + +strio = StringIO.new('теÑÑ‚') # Four 2-byte characters. +strio.pos = 0 # At first byte of first character. +strio.read # => "теÑÑ‚" +strio.pos = 1 # At second byte of first character. +strio.read # => "\x82еÑÑ‚" +strio.pos = 2 # At first of second character. +strio.read # => "еÑÑ‚" + +strio = StringIO.new(TEXT) +strio.pos = 15 +a = [] +strio.each_line {|line| a.push(line) } +a # => ["nd line\n", "\n", "Fourth line\n", "Fifth line\n"] +strio.pos # => 47 ## End-of-stream. +``` + +#### Position Before and After Writing + +Each of these methods begins writing at the current position, +and advances the position to the end of the written substring: + +- #putc: writes the given character. +- #write: writes the given objects as strings. +- [Kernel#puts][kernel#puts]: writes given objects as strings, each followed by newline. + +Examples: + +```ruby +strio = StringIO.new('foo') +strio.pos # => 0 +strio.putc('b') +strio.string # => "boo" +strio.pos # => 1 +strio.write('r') +strio.string # => "bro" +strio.pos # => 2 +strio.puts('ew') +strio.string # => "brew\n" +strio.pos # => 5 +strio.pos = 8 +strio.write('foo') +strio.string # => "brew\n\u0000\u0000\u0000foo" +strio.pos # => 11 +``` + +Each of these methods writes _before_ the current position, and decrements the position +so that the written data is next to be read: + +- #ungetbyte: unshifts the given byte. +- #ungetc: unshifts the given character. + +Examples: + +```ruby +strio = StringIO.new('foo') +strio.pos = 2 +strio.ungetc('x') +strio.pos # => 1 +strio.string # => "fxo" +strio.ungetc('x') +strio.pos # => 0 +strio.string # => "xxo" +``` + +This method does not affect the position: + +- #truncate: truncates the stream's string to the given size. + +Examples: + +```ruby +strio = StringIO.new('foobar') +strio.pos # => 0 +strio.truncate(3) +strio.string # => "foo" +strio.pos # => 0 +strio.pos = 500 +strio.truncate(0) +strio.string # => "" +strio.pos # => 500 +``` + +### Line Number + +A stream has a line number, which initially is zero: + +- Method #lineno returns the line number. +- Method #lineno= sets the line number. + +The line number can be affected by reading (but never by writing); +in general, the line number is incremented each time the record separator (default: `"\n"`) is read. + +Examples: + +```ruby +strio = StringIO.new(TEXT) +strio.string # => "First line\nSecond line\n\nFourth line\nFifth line\n" +strio.lineno # => 0 +strio.gets # => "First line\n" +strio.lineno # => 1 +strio.getc # => "S" +strio.lineno # => 1 +strio.gets # => "econd line\n" +strio.lineno # => 2 +strio.gets # => "\n" +strio.lineno # => 3 +strio.gets # => "Fourth line\n" +strio.lineno # => 4 +``` + +Setting the position does not affect the line number: + +```ruby +strio.pos = 0 +strio.lineno # => 4 +strio.gets # => "First line\n" +strio.pos # => 11 +strio.lineno # => 5 +``` + +And setting the line number does not affect the position: + +```ruby +strio.lineno = 10 +strio.pos # => 11 +strio.gets # => "Second line\n" +strio.lineno # => 11 +strio.pos # => 23 +``` + +### Open/Closed Streams + +A new stream is open for either reading or writing, and may be open for both; +see [Read/Write Mode][read/write mode]. + +Each of these methods initializes the read/write mode for a new or re-opened stream: + +- ::new: returns a new stream. +- ::open: passes a new stream to the block. +- #reopen: re-initializes the stream. + +Other relevant methods: + +- #close: closes the stream for both reading and writing. +- #close_read: closes the stream for reading. +- #close_write: closes the stream for writing. +- #closed?: returns whether the stream is closed for both reading and writing. +- #closed_read?: returns whether the stream is closed for reading. +- #closed_write?: returns whether the stream is closed for writing. + +### BOM (Byte Order Mark) + +The string provided for ::new, ::open, or #reopen +may contain an optional [BOM][bom] (byte order mark) at the beginning of the string; +the BOM can affect the stream's encoding. + +The BOM (if provided): + +- Is stored as part of the stream's string. +- Does _not_ immediately affect the encoding. +- Is _initially_ considered part of the stream. + +```ruby +utf8_bom = "\xEF\xBB\xBF" +string = utf8_bom + 'foo' +string.bytes # => [239, 187, 191, 102, 111, 111] +strio.string.bytes.take(3) # => [239, 187, 191] # The BOM. +strio = StringIO.new(string, 'rb') +strio.string.bytes # => [239, 187, 191, 102, 111, 111] # BOM is part of the stored string. +strio.external_encoding # => #<Encoding:BINARY (ASCII-8BIT)> # Default for a binary stream. +strio.gets # => "\xEF\xBB\xBFfoo" # BOM is part of the stream. +``` + +You can call instance method #set_encoding_by_bom to "activate" the stored BOM; +after doing so the BOM: + +- Is _still_ stored as part of the stream's string. +- _Determines_ (and may have changed) the stream's encoding. +- Is _no longer_ considered part of the stream. + +```ruby +strio.set_encoding_by_bom +strio.string.bytes # => [239, 187, 191, 102, 111, 111] # BOM is still part of the stored string. +strio.external_encoding # => #<Encoding:UTF-8> # The new encoding. +strio.rewind # => 0 +strio.gets # => "foo" # BOM is not part of the stream. +``` + +## Basic Stream \IO + +### Basic Reading + +You can read from the stream using these instance methods: + +- #getbyte: reads and returns the next byte. +- #getc: reads and returns the next character. +- #gets: reads and returns all or part of the next line. +- #read: reads and returns all or part of the remaining data in the stream. +- #readlines: reads the remaining data the stream and returns an array of its lines. +- [Kernel#readline][kernel#readline]: like #gets, but raises an exception if at end-of-stream. + +You can iterate over the stream using these instance methods: + +- #each_byte: reads each remaining byte, passing it to the block. +- #each_char: reads each remaining character, passing it to the block. +- #each_codepoint: reads each remaining codepoint, passing it to the block. +- #each_line: reads all or part of each remaining line, passing the read string to the block + +This instance method is useful in a multi-threaded application: + +- #pread: reads and returns all or part of the stream. + +### Basic Writing + +You can write to the stream, advancing the position, using these instance methods: + +- #putc: writes a given character. +- #write: writes the given objects as strings. +- [Kernel#puts][kernel#puts] writes given objects as strings, each followed by newline. + +You can "unshift" to the stream using these instance methods; +each writes _before_ the current position, and decrements the position +so that the written data is next to be read. + +- #ungetbyte: unshifts the given byte. +- #ungetc: unshifts the given character. + +One more writing method: + +- #truncate: truncates the stream's string to the given size. + +## Line \IO + +Reading: + +- #gets: reads and returns the next line. +- [Kernel#readline][kernel#readline]: like #gets, but raises an exception if at end-of-stream. +- #readlines: reads the remaining data the stream and returns an array of its lines. +- #each_line: reads each remaining line, passing it to the block + +Writing: + +- [Kernel#puts][kernel#puts]: writes given objects, each followed by newline. + +## Character \IO + +Reading: + +- #each_char: reads each remaining character, passing it to the block. +- #getc: reads and returns the next character. + +Writing: + +- #putc: writes the given character. +- #ungetc.: unshifts the given character. + +## Byte \IO + +Reading: + +- #each_byte: reads each remaining byte, passing it to the block. +- #getbyte: reads and returns the next byte. + +Writing: + +- #ungetbyte: unshifts the given byte. + +## Codepoint \IO + +Reading: + +- #each_codepoint: reads each remaining codepoint, passing it to the block. + +[bom]: https://en.wikipedia.org/wiki/Byte_order_mark +[encodings document]: https://docs.ruby-lang.org/en/master/language/encodings_rdoc.html +[io class]: https://docs.ruby-lang.org/en/master/IO.html +[kernel#puts]: https://docs.ruby-lang.org/en/master/Kernel.html#method-i-puts +[kernel#readline]: https://docs.ruby-lang.org/en/master/Kernel.html#method-i-readline + +[basic reading]: rdoc-ref:StringIO@Basic+Reading +[basic writing]: rdoc-ref:StringIO@Basic+Writing +[bom (byte order mark)]: rdoc-ref:StringIO@BOM+-28Byte+Order+Mark-29 +[data mode]: rdoc-ref:StringIO@Data+Mode +[encodings]: rdoc-ref:StringIO@Encodings +[end-of-stream]: rdoc-ref:StringIO@End-of-Stream +[line number]: rdoc-ref:StringIO@Line+Number +[open/closed streams]: rdoc-ref:StringIO@Open-2FClosed+Streams +[position]: rdoc-ref:StringIO@Position +[read/write mode]: rdoc-ref:StringIO@Read-2FWrite+Mode diff --git a/doc/strscan/helper_methods.md b/doc/strscan/helper_methods.md new file mode 100644 index 0000000000..9fb1d79bba --- /dev/null +++ b/doc/strscan/helper_methods.md @@ -0,0 +1,124 @@ +## Helper Methods + +These helper methods display values returned by scanner's methods. + +### `put_situation(scanner)` + +Display scanner's situation: + +- Byte position (`#pos`). +- Character position (`#charpos`) +- Target string (`#rest`) and size (`#rest_size`). + +```rb +scanner = StringScanner.new('foobarbaz') +scanner.scan(/foo/) +put_situation(scanner) +# Situation: +# pos: 3 +# charpos: 3 +# rest: "barbaz" +# rest_size: 6 +``` + +### `put_match_values(scanner)` + +Display the scanner's match values: + +```rb +scanner = StringScanner.new('Fri Dec 12 1975 14:39') +pattern = /(?<wday>\w+) (?<month>\w+) (?<day>\d+) / +scanner.match?(pattern) +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 11 +# pre_match: "" +# matched : "Fri Dec 12 " +# post_match: "1975 14:39" +# Captured match values: +# size: 4 +# captures: ["Fri", "Dec", "12"] +# named_captures: {"wday"=>"Fri", "month"=>"Dec", "day"=>"12"} +# values_at: ["Fri Dec 12 ", "Fri", "Dec", "12", nil] +# []: +# [0]: "Fri Dec 12 " +# [1]: "Fri" +# [2]: "Dec" +# [3]: "12" +# [4]: nil +``` + +### `match_values_cleared?(scanner)` + +Returns whether the scanner's match values are all properly cleared: + +```rb +scanner = StringScanner.new('foobarbaz') +match_values_cleared?(scanner) # => true +put_match_values(scanner) +# Basic match values: +# matched?: false +# matched_size: nil +# pre_match: nil +# matched : nil +# post_match: nil +# Captured match values: +# size: nil +# captures: nil +# named_captures: {} +# values_at: nil +# [0]: nil +scanner.scan(/foo/) +match_values_cleared?(scanner) # => false +``` + +## The Code + +```rb +def put_situation(scanner) + puts '# Situation:' + puts "# pos: #{scanner.pos}" + puts "# charpos: #{scanner.charpos}" + puts "# rest: #{scanner.rest.inspect}" + puts "# rest_size: #{scanner.rest_size}" +end + +def put_match_values(scanner) + puts '# Basic match values:' + puts "# matched?: #{scanner.matched?}" + value = scanner.matched_size || 'nil' + puts "# matched_size: #{value}" + puts "# pre_match: #{scanner.pre_match.inspect}" + puts "# matched : #{scanner.matched.inspect}" + puts "# post_match: #{scanner.post_match.inspect}" + puts '# Captured match values:' + puts "# size: #{scanner.size}" + puts "# captures: #{scanner.captures}" + puts "# named_captures: #{scanner.named_captures}" + if scanner.size.nil? + puts "# values_at: #{scanner.values_at(0)}" + puts "# [0]: #{scanner[0]}" + else + puts "# values_at: #{scanner.values_at(*(0..scanner.size))}" + puts "# []:" + scanner.size.times do |i| + puts "# [#{i}]: #{scanner[i].inspect}" + end + end +end + +def match_values_cleared?(scanner) + scanner.matched? == false && + scanner.matched_size.nil? && + scanner.matched.nil? && + scanner.pre_match.nil? && + scanner.post_match.nil? && + scanner.size.nil? && + scanner[0].nil? && + scanner.captures.nil? && + scanner.values_at(0..1).nil? && + scanner.named_captures == {} +end +``` + diff --git a/doc/strscan/link_refs.txt b/doc/strscan/link_refs.txt new file mode 100644 index 0000000000..19f6f7ce5c --- /dev/null +++ b/doc/strscan/link_refs.txt @@ -0,0 +1,17 @@ +[1]: rdoc-ref:StringScanner@Stored+String +[2]: rdoc-ref:StringScanner@Byte+Position+-28Position-29 +[3]: rdoc-ref:StringScanner@Target+Substring +[4]: rdoc-ref:StringScanner@Setting+the+Target+Substring +[5]: rdoc-ref:StringScanner@Traversing+the+Target+Substring +[6]: https://docs.ruby-lang.org/en/master/Regexp.html +[7]: rdoc-ref:StringScanner@Character+Position +[8]: https://docs.ruby-lang.org/en/master/String.html#method-i-5B-5D +[9]: rdoc-ref:StringScanner@Match+Values +[10]: rdoc-ref:StringScanner@Fixed-Anchor+Property +[11]: rdoc-ref:StringScanner@Positions +[13]: rdoc-ref:StringScanner@Captured+Match+Values +[14]: rdoc-ref:StringScanner@Querying+the+Target+Substring +[15]: rdoc-ref:StringScanner@Searching+the+Target+Substring +[16]: https://docs.ruby-lang.org/en/master/Regexp.html#class-Regexp-label-Groups+and+Captures +[17]: rdoc-ref:StringScanner@Matching +[18]: rdoc-ref:StringScanner@Basic+Match+Values diff --git a/doc/strscan/methods/get_byte.md b/doc/strscan/methods/get_byte.md new file mode 100644 index 0000000000..3208d77158 --- /dev/null +++ b/doc/strscan/methods/get_byte.md @@ -0,0 +1,30 @@ +call-seq: + get_byte -> byte_as_character or nil + +Returns the next byte, if available: + +- If the [position][2] + is not at the end of the [stored string][1]: + + - Returns the next byte. + - Increments the [byte position][2]. + - Adjusts the [character position][7]. + + ```rb + scanner = StringScanner.new(HIRAGANA_TEXT) + # => #<StringScanner 0/15 @ "\xE3\x81\x93\xE3\x82..."> + scanner.string # => "ã“ã‚“ã«ã¡ã¯" + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\xE3", 1, 1] + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x81", 2, 2] + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x93", 3, 1] + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\xE3", 4, 2] + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x82", 5, 3] + [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x93", 6, 2] + ``` + +- Otherwise, returns `nil`, and does not change the positions. + + ```rb + scanner.terminate + [scanner.get_byte, scanner.pos, scanner.charpos] # => [nil, 15, 5] + ``` diff --git a/doc/strscan/methods/get_charpos.md b/doc/strscan/methods/get_charpos.md new file mode 100644 index 0000000000..954fcf5b44 --- /dev/null +++ b/doc/strscan/methods/get_charpos.md @@ -0,0 +1,19 @@ +call-seq: + charpos -> character_position + +Returns the [character position][7] (initially zero), +which may be different from the [byte position][2] +given by method #pos: + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.getch # => "ã“" # 3-byte character. +scanner.getch # => "ã‚“" # 3-byte character. +put_situation(scanner) +# Situation: +# pos: 6 +# charpos: 2 +# rest: "ã«ã¡ã¯" +# rest_size: 9 +``` diff --git a/doc/strscan/methods/get_pos.md b/doc/strscan/methods/get_pos.md new file mode 100644 index 0000000000..81bbb2345e --- /dev/null +++ b/doc/strscan/methods/get_pos.md @@ -0,0 +1,14 @@ +call-seq: + pos -> byte_position + +Returns the integer [byte position][2], +which may be different from the [character position][7]: + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos # => 0 +scanner.getch # => "ã“" # 3-byte character. +scanner.charpos # => 1 +scanner.pos # => 3 +``` diff --git a/doc/strscan/methods/getch.md b/doc/strscan/methods/getch.md new file mode 100644 index 0000000000..3dd70e4c5b --- /dev/null +++ b/doc/strscan/methods/getch.md @@ -0,0 +1,43 @@ +call-seq: + getch -> character or nil + +Returns the next (possibly multibyte) character, +if available: + +- If the [position][2] + is at the beginning of a character: + + - Returns the character. + - Increments the [character position][7] by 1. + - Increments the [byte position][2] + by the size (in bytes) of the character. + + ```rb + scanner = StringScanner.new(HIRAGANA_TEXT) + scanner.string # => "ã“ã‚“ã«ã¡ã¯" + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã“", 3, 1] + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã‚“", 6, 2] + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã«", 9, 3] + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã¡", 12, 4] + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã¯", 15, 5] + [scanner.getch, scanner.pos, scanner.charpos] # => [nil, 15, 5] + ``` + +- If the [position][2] is within a multi-byte character + (that is, not at its beginning), + behaves like #get_byte (returns a 1-byte character): + + ```rb + scanner.pos = 1 + [scanner.getch, scanner.pos, scanner.charpos] # => ["\x81", 2, 2] + [scanner.getch, scanner.pos, scanner.charpos] # => ["\x93", 3, 1] + [scanner.getch, scanner.pos, scanner.charpos] # => ["ã‚“", 6, 2] + ``` + +- If the [position][2] is at the end of the [stored string][1], + returns `nil` and does not modify the positions: + + ```rb + scanner.terminate + [scanner.getch, scanner.pos, scanner.charpos] # => [nil, 15, 5] + ``` diff --git a/doc/strscan/methods/scan.md b/doc/strscan/methods/scan.md new file mode 100644 index 0000000000..22ddd368b6 --- /dev/null +++ b/doc/strscan/methods/scan.md @@ -0,0 +1,51 @@ +call-seq: + scan(pattern) -> substring or nil + +Attempts to [match][17] the given `pattern` +at the beginning of the [target substring][3]. + +If the match succeeds: + +- Returns the matched substring. +- Increments the [byte position][2] by <tt>substring.bytesize</tt>, + and may increment the [character position][7]. +- Sets [match values][9]. + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos = 6 +scanner.scan(/ã«/) # => "ã«" +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 3 +# pre_match: "ã“ã‚“" +# matched : "ã«" +# post_match: "ã¡ã¯" +# Captured match values: +# size: 1 +# captures: [] +# named_captures: {} +# values_at: ["ã«", nil] +# []: +# [0]: "ã«" +# [1]: nil +put_situation(scanner) +# Situation: +# pos: 9 +# charpos: 3 +# rest: "ã¡ã¯" +# rest_size: 6 +``` + +If the match fails: + +- Returns `nil`. +- Does not increment byte and character positions. +- Clears match values. + +```rb +scanner.scan(/nope/) # => nil +match_values_cleared?(scanner) # => true +``` diff --git a/doc/strscan/methods/scan_until.md b/doc/strscan/methods/scan_until.md new file mode 100644 index 0000000000..9a8c7c02f6 --- /dev/null +++ b/doc/strscan/methods/scan_until.md @@ -0,0 +1,52 @@ +call-seq: + scan_until(pattern) -> substring or nil + +Attempts to [match][17] the given `pattern` +anywhere (at any [position][2]) in the [target substring][3]. + +If the match attempt succeeds: + +- Sets [match values][9]. +- Sets the [byte position][2] to the end of the matched substring; + may adjust the [character position][7]. +- Returns the matched substring. + + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos = 6 +scanner.scan_until(/ã¡/) # => "ã«ã¡" +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 3 +# pre_match: "ã“ã‚“ã«" +# matched : "ã¡" +# post_match: "ã¯" +# Captured match values: +# size: 1 +# captures: [] +# named_captures: {} +# values_at: ["ã¡", nil] +# []: +# [0]: "ã¡" +# [1]: nil +put_situation(scanner) +# Situation: +# pos: 12 +# charpos: 4 +# rest: "ã¯" +# rest_size: 3 +``` + +If the match attempt fails: + +- Clears match data. +- Returns `nil`. +- Does not update positions. + +```rb +scanner.scan_until(/nope/) # => nil +match_values_cleared?(scanner) # => true +``` diff --git a/doc/strscan/methods/set_pos.md b/doc/strscan/methods/set_pos.md new file mode 100644 index 0000000000..3b7abe65e3 --- /dev/null +++ b/doc/strscan/methods/set_pos.md @@ -0,0 +1,27 @@ +call-seq: + pos = n -> n + pointer = n -> n + +Sets the [byte position][2] and the [character position][11]; +returns `n`. + +Does not affect [match values][9]. + +For non-negative `n`, sets the position to `n`: + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos = 3 # => 3 +scanner.rest # => "ã‚“ã«ã¡ã¯" +scanner.charpos # => 1 +``` + +For negative `n`, counts from the end of the [stored string][1]: + +```rb +scanner.pos = -9 # => -9 +scanner.pos # => 6 +scanner.rest # => "ã«ã¡ã¯" +scanner.charpos # => 2 +``` diff --git a/doc/strscan/methods/skip.md b/doc/strscan/methods/skip.md new file mode 100644 index 0000000000..10a329e0e4 --- /dev/null +++ b/doc/strscan/methods/skip.md @@ -0,0 +1,43 @@ +call-seq: + skip(pattern) match_size or nil + +Attempts to [match][17] the given `pattern` +at the beginning of the [target substring][3]; + +If the match succeeds: + +- Increments the [byte position][2] by substring.bytesize, + and may increment the [character position][7]. +- Sets [match values][9]. +- Returns the size (bytes) of the matched substring. + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos = 6 +scanner.skip(/ã«/) # => 3 +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 3 +# pre_match: "ã“ã‚“" +# matched : "ã«" +# post_match: "ã¡ã¯" +# Captured match values: +# size: 1 +# captures: [] +# named_captures: {} +# values_at: ["ã«", nil] +# []: +# [0]: "ã«" +# [1]: nil +put_situation(scanner) +# Situation: +# pos: 9 +# charpos: 3 +# rest: "ã¡ã¯" +# rest_size: 6 + +scanner.skip(/nope/) # => nil +match_values_cleared?(scanner) # => true +``` diff --git a/doc/strscan/methods/skip_until.md b/doc/strscan/methods/skip_until.md new file mode 100644 index 0000000000..b7dacf6da1 --- /dev/null +++ b/doc/strscan/methods/skip_until.md @@ -0,0 +1,49 @@ +call-seq: + skip_until(pattern) -> matched_substring_size or nil + +Attempts to [match][17] the given `pattern` +anywhere (at any [position][2]) in the [target substring][3]; +does not modify the positions. + +If the match attempt succeeds: + +- Sets [match values][9]. +- Returns the size of the matched substring. + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.pos = 6 +scanner.skip_until(/ã¡/) # => 6 +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 3 +# pre_match: "ã“ã‚“ã«" +# matched : "ã¡" +# post_match: "ã¯" +# Captured match values: +# size: 1 +# captures: [] +# named_captures: {} +# values_at: ["ã¡", nil] +# []: +# [0]: "ã¡" +# [1]: nil +put_situation(scanner) +# Situation: +# pos: 12 +# charpos: 4 +# rest: "ã¯" +# rest_size: 3 +``` + +If the match attempt fails: + +- Clears match values. +- Returns `nil`. + +```rb +scanner.skip_until(/nope/) # => nil +match_values_cleared?(scanner) # => true +``` diff --git a/doc/strscan/methods/terminate.md b/doc/strscan/methods/terminate.md new file mode 100644 index 0000000000..b03b37d2a2 --- /dev/null +++ b/doc/strscan/methods/terminate.md @@ -0,0 +1,30 @@ +call-seq: + terminate -> self + +Sets the scanner to end-of-string; +returns +self+: + +- Sets both [positions][11] to end-of-stream. +- Clears [match values][9]. + +```rb +scanner = StringScanner.new(HIRAGANA_TEXT) +scanner.string # => "ã“ã‚“ã«ã¡ã¯" +scanner.scan_until(/ã«/) +put_situation(scanner) +# Situation: +# pos: 9 +# charpos: 3 +# rest: "ã¡ã¯" +# rest_size: 6 +match_values_cleared?(scanner) # => false + +scanner.terminate # => #<StringScanner fin> +put_situation(scanner) +# Situation: +# pos: 15 +# charpos: 5 +# rest: "" +# rest_size: 0 +match_values_cleared?(scanner) # => true +``` diff --git a/doc/strscan/strscan.md b/doc/strscan/strscan.md new file mode 100644 index 0000000000..385e92f84e --- /dev/null +++ b/doc/strscan/strscan.md @@ -0,0 +1,544 @@ +\Class `StringScanner` supports processing a stored string as a stream; +this code creates a new `StringScanner` object with string `'foobarbaz'`: + +```rb +require 'strscan' +scanner = StringScanner.new('foobarbaz') +``` + +## About the Examples + +All examples here assume that `StringScanner` has been required: + +```rb +require 'strscan' +``` + +Some examples here assume that these constants are defined: + +```rb +MULTILINE_TEXT = <<~EOT +Go placidly amid the noise and haste, +and remember what peace there may be in silence. +EOT + +HIRAGANA_TEXT = 'ã“ã‚“ã«ã¡ã¯' + +ENGLISH_TEXT = 'Hello' +``` + +Some examples here assume that certain helper methods are defined: + +- `put_situation(scanner)`: + Displays the values of the scanner's + methods #pos, #charpos, #rest, and #rest_size. +- `put_match_values(scanner)`: + Displays the scanner's [match values][9]. +- `match_values_cleared?(scanner)`: + Returns whether the scanner's [match values][9] are cleared. + +See examples at [helper methods](helper_methods.md). + +## The `StringScanner` \Object + +This code creates a `StringScanner` object +(we'll call it simply a _scanner_), +and shows some of its basic properties: + +```rb +scanner = StringScanner.new('foobarbaz') +scanner.string # => "foobarbaz" +put_situation(scanner) +# Situation: +# pos: 0 +# charpos: 0 +# rest: "foobarbaz" +# rest_size: 9 +``` + +The scanner has: + +* A <i>stored string</i>, which is: + + * Initially set by StringScanner.new(string) to the given `string` + (`'foobarbaz'` in the example above). + * Modifiable by methods #string=(new_string) and #concat(more_string). + * Returned by method #string. + + More at [Stored String][1] below. + +* A _position_; + a zero-based index into the bytes of the stored string (_not_ into its characters): + + * Initially set by StringScanner.new to `0`. + * Returned by method #pos. + * Modifiable explicitly by methods #reset, #terminate, and #pos=(new_pos). + * Modifiable implicitly (various traversing methods, among others). + + More at [Byte Position][2] below. + +* A <i>target substring</i>, + which is a trailing substring of the stored string; + it extends from the current position to the end of the stored string: + + * Initially set by StringScanner.new(string) to the given `string` + (`'foobarbaz'` in the example above). + * Returned by method #rest. + * Modified by any modification to either the stored string or the position. + + <b>Most importantly</b>: + the searching and traversing methods operate on the target substring, + which may be (and often is) less than the entire stored string. + + More at [Target Substring][3] below. + +## Stored \String + +The <i>stored string</i> is the string stored in the `StringScanner` object. + +Each of these methods sets, modifies, or returns the stored string: + +| Method | Effect | +|----------------------|-------------------------------------------------| +| ::new(string) | Creates a new scanner for the given string. | +| #string=(new_string) | Replaces the existing stored string. | +| #concat(more_string) | Appends a string to the existing stored string. | +| #string | Returns the stored string. | + +## Positions + +A `StringScanner` object maintains a zero-based <i>byte position</i> +and a zero-based <i>character position</i>. + +Each of these methods explicitly sets positions: + +| Method | Effect | +|--------------------------|-----------------------------------------------------------| +| #reset | Sets both positions to zero (beginning of stored string). | +| #terminate | Sets both positions to the end of the stored string. | +| #pos=(new_byte_position) | Sets byte position; adjusts character position. | + +### Byte Position (Position) + +The byte position (or simply _position_) +is a zero-based index into the bytes in the scanner's stored string; +for a new `StringScanner` object, the byte position is zero. + +When the byte position is: + +* Zero (at the beginning), the target substring is the entire stored string. +* Equal to the size of the stored string (at the end), + the target substring is the empty string `''`. + +To get or set the byte position: + +* \#pos: returns the byte position. +* \#pos=(new_pos): sets the byte position. + +Many methods use the byte position as the basis for finding matches; +many others set, increment, or decrement the byte position: + +```rb +scanner = StringScanner.new('foobar') +scanner.pos # => 0 +scanner.scan(/foo/) # => "foo" # Match found. +scanner.pos # => 3 # Byte position incremented. +scanner.scan(/foo/) # => nil # Match not found. +scanner.pos # => 3 # Byte position not changed. +``` + +Some methods implicitly modify the byte position; +see: + +* [Setting the Target Substring][4]. +* [Traversing the Target Substring][5]. + +The values of these methods are derived directly from the values of #pos and #string: + +- \#charpos: the [character position][7]. +- \#rest: the [target substring][3]. +- \#rest_size: `rest.size`. + +### Character Position + +The character position is a zero-based index into the _characters_ +in the stored string; +for a new `StringScanner` object, the character position is zero. + +\Method #charpos returns the character position; +its value may not be reset explicitly. + +Some methods change (increment or reset) the character position; +see: + +* [Setting the Target Substring][4]. +* [Traversing the Target Substring][5]. + +Example (string includes multi-byte characters): + +```rb +scanner = StringScanner.new(ENGLISH_TEXT) # Five 1-byte characters. +scanner.concat(HIRAGANA_TEXT) # Five 3-byte characters +scanner.string # => "Helloã“ã‚“ã«ã¡ã¯" # Twenty bytes in all. +put_situation(scanner) +# Situation: +# pos: 0 +# charpos: 0 +# rest: "Helloã“ã‚“ã«ã¡ã¯" +# rest_size: 20 +scanner.scan(/Hello/) # => "Hello" # Five 1-byte characters. +put_situation(scanner) +# Situation: +# pos: 5 +# charpos: 5 +# rest: "ã“ã‚“ã«ã¡ã¯" +# rest_size: 15 +scanner.getch # => "ã“" # One 3-byte character. +put_situation(scanner) +# Situation: +# pos: 8 +# charpos: 6 +# rest: "ã‚“ã«ã¡ã¯" +# rest_size: 12 +``` + +## Target Substring + +The target substring is the part of the [stored string][1] +that extends from the current [byte position][2] to the end of the stored string; +it is always either: + +- The entire stored string (byte position is zero). +- A trailing substring of the stored string (byte position positive). + +The target substring is returned by method #rest, +and its size is returned by method #rest_size. + +Examples: + +```rb +scanner = StringScanner.new('foobarbaz') +put_situation(scanner) +# Situation: +# pos: 0 +# charpos: 0 +# rest: "foobarbaz" +# rest_size: 9 +scanner.pos = 3 +put_situation(scanner) +# Situation: +# pos: 3 +# charpos: 3 +# rest: "barbaz" +# rest_size: 6 +scanner.pos = 9 +put_situation(scanner) +# Situation: +# pos: 9 +# charpos: 9 +# rest: "" +# rest_size: 0 +``` + +### Setting the Target Substring + +The target substring is set whenever: + +* The [stored string][1] is set (position reset to zero; target substring set to stored string). +* The [byte position][2] is set (target substring adjusted accordingly). + +### Querying the Target Substring + +This table summarizes (details and examples at the links): + +| Method | Returns | +|------------|-----------------------------------| +| #rest | Target substring. | +| #rest_size | Size (bytes) of target substring. | + +### Searching the Target Substring + +A _search_ method examines the target substring, +but does not advance the [positions][11] +or (by implication) shorten the target substring. + +This table summarizes (details and examples at the links): + +| Method | Returns | Sets Match Values? | +|-----------------------|-----------------------------------------------|--------------------| +| #check(pattern) | Matched leading substring or +nil+. | Yes. | +| #check_until(pattern) | Matched substring (anywhere) or +nil+. | Yes. | +| #exist?(pattern) | Matched substring (anywhere) end index. | Yes. | +| #match?(pattern) | Size of matched leading substring or +nil+. | Yes. | +| #peek(size) | Leading substring of given length (bytes). | No. | +| #peek_byte | Integer leading byte or +nil+. | No. | +| #rest | Target substring (from byte position to end). | No. | + +### Traversing the Target Substring + +A _traversal_ method examines the target substring, +and, if successful: + +- Advances the [positions][11]. +- Shortens the target substring. + + +This table summarizes (details and examples at links): + +| Method | Returns | Sets Match Values? | +|----------------------|------------------------------------------------------|--------------------| +| #get_byte | Leading byte or +nil+. | No. | +| #getch | Leading character or +nil+. | No. | +| #scan(pattern) | Matched leading substring or +nil+. | Yes. | +| #scan_byte | Integer leading byte or +nil+. | No. | +| #scan_until(pattern) | Matched substring (anywhere) or +nil+. | Yes. | +| #skip(pattern) | Matched leading substring size or +nil+. | Yes. | +| #skip_until(pattern) | Position delta to end-of-matched-substring or +nil+. | Yes. | +| #unscan | +self+. | No. | + +## Querying the Scanner + +Each of these methods queries the scanner object +without modifying it (details and examples at links) + +| Method | Returns | +|---------------------|----------------------------------| +| #beginning_of_line? | +true+ or +false+. | +| #charpos | Character position. | +| #eos? | +true+ or +false+. | +| #fixed_anchor? | +true+ or +false+. | +| #inspect | String representation of +self+. | +| #pos | Byte position. | +| #rest | Target substring. | +| #rest_size | Size of target substring. | +| #string | Stored string. | + +## Matching + +`StringScanner` implements pattern matching via Ruby class [Regexp][6], +and its matching behaviors are the same as Ruby's +except for the [fixed-anchor property][10]. + +### Matcher Methods + +Each <i>matcher method</i> takes a single argument `pattern`, +and attempts to find a matching substring in the [target substring][3]. + +| Method | Pattern Type | Matches Target Substring | Success Return | May Update Positions? | +|--------------|-------------------|--------------------------|--------------------|-----------------------| +| #check | Regexp or String. | At beginning. | Matched substring. | No. | +| #check_until | Regexp or String. | Anywhere. | Substring. | No. | +| #match? | Regexp or String. | At beginning. | Match size. | No. | +| #exist? | Regexp or String. | Anywhere. | Substring size. | No. | +| #scan | Regexp or String. | At beginning. | Matched substring. | Yes. | +| #scan_until | Regexp or String. | Anywhere. | Substring. | Yes. | +| #skip | Regexp or String. | At beginning. | Match size. | Yes. | +| #skip_until | Regexp or String. | Anywhere. | Substring size. | Yes. | + +<br> + +Which matcher you choose will depend on: + +- Where you want to find a match: + + - Only at the beginning of the target substring: + #check, #match?, #scan, #skip. + - Anywhere in the target substring: + #check_until, #exist?, #scan_until, #skip_until. + +- Whether you want to: + + - Traverse, by advancing the positions: + #scan, #scan_until, #skip, #skip_until. + - Keep the positions unchanged: + #check, #check_until, #match?, #exist?. + +- What you want for the return value: + + - The matched substring: #check, #scan. + - The substring: #check_until, #scan_until. + - The match size: #match?, #skip. + - The substring size: #exist?, #skip_until. + +### Match Values + +The <i>match values</i> in a `StringScanner` object +generally contain the results of the most recent attempted match. + +Each match value may be thought of as: + +* _Clear_: Initially, or after an unsuccessful match attempt: + usually, `false`, `nil`, or `{}`. +* _Set_: After a successful match attempt: + `true`, string, array, or hash. + +Each of these methods clears match values: + +- ::new(string). +- \#reset. +- \#terminate. + +Each of these methods attempts a match based on a pattern, +and either sets match values (if successful) or clears them (if not); + +- \#check(pattern) +- \#check_until(pattern) +- \#exist?(pattern) +- \#match?(pattern) +- \#scan(pattern) +- \#scan_until(pattern) +- \#skip(pattern) +- \#skip_until(pattern) + +#### Basic Match Values + +Basic match values are those not related to captures. + +Each of these methods returns a basic match value: + +| Method | Return After Match | Return After No Match | +|-----------------|----------------------------------------|-----------------------| +| #matched? | +true+. | +false+. | +| #matched_size | Size of matched substring. | +nil+. | +| #matched | Matched substring. | +nil+. | +| #pre_match | Substring preceding matched substring. | +nil+. | +| #post_match | Substring following matched substring. | +nil+. | + +<br> + +See examples below. + +#### Captured Match Values + +Captured match values are those related to [captures][16]. + +Each of these methods returns a captured match value: + +| Method | Return After Match | Return After No Match | +|-----------------|-----------------------------------------|-----------------------| +| #size | Count of captured substrings. | +nil+. | +| #\[\](n) | <tt>n</tt>th captured substring. | +nil+. | +| #captures | Array of all captured substrings. | +nil+. | +| #values_at(*n) | Array of specified captured substrings. | +nil+. | +| #named_captures | Hash of named captures. | <tt>{}</tt>. | + +<br> + +See examples below. + +#### Match Values Examples + +Successful basic match attempt (no captures): + +```rb +scanner = StringScanner.new('foobarbaz') +scanner.exist?(/bar/) +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 3 +# pre_match: "foo" +# matched : "bar" +# post_match: "baz" +# Captured match values: +# size: 1 +# captures: [] +# named_captures: {} +# values_at: ["bar", nil] +# []: +# [0]: "bar" +# [1]: nil +``` + +Failed basic match attempt (no captures); + +```rb +scanner = StringScanner.new('foobarbaz') +scanner.exist?(/nope/) +match_values_cleared?(scanner) # => true +``` + +Successful unnamed capture match attempt: + +```rb +scanner = StringScanner.new('foobarbazbatbam') +scanner.exist?(/(foo)bar(baz)bat(bam)/) +put_match_values(scanner) +# Basic match values: +# matched?: true +# matched_size: 15 +# pre_match: "" +# matched : "foobarbazbatbam" +# post_match: "" +# Captured match values: +# size: 4 +# captures: ["foo", "baz", "bam"] +# named_captures: {} +# values_at: ["foobarbazbatbam", "foo", "baz", "bam", nil] +# []: +# [0]: "foobarbazbatbam" +# [1]: "foo" +# [2]: "baz" +# [3]: "bam" +# [4]: nil +``` + +Successful named capture match attempt; +same as unnamed above, except for #named_captures: + +```rb +scanner = StringScanner.new('foobarbazbatbam') +scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/) +scanner.named_captures # => {"x"=>"foo", "y"=>"baz", "z"=>"bam"} +``` + +Failed unnamed capture match attempt: + +```rb +scanner = StringScanner.new('somestring') +scanner.exist?(/(foo)bar(baz)bat(bam)/) +match_values_cleared?(scanner) # => true +``` + +Failed named capture match attempt; +same as unnamed above, except for #named_captures: + +```rb +scanner = StringScanner.new('somestring') +scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/) +match_values_cleared?(scanner) # => false +scanner.named_captures # => {"x"=>nil, "y"=>nil, "z"=>nil} +``` + +## Fixed-Anchor Property + +Pattern matching in `StringScanner` is the same as in Ruby's, +except for its fixed-anchor property, +which determines the meaning of `'\A'`: + +* `false` (the default): matches the current byte position. + + ```rb + scanner = StringScanner.new('foobar') + scanner.scan(/\A./) # => "f" + scanner.scan(/\A./) # => "o" + scanner.scan(/\A./) # => "o" + scanner.scan(/\A./) # => "b" + ``` + +* `true`: matches the beginning of the target substring; + never matches unless the byte position is zero: + + ```rb + scanner = StringScanner.new('foobar', fixed_anchor: true) + scanner.scan(/\A./) # => "f" + scanner.scan(/\A./) # => nil + scanner.reset + scanner.scan(/\A./) # => "f" + ``` + +The fixed-anchor property is set when the `StringScanner` object is created, +and may not be modified +(see StringScanner.new); +method #fixed_anchor? returns the setting. + diff --git a/doc/symbol/casecmp.rdoc b/doc/symbol/casecmp.rdoc new file mode 100644 index 0000000000..9c286070b7 --- /dev/null +++ b/doc/symbol/casecmp.rdoc @@ -0,0 +1,27 @@ +Like Symbol#<=>, but case-insensitive; +equivalent to <tt>self.to_s.casecmp(object.to_s)</tt>: + + lower = :abc + upper = :ABC + upper.casecmp(lower) # => 0 + lower.casecmp(lower) # => 0 + lower.casecmp(upper) # => 0 + +Returns nil if +self+ and +object+ have incompatible encodings, +or if +object+ is not a symbol: + + sym = 'äöü'.encode("ISO-8859-1").to_sym + other_sym = 'ÄÖÜ' + sym.casecmp(other_sym) # => nil + :foo.casecmp(2) # => nil + +Unlike Symbol#casecmp?, +case-insensitivity does not work for characters outside of 'A'..'Z' and 'a'..'z': + + lower = :äöü + upper = :ÄÖÜ + upper.casecmp(lower) # => -1 + lower.casecmp(lower) # => 0 + lower.casecmp(upper) # => 1 + +Related: Symbol#casecmp?, String#casecmp. diff --git a/doc/symbol/casecmp_p.rdoc b/doc/symbol/casecmp_p.rdoc new file mode 100644 index 0000000000..7102b54289 --- /dev/null +++ b/doc/symbol/casecmp_p.rdoc @@ -0,0 +1,26 @@ +Returns +true+ if +self+ and +object+ are equal after Unicode case folding, +otherwise +false+: + + lower = :abc + upper = :ABC + upper.casecmp?(lower) # => true + lower.casecmp?(lower) # => true + lower.casecmp?(upper) # => true + +Returns nil if +self+ and +object+ have incompatible encodings, +or if +object+ is not a symbol: + + sym = 'äöü'.encode("ISO-8859-1").to_sym + other_sym = 'ÄÖÜ' + sym.casecmp?(other_sym) # => nil + :foo.casecmp?(2) # => nil + +Unlike Symbol#casecmp, works for characters outside of 'A'..'Z' and 'a'..'z': + + lower = :äöü + upper = :ÄÖÜ + upper.casecmp?(lower) # => true + lower.casecmp?(lower) # => true + lower.casecmp?(upper) # => true + +Related: Symbol#casecmp, String#casecmp?. diff --git a/doc/syntax.rdoc b/doc/syntax.rdoc index fe0f98ce4c..a48c83ff15 100644 --- a/doc/syntax.rdoc +++ b/doc/syntax.rdoc @@ -2,6 +2,9 @@ The Ruby syntax is large and is split up into the following sections: +{Code Layout}[rdoc-ref:syntax/layout.rdoc] :: + Breaking code in lines + Literals[rdoc-ref:syntax/literals.rdoc] :: Numbers, Strings, Arrays, Hashes, etc. @@ -11,6 +14,9 @@ Assignment[rdoc-ref:syntax/assignment.rdoc] :: {Control Expressions}[rdoc-ref:syntax/control_expressions.rdoc] :: +if+, +unless+, +while+, +until+, +for+, +break+, +next+, +redo+ +{Pattern matching}[rdoc-ref:syntax/pattern_matching.rdoc] :: + Structural pattern matching and variable binding syntax + Methods[rdoc-ref:syntax/methods.rdoc] :: Method and method argument syntax @@ -27,8 +33,13 @@ Precedence[rdoc-ref:syntax/precedence.rdoc] :: Precedence of ruby operators Refinements[rdoc-ref:syntax/refinements.rdoc] :: - Use and behavior of the experimental refinements feature + Use and behavior of the refinements feature Miscellaneous[rdoc-ref:syntax/miscellaneous.rdoc] :: +alias+, +undef+, +BEGIN+, +END+ +Comments[rdoc-ref:syntax/comments.rdoc] :: + Line and block code comments + +Operators[rdoc-ref:syntax/operators.rdoc] :: + Operator method behaviors diff --git a/doc/syntax/assignment.rdoc b/doc/syntax/assignment.rdoc index 7361b7d3bd..3988f82e5f 100644 --- a/doc/syntax/assignment.rdoc +++ b/doc/syntax/assignment.rdoc @@ -8,6 +8,9 @@ example assigns the number five to the local variable +v+: Assignment creates a local variable if the variable was not previously referenced. +An assignment expression result is always the assigned value, including +{assignment methods}[rdoc-ref:@Assignment+Methods]. + == Local Variable Names A local variable name must start with a lowercase US-ASCII letter or a @@ -92,8 +95,9 @@ Now any reference to +big_calculation+ is considered a local variable and will be cached. To call the method, use <code>self.big_calculation</code>. You can force a method call by using empty argument parentheses as shown above -or by using an explicit receiver like <code>self.</code>. Using an explicit -receiver may raise a NameError if the method's visibility is not public. +or by using an explicit receiver like <code>self</code>. Using an explicit +receiver may raise a NameError if the method's visibility is not public or the +receiver is the literal <code>self</code>. Another commonly confusing case is when using a modifier +if+: @@ -103,7 +107,7 @@ Rather than printing "true" you receive a NameError, "undefined local variable or method `a'". Since ruby parses the bare +a+ left of the +if+ first and has not yet seen an assignment to +a+ it assumes you wish to call a method. Ruby then sees the assignment to +a+ and will assume you are referencing a local -method. +variable. The confusion comes from the out-of-order execution of the expression. First the local variable is assigned-to then you attempt to call a nonexistent @@ -158,9 +162,7 @@ Here is an example of instance variable usage: p object1.value # prints "some value" p object2.value # prints "other value" -An uninitialized instance variable has a value of +nil+. If you run Ruby with -warnings enabled, you will get a warning when accessing an uninitialized -instance variable. +An uninitialized instance variable has a value of +nil+. The +value+ method has access to the value set by the +initialize+ method, but only for the same object. @@ -277,7 +279,7 @@ An uninitialized global variable has a value of +nil+. Ruby has some special globals that behave differently depending on context such as the regular expression match variables or that have a side-effect when -assigned to. See the {global variables documentation}[rdoc-ref:globals.rdoc] +assigned to. See the {global variables documentation}[rdoc-ref:language/globals.md] for details. == Assignment Methods @@ -341,6 +343,9 @@ This prints: local_variables: @value: 42 +Note that the value returned by an assignment method is ignored whatever, +since an assignment expression result is always the assignment value. + == Abbreviated Assignment You can mix several of the operators and assignment. To add 1 to an object @@ -396,6 +401,10 @@ assigning. This is similar to multiple assignment: p a # prints [1, 2, 3] + b = *1 + + p b # prints [1] + You can splat anywhere in the right-hand side of the assignment: a = 1, *[2, 3] diff --git a/doc/syntax/calling_methods.rdoc b/doc/syntax/calling_methods.rdoc index b86d60ad88..76babcc3dc 100644 --- a/doc/syntax/calling_methods.rdoc +++ b/doc/syntax/calling_methods.rdoc @@ -30,7 +30,41 @@ NoMethodError. You may also use <code>::</code> to designate a receiver, but this is rarely used due to the potential for confusion with <code>::</code> for namespaces. -=== Safe navigation operator +=== Chaining Method Calls + +You can "chain" method calls by immediately following one method call with another. + +This example chains methods Array#append and Array#compact: + + a = [:foo, 'bar', 2] + a1 = [:baz, nil, :bam, nil] + a2 = a.append(*a1).compact + a2 # => [:foo, "bar", 2, :baz, :bam] + +Details: + +- First method <tt>merge</tt> creates a copy of <tt>a</tt>, + appends (separately) each element of <tt>a1</tt> to the copy, and returns + [:foo, "bar", 2, :baz, nil, :bam, nil] +- Chained method <tt>compact</tt> creates a copy of that return value, + removes its <tt>nil</tt>-valued entries, and returns + [:foo, "bar", 2, :baz, :bam] + +You can chain methods that are in different classes. +This example chains methods Hash#to_a and Array#reverse: + + h = {foo: 0, bar: 1, baz: 2} + h.to_a.reverse # => [[:baz, 2], [:bar, 1], [:foo, 0]] + +Details: + +- First method Hash#to_a converts <tt>a</tt> to an \Array, and returns + [[:foo, 0], [:bar, 1], [:baz, 2]] +- Chained method Array#reverse creates copy of that return value, + reverses it, and returns + [[:baz, 2], [:bar, 1], [:foo, 0]] + +=== Safe Navigation Operator <code>&.</code>, called "safe navigation operator", allows to skip method call when receiver is +nil+. It returns +nil+ and doesn't evaluate method's arguments @@ -98,7 +132,7 @@ to: If the method definition has a <code>*argument</code> extra positional arguments will be assigned to +argument+ in the method as an Array. -If the method definition doesn't include keyword arguments the keyword or +If the method definition doesn't include keyword arguments, the keyword or hash-type arguments are assigned as a single hash to the last argument: def my_method(options) @@ -172,9 +206,28 @@ like positional arguments: my_method(positional1, keyword1: value1, keyword2: value2) Any keyword arguments not given will use the default value from the method -definition. If a keyword argument is given that the method did not list an +definition. If a keyword argument is given that the method did not list, +and the method definition does not accept arbitrary keyword arguments, an ArgumentError will be raised. +Keyword argument value can be omitted, meaning the value will be fetched +from the context by the name of the key + + keyword1 = 'some value' + my_method(positional1, keyword1:) + # ...is the same as + my_method(positional1, keyword1: keyword1) + +Be aware that when method parenthesis are omitted, too, the parsing order might +be unexpected: + + my_method positional1, keyword1: + + some_other_expression + + # ...is actually parsed as + my_method(positional1, keyword1: some_other_expression) + === Block Argument The block argument sends a closure from the calling scope to the method. @@ -238,16 +291,16 @@ override local arguments outside the block in the caller's scope: This prints: hello main this is block - place is world + place is: world So the +place+ variable in the block is not the same +place+ variable as outside the block. Removing <code>; place</code> from the block arguments gives this result: hello main this is block - place is block + place is: block -=== Array to Arguments Conversion +=== Unpacking Positional Arguments Given the following method: @@ -269,14 +322,58 @@ Both are equivalent to: my_method(1, 2, 3) -If the method accepts keyword arguments, the splat operator will convert a -hash at the end of the array into keyword arguments: +The <code>*</code> unpacking operator can be applied to any object, not only +arrays. If the object responds to a <code>#to_a</code> method, this method +is called, and is expected to return an Array, and elements of this array are passed +as separate positional arguments: - def my_method(a, b, c: 3) + class Name + def initialize(name) + @name = name + end + + def to_a = @name.split(' ') end - arguments = [1, 2, { c: 4 }] - my_method(*arguments) + name = Name.new('Jane Doe') + p(*name) + # prints separate values: + # Jane + # Doe + +If the object doesn't have a <code>#to_a</code> method, the object itself is passed +as one argument: + + class Name + def initialize(name) + @name = name + end + end + + name = Name.new('Jane Doe') + p(*name) + # Prints the object itself: + # #<Name:0x00007f9d07bca650 @name="Jane Doe"> + +This allows to handle one or many arguments polymorphically. Note also that <tt>*nil</tt> +is unpacked to an empty list of arguments, so conditional unpacking is possible: + + my_method(*(some_arguments if some_condition?)) + +If <code>#to_a</code> method exists and does not return an Array, it would be an +error on unpacking: + + class Name + def initialize(name) + @name = name + end + + def to_a = @name + end + + name = Name.new('Jane Doe') + p(*name) + # can't convert Name to Array (Name#to_a gives String) (TypeError) You may also use the <code>**</code> (described next) to convert a Hash into keyword arguments. @@ -285,16 +382,21 @@ If the number of objects in the Array do not match the number of arguments for the method, an ArgumentError will be raised. If the splat operator comes first in the call, parentheses must be used to -avoid a warning. +avoid an ambiguity of interpretation as an unpacking operator or multiplication +operator. In this case, Ruby issues a warning in verbose mode: -=== Hash to Keyword Arguments Conversion + my_method *arguments # warning: '*' interpreted as argument prefix + my_method(*arguments) # no warning + +=== Unpacking Keyword Arguments Given the following method: def my_method(first: 1, second: 2, third: 3) end -You can turn a Hash into keyword arguments with the <code>**</code> operator: +You can turn a Hash into keyword arguments with the <code>**</code> +(keyword splat) operator: arguments = { first: 3, second: 4, third: 5 } my_method(**arguments) @@ -308,8 +410,38 @@ Both are equivalent to: my_method(first: 3, second: 4, third: 5) -If the method definition uses <code>**</code> to gather arbitrary keyword -arguments, they will not be gathered by <code>*</code>: +The <code>**</code> unpacking operator can be applied to any object, not only +hashes. If the object responds to a <code>#to_hash</code> method, this method +is called, and is expected to return an Hash, and elements of this hash are passed +as keyword arguments: + + class Name + def initialize(name) + @name = name + end + + def to_hash = {first: @name.split(' ').first, last: @name.split(' ').last} + end + + name = Name.new('Jane Doe') + p(**name) + # Prints: {name: "Jane", last: "Doe"} + +Unlike <code>*</code> operator, <code>**</code> raises an error when used on an +object that doesn't respond to <code>#to_hash</code>. The one exception is ++nil+, which doesn't explicitly define this method, but is still allowed to +be used in <code>**</code> unpacking, not adding any keyword arguments. + +Again, this allows for conditional unpacking: + + my_method(some: params, **(some_extra_params if pass_extra_params?)) + +Like <code>*</code> operator, <code>**</code> raises an error when the object responds +to <code>#to_hash</code>, but it doesn't return a Hash. + +If the method definition uses the keyword splat operator to +gather arbitrary keyword arguments, they will not be gathered +by <code>*</code>: def my_method(*a, **kw) p arguments: a, keywords: kw @@ -319,10 +451,7 @@ arguments, they will not be gathered by <code>*</code>: Prints: - {:arguments=>[1, 2, {"3"=>4}], :keywords=>{:five=>6}} - -Unlike the splat operator described above, the <code>**</code> operator has no -commonly recognized name. + {:arguments=>[1, 2], :keywords=>{'3'=>4, :five=>6}} === Proc to Block Conversion @@ -333,17 +462,17 @@ Given a method that use a block: end You can convert a proc or lambda to a block argument with the <code>&</code> -operator: +(block conversion) operator: argument = proc { |a| puts "#{a.inspect} was yielded" } my_method(&argument) -If the splat operator comes first in the call, parenthesis must be used to -avoid a warning. +If the block conversion operator comes first in the call, parenthesis must be +used to avoid a warning: -Unlike the splat operator described above, the <code>&</code> operator has no -commonly recognized name. + my_method &argument # warning + my_method(&argument) # no warning == Method Lookup diff --git a/doc/syntax/comments.rdoc b/doc/syntax/comments.rdoc new file mode 100644 index 0000000000..cb6829a984 --- /dev/null +++ b/doc/syntax/comments.rdoc @@ -0,0 +1,253 @@ += Code Comments + +Ruby has two types of comments: inline and block. + +Inline comments start with the <code>#</code> character and continue until the +end of the line: + + # On a separate line + class Foo # or at the end of the line + # can be indented + def bar + end + end + +Block comments start with <code>=begin</code> and end with <code>=end</code>. +Each should start on a separate line. + + =begin + This is + commented out + =end + + class Foo + end + + =begin some_tag + this works, too + =end + +<code>=begin</code> and <code>=end</code> can not be indented, so this is a +syntax error: + + class Foo + =begin + Will not work + =end + end + +== Magic Comments + +While comments are typically ignored by Ruby, special "magic comments" contain +directives that affect how the code is interpreted. + +Top-level magic comments must appear in the first comment section of a file. + +NOTE: Magic comments affect only the file in which they appear; +other files are unaffected. + + # frozen_string_literal: true + + var = 'hello' + var.frozen? # => true + +=== Alternative syntax + +Magic comments may consist of a single directive (as in the example above). +Alternatively, multiple directives may appear on the same line if separated by ";" +and wrapped between "-*-" (see Emacs' {file variables}[https://www.gnu.org/software/emacs/manual/html_node/emacs/Specifying-File-Variables.html]). + + # emacs-compatible; -*- coding: big5; mode: ruby; frozen_string_literal: true -*- + + p 'hello'.frozen? # => true + p 'hello'.encoding # => #<Encoding:Big5> + +=== +encoding+ Directive + +Indicates which string encoding should be used for string literals, +regexp literals and <code>__ENCODING__</code>: + + # encoding: big5 + + ''.encoding # => #<Encoding:Big5> + +Default encoding is UTF-8. + +Top-level magic comments must start on the first line, or on the second line if +the first line looks like <tt>#! shebang line</tt>. + +The word "coding" may be used instead of "encoding". + +=== +frozen_string_literal+ Directive + +Indicates that string literals should be allocated once at parse time and frozen. + + # frozen_string_literal: true + + 3.times do + p 'hello'.object_id # => prints same number + end + p 'world'.frozen? # => true + +The default is false; this can be changed with <code>--enable=frozen-string-literal</code>. +Without the directive, or with <code># frozen_string_literal: false</code>, +the example above would print 3 different numbers and "false". + +Starting in Ruby 3.0, string literals that are dynamic are not frozen nor reused: + + # frozen_string_literal: true + + p "Addition: #{2 + 2}".frozen? # => false + +It must appear in the first comment section of a file. + +=== +warn_indent+ Directive + +This directive can turn on detection of bad indentation for statements that follow it: + + def foo + end # => no warning + + # warn_indent: true + def bar + end # => warning: mismatched indentations at 'end' with 'def' at 6 + +Another way to get these warnings to show is by running Ruby with warnings (<code>ruby -w</code>). Using a directive to set this false will prevent these warnings to show. + +=== +shareable_constant_value+ Directive + +Note: This directive is experimental in Ruby 3.0 and may change in future releases. + +This special directive helps to create constants that hold only immutable objects, or {Ractor-shareable}[rdoc-ref:Ractor@Shareable+and+unshareable+objects] constants. + +The directive can specify special treatment for values assigned to constants: + +* +none+: (default) +* +literal+: literals are implicitly frozen, others must be Ractor-shareable +* +experimental_everything+: all made shareable +* +experimental_copy+: copy deeply and make it shareable + +==== Mode +none+ (default) + +No special treatment in this mode (as in Ruby 2.x): no automatic freezing and no checks. + +It has always been a good idea to deep-freeze constants; Ractor makes this +an even better idea as only the main ractor can access non-shareable constants: + + # shareable_constant_value: none + A = {foo: []} + A.frozen? # => false + Ractor.new { puts A } # => can not access non-shareable objects by non-main Ractor. + +==== Mode +literal+ + +In "literal" mode, constants assigned to literals will be deeply-frozen: + + # shareable_constant_value: literal + X = [{foo: []}] # => same as [{foo: [].freeze}.freeze].freeze + +Other values must be shareable: + + # shareable_constant_value: literal + X = Object.new # => cannot assign unshareable object to X + +Note that only literals directly assigned to constants, or recursively held in such literals will be frozen: + + # shareable_constant_value: literal + var = [{foo: []}] + var.frozen? # => false (assignment was made to local variable) + X = var # => cannot assign unshareable object to X + + X = Set[1, 2, {foo: []}].freeze # => cannot assign unshareable object to X + # (`Set[...]` is not a literal and + # `{foo: []}` is an argument to `Set.[]`) + +The method Module#const_set is not affected. + +==== Mode +experimental_everything+ + +In this mode, all values assigned to constants are made shareable. + + # shareable_constant_value: experimental_everything + FOO = Set[1, 2, {foo: []}] + # same as FOO = Ractor.make_shareable(...) + # OR same as `FOO = Set[1, 2, {foo: [].freeze}.freeze].freeze` + + var = [{foo: []}] + var.frozen? # => false (assignment was made to local variable) + X = var # => calls `Ractor.make_shareable(var)` + var.frozen? # => true + +This mode is "experimental", because it might be error prone, for +example by deep-freezing the constants of an external resource which +could cause errors: + + # shareable_constant_value: experimental_everything + FOO = SomeGem::Something::FOO + # => deep freezes the gem's constant! + +This will be revisited before Ruby 3.1 to either allow `everything` +or to instead remove this mode. + +The method Module#const_set is not affected. + +==== Mode +experimental_copy+ + +In this mode, all values assigned to constants are deeply copied and +made shareable. It is safer mode than +experimental_everything+. + + # shareable_constant_value: experimental_copy + var = [{foo: []}] + var.frozen? # => false (assignment was made to local variable) + X = var # => calls `Ractor.make_shareable(var, copy: true)` + var.frozen? # => false + Ractor.shareable?(X) #=> true + var.object_id == X.object_id #=> false + +This mode is "experimental" and has not been discussed thoroughly. +This will be revisited before Ruby 3.1 to either allow `copy` +or to instead remove this mode. + +The method Module#const_set is not affected. + +==== Scope + +This directive can be used multiple times in the same file: + + # shareable_constant_value: none + A = {foo: []} + A.frozen? # => false + Ractor.new { puts A } # => can not access non-shareable objects by non-main Ractor. + + # shareable_constant_value: literal + B = {foo: []} + B.frozen? # => true + B[:foo].frozen? # => true + + C = [Object.new] # => cannot assign unshareable object to C (Ractor::IsolationError) + + D = [Object.new.freeze] + D.frozen? # => true + + # shareable_constant_value: experimental_everything + E = Set[1, 2, Object.new] + E.frozen? # => true + E.all(&:frozen?) # => true + +The directive affects only subsequent constants and only for the current scope: + + module Mod + # shareable_constant_value: literal + A = [1, 2, 3] + module Sub + B = [4, 5] + end + end + + C = [4, 5] + + module Mod + D = [6] + end + p Mod::A.frozen?, Mod::Sub::B.frozen? # => true, true + p C.frozen?, Mod::D.frozen? # => false, false diff --git a/doc/syntax/control_expressions.rdoc b/doc/syntax/control_expressions.rdoc index 65f7b431e3..3de6cd293f 100644 --- a/doc/syntax/control_expressions.rdoc +++ b/doc/syntax/control_expressions.rdoc @@ -144,7 +144,7 @@ expression. == Modifier +if+ and +unless+ +if+ and +unless+ can also be used to modify an expression. When used as a -modifier the left-hand side is the "then" expression and the right-hand side +modifier the left-hand side is the "then" statement and the right-hand side is the "test" expression: a = 0 @@ -164,7 +164,7 @@ This will print 1. This will print 0. While the modifier and standard versions have both a "test" expression and a -"then" expression, they are not exact transformations of each other due to +"then" statement, they are not exact transformations of each other due to parse order. Here is an example that shows the difference: p a if a = 0.zero? @@ -189,7 +189,7 @@ The same is true for +unless+. The +case+ expression can be used in two ways. The most common way is to compare an object against multiple patterns. The -patterns are matched using the +===+ method which is aliased to +==+ on +patterns are matched using the <tt>===</tt> method which is aliased to <tt>==</tt> on Object. Other classes must override it to give meaningful behavior. See Module#=== and Regexp#=== for examples. @@ -232,7 +232,7 @@ You may use +then+ after the +when+ condition. This is most frequently used to place the body of the +when+ on a single line. case a - when 1, 2 then puts "a is one or two + when 1, 2 then puts "a is one or two" when 3 then puts "a is three" else puts "I don't know what a is" end @@ -255,6 +255,20 @@ Again, the +then+ and +else+ are optional. The result value of a +case+ expression is the last value executed in the expression. +Since Ruby 2.7, +case+ expressions also provide a more powerful +pattern matching feature via the +in+ keyword: + + case {a: 1, b: 2, c: 3} + in a: Integer => m + "matched: #{m}" + else + "not matched" + end + # => "matched: 1" + +The pattern matching syntax is described on +{its own page}[rdoc-ref:syntax/pattern_matching.rdoc]. + == +while+ Loop The +while+ loop executes while a condition is true: @@ -439,11 +453,69 @@ longer true, now you will receive a SyntaxError when you use +retry+ outside of a +rescue+ block. See {Exceptions}[rdoc-ref:syntax/exceptions.rdoc] for proper usage of +retry+. +== Modifier Statements + +Ruby's grammar differentiates between statements and expressions. All +expressions are statements (an expression is a type of statement), but +not all statements are expressions. Some parts of the grammar accept +expressions and not other types of statements, which causes code that +looks similar to be parsed differently. + +For example, when not used as a modifier, +if+, +else+, +while+, +until+, +and +begin+ are expressions (and also statements). However, when +used as a modifier, +if+, +else+, +while+, +until+ and +rescue+ +are statements but not expressions. + + if true; 1 end # expression (and therefore statement) + 1 if true # statement (not expression) + +Statements that are not expressions cannot be used in contexts where an +expression is expected, such as method arguments. + + puts( 1 if true ) #=> SyntaxError + +You can wrap a statement in parentheses to create an expression. + + puts((1 if true)) #=> 1 + +If you put a space between the method name and opening parenthesis, you +do not need two sets of parentheses. + + puts (1 if true) #=> 1, because of optional parentheses for method + +This is because this is parsed similar to a method call without +parentheses. It is equivalent to the following code, without the creation +of a local variable: + + x = (1 if true) + p x + +In a modifier statement, the left-hand side must be a statement and the +right-hand side must be an expression. + +So in <code>a if b rescue c</code>, because <code>b rescue c</code> is a +statement that is not an expression, and therefore is not allowed as the +right-hand side of the +if+ modifier statement, the code is necessarily +parsed as <code>(a if b) rescue c</code>. + +This interacts with operator precedence in such a way that: + + stmt if v = expr rescue x + stmt if v = expr unless x + +are parsed as: + + stmt if v = (expr rescue x) + (stmt if v = expr) unless x + +This is because modifier +rescue+ has higher precedence than <code>=</code>, +and modifier +if+ has lower precedence than <code>=</code>. + == Flip-Flop -The flip-flop is a rarely seen conditional expression. It's primary use is -for processing text from ruby one-line programs used with <code>ruby -n</code> -or <code>ruby -p</code>. +The flip-flop is a slightly special conditional expression. One of its +typical uses is processing text from ruby one-line programs used with +<code>ruby -n</code> or <code>ruby -p</code>. The form of the flip-flop is an expression that indicates when the flip-flop turns on, <code>..</code> (or <code>...</code>), then an expression @@ -452,7 +524,6 @@ will continue to evaluate to +true+, and +false+ when off. Here is an example: - selected = [] 0.upto 10 do |value| @@ -461,15 +532,16 @@ Here is an example: p selected # prints [2, 3, 4, 5, 6, 7, 8] -In the above example, the on condition is <code>n==2</code>. The flip-flop -is initially off (false) for 0 and 1, but becomes on (true) for 2 and remains -on through 8. After 8 it turns off and remains off for 9 and 10. +In the above example, the `on' condition is <code>n==2</code>. The flip-flop +is initially `off' (false) for 0 and 1, but becomes `on' (true) for 2 and +remains `on' through 8. After 8 it turns off and remains `off' for 9 and 10. -The flip-flop must be used inside a conditional such as +if+, +while+, -+unless+, +until+ etc. including the modifier forms. +The flip-flop must be used inside a conditional such as <code>!</code>, +<code>? :</code>, +not+, +if+, +while+, +unless+, +until+ etc. including the +modifier forms. -When you use an inclusive range (<code>..</code>), the off condition is -evaluated when the on condition changes: +When you use an inclusive range (<code>..</code>), the `off' condition is +evaluated when the `on' condition changes: selected = [] @@ -483,7 +555,7 @@ Here, both sides of the flip-flop are evaluated so the flip-flop turns on and off only when +value+ equals 2. Since the flip-flop turned on in the iteration it returns true. -When you use an exclusive range (<code>...</code>), the off condition is +When you use an exclusive range (<code>...</code>), the `off' condition is evaluated on the following iteration: selected = [] @@ -495,5 +567,73 @@ evaluated on the following iteration: p selected # prints [2, 3, 4, 5] Here, the flip-flop turns on when +value+ equals 2, but doesn't turn off on the -same iteration. The off condition isn't evaluated until the following +same iteration. The `off' condition isn't evaluated until the following iteration and +value+ will never be two again. + +== throw/catch + ++throw+ and +catch+ are used to implement non-local control flow in Ruby. They +operate similarly to exceptions, allowing control to pass directly from the +place where +throw+ is called to the place where the matching +catch+ is +called. The main difference between +throw+/+catch+ and the use of exceptions +is that +throw+/+catch+ are designed for expected non-local control flow, +while exceptions are designed for exceptional control flow situations, such +as handling unexpected errors. + +When using +throw+, you provide 1-2 arguments. The first argument is the +value for the matching +catch+. The second argument is optional (defaults to ++nil+), and will be the value that +catch+ returns if there is a matching ++throw+ inside the +catch+ block. If no matching +throw+ method is called +inside a +catch+ block, the +catch+ method returns the return value of the +block passed to it. + + def a(n) + throw :d, :a if n == 0 + b(n) + end + + def b(n) + throw :d, :b if n == 1 + c(n) + end + + def c(n) + throw :d if n == 2 + end + + 4.times.map do |i| + catch(:d) do + a(i) + :default + end + end + # => [:a, :b, nil, :default] + +If the first argument you pass to +throw+ is not handled by a matching ++catch+, an UncaughtThrowError exception will be raised. This is because ++throw+/+catch+ should only be used for expected control flow changes, so +using a value that is not already expected is an error. + ++throw+/+catch+ are implemented as Kernel methods (Kernel#throw and +Kernel#catch), not as keywords. So they are not usable directly if you are +in a BasicObject context. You can use Kernel.throw and Kernel.catch in +this case: + + BasicObject.new.instance_exec do + def a + b + end + + def b + c + end + + def c + ::Kernel.throw :d, :e + end + + result = ::Kernel.catch(:d) do + a + end + result # => :e + end diff --git a/doc/syntax/exceptions.rdoc b/doc/syntax/exceptions.rdoc index a2e75616fb..cdf9d367a7 100644 --- a/doc/syntax/exceptions.rdoc +++ b/doc/syntax/exceptions.rdoc @@ -17,7 +17,14 @@ wish to limit the scope of rescued exceptions: # ... end -The same is true for a +class+ or +module+. +The same is true for a +class+, +module+, and +block+: + + [0, 1, 2].map do |i| + 10 / i + rescue ZeroDivisionError + nil + end + #=> [nil, 10, 5] You can assign the exception to a local variable by using <tt>=> variable_name</tt> at the end of the +rescue+ line: @@ -79,7 +86,7 @@ To always run some code whether an exception was raised or not, use +ensure+: rescue # ... ensure - # this always runs + # this always runs BUT does not implicitly return the last evaluated statement. end You may also run some code when an exception is not raised: @@ -89,7 +96,11 @@ You may also run some code when an exception is not raised: rescue # ... else - # this runs only when no exception was raised + # this runs only when no exception was raised AND return the last evaluated statement ensure - # ... + # this always runs. + # It is evaluated after the evaluation of either the `rescue` or the `else` block. + # It will not return implicitly. end + +NB : Without explicit +return+ in the +ensure+ block, +begin+/+end+ block will return the last evaluated statement before entering in the +ensure+ block. diff --git a/doc/keywords.rdoc b/doc/syntax/keywords.rdoc index 98bbd5e864..7c368205ef 100644 --- a/doc/keywords.rdoc +++ b/doc/syntax/keywords.rdoc @@ -1,4 +1,4 @@ -== Keywords += Keywords The following keywords are used by Ruby. @@ -76,12 +76,14 @@ for:: expressions}[rdoc-ref:syntax/control_expressions.rdoc] if:: - Used for +if+ and modifier +if+ expressions. See {control + Used for +if+ and modifier +if+ statements. See {control expressions}[rdoc-ref:syntax/control_expressions.rdoc] in:: Used to separate the iterable object and iterator variable in a +for+ loop. See {control expressions}[rdoc-ref:syntax/control_expressions.rdoc] + It also serves as a pattern in a +case+ expression. + See {pattern matching}[rdoc-ref:syntax/pattern_matching.rdoc] module:: Creates or opens a module. See {modules and classes @@ -115,7 +117,9 @@ retry:: handling}[rdoc-ref:syntax/exceptions.rdoc] return:: - Exits a method. See {methods}[rdoc-ref:syntax/methods.rdoc] + Exits a method. See {methods}[rdoc-ref:syntax/methods.rdoc]. + If met in top-level scope, immediately stops interpretation of + the current file. self:: The object the current method is attached to. See @@ -137,7 +141,7 @@ undef:: See {modules and classes}[rdoc-ref:syntax/modules_and_classes.rdoc] unless:: - Used for +unless+ and modifier +unless+ expressions. See {control + Used for +unless+ and modifier +unless+ statements. See {control expressions}[rdoc-ref:syntax/control_expressions.rdoc] until:: diff --git a/doc/syntax/layout.rdoc b/doc/syntax/layout.rdoc new file mode 100644 index 0000000000..f07447587b --- /dev/null +++ b/doc/syntax/layout.rdoc @@ -0,0 +1,118 @@ += Code Layout + +Expressions in Ruby are separated by line breaks: + + x = 1 + y = 2 + z = x + y + +Line breaks also used as logical separators of the headers of some of control structures from their bodies: + + if z > 3 # line break ends the condition and starts the body + puts "more" + end + + while x < 3 # line break ends the condition and starts the body + x += 1 + end + +<tt>;</tt> can be used as an expressions separator instead of a line break: + + x = 1; y = 2; z = x + y + if z > 3; puts "more"; end + +Traditionally, expressions separated by <tt>;</tt> is used only in short scripts and experiments. + +In some control structures, there is an optional keyword that can be used instead of a line break to separate their elements: + + # if, elsif, until and case ... when: 'then' is an optional separator: + + if z > 3 then puts "more" end + + case x + when Numeric then "number" + when String then "string" + else "object" + end + + # while and until: 'do' is an optional separator + while x < 3 do x +=1 end + +Also, line breaks can be skipped in some places where it doesn't create any ambiguity. Note in the example above: no line break needed before +end+, just as no line break needed after +else+. + +== Breaking expressions in lines + +One expression might be split into several lines when each line can be unambiguously identified as "incomplete" without the next one. + +These works: + + x = # incomplete without something after = + 1 + # incomplete without something after + + 2 + + File.read "test.txt", # incomplete without something after , + enconding: "utf-8" + +These would not: + + # unintended interpretation: + x = 1 # already complete expression + + 2 # interpreted as a separate +2 + + # syntax error: + File.read "test.txt" # already complete expression + , encoding: "utf-8" # attempt to parse as a new expression, SyntaxError + +The exceptions to the rule are lines starting with <tt>.</tt> ("leading dot" style of method calls) or logical operators <tt>&&</tt>/<tt>||</tt> and <tt>and</tt>/<tt>or</tt>: + + # OK, interpreted as a chain of calls + File.read('test.txt') + .strip("\n") + .split("\t") + .sort + + # OK, interpreted as a chain of logical operators: + File.empty?('test.txt') + || File.size('test.txt') < 10 + || File.read('test.txt').strip.empty? + +If the expressions is broken into multiple lines in any of the ways described above, comments between separate lines are allowed: + + sum = base_salary + + # see "yearly bonuses section" + yearly_bonus(year) + + # per-employee coefficient is described + # in another module + personal_coeff(employee) + + # We want to short-circuit on empty files + File.empty?('test.txt') + # Or almost empty ones + || File.size('test.txt') < 10 + # Otherwise we check if it is full of spaces + || File.read('test.txt').strip.empty? + +Finally, the code can explicitly tell Ruby that the expression is continued on the next line with <tt>\\</tt>: + + # Unusual, but works + File.read "test.txt" \ + , encoding: "utf-8" + + # More regular usage (joins the strings on parsing instead + # of concatenating them in runtime, as + would do): + TEXT = "One pretty long line" \ + "one more long line" \ + "one other line of the text" + +The <tt>\\</tt> works as a parse time line break escape, so with it, comments can not be inserted between the lines: + + TEXT = "line 1" \ + # here would be line 2: + "line 2" + + # This is interpreted as if there was no line break where \ is, + # i.e. the same as + TEXT = "line 1" # here would be line 2: + "line 2" + + puts TEXT #=> "line 1" diff --git a/doc/syntax/literals.rdoc b/doc/syntax/literals.rdoc index 0cde9447c5..87a891bf2d 100644 --- a/doc/syntax/literals.rdoc +++ b/doc/syntax/literals.rdoc @@ -2,17 +2,33 @@ Literals create objects you can use in your program. Literals include: -* Booleans and nil -* Numbers -* Strings -* Symbols -* Arrays -* Hashes -* Ranges -* Regular Expressions -* Procs - -== Booleans and nil +* {Boolean and Nil Literals}[#label-Boolean+and+Nil+Literals] +* {Numeric Literals}[#label-Numeric+Literals] + + * {Integer Literals}[#label-Integer+Literals] + * {Float Literals}[#label-Float+Literals] + * {Rational Literals}[#label-Rational+Literals] + * {Complex Literals}[#label-Complex+Literals] + +* {String Literals}[#label-String+Literals] +* {Here Document Literals}[#label-Here+Document+Literals] +* {Symbol Literals}[#label-Symbol+Literals] +* {Array Literals}[#label-Array+Literals] +* {Hash Literals}[#label-Hash+Literals] +* {Range Literals}[#label-Range+Literals] +* {Regexp Literals}[#label-Regexp+Literals] +* {Lambda Proc Literals}[#label-Lambda+Proc+Literals] +* {Percent Literals}[#label-Percent+Literals] + + * {%q: Non-Interpolable String Literals}[#label-25q-3A+Non-Interpolable+String+Literals] + * {% and %Q: Interpolable String Literals}[#label-25+and+-25Q-3A+Interpolable+String+Literals] + * {%w and %W: String-Array Literals}[#label-25w+and+-25W-3A+String-Array+Literals] + * {%i and %I: Symbol-Array Literals}[#label-25i+and+-25I-3A+Symbol-Array+Literals] + * {%r: Regexp Literals}[#label-25r-3A+Regexp+Literals] + * {%s: Symbol Literals}[#label-25s-3A+Symbol+Literals] + * {%x: Backtick Literals}[#label-25x-3A+Backtick+Literals] + +== Boolean and Nil Literals +nil+ and +false+ are both false values. +nil+ is sometimes used to indicate "no value" or "unknown" but evaluates to +false+ in conditional expressions. @@ -20,10 +36,9 @@ Literals create objects you can use in your program. Literals include: +true+ is a true value. All objects except +nil+ and +false+ evaluate to a true value in conditional expressions. -(There are also the constants +TRUE+, +FALSE+ and +NIL+, but the lowercase -literal forms are preferred.) +== \Numeric Literals -== Numbers +=== \Integer Literals You can write integers of any size as follows: @@ -34,15 +49,6 @@ These numbers have the same value, 1,234. The underscore may be used to enhance readability for humans. You may place an underscore anywhere in the number. -Floating point numbers may be written as follows: - - 12.34 - 1234e-2 - 1.234E1 - -These numbers have the same value, 12.34. You may use underscores in floating -point numbers as well. - You can use a special prefix to write numbers in decimal, hexadecimal, octal or binary formats. For decimal numbers use a prefix of <tt>0d</tt>, for hexadecimal numbers use a prefix of <tt>0x</tt>, for octal numbers use a @@ -71,34 +77,68 @@ Examples: All these numbers have the same decimal value, 170. Like integers and floats you may use an underscore for readability. -=== Rational numbers +=== \Float Literals -Numbers suffixed by +r+ are Rational numbers. +Floating-point numbers may be written as follows: - 12r #=> (12/1) - 12.3r #=> (123/10) + 12.34 + 1234e-2 + 1.234E1 + +These numbers have the same value, 12.34. You may use underscores in floating +point numbers as well. -Rational numbers are exact, whereas Float numbers are inexact. +=== \Rational Literals - 0.1r + 0.2r #=> (3/10) - 0.1 + 0.2 #=> 0.30000000000000004 +You can write a Rational literal using a special suffix, <tt>'r'</tt>. + +Examples: -=== Complex numbers + 1r # => (1/1) + 2/3r # => (2/3) # With denominator. + -1r # => (-1/1) # With signs. + -2/3r # => (-2/3) + 2/-3r # => (-2/3) + -2/-3r # => (2/3) + +1/+3r # => (1/3) + 1.2r # => (6/5) # With fractional part. + 1_1/2_1r # => (11/21) # With embedded underscores. + 2/4r # => (1/2) # Automatically reduced. -Numbers suffixed by +i+ are Complex (or imaginary) numbers. +Syntax: + + <rational-literal> = <numerator> [ '/' <denominator> ] 'r' + <numerator> = [ <sign> ] <digits> [ <fractional-part> ] + <fractional-part> = '.' <digits> + <denominator> = [ sign ] <digits> + <sign> = '-' | '+' + <digits> = <digit> { <digit> | '_' <digit> } + <digit> = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' + +Note this, which is parsed as \Float numerator <tt>1.2</tt> +divided by \Rational denominator <tt>3r</tt>, +resulting in a \Float: + + 1.2/3r # => 0.39999999999999997 + +=== \Complex Literals + +You can write a Complex number as follows (suffixed +i+): 1i #=> (0+1i) 1i * 1i #=> (-1+0i) -Also Rational numbers may be imaginary numbers. +Also \Rational numbers may be imaginary numbers. 12.3ri #=> (0+(123/10)*i) -+i+ must be placed after +r+, the opposite is not allowed. ++i+ must be placed after +r+; the opposite is not allowed. + + 12.3ir #=> Syntax error - 12.3ir #=> syntax error +== \String Literals -== Strings +=== Double-Quoted \String Literals The most common way of writing strings is using <tt>"</tt>: @@ -110,35 +150,14 @@ Any internal <tt>"</tt> must be escaped: "This string has a quote: \". As you can see, it is escaped" -Double-quote strings allow escaped characters such as <tt>\n</tt> for -newline, <tt>\t</tt> for tab, etc. The full list of supported escape -sequences are as follows: +Double-quoted strings allow escape sequences described in +{Escape Sequences}[#label-Escape+Sequences]. - \a bell, ASCII 07h (BEL) - \b backspace, ASCII 08h (BS) - \t horizontal tab, ASCII 09h (TAB) - \n newline (line feed), ASCII 0Ah (LF) - \v vertical tab, ASCII 0Bh (VT) - \f form feed, ASCII 0Ch (FF) - \r carriage return, ASCII 0Dh (CR) - \e escape, ASCII 1Bh (ESC) - \s space, ASCII 20h (SPC) - \\ backslash, \ - \nnn octal bit pattern, where nnn is 1-3 octal digits ([0-7]) - \xnn hexadecimal bit pattern, where nn is 1-2 hexadecimal digits ([0-9a-fA-F]) - \unnnn Unicode character, where nnnn is exactly 4 hexadecimal digits ([0-9a-fA-F]) - \u{nnnn ...} Unicode character(s), where each nnnn is 1-6 hexadecimal digits ([0-9a-fA-F]) - \cx or \C-x control character, where x is an ASCII printable character - \M-x meta character, where x is an ASCII printable character - \M-\C-x meta control character, where x is an ASCII printable character - \M-\cx same as above - \c\M-x same as above - \c? or \C-? delete, ASCII 7Fh (DEL) - -Any other character following a backslash is interpreted as the +In a double-quoted string, +any other character following a backslash is interpreted as the character itself. -Double-quote strings allow interpolation of other values using +Double-quoted strings allow interpolation of other values using <tt>#{...}</tt>: "One plus one is two: #{1 + 1}" @@ -146,8 +165,18 @@ Double-quote strings allow interpolation of other values using Any expression may be placed inside the interpolated section, but it's best to keep the expression small for readability. +You can also use <tt>#@foo</tt>, <tt>#@@foo</tt> and <tt>#$foo</tt> as a +shorthand for, respectively, <tt>#{ @foo }</tt>, <tt>#{ @@foo }</tt> and +<tt>#{ $foo }</tt>. + +See also: + +* {% and %Q: Interpolable String Literals}[#label-25+and+-25Q-3A+Interpolable+String+Literals] + +=== Single-Quoted \String Literals + Interpolation may be disabled by escaping the "#" character or using -single-quote strings: +single-quoted strings: '#{1 + 1}' #=> "\#{1 + 1}" @@ -155,14 +184,15 @@ In addition to disabling interpolation, single-quoted strings also disable all escape sequences except for the single-quote (<tt>\'</tt>) and backslash (<tt>\\\\</tt>). -You may also create strings using <tt>%</tt>: +In a single-quoted string, +any other character following a backslash is interpreted as is: +a backslash and the character itself. + +See also: - %(1 + 1 is #{1 + 1}) #=> "1 + 1 is 2" +* {%q: Non-Interpolable String Literals}[#label-25q-3A+Non-Interpolable+String+Literals] -There are two different types of <tt>%</tt> strings <tt>%q(...)</tt> behaves -like a single-quote string (no interpolation or character escaping), while -<tt>%Q</tt> behaves as a double-quote string. See Percent Strings below for -more discussion of the syntax of percent strings. +=== Literal String Concatenation Adjacent string literals are automatically concatenated by the interpreter: @@ -174,12 +204,14 @@ Any combination of adjacent single-quote, double-quote, percent strings will be concatenated as long as a percent-string is not last. %q{a} 'b' "c" #=> "abc" - "a" 'b' %q{c} #=> NameError: uninitialized constant q + "a" 'b' %q{c} #=> NoMethodError: undefined method 'q' for main + +=== Character Literal There is also a character literal notation to represent single character strings, which syntax is a question mark (<tt>?</tt>) -followed by a single character or escape sequence that corresponds to -a single codepoint in the script encoding: +followed by a single character or escape sequence (except continuation line) +that corresponds to a single codepoint in the script encoding: ?a #=> "a" ?abc #=> SyntaxError @@ -193,7 +225,47 @@ a single codepoint in the script encoding: ?\C-\M-a #=> "\x81", same as above ?ã‚ #=> "ã‚" -=== Here Documents +=== Escape Sequences + +Some characters can be represented as escape sequences in +double-quoted strings, +character literals, +here document literals (non-quoted, double-quoted, and with backticks), +double-quoted symbols, +double-quoted symbol keys in Hash literals, +Regexp literals, and +several percent literals (<tt>%</tt>, <tt>%Q</tt>, <tt>%W</tt>, <tt>%I</tt>, <tt>%r</tt>, <tt>%x</tt>). + +They allow escape sequences such as <tt>\n</tt> for +newline, <tt>\t</tt> for tab, etc. The full list of supported escape +sequences are as follows: + + \a bell, ASCII 07h (BEL) + \b backspace, ASCII 08h (BS) + \t horizontal tab, ASCII 09h (TAB) + \n newline (line feed), ASCII 0Ah (LF) + \v vertical tab, ASCII 0Bh (VT) + \f form feed, ASCII 0Ch (FF) + \r carriage return, ASCII 0Dh (CR) + \e escape, ASCII 1Bh (ESC) + \s space, ASCII 20h (SPC) + \\ backslash, \ + \nnn octal bit pattern, where nnn is 1-3 octal digits ([0-7]) + \xnn hexadecimal bit pattern, where nn is 1-2 hexadecimal digits ([0-9a-fA-F]) + \unnnn Unicode character, where nnnn is exactly 4 hexadecimal digits ([0-9a-fA-F]) + \u{nnnn ...} Unicode character(s), where each nnnn is 1-6 hexadecimal digits ([0-9a-fA-F]) + \cx or \C-x control character, where x is an ASCII printable character + \M-x meta character, where x is an ASCII printable character + \M-\C-x meta control character, where x is an ASCII printable character + \M-\cx same as above + \c\M-x same as above + \c? or \C-? delete, ASCII 7Fh (DEL) + \<newline> continuation line (empty string) + +The last one, <tt>\<newline></tt>, represents an empty string instead of a character. +It is used to fold a line in a string. + +=== Here Document Literals If you are writing a large block of text you may use a "here document" or "heredoc": @@ -219,7 +291,7 @@ You may indent the ending identifier if you place a "-" after <tt><<</tt>: That might span many lines INDENTED_HEREDOC -Note that the while the closing identifier may be indented, the content is +Note that while the closing identifier may be indented, the content is always treated as if it is flush left. If you indent the content those spaces will appear in the output. @@ -237,9 +309,16 @@ the content. Note that empty lines and lines consisting solely of literal tabs and spaces will be ignored for the purposes of determining indentation, but escaped tabs and spaces are considered non-indentation characters. -A heredoc allows interpolation and escaped characters. You may disable -interpolation and escaping by surrounding the opening identifier with single -quotes: +For the purpose of measuring an indentation, a horizontal tab is regarded as a +sequence of one to eight spaces such that the column position corresponding to +its end is a multiple of eight. The amount to be removed is counted in terms +of the number of spaces. If the boundary appears in the middle of a tab, that +tab is not removed. + +A heredoc allows interpolation and the escape sequences described in +{Escape Sequences}[#label-Escape+Sequences]. +You may disable interpolation and the escaping by surrounding the opening +identifier with single quotes: expected_result = <<-'EXPECTED' One plus one is #{1 + 1} @@ -273,26 +352,34 @@ read: content for heredoc two TWO -== Symbols +== \Symbol Literals A Symbol represents a name inside the ruby interpreter. See Symbol for more details on what symbols are and when ruby creates them internally. You may reference a symbol using a colon: <tt>:my_symbol</tt>. -You may also create symbols by interpolation: +You may also create symbols by interpolation and escape sequences described in +{Escape Sequences}[#label-Escape+Sequences] with double-quotes: :"my_symbol1" :"my_symbol#{1 + 1}" + :"foo\sbar" -Like strings, a single-quote may be used to disable interpolation: +Like strings, a single-quote may be used to disable interpolation and +escape sequences: :'my_symbol#{1 + 1}' #=> :"my_symbol\#{1 + 1}" When creating a Hash, there is a special syntax for referencing a Symbol as well. -== Arrays +See also: + +* {%s: Symbol Literals}[#label-25s-3A+Symbol+Literals] + + +== \Array Literals An array is created using the objects between <tt>[</tt> and <tt>]</tt>: @@ -303,9 +390,14 @@ You may place expressions inside the array: [1, 1 + 1, 1 + 2] [1, [1 + 1, [1 + 2]]] +See also: + +* {%w and %W: String-Array Literals}[#label-25w+and+-25W-3A+String-Array+Literals] +* {%i and %I: Symbol-Array Literals}[#label-25i+and+-25I-3A+Symbol-Array+Literals] + See Array for the methods you may use with an array. -== Hashes +== \Hash Literals A hash is created using key-value pairs between <tt>{</tt> and <tt>}</tt>: @@ -327,9 +419,17 @@ is equal to { :"a 1" => 1, :"b 2" => 2 } +Hash values can be omitted, meaning that value will be fetched from the context +by the name of the key: + + x = 100 + y = 200 + h = { x:, y: } + #=> {:x=>100, :y=>200} + See Hash for the methods you may use with a hash. -== Ranges +== \Range Literals A range represents an interval of values. The range may include or exclude its ending value. @@ -342,27 +442,31 @@ its ending value. You may create a range of any object. See the Range documentation for details on the methods you need to implement. -== Regular Expressions +== \Regexp Literals -A regular expression is created using "/": +A regular expression may be created using leading and trailing +slash (<tt>'/'</tt>) characters: - /my regular expression/ + re = /foo/ # => /foo/ + re.class # => Regexp -The regular expression may be followed by flags which adjust the matching -behavior of the regular expression. The "i" flag makes the regular expression -case-insensitive: - - /my regular expression/i +The trailing slash may be followed by one or more modifiers characters +that set modes for the regexp. +See {Regexp modes}[rdoc-ref:Regexp@Modes] for details. Interpolation may be used inside regular expressions along with escaped characters. Note that a regular expression may require additional escaped characters than a string. +See also: + +* {%r: Regexp Literals}[#label-25r-3A+Regexp+Literals] + See Regexp for a description of the syntax of regular expressions. -== Procs +== Lambda Proc Literals -A proc can be created with <tt>-></tt>: +A lambda proc can be created with <tt>-></tt>: -> { 1 + 1 } @@ -374,27 +478,148 @@ You can require arguments for the proc as follows: This proc will add one to its argument. -== Percent Strings +== Percent Literals + +Each of the literals in described in this section +may use these paired delimiters: + +* <tt>[</tt> and <tt>]</tt>. +* <tt>(</tt> and <tt>)</tt>. +* <tt>{</tt> and <tt>}</tt>. +* <tt><</tt> and <tt>></tt>. +* Non-alphanumeric ASCII character except above, as both beginning and ending delimiters. + +The delimiters can be escaped with a backslash. +However, the first four pairs (brackets, parenthesis, braces, and +angle brackets) are allowed without backslash as far as they are correctly +paired. + +These are demonstrated in the next section. + +=== <tt>%q</tt>: Non-Interpolable String Literals + +You can write a non-interpolable string with <tt>%q</tt>. +The created string is the same as if you created it with single quotes: + + %q[foo bar baz] # => "foo bar baz" # Using []. + %q(foo bar baz) # => "foo bar baz" # Using (). + %q{foo bar baz} # => "foo bar baz" # Using {}. + %q<foo bar baz> # => "foo bar baz" # Using <>. + %q|foo bar baz| # => "foo bar baz" # Using two |. + %q:foo bar baz: # => "foo bar baz" # Using two :. + %q(1 + 1 is #{1 + 1}) # => "1 + 1 is \#{1 + 1}" # No interpolation. + %q[foo[bar]baz] # => "foo[bar]baz" # brackets can be nested. + %q(foo(bar)baz) # => "foo(bar)baz" # parenthesis can be nested. + %q{foo{bar}baz} # => "foo{bar}baz" # braces can be nested. + %q<foo<bar>baz> # => "foo<bar>baz" # angle brackets can be nested. + +This is similar to single-quoted string but only backslashes and +the specified delimiters can be escaped with a backslash. + +=== <tt>% and %Q</tt>: Interpolable String Literals + +You can write an interpolable string with <tt>%Q</tt> +or with its alias <tt>%</tt>: + + %[foo bar baz] # => "foo bar baz" + %(1 + 1 is #{1 + 1}) # => "1 + 1 is 2" # Interpolation. + +This is similar to double-quoted string. +It allow escape sequences described in +{Escape Sequences}[#label-Escape+Sequences]. +Other escaped characters (a backslash followed by a character) are +interpreted as the character. + +=== <tt>%w and %W</tt>: String-Array Literals + +You can write an array of strings as whitespace-separated words +with <tt>%w</tt> (non-interpolable) or <tt>%W</tt> (interpolable): + + %w[foo bar baz] # => ["foo", "bar", "baz"] + %w[1 % *] # => ["1", "%", "*"] + # Use backslash to embed spaces in the strings. + %w[foo\ bar baz\ bat] # => ["foo bar", "baz bat"] + %W[foo\ bar baz\ bat] # => ["foo bar", "baz bat"] + %w(#{1 + 1}) # => ["\#{1", "+", "1}"] + %W(#{1 + 1}) # => ["2"] + + # The nested delimiters evaluated to a flat array of strings + # (not nested array). + %w[foo[bar baz]qux] # => ["foo[bar", "baz]qux"] + +The following characters are considered as white spaces to separate words: + +* space, ASCII 20h (SPC) +* form feed, ASCII 0Ch (FF) +* newline (line feed), ASCII 0Ah (LF) +* carriage return, ASCII 0Dh (CR) +* horizontal tab, ASCII 09h (TAB) +* vertical tab, ASCII 0Bh (VT) + +The white space characters can be escaped with a backslash to make them +part of a word. + +<tt>%W</tt> allow escape sequences described in +{Escape Sequences}[#label-Escape+Sequences]. +However the continuation line <tt>\<newline></tt> is not usable because +it is interpreted as the escaped newline described above. + +=== <tt>%i and %I</tt>: Symbol-Array Literals + +You can write an array of symbols as whitespace-separated words +with <tt>%i</tt> (non-interpolable) or <tt>%I</tt> (interpolable): + + %i[foo bar baz] # => [:foo, :bar, :baz] + %i[1 % *] # => [:"1", :%, :*] + # Use backslash to embed spaces in the symbols. + %i[foo\ bar baz\ bat] # => [:"foo bar", :"baz bat"] + %I[foo\ bar baz\ bat] # => [:"foo bar", :"baz bat"] + %i(#{1 + 1}) # => [:"\#{1", :+, :"1}"] + %I(#{1 + 1}) # => [:"2"] + +The white space characters and its escapes are interpreted as the same as +string-array literals described in +{%w and %W: String-Array Literals}[#label-25w+and+-25W-3A+String-Array+Literals]. + +=== <tt>%s</tt>: Symbol Literals + +You can write a symbol with <tt>%s</tt>: + + %s[foo] # => :foo + %s[foo bar] # => :"foo bar" + +This is non-interpolable. +No interpolation allowed. +Only backslashes and the specified delimiters can be escaped with a backslash. + +=== <tt>%r</tt>: Regexp Literals + +You can write a regular expression with <tt>%r</tt>; +the character used as the leading and trailing delimiter +may be (almost) any character: + + %r/foo/ # => /foo/ + %r:name/value pair: # => /name\/value pair/ + +A few "symmetrical" character pairs may be used as delimiters: -Besides <tt>%(...)</tt> which creates a String, the <tt>%</tt> may create -other types of object. As with strings, an uppercase letter allows -interpolation and escaped characters while a lowercase letter disables them. + %r[foo] # => /foo/ + %r{foo} # => /foo/ + %r(foo) # => /foo/ + %r<foo> # => /foo/ -These are the types of percent strings in ruby: +The trailing delimiter may be followed by one or more modifier characters +that set modes for the regexp. +See {Regexp modes}[rdoc-ref:Regexp@Modes] for details. -<tt>%i</tt> :: Array of Symbols -<tt>%q</tt> :: String -<tt>%r</tt> :: Regular Expression -<tt>%s</tt> :: Symbol -<tt>%w</tt> :: Array of Strings -<tt>%x</tt> :: Backtick (capture subshell result) +=== <tt>%x</tt>: Backtick Literals -For the two array forms of percent string, if you wish to include a space in -one of the array entries you must escape it with a "\\" character: +You can write and execute a shell command with <tt>%x</tt>: - %w[one one-hundred\ one] - #=> ["one", "one-hundred one"] + %x(echo 1) # => "1\n" + %x[echo #{1 + 2}] # => "3\n" + %x[echo \u0030] # => "0\n" -If you are using "(", "[", "{", "<" you must close it with ")", "]", "}", ">" -respectively. You may use most other non-alphanumeric characters for percent -string delimiters such as "%", "|", "^", etc. +This is interpolable. +<tt>%x</tt> allow escape sequences described in +{Escape Sequences}[#label-Escape+Sequences]. diff --git a/doc/syntax/methods.rdoc b/doc/syntax/methods.rdoc index a47c1a3cbf..14810a188f 100644 --- a/doc/syntax/methods.rdoc +++ b/doc/syntax/methods.rdoc @@ -11,14 +11,19 @@ A method definition consists of the +def+ keyword, a method name, the body of the method, +return+ value and the +end+ keyword. When called the method will execute the body of the method. This method returns +2+. +Since Ruby 3.0, there is also a shorthand syntax for methods consisting +of exactly one expression: + + def one_plus_one = 1 + 1 + This section only covers defining methods. See also the {syntax documentation on calling methods}[rdoc-ref:syntax/calling_methods.rdoc]. == Method Names Method names may be one of the operators or must start a letter or a character -with the eight bit set. It may contain letters, numbers, an <code>_</code> -(underscore or low line) or a character with the eight bit set. The convention +with the eighth bit set. It may contain letters, numbers, an <code>_</code> +(underscore or low line) or a character with the eighth bit set. The convention is to use underscores to separate words in a multiword method name: def method_name @@ -26,7 +31,7 @@ is to use underscores to separate words in a multiword method name: end Ruby programs must be written in a US-ASCII-compatible character set such as -UTF-8, ISO-8859-1 etc. In such character sets if the eight bit is set it +UTF-8, ISO-8859-1 etc. In such character sets if the eighth bit is set it indicates an extended character. Ruby allows method names and other identifiers to contain such characters. Ruby programs cannot contain some characters like ASCII NUL (<code>\x00</code>). @@ -62,9 +67,24 @@ Methods that end with a question mark by convention return boolean, but they may not always return just +true+ or +false+. Often, they will return an object to indicate a true value (or "truthy" value). -Methods that end with an equals sign indicate an assignment method. For -assignment methods, the return value is ignored and the arguments are returned -instead. +Methods that end with an equals sign indicate an assignment method. + + class C + def attr + @attr + end + + def attr=(val) + @attr = val + end + end + + c = C.new + c.attr #=> nil + c.attr = 10 # calls "attr=(10)" + c.attr #=> 10 + +Assignment methods can not be defined using the shorthand syntax. These are method names for the various Ruby operators. Each of these operators accepts only one argument. Following the operator is the typical @@ -80,6 +100,7 @@ operators. <code>/</code> :: divide <code>%</code> :: modulus division, String#% <code>&</code> :: AND +<code>|</code> :: OR <code>^</code> :: XOR (exclusive OR) <code>>></code> :: right-shift <code><<</code> :: left-shift, append @@ -94,8 +115,8 @@ operators. <code>></code> :: greater-than <code>>=</code> :: greater-than or equal -To define unary methods minus, plus, tilde and not (<code>!</code>) follow the -operator with an <code>@</code> as in <code>+@</code> or <code>!@</code>: +To define unary methods minus and plus, follow the operator with an +<code>@</code> as in <code>+@</code>: class C def -@ @@ -107,6 +128,13 @@ operator with an <code>@</code> as in <code>+@</code> or <code>!@</code>: -obj # prints "you inverted this object" +The <code>@</code> is needed to differentiate unary minus and plus +operators from binary minus and plus operators. + +You can also follow tilde and not (<code>!</code>) unary methods with +<code>@</code>, but it is not required as there are no binary tilde +and not operators. + Unary methods accept zero arguments. Additionally, methods for element reference and assignment may be defined: @@ -253,6 +281,13 @@ The parentheses around the arguments are optional: value + 1 end +The parentheses are mandatory in shorthand method definitions: + + # OK + def add_one(value) = value + 1 + # SyntaxError + def add_one value = value + 1 + Multiple arguments are separated by a comma: def add_values(a, b) @@ -283,6 +318,25 @@ This will raise a SyntaxError: a + b + c end +Default argument values can refer to arguments that have already been +evaluated as local variables, and argument values are always evaluated +left to right. So this is allowed: + + def add_values(a = 1, b = a) + a + b + end + add_values + # => 2 + +But this will raise a +NameError+ (unless there is a method named ++b+ defined): + + def add_values(a = b, b = 1) + a + b + end + add_values + # NameError (undefined local variable or method `b' for main:Object) + === Array Decomposition You can decompose (unpack or extract values from) an Array using extra @@ -355,11 +409,22 @@ converted to an Array: gather_arguments 1, 2, 3 # prints [1, 2, 3] -The array argument must be the last positional argument, it must appear before -any keyword arguments. +The array argument must appear before any keyword arguments. -The array argument will capture a Hash as the last entry if a hash was sent by -the caller after all positional arguments. +It is possible to gather arguments at the beginning or in the middle: + + def gather_arguments(first_arg, *middle_arguments, last_arg) + p middle_arguments + end + + gather_arguments 1, 2, 3, 4 # prints [2, 3] + +The array argument will capture a Hash as the last entry if keywords were +provided by the caller after all positional arguments. + + def gather_arguments(*arguments) + p arguments + end gather_arguments 1, a: 2 # prints [1, {:a=>2}] @@ -377,6 +442,13 @@ Also, note that a bare <code>*</code> can be used to ignore arguments: def ignore_arguments(*) end +You can also use a bare <code>*</code> when calling a method to pass the +arguments directly to another method: + + def delegate_arguments(*) + other_method(*) + end + === Keyword Arguments Keyword arguments are similar to positional arguments with default values: @@ -395,13 +467,56 @@ Arbitrary keyword arguments will be accepted with <code>**</code>: # prints 1 then {:second=>2, :third=>3} When calling a method with keyword arguments the arguments may appear in any -order. If an unknown keyword argument is sent by the caller an ArgumentError -is raised. +order. If an unknown keyword argument is sent by the caller, and the method +does not accept arbitrary keyword arguments, an ArgumentError is raised. + +To require a specific keyword argument, do not include a default value +for the keyword argument: + + def add_values(first:, second:) + first + second + end + add_values + # ArgumentError (missing keywords: first, second) + add_values(first: 1, second: 2) + # => 3 When mixing keyword arguments and positional arguments, all positional arguments must appear before any keyword arguments. -== Block Argument +Also, note that <code>**</code> can be used to ignore keyword arguments: + + def ignore_keywords(**) + end + +You can also use <code>**</code> when calling a method to delegate +keyword arguments to another method: + + def delegate_keywords(**) + other_method(**) + end + +To mark a method as accepting keywords, but not actually accepting +keywords, you can use the <code>**nil</code>: + + def no_keywords(**nil) + end + +Calling such a method with keywords or a non-empty keyword splat will +result in an ArgumentError. This syntax is supported so that keywords +can be added to the method later without affected backwards compatibility. + +If a method definition does not accept any keywords, and the +<code>**nil</code> syntax is not used, any keywords provided when calling +the method will be converted to a Hash positional argument: + + def meth(arg) + arg + end + meth(a: 1) + # => {:a=>1} + +=== Block Argument The block argument is indicated by <code>&</code> and must come last: @@ -415,8 +530,15 @@ Most frequently the block argument is used to pass a block to another method: @items.each(&block) end +You are not required to give a name to the block if you will just be passing +it to another method: + + def each_item(&) + @items.each(&) + end + If you are only going to call the block and will not otherwise manipulate it -or send it to another method using <code>yield</code> without an explicit +or send it to another method, using <code>yield</code> without an explicit block parameter is preferred. This method is equivalent to the first method in this section: @@ -424,14 +546,64 @@ in this section: yield self end -There is also a performance benefit to using yield over a calling a block -parameter. When a block argument is assigned to a variable a Proc object is -created which holds the block. When using yield this Proc object is not -created. +=== Argument Forwarding + +Since Ruby 2.7, an all-arguments forwarding syntax is available: + + def concrete_method(*positional_args, **keyword_args, &block) + [positional_args, keyword_args, block] + end + + def forwarding_method(...) + concrete_method(...) + end + + forwarding_method(1, b: 2) { puts 3 } + #=> [[1], {:b=>2}, #<Proc:...skip...>] -If you only need to use the block sometimes you can use Proc.new to create a -proc from the block that was passed to your method. See Proc.new for further -details. +Calling with forwarding <code>...</code> is available only in methods +defined with <code>...</code>. + + def regular_method(arg, **kwarg) + concrete_method(...) # Syntax error + end + +Since Ruby 3.0, there can be leading arguments before <code>...</code> +both in definitions and in invocations (but in definitions they can be +only positional arguments without default values). + + def request(method, path, **headers) + puts "#{method.upcase} #{path} #{headers}" + end + + def get(...) + request(:GET, ...) # leading argument in invoking + end + + get('http://ruby-lang.org', 'Accept' => 'text/html') + # Prints: GET http://ruby-lang.org {"Accept"=>"text/html"} + + def logged_get(msg, ...) # leading argument in definition + puts "Invoking #get: #{msg}" + get(...) + end + + logged_get('Ruby site', 'http://ruby-lang.org') + # Prints: + # Invoking #get: Ruby site + # GET http://ruby-lang.org {} + +Note that omitting parentheses in forwarding calls may lead to +unexpected results: + + def log(...) + puts ... # This would be treated as `puts()...', + # i.e. endless range from puts result + end + + log("test") + # Prints: warning: ... at EOL, should be parenthesized? + # ...and then empty line == Exception Handling @@ -454,6 +626,28 @@ May be written as: # handle exception end +Similarly, if you wish to always run code even if an exception is raised, +you can use +ensure+ without +begin+ and +end+: + + def my_method + # code that may raise an exception + ensure + # code that runs even if previous code raised an exception + end + +You can also combine +rescue+ with +ensure+ and/or +else+, without ++begin+ and +end+: + + def my_method + # code that may raise an exception + rescue + # handle exception + else + # only run if no exception raised above + ensure + # code that runs even if previous code raised an exception + end + If you wish to rescue an exception for only part of your method, use +begin+ and +end+. For more details see the page on {exception handling}[rdoc-ref:syntax/exceptions.rdoc]. diff --git a/doc/syntax/miscellaneous.rdoc b/doc/syntax/miscellaneous.rdoc index d5691f8d60..d5cfd3e474 100644 --- a/doc/syntax/miscellaneous.rdoc +++ b/doc/syntax/miscellaneous.rdoc @@ -13,7 +13,7 @@ most frequently used with <code>ruby -e</code>. Ruby does not require any indentation. Typically, ruby programs are indented two spaces. -If you run ruby with warnings enabled and have an indentation mis-match, you +If you run ruby with warnings enabled and have an indentation mismatch, you will receive a warning. == +alias+ @@ -83,6 +83,36 @@ Using the specific reflection methods such as instance_variable_defined? for instance variables or const_defined? for constants is less error prone than using +defined?+. ++defined?+ handles some regexp global variables specially based on whether +there is an active regexp match and how many capture groups there are: + + /b/ =~ 'a' + defined?($~) # => "global-variable" + defined?($&) # => nil + defined?($`) # => nil + defined?($') # => nil + defined?($+) # => nil + defined?($1) # => nil + defined?($2) # => nil + + /./ =~ 'a' + defined?($~) # => "global-variable" + defined?($&) # => "global-variable" + defined?($`) # => "global-variable" + defined?($') # => "global-variable" + defined?($+) # => nil + defined?($1) # => nil + defined?($2) # => nil + + /(.)/ =~ 'a' + defined?($~) # => "global-variable" + defined?($&) # => "global-variable" + defined?($`) # => "global-variable" + defined?($') # => "global-variable" + defined?($+) # => "global-variable" + defined?($1) # => "global-variable" + defined?($2) # => nil + == +BEGIN+ and +END+ +BEGIN+ defines a block that is run before any other code in the current file. diff --git a/doc/syntax/modules_and_classes.rdoc b/doc/syntax/modules_and_classes.rdoc index dd70d4ac21..9e05c5c774 100644 --- a/doc/syntax/modules_and_classes.rdoc +++ b/doc/syntax/modules_and_classes.rdoc @@ -40,9 +40,9 @@ functionality: remove_method :my_method end -Reopening classes is a very powerful feature of Ruby, but it is best to only -reopen classes you own. Reopening classes you do not own may lead to naming -conflicts or difficult to diagnose bugs. +Reopening modules (or classes) is a very powerful feature of Ruby, but it is +best to only reopen modules you own. Reopening modules you do not own may lead +to naming conflicts or difficult to diagnose bugs. == Nesting @@ -155,8 +155,8 @@ Ruby has three types of visibility. The default is +public+. A public method may be called from any other object. The second visibility is +protected+. When calling a protected method the -sender must be a subclass of the receiver or the receiver must be a subclass of -the sender. Otherwise a NoMethodError will be raised. +sender must inherit the Class or Module which defines the method. Otherwise a +NoMethodError will be raised. Protected visibility is most frequently used to define <code>==</code> and other comparison methods where the author does not wish to expose an object's @@ -190,9 +190,41 @@ Here is an example: b.n b #=> 1 -- m called on defining class a.n b # raises NoMethodError A is not a subclass of B -The third visibility is +private+. A private method may not be called with a -receiver, not even +self+. If a private method is called with a receiver a -NoMethodError will be raised. +The third visibility is +private+. A private method may only be called from +inside the owner class without a receiver, or with a literal +self+ +as a receiver. If a private method is called with a +receiver other than a literal +self+, a NoMethodError will be raised. + + class A + def without + m + end + + def with_self + self.m + end + + def with_other + A.new.m + end + + def with_renamed + copy = self + copy.m + end + + def m + 1 + end + + private :m + end + + a = A.new + a.without #=> 1 + a.with_self #=> 1 + a.with_other # NoMethodError (private method `m' called for #<A:0x0000559c287f27d0>) + a.with_renamed # NoMethodError (private method `m' called for #<A:0x0000559c285f8330>) === +alias+ and +undef+ @@ -227,6 +259,28 @@ includes a minimum of built-in methods. You can use BasicObject to create an independent inheritance structure. See the BasicObject documentation for further details. +Just like modules, classes can also be reopened. You can omit its superclass +when you reopen a class. Specifying a different superclass than the previous +definition will raise an error. + + class C + end + + class D < C + end + + # OK + class D < C + end + + # OK + class D + end + + # TypeError: superclass mismatch for class D + class D < String + end + == Inheritance Any method defined on a class is callable from its subclass: diff --git a/doc/syntax/operators.rdoc b/doc/syntax/operators.rdoc new file mode 100644 index 0000000000..d3045ac99e --- /dev/null +++ b/doc/syntax/operators.rdoc @@ -0,0 +1,75 @@ += Operators + +In Ruby, operators such as <code>+</code>, are defined as methods on the class. +Literals[rdoc-ref:syntax/literals.rdoc] define their methods within the lower +level, C language. String class, for example. + +Ruby objects can define or overload their own implementation for most operators. + +Here is an example: + + class Foo < String + def +(str) + self.concat(str).concat("another string") + end + end + + foobar = Foo.new("test ") + puts foobar + "baz " + +This prints: + + test baz another string + +What operators are available is dependent on the implementing class. + +== Operator Behavior + +How a class behaves to a given operator is specific to that class, since +operators are method implementations. + +When using an operator, it's the expression on the left-hand side of the +operation that specifies the behavior. + + 'a' * 3 #=> "aaa" + 3 * 'a' # TypeError: String can't be coerced into Integer + +== Logical Operators + +Logical operators are not methods, and therefore cannot be +redefined/overloaded. They are tokenized at a lower level. + +Short-circuit logical operators (<code>&&</code>, <code>||</code>, +<code>and</code>, and <code>or</code>) do not always result in a boolean value. +Similar to blocks, it's the last evaluated expression that defines the result +of the operation. + +=== <code>&&</code>, <code>and</code> + +Both <code>&&</code>/<code>and</code> operators provide short-circuiting by executing each +side of the operator, left to right, and stopping at the first occurrence of a +falsey expression. The expression that defines the result is the last one +executed, whether it be the final expression, or the first occurrence of a falsey +expression. + +Some examples: + + true && 9 && "string" #=> "string" + (1 + 2) && nil && "string" #=> nil + (a = 1) && (b = false) && (c = "string") #=> false + + puts a #=> 1 + puts b #=> false + puts c #=> nil + +In this last example, <code>c</code> was initialized, but not defined. + +=== <code>||</code>, <code>or</code> + +The means by which <code>||</code>/<code>or</code> short-circuits, is to return the result of +the first expression that is truthy. + +Some examples: + + (1 + 2) || true || "string" #=> 3 + false || nil || "string" #=> "string" diff --git a/doc/syntax/pattern_matching.rdoc b/doc/syntax/pattern_matching.rdoc new file mode 100644 index 0000000000..06aae26d49 --- /dev/null +++ b/doc/syntax/pattern_matching.rdoc @@ -0,0 +1,528 @@ += Pattern matching + +Pattern matching is a feature allowing deep matching of structured values: checking the structure and binding the matched parts to local variables. + +Pattern matching in Ruby is implemented with the +case+/+in+ expression: + + case <expression> + in <pattern1> + ... + in <pattern2> + ... + in <pattern3> + ... + else + ... + end + +(Note that +in+ and +when+ branches can NOT be mixed in one +case+ expression.) + +Or with the <code>=></code> operator and the +in+ operator, which can be used in a standalone expression: + + <expression> => <pattern> + + <expression> in <pattern> + +The +case+/+in+ expression is _exhaustive_: if the value of the expression does not match any branch of the +case+ expression (and the +else+ branch is absent), +NoMatchingPatternError+ is raised. + +Therefore, the +case+ expression might be used for conditional matching and unpacking: + + config = {db: {user: 'admin', password: 'abc123'}} + + case config + in db: {user:} # matches subhash and puts matched value in variable user + puts "Connect with user '#{user}'" + in connection: {username: } + puts "Connect with user '#{username}'" + else + puts "Unrecognized structure of config" + end + # Prints: "Connect with user 'admin'" + +whilst the <code>=></code> operator is most useful when the expected data structure is known beforehand, to just unpack parts of it: + + config = {db: {user: 'admin', password: 'abc123'}} + + config => {db: {user:}} # will raise if the config's structure is unexpected + + puts "Connect with user '#{user}'" + # Prints: "Connect with user 'admin'" + +<code><expression> in <pattern></code> is the same as <code>case <expression>; in <pattern>; true; else false; end</code>. +You can use it when you only want to know if a pattern has been matched or not: + + users = [{name: "Alice", age: 12}, {name: "Bob", age: 23}] + users.any? {|user| user in {name: /B/, age: 20..} } #=> true + +See below for more examples and explanations of the syntax. + +== Patterns + +Patterns can be: + +* any Ruby object (matched by the <code>===</code> operator, like in +when+); (<em>Value pattern</em>) +* array pattern: <code>[<subpattern>, <subpattern>, <subpattern>, ...]</code>; (<em>Array pattern</em>) +* find pattern: <code>[*variable, <subpattern>, <subpattern>, <subpattern>, ..., *variable]</code>; (<em>Find pattern</em>) +* hash pattern: <code>{key: <subpattern>, key: <subpattern>, ...}</code>; (<em>Hash pattern</em>) +* combination of patterns with <code>|</code>; (<em>Alternative pattern</em>) +* variable capture: <code><pattern> => variable</code> or <code>variable</code>; (<em>As pattern</em>, <em>Variable pattern</em>) + +Any pattern can be nested inside array/find/hash patterns where <code><subpattern></code> is specified. + +Array patterns and find patterns match arrays, or objects that respond to +deconstruct+ (see below about the latter). +Hash patterns match hashes, or objects that respond to +deconstruct_keys+ (see below about the latter). Note that only symbol keys are supported for hash patterns. + +An important difference between array and hash pattern behavior is that arrays match only a _whole_ array: + + case [1, 2, 3] + in [Integer, Integer] + "matched" + else + "not matched" + end + #=> "not matched" + +while the hash matches even if there are other keys besides the specified part: + + case {a: 1, b: 2, c: 3} + in {a: Integer} + "matched" + else + "not matched" + end + #=> "matched" + +<code>{}</code> is the only exclusion from this rule. It matches only if an empty hash is given: + + case {a: 1, b: 2, c: 3} + in {} + "matched" + else + "not matched" + end + #=> "not matched" + + case {} + in {} + "matched" + else + "not matched" + end + #=> "matched" + +There is also a way to specify there should be no other keys in the matched hash except those explicitly specified by the pattern, with <code>**nil</code>: + + case {a: 1, b: 2} + in {a: Integer, **nil} # this will not match the pattern having keys other than a: + "matched a part" + in {a: Integer, b: Integer, **nil} + "matched a whole" + else + "not matched" + end + #=> "matched a whole" + +Both array and hash patterns support "rest" specification: + + case [1, 2, 3] + in [Integer, *] + "matched" + else + "not matched" + end + #=> "matched" + + case {a: 1, b: 2, c: 3} + in {a: Integer, **} + "matched" + else + "not matched" + end + #=> "matched" + +Parentheses around both kinds of patterns could be omitted: + + case [1, 2] + in Integer, Integer + "matched" + else + "not matched" + end + #=> "matched" + + case {a: 1, b: 2, c: 3} + in a: Integer + "matched" + else + "not matched" + end + #=> "matched" + + [1, 2] => a, b + [1, 2] in a, b + + {a: 1, b: 2, c: 3} => a: + {a: 1, b: 2, c: 3} in a: + +Find pattern is similar to array pattern but it can be used to check if the given object has any elements that match the pattern: + + case ["a", 1, "b", "c", 2] + in [*, String, String, *] + "matched" + else + "not matched" + end + +== Variable binding + +Besides deep structural checks, one of the very important features of the pattern matching is the binding of the matched parts to local variables. The basic form of binding is just specifying <code>=> variable_name</code> after the matched (sub)pattern (one might find this similar to storing exceptions in local variables in a <code>rescue ExceptionClass => var</code> clause): + + case [1, 2] + in Integer => a, Integer + "matched: #{a}" + else + "not matched" + end + #=> "matched: 1" + + case {a: 1, b: 2, c: 3} + in a: Integer => m + "matched: #{m}" + else + "not matched" + end + #=> "matched: 1" + +If no additional check is required, for only binding some part of the data to a variable, a simpler form could be used: + + case [1, 2] + in a, Integer + "matched: #{a}" + else + "not matched" + end + #=> "matched: 1" + + case {a: 1, b: 2, c: 3} + in a: m + "matched: #{m}" + else + "not matched" + end + #=> "matched: 1" + +For hash patterns, even a simpler form exists: key-only specification (without any sub-pattern) binds the local variable with the key's name, too: + + case {a: 1, b: 2, c: 3} + in a: + "matched: #{a}" + else + "not matched" + end + #=> "matched: 1" + +\Binding works for nested patterns as well: + + case {name: 'John', friends: [{name: 'Jane'}, {name: 'Rajesh'}]} + in name:, friends: [{name: first_friend}, *] + "matched: #{first_friend}" + else + "not matched" + end + #=> "matched: Jane" + +The "rest" part of a pattern also can be bound to a variable: + + case [1, 2, 3] + in a, *rest + "matched: #{a}, #{rest}" + else + "not matched" + end + #=> "matched: 1, [2, 3]" + + case {a: 1, b: 2, c: 3} + in a:, **rest + "matched: #{a}, #{rest}" + else + "not matched" + end + #=> "matched: 1, {b: 2, c: 3}" + +\Binding to variables currently does NOT work for alternative patterns joined with <code>|</code>: + + case {a: 1, b: 2} + in {a: } | Array + # ^ SyntaxError (variable capture in alternative pattern) + "matched: #{a}" + else + "not matched" + end + +Variables that start with <code>_</code> are the only exclusions from this rule: + + case {a: 1, b: 2} + in {a: _, b: _foo} | Array + "matched: #{_}, #{_foo}" + else + "not matched" + end + # => "matched: 1, 2" + +It is, though, not advised to reuse the bound value, as this pattern's goal is to signify a discarded value. + +== Variable pinning + +Due to the variable binding feature, existing local variable can not be straightforwardly used as a sub-pattern: + + expectation = 18 + + case [1, 2] + in expectation, *rest + "matched. expectation was: #{expectation}" + else + "not matched. expectation was: #{expectation}" + end + # expected: "not matched. expectation was: 18" + # real: "matched. expectation was: 1" -- local variable just rewritten + +For this case, the pin operator <code>^</code> can be used, to tell Ruby "just use this value as part of the pattern": + + expectation = 18 + case [1, 2] + in ^expectation, *rest + "matched. expectation was: #{expectation}" + else + "not matched. expectation was: #{expectation}" + end + #=> "not matched. expectation was: 18" + +One important usage of variable pinning is specifying that the same value should occur in the pattern several times: + + jane = {school: 'high', schools: [{id: 1, level: 'middle'}, {id: 2, level: 'high'}]} + john = {school: 'high', schools: [{id: 1, level: 'middle'}]} + + case jane + in school:, schools: [*, {id:, level: ^school}] # select the last school, level should match + "matched. school: #{id}" + else + "not matched" + end + #=> "matched. school: 2" + + case john # the specified school level is "high", but last school does not match + in school:, schools: [*, {id:, level: ^school}] + "matched. school: #{id}" + else + "not matched" + end + #=> "not matched" + +In addition to pinning local variables, you can also pin instance, global, and class variables: + + $gvar = 1 + class A + @ivar = 2 + @@cvar = 3 + case [1, 2, 3] + in ^$gvar, ^@ivar, ^@@cvar + "matched" + else + "not matched" + end + #=> "matched" + end + +You can also pin the result of arbitrary expressions using parentheses: + + a = 1 + b = 2 + case 3 + in ^(a + b) + "matched" + else + "not matched" + end + #=> "matched" + +== Matching non-primitive objects: +deconstruct+ and +deconstruct_keys+ + +As already mentioned above, array, find, and hash patterns besides literal arrays and hashes will try to match any object implementing +deconstruct+ (for array/find patterns) or +deconstruct_keys+ (for hash patterns). + + class Point + def initialize(x, y) + @x, @y = x, y + end + + def deconstruct + puts "deconstruct called" + [@x, @y] + end + + def deconstruct_keys(keys) + puts "deconstruct_keys called with #{keys.inspect}" + {x: @x, y: @y} + end + end + + case Point.new(1, -2) + in px, Integer # sub-patterns and variable binding works + "matched: #{px}" + else + "not matched" + end + # prints "deconstruct called" + "matched: 1" + + case Point.new(1, -2) + in x: 0.. => px + "matched: #{px}" + else + "not matched" + end + # prints: deconstruct_keys called with [:x] + #=> "matched: 1" + ++keys+ are passed to +deconstruct_keys+ to provide a room for optimization in the matched class: if calculating a full hash representation is expensive, one may calculate only the necessary subhash. When the <code>**rest</code> pattern is used, +nil+ is passed as a +keys+ value: + + case Point.new(1, -2) + in x: 0.. => px, **rest + "matched: #{px}" + else + "not matched" + end + # prints: deconstruct_keys called with nil + #=> "matched: 1" + +Additionally, when matching custom classes, the expected class can be specified as part of the pattern and is checked with <code>===</code> + + class SuperPoint < Point + end + + case Point.new(1, -2) + in SuperPoint(x: 0.. => px) + "matched: #{px}" + else + "not matched" + end + #=> "not matched" + + case SuperPoint.new(1, -2) + in SuperPoint[x: 0.. => px] # [] or () parentheses are allowed + "matched: #{px}" + else + "not matched" + end + #=> "matched: 1" + +These core and library classes implement deconstruction: + +* MatchData#deconstruct and MatchData#deconstruct_keys; +* Time#deconstruct_keys, Date#deconstruct_keys, DateTime#deconstruct_keys. + +== Guard clauses + ++if+ can be used to attach an additional condition (guard clause) when the pattern matches in +case+/+in+ expressions. +This condition may use bound variables: + + case [1, 2] + in a, b if b == a*2 + "matched" + else + "not matched" + end + #=> "matched" + + case [1, 1] + in a, b if b == a*2 + "matched" + else + "not matched" + end + #=> "not matched" + ++unless+ works, too: + + case [1, 1] + in a, b unless b == a*2 + "matched" + else + "not matched" + end + #=> "matched" + +Note that <code>=></code> and +in+ operator can not have a guard clause. +The following examples is parsed as a standalone expression with modifier +if+. + + [1, 2] in a, b if b == a*2 + +== Appendix A. Pattern syntax + +Approximate syntax is: + + pattern: value_pattern + | variable_pattern + | alternative_pattern + | as_pattern + | array_pattern + | find_pattern + | hash_pattern + + value_pattern: literal + | Constant + | ^local_variable + | ^instance_variable + | ^class_variable + | ^global_variable + | ^(expression) + + variable_pattern: variable + + alternative_pattern: pattern | pattern | ... + + as_pattern: pattern => variable + + array_pattern: [pattern, ..., *variable] + | Constant(pattern, ..., *variable) + | Constant[pattern, ..., *variable] + + find_pattern: [*variable, pattern, ..., *variable] + | Constant(*variable, pattern, ..., *variable) + | Constant[*variable, pattern, ..., *variable] + + hash_pattern: {key: pattern, key:, ..., **variable} + | Constant(key: pattern, key:, ..., **variable) + | Constant[key: pattern, key:, ..., **variable] + +== Appendix B. Some undefined behavior examples + +To leave room for optimization in the future, the specification contains some undefined behavior. + +Use of a variable in an unmatched pattern: + + case [0, 1] + in [a, 2] + "not matched" + in b + "matched" + in c + "not matched" + end + a #=> undefined + c #=> undefined + +Number of +deconstruct+, +deconstruct_keys+ method calls: + + $i = 0 + ary = [0] + def ary.deconstruct + $i += 1 + self + end + case ary + in [0, 1] + "not matched" + in [0] + "matched" + end + $i #=> undefined diff --git a/doc/syntax/precedence.rdoc b/doc/syntax/precedence.rdoc index 515626c74f..f0ca92b571 100644 --- a/doc/syntax/precedence.rdoc +++ b/doc/syntax/precedence.rdoc @@ -49,10 +49,14 @@ Unary <code>+</code> and unary <code>-</code> are for <code>+1</code>, <code>-1</code> or <code>-(a + b)</code>. Modifier-if, modifier-unless, etc. are for the modifier versions of those -keywords. For example, this is a modifier-unless expression: +keywords. For example, this is a modifier-unless statement: a += 1 unless a.zero? +Note that <code>(a if b rescue c)</code> is parsed as <code>((a if b) rescue +c)</code> due to reasons not related to precedence. See {modifier +statements}[control_expressions.rdoc#label-Modifier+Statements]. + <code>{ ... }</code> blocks have priority below all listed operations, but <code>do ... end</code> blocks have lower priority. diff --git a/doc/syntax/refinements.rdoc b/doc/syntax/refinements.rdoc index fc554bb476..4095977284 100644 --- a/doc/syntax/refinements.rdoc +++ b/doc/syntax/refinements.rdoc @@ -212,10 +212,7 @@ all refinements from the same module are active when a refined method When looking up a method for an instance of class +C+ Ruby checks: -* If refinements are active for +C+, in the reverse order they were activated: - * The prepended modules from the refinement for +C+ - * The refinement for +C+ - * The included modules from the refinement for +C+ +* The refinements of +C+, in reverse order of activation * The prepended modules of +C+ * +C+ * The included modules of +C+ @@ -245,7 +242,8 @@ When +super+ is invoked method lookup checks: Note that +super+ in a method of a refinement invokes the method in the refined class even if there is another refinement which has been activated in -the same context. +the same context. This is only true for +super+ in a method of a refinement, it +does not apply to +super+ in a method in a module that is included in a refinement. == Methods Introspection @@ -278,6 +276,6 @@ Refinements in descendants have higher precedence than those of ancestors. == Further Reading -See https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/RefinementsSpec for the +See https://github.com/ruby/ruby/wiki/Refinements-Spec for the current specification for implementing refinements. The specification also contains more details. |
