Age | Commit message (Collapse) | Author |
|
|
|
The result of String#b is a string with an ASCII_8BIT/BINARY encoding. That encoding is ASCII-compatible and has no byte sequences that are invalid for the encoding. If we know the receiver's code range, we can derive the resulting string's code range without needing to perform a full code range scan.
Notes:
Merged: https://github.com/ruby/ruby/pull/6183
|
|
If the RHS has valid encoding, and both strings have the same
encoding, we can use the fast path.
However we need to update the LHS coderange.
```
compare-ruby: ruby 3.2.0dev (2022-07-21T14:46:32Z master cdbb9b8555) [arm64-darwin21]
built-ruby: ruby 3.2.0dev (2022-07-25T07:25:41Z string-concat-vali.. 11a2772bdd) [arm64-darwin21]
warming up...
| |compare-ruby|built-ruby|
|:-------------------|-----------:|---------:|
|binary_concat_7bit | 554.816k| 556.460k|
| | -| 1.00x|
|utf8_concat_7bit | 556.367k| 555.101k|
| | 1.00x| -|
|utf8_concat_UTF8 | 412.555k| 556.824k|
| | -| 1.35x|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/6163
|
|
[Misc #18891]
Notes:
Merged: https://github.com/ruby/ruby/pull/6094
|
|
Previously, it was including one newline when chomp was used,
which is inconsistent with IO#each_line behavior. This makes
behavior consistent with IO#each_line, chomping all paragraph
separators (multiple consecutive newlines), but not single
newlines.
Partially Fixes [Bug #18768]
Notes:
Merged: https://github.com/ruby/ruby/pull/5960
|
|
Not having to fetch the rb_encoding save a significant
amount of time.
Additionally, even when we have to fetch it, we can do
it faster using `ENCODING_GET` rather than `rb_enc_get`.
```
compare-ruby: ruby 3.2.0dev (2022-07-19T08:41:40Z master cb9fd920a3) [arm64-darwin21]
built-ruby: ruby 3.2.0dev (2022-07-21T11:16:16Z faster-buffer-conc.. 4f001f0748) [arm64-darwin21]
warming up...
| |compare-ruby|built-ruby|
|:---------------------|-----------:|---------:|
|binary_concat_utf8 | 510.580k| 565.600k|
| | -| 1.11x|
|binary_concat_binary | 512.653k| 571.483k|
| | -| 1.11x|
|utf8_concat_utf8 | 511.396k| 566.879k|
| | -| 1.11x|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/6160
|
|
rb_str_modify clear the coderange, which in this case isn't
necessary.
```
compare-ruby: ruby 3.2.0dev (2022-07-12T15:01:11Z master 71aec68566) [arm64-darwin21]
built-ruby: ruby 3.2.0dev (2022-07-19T07:17:01Z faster-buffer-conc.. 3cad62aab4) [arm64-darwin21]
warming up...
| |compare-ruby|built-ruby|
|:---------------------|-----------:|---------:|
|binary_concat_utf8 | 360.617k| 605.091k|
| | -| 1.68x|
|binary_concat_binary | 446.650k| 605.053k|
| | -| 1.35x|
|utf8_concat_utf8 | 454.166k| 597.311k|
| | -| 1.32x|
```
```
| |compare-ruby|built-ruby|
|:-----------|-----------:|---------:|
|erb_render | 1.790M| 2.045M|
| | -| 1.14x|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/6120
|
|
If the LHS is ASCII compatible and the RHS is 7BIT
we can directly concat without being concerned about
anything else.
Benchmark:
```
compare-ruby: ruby 3.2.0dev (2022-07-12T15:01:11Z master 71aec68566) [arm64-darwin21]
built-ruby: ruby 3.2.0dev (2022-07-13T10:13:53Z faster-buffer-conc.. a04c10476d) [arm64-darwin21]
warming up...
| |compare-ruby|built-ruby|
|:---------------------|-----------:|---------:|
|binary_append_utf8 | 385.315k| 573.663k|
| | -| 1.49x|
|binary_append_binary | 446.579k| 574.898k|
| | -| 1.29x|
|utf8_append_utf8 | 430.936k| 573.394k|
| | -| 1.33x|
```
Note that in the benchmark, the RHS always have a precomputed
coderange. So the benchmark never enter the slowpath of having to
scan the RHS. However it's extremly likely that we'll end
up scanning it anyway in rb_enc_cr_str_buf_cat
Notes:
Merged: https://github.com/ruby/ruby/pull/6120
|
|
Otherwise it's way too easy to confuse it with US_ASCII.
Notes:
Merged: https://github.com/ruby/ruby/pull/6127
|
|
Correct call-seq directive in string.c
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5867
|
|
This function was added to a public header in [1] probably
unintentionally since it's not used anywhere, exposes implementation
details, and isn't related to the goals of that pull request.
[1]: 56cc3e99b6b9ec004255280337f6b8353f5e5b06
Notes:
Merged: https://github.com/ruby/ruby/pull/6023
Merged-By: XrXr
|
|
|
|
And re-embed any strings that can now fit inside the slot they've been
moved to
Notes:
Merged: https://github.com/ruby/ruby/pull/5986
|
|
* Add missing space for `String#start_with?`.
* Add missing pluses for `String#tr` and
`Methods for Converting to New String` label.
* Move quote into the tag for `Whitespace in Strings` label.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
This reverts commit 9d927204e7b86eb00bfd07a060a6383139edf741.
Notes:
Merged: https://github.com/ruby/ruby/pull/5981
|
|
... only when the message string has a newline.
`p StandardError.new("foo\nbar")` now prints `#<StandardError: "foo\nbar">'
instead of:
#<StandardError:
bar>
[Bug #18170]
Notes:
Merged: https://github.com/ruby/ruby/pull/4857
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5583
|
|
|
|
Treats:
#[]
#length
#empty?
#upcase
#downcase
#capitalize
#swapcase
#start_with?
#end_with?
#encoding
::all_symbols
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
|
|
Treats:
#==
#inspect
#name
#to_s
#to_sym
#to_proc
#succ
#<=>
#casecmp
#casecmp?
#=~
#match
#match?
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Creates file doc/string/slices.rdoc that the string slicing methods can link to.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#length
#bytesize
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Adds to doc for String.new, also making it compliant with documentation_guide.rdoc.
Fixes some broken links in io.c (that I failed to correct yesterday).
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#force_encoding
#b
#valid_encoding?
#ascii_only?
#scrub
#scrub!
#unicode_normalized?
Plus a couple of minor tweaks.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Repaired What's Here sections for Range, String, Symbol, Struct.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#start_with?
#end_with?
#delete_prefix
#delete_prefix!
#delete_suffix
#delete_suffix!
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#ljust
#rjust
#center
#partition
#rpartition
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#scan
#hex
#oct
#crypt
#ord
#sum
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
|
|
Treats:
#lstrip
#lstrip!
#rstrip
#rstrip!
#strip
#strip!
Adds section Whitespace in Strings.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Method references is not only able to be marked up as code, also
reflects `--show-hash` option.
The bug that prevented the old rdoc from correctly parsing these
methods was fixed last month.
|
|
Treated:
#chomp
#chomp!
#chop
#chop!
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#chars
#codepoints
#each_char
#each_codepoint
#each_grapheme_cluster
#grapheme_clusters
Also, corrects a passage in #unicode_normalize that mentioned module UnicodeNormalize, whose doc (:nodoc:, actually) says not to mention it.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
As @peterzhu2118 and @duerst have pointed out, putting string method's RDoc into doc/ (which allows non-ASCII in examples) makes the "click to toggle source" feature not work for that method.
This PR moves the primary method doc back into string.c, then includes RDoc from doc/string/*.rdoc, and also removes doc/string.rdoc.
The affected methods are:
::new
#bytes
#each_byte
#each_line
#split
The call-seq is in string.c because it works there; it did not work when the call-seq is in doc/string/*.rdoc.
This PR also updates the relevant guidance in doc/documentation_guide.rdoc.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#split
#each_line
#lines
#each_byte
#bytes
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5584
|
|
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
* Enhanced RDoc for String#split
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Assuming that all platforms, where only `crypt` is available but
not `crypt_r`, are POSIX-base.
|
|
Treats:
#count
#delete
#delete!
#squeeze
#squeeze!
Adds section "Multiple Character Selectors" to doc/character_selectors.rdoc.
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Treats:
#tr (revised to link to "Character Selectors" document)
#tr!
#tr_s
#tr_s!
Also renames doc/character_selector.rdoc to match its title.
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
|
|
Notes:
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
|
|
Otherwise, an empty entry will be generated as `String::new` along
with the one from doc/string.rb.
|
|
* String#getbyte returns `nil` if `index` is out of range.
* Add String#getbyte example with nil output.
* Modify String#getbyte example to use negative index.
Notes:
Merged: https://github.com/ruby/ruby/pull/5586
Merged-By: nobu <nobu@ruby-lang.org>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5410
|