summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBurdette Lamar <BurdetteLamar@Yahoo.com>2022-02-27 15:43:23 -0600
committerGitHub <noreply@github.com>2022-02-27 15:43:23 -0600
commit28ee1ca74831a9265ff40c81d14ff327837af757 (patch)
tree987f98beb9ff67c4f31419fcda3139114290a791
parent7f4345639b09395f2ab423d1cdac6f2ddf0707de (diff)
[DOC] Enhanced RDoc for encoding (#5603)
Additions and corrections for external/internal encodings.
Notes
Notes: Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
-rw-r--r--doc/encoding.rdoc69
1 files changed, 43 insertions, 26 deletions
diff --git a/doc/encoding.rdoc b/doc/encoding.rdoc
index fcbbf3afa5..3c6d1f2889 100644
--- a/doc/encoding.rdoc
+++ b/doc/encoding.rdoc
@@ -205,57 +205,74 @@ other than from the filesystem:
Encoding.find('locale') # => #<Encoding:IBM437>
-=== \IO Encodings
+=== Stream Encodings
-An IO object (an input/output stream), and by inheritance a File object,
-has at least one, and sometimes two, encodings:
+Certain stream objects can have two encodings; these objects include instances of:
-- Its _external_ _encoding_ identifies the encoding of the stream.
-- Its _internal_ _encoding_, if not +nil+, specifies the encoding
+- IO.
+- File.
+- ARGF.
+- StringIO.
+
+The two encodings are:
+
+- An _external_ _encoding_, which identifies the encoding of the stream.
+- An _internal_ _encoding_, which (if not +nil+) specifies the encoding
to be used for the string constructed from the stream.
==== External \Encoding
-Bytes read from the stream are decoded into characters via the external encoding;
-by default (that is, if the internal encoding is +nil),
-those characters become a string whose encoding is set to the external encoding.
+The external encoding, which is an \Encoding object, specifies how bytes read
+from the stream are to be interpreted as characters.
The default external encoding is:
- UTF-8 for a text stream.
- ASCII-8BIT for a binary stream.
- f = File.open('t.rus', 'rb')
- f.external_encoding # => #<Encoding:ASCII-8BIT>
+The default external encoding is returned by method Encoding.default_external,
+and may be set by:
+
+- Ruby command-line options <tt>--external_encoding</tt> or <tt>-E</tt>.
+
+You can also set the default external encoding using method Encoding.default_external=,
+but doing so may cause problems; strings created before and after the change
+may have a different encodings.
-The external encoding may be set by the open option +external_encoding+:
+For an \IO or \File object, the external encoding may be set by:
- f = File.open('t.txt', external_encoding: 'ASCII-8BIT')
- f.external_encoding # => #<Encoding:ASCII-8BIT>
+- Open options +external_encoding+ or +encoding+, when the object is created;
+ see {Open Options}[rdoc-ref:IO@Open+Options].
-The external encoding may also set by method #set_encoding:
+For an \IO, \File, \ARGF, or \StringIO object, the external encoding may be set by:
- f = File.open('t.txt')
- f.set_encoding('ASCII-8BIT')
- f.external_encoding # => #<Encoding:ASCII-8BIT>
+- \Methods +set_encoding+ or (except for \ARGF) +set_encoding_by_bom+.
==== Internal \Encoding
-If not +nil+, the internal encoding specifies that the characters read
-from the stream are to be converted to characters in the internal encoding;
+The internal encoding, which is an \Encoding object or +nil+,
+specifies how characters read from the stream
+are to be converted to characters in the internal encoding;
those characters become a string whose encoding is set to the internal encoding.
The default internal encoding is +nil+ (no conversion).
-The internal encoding may set by the open option +internal_encoding+:
+It is returned by method Encoding.default_internal,
+and may be set by:
+
+- Ruby command-line options <tt>--internal_encoding</tt> or <tt>-E</tt>.
+
+You can also set the default internal encoding using method Encoding.default_internal=,
+but doing so may cause problems; strings created before and after the change
+may have a different encodings.
+
+For an \IO or \File object, the internal encoding may be set by:
- f = File.open('t.txt', internal_encoding: 'ASCII-8BIT')
- f.internal_encoding # => #<Encoding:ASCII-8BIT>
+- Open options +internal_encoding+ or +encoding+, when the object is created;
+ see {Open Options}[rdoc-ref:IO@Open+Options].
-The internal encoding may also set by method #set_encoding:
+For an \IO, \File, \ARGF, or \StringIO object, the internal encoding may be set by:
- f = File.open('t.txt')
- f.set_encoding('UTF-8', 'ASCII-8BIT')
- f.internal_encoding # => #<Encoding:ASCII-8BIT>
+- \Method +set_encoding+.
=== Script \Encoding