diff options
Diffstat (limited to 'ruby_2_2/doc/marshal.rdoc')
-rw-r--r-- | ruby_2_2/doc/marshal.rdoc | 313 |
1 files changed, 0 insertions, 313 deletions
diff --git a/ruby_2_2/doc/marshal.rdoc b/ruby_2_2/doc/marshal.rdoc deleted file mode 100644 index f97db00a49..0000000000 --- a/ruby_2_2/doc/marshal.rdoc +++ /dev/null @@ -1,313 +0,0 @@ -= Marshal Format - -The Marshal format is used to serialize ruby objects. The format can store -arbitrary objects through three user-defined extension mechanisms. - -For documentation on using Marshal to serialize and deserialize objects, see -the Marshal module. - -This document calls a serialized set of objects a stream. The Ruby -implementation can load a set of objects from a String, an IO or an object -that implements a +getc+ method. - -== Stream Format - -The first two bytes of the stream contain the major and minor version, each as -a single byte encoding a digit. The version implemented in Ruby is 4.8 -(stored as "\x04\x08") and is supported by ruby 1.8.0 and newer. - -Different major versions of the Marshal format are not compatible and cannot -be understood by other major versions. Lesser minor versions of the format -can be understood by newer minor versions. Format 4.7 can be loaded by a 4.8 -implementation but format 4.8 cannot be loaded by a 4.7 implementation. - -Following the version bytes is a stream describing the serialized object. The -stream contains nested objects (the same as a Ruby object) but objects in the -stream do not necessarily have a direct mapping to the Ruby object model. - -Each object in the stream is described by a byte indicating its type followed -by one or more bytes describing the object. When "object" is mentioned below -it means any of the types below that defines a Ruby object. - -=== true, false, nil - -These objects are each one byte long. "T" is represents +true+, "F" -represents +false+ and "0" represents +nil+. - -=== Fixnum and long - -"i" represents a signed 32 bit value using a packed format. One through five -bytes follows the type. The value loaded will always be a Fixnum. On -32 bit platforms (where the precision of a Fixnum is less than 32 bits) -loading large values will cause overflow on CRuby. - -The fixnum type is used to represent both ruby Fixnum objects and the sizes of -marshaled arrays, hashes, instance variables and other types. In the -following sections "long" will mean the format described below, which supports -full 32 bit precision. - -The first byte has the following special values: - -"\x00":: - The value of the integer is 0. No bytes follow. - -"\x01":: - The total size of the integer is two bytes. The following byte is a - positive integer in the range of 0 through 255. Only values between 123 - and 255 should be represented this way to save bytes. - -"\xff":: - The total size of the integer is two bytes. The following byte is a - negative integer in the range of -1 through -256. - -"\x02":: - The total size of the integer is three bytes. The following two bytes are a - positive little-endian integer. - -"\xfe":: - The total size of the integer is three bytes. The following two bytes are a - negative little-endian integer. - -"\x03":: - The total size of the integer is four bytes. The following three bytes are - a positive little-endian integer. - -"\xfd":: - The total size of the integer is two bytes. The following three bytes are a - negative little-endian integer. - -"\x04":: - The total size of the integer is five bytes. The following four bytes are a - positive little-endian integer. For compatibility with 32 bit ruby, - only Fixnums less than 1073741824 should be represented this way. For sizes - of stream objects full precision may be used. - -"\xfc":: - The total size of the integer is two bytes. The following four bytes are a - negative little-endian integer. For compatibility with 32 bit ruby, - only Fixnums greater than -10737341824 should be represented this way. For - sizes of stream objects full precision may be used. - -Otherwise the first byte is a sign-extended eight-bit value with an offset. -If the value is positive the value is determined by subtracting 5 from the -value. If the value is negative the value is determined by adding 5 to the -value. - -There are multiple representations for many values. CRuby always outputs the -shortest representation possible. - -=== Symbols and Byte Sequence - -":" represents a real symbol. A real symbol contains the data needed to -define the symbol for the rest of the stream as future occurrences in the -stream will instead be references (a symbol link) to this one. The reference -is a zero-indexed 32 bit value (so the first occurrence of <code>:hello</code> -is 0). - -Following the type byte is byte sequence which consists of a long indicating -the number of bytes in the sequence followed by that many bytes of data. Byte -sequences have no encoding. - -For example, the following stream contains the Symbol <code>:hello</code>: - - "\x04\x08:\x0ahello" - -";" represents a Symbol link which references a previously defined Symbol. -Following the type byte is a long containing the index in the lookup table for -the linked (referenced) Symbol. - -For example, the following stream contains <code>[:hello, :hello]</code>: - - "\x04\b[\a:\nhello;\x00" - -When a "symbol" is referenced below it may be either a real symbol or a -symbol link. - -=== Object References - -Separate from but similar to symbol references, the stream contains only one -copy of each object (as determined by #object_id) for all objects except -true, false, nil, Fixnums and Symbols (which are stored separately as -described above) a one-indexed 32 bit value will be stored and reused when the -object is encountered again. (The first object has an index of 1). - -"@" represents an object link. Following the type byte is a long giving the -index of the object. - -For example, the following stream contains an Array of the object -<code>"hello"</code> twice: - - "\004\b[\a\"\nhello@\006" - -=== Instance Variables - -"I" indicates that instance variables follow the next object. An object -follows the type byte. Following the object is a length indicating the number -of instance variables for the object. Following the length is a set of -name-value pairs. The names are symbols while the values are objects. The -symbols must be instance variable names (<code>:@name</code>). - -An Object ("o" type, described below) uses the same format for its instance -variables as described here. - -For a String and Regexp (described below) a special instance variable -<code>:E</code> is used to indicate the Encoding. - -=== Extended - -"e" indicates that the next object is extended by a module. An object follows -the type byte. Following the object is a symbol that contains the name of the -module the object is extended by. - -=== Array - -"[" represents an Array. Following the type byte is a long indicating the -number of objects in the array. The given number of objects follow the -length. - -=== Bignum - -"l" represents a Bignum which is composed of three parts: - -sign:: - A single byte containing "+" for a positive value or "-" for a negative - value. -length:: - A long indicating the number of bytes of Bignum data follows, divided by - two. Multiply the length by two to determine the number of bytes of data - that follow. -data:: - Bytes of Bignum data representing the number. - -The following ruby code will reconstruct the Bignum value from an array of -bytes: - - result = 0 - - bytes.each_with_index do |byte, exp| - result += (byte * 2 ** (exp * 8)) - end - -=== Class and Module - -"c" represents a Class object, "m" represents a Module and "M" represents -either a class or module (this is an old-style for compatibility). No class -or module content is included, this type is only a reference. Following the -type byte is a byte sequence which is used to look up an existing class or -module, respectively. - -Instance variables are not allowed on a class or module. - -If no class or module exists an exception should be raised. - -For "c" and "m" types, the loaded object must be a class or module, -respectively. - -=== Data - -"d" represents a Data object. (Data objects are wrapped pointers from ruby -extensions.) Following the type byte is a symbol indicating the class for the -Data object and an object that contains the state of the Data object. - -To dump a Data object Ruby calls _dump_data. To load a Data object Ruby calls -_load_data with the state of the object on a newly allocated instance. - -=== Float - -"f" represents a Float object. Following the type byte is a byte sequence -containing the float value. The following values are special: - -"inf":: - Positive infinity - -"-inf":: - Negative infinity - -"nan":: - Not a Number - -Otherwise the byte sequence contains a C double (loadable by strtod(3)). -Older minor versions of Marshal also stored extra mantissa bits to ensure -portability across platforms but 4.8 does not include these. See -[ruby-talk:69518] for some explanation. - -=== Hash and Hash with Default Value - -"{" represents a Hash object while "}" represents a Hash with a default value -set (<code>Hash.new 0</code>). Following the type byte is a long indicating -the number of key-value pairs in the Hash, the size. Double the given number -of objects follow the size. - -For a Hash with a default value, the default value follows all the pairs. - -=== Module and Old Module - -=== Object - -"o" represents an object that doesn't have any other special form (such as -a user-defined or built-in format). Following the type byte is a symbol -containing the class name of the object. Following the class name is a long -indicating the number of instance variable names and values for the object. -Double the given number of pairs of objects follow the size. - -The keys in the pairs must be symbols containing instance variable names. - -=== Regular Expression - -"/" represents a regular expression. Following the type byte is a byte -sequence containing the regular expression source. Following the type byte is -a byte containing the regular expression options (case-insensitive, etc.) as a -signed 8-bit value. - -Regular expressions can have an encoding attached through instance variables -(see above). If no encoding is attached escapes for the following regexp -specials not present in ruby 1.8 must be removed: g-m, o-q, u, y, E, F, H-L, -N-V, X, Y. - -=== String - -'"' represents a String. Following the type byte is a byte sequence -containing the string content. When dumped from ruby 1.9 an encoding instance -variable (<code>:E</code> see above) should be included unless the encoding is -binary. - -=== Struct - -"S" represents a Struct. Following the type byte is a symbol containing the -name of the struct. Following the name is a long indicating the number of -members in the struct. Double the number of objects follow the member count. -Each member is a pair containing the member's symbol and an object for the -value of that member. - -If the struct name does not match a Struct subclass in the running ruby an -exception should be raised. - -If there is a mismatch between the struct in the currently running ruby and -the member count in the marshaled struct an exception should be raised. - -=== User Class - -"C" represents a subclass of a String, Regexp, Array or Hash. Following the -type byte is a symbol containing the name of the subclass. Following the name -is the wrapped object. - -=== User Defined - -"u" represents an object with a user-defined serialization format using the -+_dump+ instance method and +_load+ class method. Following the type byte is -a symbol containing the class name. Following the class name is a byte -sequence containing the user-defined representation of the object. - -The class method +_load+ is called on the class with a string created from the -byte-sequence. - -=== User Marshal - -"U" represents an object with a user-defined serialization format using the -+marshal_dump+ and +marshal_load+ instance methods. Following the type byte -is a symbol containing the class name. Following the class name is an object -containing the data. - -Upon loading a new instance must be allocated and +marshal_load+ must be -called on the instance with the data. - |