* Cross-patch from Ruby CVS; mostly Nabu edits.

* Fixes ticket:68. ***** Note that this is an API change!!! ***** NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. * Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. * Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. * Fixes a bug in multiple decodings * The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I'll have to think of a better way of managing translations, but the REXML codecs are (a) less reliable than ICONV, but more importantly (b) slower. The real solution is to use ICONV by default, but allow users to specify that they want to use the pure Ruby codecs. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling * Addresses ticket:66 * Fixes ticket:71 * Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_8@11315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
author: ser <ser@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> 2006-12-01 02:20:08 +0000
committer: ser <ser@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> 2006-12-01 02:20:08 +0000
commit: f114b85d89cf98cf4a11731615df77e50901d0c1 (patch)
tree: 5d0fe1b4da60aaa23cde90cbd793e2584b7762a4 /lib/rexml/text.rb
parent: d2205c869eb959e6ddb8b08c4ab7318d30ea62af (diff)
1 files changed, 24 insertions, 23 deletions
diff --git a/lib/rexml/text.rb b/lib/rexml/text.rb
index 55bc9f50f8..3de9170623 100644
--- a/lib/rexml/text.rb
+++ b/lib/rexml/text.rb
@@ -42,6 +42,7 @@ module REXML
     # Use this field if you have entities defined for some text, and you don't
     # want REXML to escape that text in output.
     #   Text.new( "<&", false, nil, false ) #-> "&lt;&amp;"
+    #   Text.new( "&lt;&amp;", false, nil, false ) #-> "&amp;lt;&amp;amp;"
     #   Text.new( "<&", false, nil, true )  #-> Parse exception
     #   Text.new( "&lt;&amp;", false, nil, true )  #-> "&lt;&amp;"
     #   # Assume that the entity "s" is defined to be "sean"
@@ -172,17 +173,6 @@ module REXML
       end
       @unnormalized = Text::unnormalize( @string, doctype )
     end
-     
-     def wrap(string, width, addnewline=false)
-       # Recursivly wrap string at width.
-       return string if string.length <= width
-       place = string.rindex(' ', width) # Position in string with last ' ' before cutoff
-       if addnewline then
-         return "\n" + string[0,place] + "\n" + wrap(string[place+1..-1], width)
-       else
-         return string[0,place] + "\n" + wrap(string[place+1..-1], width)
-       end
-     end
 
     # Sets the contents of this text node.  This expects the text to be 
     # unnormalized.  It returns self.
@@ -198,17 +188,28 @@ module REXML
       @raw = false
     end
  
-     def indent_text(string, level=1, style="\t", indentfirstline=true)
-      return string if level < 0
-       new_string = ''
-       string.each { |line|
-         indent_string = style * level
-         new_line = (indent_string + line).sub(/[\s]+$/,'')
-         new_string << new_line
-       }
-       new_string.strip! unless indentfirstline
-       return new_string
+     def wrap(string, width, addnewline=false)
+       # Recursivly wrap string at width.
+       return string if string.length <= width
+       place = string.rindex(' ', width) # Position in string with last ' ' before cutoff
+       if addnewline then
+         return "\n" + string[0,place] + "\n" + wrap(string[place+1..-1], width)
+       else
+         return string[0,place] + "\n" + wrap(string[place+1..-1], width)
+       end
      end
+
+    def indent_text(string, level=1, style="\t", indentfirstline=true)
+      return string if level < 0
+      new_string = ''
+      string.each { |line|
+        indent_string = style * level
+        new_line = (indent_string + line).sub(/[\s]+$/,'')
+        new_string << new_line
+      }
+      new_string.strip! unless indentfirstline
+      return new_string
+    end
  
     def write( writer, indent=-1, transitive=false, ie_hack=false ) 
       s = to_s()
@@ -286,9 +287,10 @@ module REXML
     def Text::normalize( input, doctype=nil, entity_filter=nil )
       copy = input
       # Doing it like this rather than in a loop improves the speed
+      #copy = copy.gsub( EREFERENCE, '&amp;' )
+      copy = copy.gsub( "&", "&amp;" )
       if doctype
         # Replace all ampersands that aren't part of an entity
-        copy = copy.gsub( EREFERENCE, '&amp;' )
         doctype.entities.each_value do |entity|
           copy = copy.gsub( entity.value, 
             "&#{entity.name};" ) if entity.value and 
@@ -296,7 +298,6 @@ module REXML
         end
       else
         # Replace all ampersands that aren't part of an entity
-        copy = copy.gsub( EREFERENCE, '&amp;' )
         DocType::DEFAULT_ENTITIES.each_value do |entity|
           copy = copy.gsub(entity.value, "&#{entity.name};" )
         end
author	ser <ser@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>	2006-12-01 02:20:08 +0000
committer	ser <ser@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>	2006-12-01 02:20:08 +0000
commit	f114b85d89cf98cf4a11731615df77e50901d0c1 (patch)
tree	5d0fe1b4da60aaa23cde90cbd793e2584b7762a4 /lib/rexml/text.rb
parent	d2205c869eb959e6ddb8b08c4ab7318d30ea62af (diff)