summaryrefslogtreecommitdiff
path: root/string.c
diff options
context:
space:
mode:
authornormal <normal@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>2017-02-24 01:01:23 +0000
committernormal <normal@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>2017-02-24 01:01:23 +0000
commit4e90dcc9d77d27ab2e2ec96c01e2f9087f1b3499 (patch)
tree8f266624ef08d5d1bfc33cf173b4e5801e9015e4 /string.c
parent6323c8b824aa1294cab87e8e92d8930801c650f7 (diff)
string.c (str_uminus): deduplicate strings
This exposes the rb_fstring internal function to return a deduped and frozen string when a non-frozen string is given. This is useful for writing all sorts of record processing key values maybe stored, but certain keys and values are often duplicated at a high frequency, so memory savings can noticeable. Use cases are many: * email/NNTP header processing There are some standard header keys everybody uses (From/To/Cc/Date/Subject/Received/Message-ID/References/In-Reply-To), as well as common ones specific to a certain lists: (ruby-core has X-Redmine-* headers) It is also useful to dedupe values, as most inboxes have multiple messages from the same sender, or MUA. * package management systems - things like RubyGems stores identical strings for licenses, dependency names, author names/emails, etc * HTTP headers/trailers - standard headers (Host/Accept/Accept-Encoding/User-Agent/...) are common, but there are also uncommon ones. Values may be deduped, as well, as it is likely a user agent will make multiple/parallel requests to the same server. * version control systems - this can be useful for deduplicating names of frequent committers (like "nobu" :) In linux.git and git.git, there are also common trailers such as Signed-Off-By/Acked-by/Reviewed-by/Fixes/... as well as less common ones. * audio metadata - There are commonly used tags (Artist/Album/Title/Tracknumber), but Vorbis comments allows arbitrary key values to be stored. Music collections contain songs by the same artist or mutiple songs from the same album, so deduplicating values will be helpful there, too. * JSON, YAML, XML, HTML processing Certain fields, tags and attributes are commonly used across the same and multiple documents There is no security concern in this being a DoS vector by causing immortal strings. The fstring table is not a GC-root and not walked during the mark phase. GC-able dynamic symbols since Ruby 2.2 are handled in the same manner, and that implementation also relies on the non-immortality of fstrings. [Feature #13077] [ruby-core:79663] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Diffstat (limited to 'string.c')
-rw-r--r--string.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/string.c b/string.c
index 5e1769a792..91040b14f8 100644
--- a/string.c
+++ b/string.c
@@ -2530,7 +2530,7 @@ str_uminus(VALUE str)
return str;
}
else {
- return rb_str_freeze(rb_str_dup(str));
+ return rb_fstring(str);
}
}