summaryrefslogtreecommitdiff
path: root/transcode.c
diff options
context:
space:
mode:
authorJeremy Evans <code@jeremyevans.net>2021-06-26 12:32:39 -0700
committerGitHub <noreply@github.com>2021-06-26 12:32:39 -0700
commite86c1f6fc53433ef5c82ed2b7a4cc9a12c153e4c (patch)
treec72cf93eaf31977d2cd80ecda12683c4017e3b55 /transcode.c
parent391abc543cea118a9cd7d6310acadbfa352668ef (diff)
Work around issue transcoding issue with non-ASCII compatible encodings and xml escaping
When using a non-ASCII compatible source and destination encoding and xml escaping (the :xml option to String#encode), the resulting string was broken, as it used the correct non-ASCII compatible encoding, but contained data that was ASCII-compatible instead of compatible with the string's encoding. Work around this issue by detecting the case where both the source and destination encoding are non-ASCII compatible, and transcoding the source string from the non-ASCII compatible encoding to UTF-8. The xml escaping code will correctly handle the UTF-8 source string and the return the correctly encoded and escaped value. Fixes [Bug #12052] Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Notes
Notes: Merged: https://github.com/ruby/ruby/pull/4605 Merged-By: jeremyevans <code@jeremyevans.net>
Diffstat (limited to 'transcode.c')
-rw-r--r--transcode.c6
1 files changed, 6 insertions, 0 deletions
diff --git a/transcode.c b/transcode.c
index 505c8177fb..a452448d99 100644
--- a/transcode.c
+++ b/transcode.c
@@ -2719,6 +2719,12 @@ str_transcode0(int argc, VALUE *argv, VALUE *self, int ecflags, VALUE ecopts)
}
}
else {
+ if (senc && denc && !rb_enc_asciicompat(senc) && !rb_enc_asciicompat(denc)) {
+ rb_encoding *utf8 = rb_utf8_encoding();
+ str = rb_str_conv_enc(str, senc, utf8);
+ senc = utf8;
+ sname = "UTF-8";
+ }
if (encoding_equal(sname, dname)) {
sname = "";
dname = "";