Use end of char boundary in start_with?

Previously we used the next character following the found prefix to determine if the match ended on a broken character. This had caused surprising behaviour when a valid character was followed by a UTF-8 continuation byte. This commit changes the behaviour to instead look for the end of the last character in the prefix. [Bug #19784] Co-authored-by: ywenc <ywenc@github.com> Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
author: John Hawthorn <john@hawthorn.email> 2023-08-31 15:12:47 -0700
committer: John Hawthorn <john@hawthorn.email> 2023-09-01 16:23:28 -0700
commit: d89b15cdce8a2fa36fc2a150551f0dd8e58814d7 (patch)
tree: 789da818c90a706c659de182207e02a9cc5b1e40 /string.c
parent: 2ca0f01015d076d966ab1b0f28700a4424b86da6 (diff)
1 files changed, 2 insertions, 2 deletions
diff --git a/string.c b/string.c
index 5af5fc4a40..deeed4a12a 100644
--- a/string.c
+++ b/string.c
@@ -10472,7 +10472,7 @@ rb_str_start_with(int argc, VALUE *argv, VALUE str)
             p = RSTRING_PTR(str);
             e = p + slen;
             s = p + tlen;
-            if (!at_char_boundary(p, s, e, enc))
+            if (!at_char_right_boundary(p, s, e, enc))
                 continue;
             if (memcmp(p, RSTRING_PTR(tmp), tlen) == 0)
                 return Qtrue;
@@ -10554,7 +10554,7 @@ deleted_prefix_length(VALUE str, VALUE prefix)
         }
         const char *strend = strptr + olen;
         const char *after_prefix = strptr + prefixlen;
-        if (!at_char_boundary(strptr, after_prefix, strend, enc)) {
+        if (!at_char_right_boundary(strptr, after_prefix, strend, enc)) {
             /* prefix does not end at char-boundary */
             return 0;
         }
author	John Hawthorn <john@hawthorn.email>	2023-08-31 15:12:47 -0700
committer	John Hawthorn <john@hawthorn.email>	2023-09-01 16:23:28 -0700
commit	d89b15cdce8a2fa36fc2a150551f0dd8e58814d7 (patch)
tree	789da818c90a706c659de182207e02a9cc5b1e40 /string.c
parent	2ca0f01015d076d966ab1b0f28700a4424b86da6 (diff)