From 56242ba495246e95dd5178f2ec101c1005c10afc Mon Sep 17 00:00:00 2001
From: Earlopain <14981592+Earlopain@users.noreply.github.com>
Date: Tue, 14 Jan 2025 20:20:05 +0100
Subject: Better handle regexp in the parser translator
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already.

There was also a mistake where the wrong value was constructed for the ast, this is now fixed.
One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh,
I don't entirely understand all the rules.

Fixes more than half of the remaining ast differences for rubocop tests
---
 test/prism/ruby/parser_test.rb | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

(limited to 'test')

diff --git a/test/prism/ruby/parser_test.rb b/test/prism/ruby/parser_test.rb
index 8d791da369..fae8ec8dec 100644
--- a/test/prism/ruby/parser_test.rb
+++ b/test/prism/ruby/parser_test.rb
@@ -69,13 +69,15 @@ module Prism
 
       # https://github.com/whitequark/parser/issues/950
       "whitequark/dedenting_interpolating_heredoc_fake_line_continuation.txt",
+
+      # Contains an escaped multibyte character. This is supposed to drop to backslash
+      "seattlerb/regexp_escape_extended.txt",
     ]
 
     # These files are either failing to parse or failing to translate, so we'll
     # skip them for now.
     skip_all = skip_incorrect | [
       "unescaping.txt",
-      "seattlerb/bug190.txt",
       "seattlerb/heredoc_with_extra_carriage_returns_windows.txt",
       "seattlerb/heredoc_with_only_carriage_returns_windows.txt",
       "seattlerb/heredoc_with_only_carriage_returns.txt",
-- 
cgit v1.2.3