From 56242ba495246e95dd5178f2ec101c1005c10afc Mon Sep 17 00:00:00 2001 From: Earlopain <14981592+Earlopain@users.noreply.github.com> Date: Tue, 14 Jan 2025 20:20:05 +0100 Subject: Better handle regexp in the parser translator MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already. There was also a mistake where the wrong value was constructed for the ast, this is now fixed. One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh, I don't entirely understand all the rules. Fixes more than half of the remaining ast differences for rubocop tests --- test/prism/ruby/parser_test.rb | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'test') diff --git a/test/prism/ruby/parser_test.rb b/test/prism/ruby/parser_test.rb index 8d791da369..fae8ec8dec 100644 --- a/test/prism/ruby/parser_test.rb +++ b/test/prism/ruby/parser_test.rb @@ -69,13 +69,15 @@ module Prism # https://github.com/whitequark/parser/issues/950 "whitequark/dedenting_interpolating_heredoc_fake_line_continuation.txt", + + # Contains an escaped multibyte character. This is supposed to drop to backslash + "seattlerb/regexp_escape_extended.txt", ] # These files are either failing to parse or failing to translate, so we'll # skip them for now. skip_all = skip_incorrect | [ "unescaping.txt", - "seattlerb/bug190.txt", "seattlerb/heredoc_with_extra_carriage_returns_windows.txt", "seattlerb/heredoc_with_only_carriage_returns_windows.txt", "seattlerb/heredoc_with_only_carriage_returns.txt", -- cgit v1.2.3