[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser::Lexer`

This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for double splat argument. ## Parser gem (Expected) Returns `tDSTAR` token: ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Parser::Ruby33.new.tokenize(buf)[2]' ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23] [[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]], [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]], [:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the parser returned `tPOW` token when parsing the following: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23] [[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 6...8>]], [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]], [:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]] ``` After the update, the parser now returns `tDSTAR` token for the same input: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23] [[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]], [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]], [:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]] ``` With this change, the following code could be removed from test/prism/ruby/parser_test.rb: ```diff - when :tPOW - actual_token[0] = expected_token[0] if expected_token[0] == :tDSTAR ``` `tPOW` is the token type for the behavior of `a ** b`, and its behavior remains unchanged: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "a ** b"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23] [[:tIDENTIFIER, ["a", #<Parser::Source::Range example.rb 0...1>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 2...4>]], [:tIDENTIFIER, ["b", #<Parser::Source::Range example.rb 5...6>]]] ``` https://github.com/ruby/prism/commit/66bde35a44
author: Koichi ITO <koic.ito@gmail.com> 2024-09-08 18:25:47 +0900
committer: git <svn-admin@ruby-lang.org> 2024-09-09 19:01:30 +0000
commit: 7a6533452807aa432f097db4e637e4c480645d6b (patch)
tree: 93ef81c733a849b7f4ddd2e00efd905b8e348c4a
parent: 1205f17125e5e07d7649c8c9fedcf4b74079a54d (diff)
2 files changed, 3 insertions, 6 deletions
diff --git a/lib/prism/translation/parser/lexer.rb b/lib/prism/translation/parser/lexer.rb
index f687360789..5ca507f7e7 100644
--- a/lib/prism/translation/parser/lexer.rb
+++ b/lib/prism/translation/parser/lexer.rb
@@ -173,7 +173,7 @@ module Prism
           UMINUS_NUM: :tUNARY_NUM,
           UPLUS: :tUPLUS,
           USTAR: :tSTAR,
-          USTAR_STAR: :tPOW,
+          USTAR_STAR: :tDSTAR,
           WORDS_SEP: :tSPACE
         }
 
diff --git a/test/prism/ruby/parser_test.rb b/test/prism/ruby/parser_test.rb
index f443342045..87749efdda 100644
--- a/test/prism/ruby/parser_test.rb
+++ b/test/prism/ruby/parser_test.rb
@@ -268,11 +268,8 @@ module Prism
           # There are a lot of tokens that have very specific meaning according
           # to the context of the parser. We don't expose that information in
           # prism, so we need to normalize these tokens a bit.
-          case actual_token[0]
-          when :kDO
-            actual_token[0] = expected_token[0] if %i[kDO_BLOCK kDO_LAMBDA].include?(expected_token[0])
-          when :tPOW
-            actual_token[0] = expected_token[0] if expected_token[0] == :tDSTAR
+          if actual_token[0] == :kDO && %i[kDO_BLOCK kDO_LAMBDA].include?(expected_token[0])
+            actual_token[0] = expected_token[0]
           end
 
           # Now we can assert that the tokens are actually equal.
author	Koichi ITO <koic.ito@gmail.com>	2024-09-08 18:25:47 +0900
committer	git <svn-admin@ruby-lang.org>	2024-09-09 19:01:30 +0000
commit	7a6533452807aa432f097db4e637e4c480645d6b (patch)
tree	93ef81c733a849b7f4ddd2e00efd905b8e348c4a
parent	1205f17125e5e07d7649c8c9fedcf4b74079a54d (diff)