summaryrefslogtreecommitdiff
path: root/lib/prism/translation/parser/lexer.rb
AgeCommit message (Collapse)Author
2024-03-16[ruby/prism] Fix token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using backquoted heredoc indetiner: ```ruby <<-` FOO` a b FOO ``` ## Parser gem (Expected) Returns `tXSTRING_BEG` as the first token: ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr, s(:str, "a\n"), s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]], [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]], [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]], [:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING_BEG` token and value of `tSTRING_END` token. ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr, s(:str, "a\n"), s(:str, "b\n")), [], [[:tSTRING_BEG, ["<<\"", #<Parser::Source::Range example.rb 0...10>]], [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]], [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]], [:tSTRING_END, ["` FOO`", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]] ``` After this correction, the AST and tokens returned by the Parser gem are the same: ```console $ bunlde exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr, s(:str, "a\n"), s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]], [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]], [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]], [:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]] ``` https://github.com/ruby/prism/commit/308f8d85a1
2024-03-15[ruby/prism] Fix token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using escaped backslash in string literal: ```ruby "\\ foo \\ bar" ``` ## Parser gem (Expected) ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]], [:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING` token: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\\\ foo \\\\ bar", #<Parser::Source::Range example.rb 0...15>]], [:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]] ``` After this correction, the AST and tokens returned by the Parser gem are the same: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]], [:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]] ``` The reproduction test is based on the following strings.txt and exists: https://github.com/ruby/prism/blob/v0.24.0/test/prism/fixtures/strings.txt#L79 But, the restoration has not yet been performed due to remaining other issues in strings.txt. https://github.com/ruby/prism/commit/2c44e7e307
2024-03-15[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for the heredocs_leading_whitespace.txt test. https://github.com/ruby/prism/commit/7d45fb1eed
2024-03-15[ruby/prism] Fix an AST and token incompatibility for ↵Koichi ITO
`Prism::Translation::Parser` This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser` for dstring literal: ```ruby "foo #{bar}" ``` ## Parser gem (Expected) ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:dstr, s(:str, "foo\n"), s(:str, " "), s(:begin, s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]], [:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]], [:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]], [:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]], [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]], [:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]], [:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the AST and tokens returned by the Parser gem were different. In this case, `dstr` node should not be nested: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:dstr, s(:dstr, s(:str, "foo\n"), s(:str, " ")), s(:begin, s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]], [:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]], [:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]], [:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]], [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]], [:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]], [:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]] ``` After this correction, the AST and tokens returned by the Parser gem are the same: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:dstr, s(:str, "foo\n"), s(:str, " "), s(:begin, s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]], [:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]], [:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]], [:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]], [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]], [:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]], [:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]] ``` https://github.com/ruby/prism/commit/c1652a9ee7
2024-03-13[ruby/prism] Fix an AST and token incompatibility for ↵Koichi ITO
`Prism::Translation::Parser` This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser` for empty xstring literal. ## Parser gem (Expected) ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Parser::Ruby33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]], [:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the AST and tokens returned by the Parser gem were different: ```console $ bunele exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr, s(:str, "")), [], [[:tBACK_REF2, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]], [:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]] ``` After this correction, the AST and tokens returned by the Parser gem are the same: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]], [:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]] ``` https://github.com/ruby/prism/commit/4ac89dcbb5
2024-03-12[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
In practice, the `BACKTICK` is mapped either as `:tXSTRING_BEG` or `:tBACK_REF2`. The former is used in xstrings like `` `foo` ``, while the latter is utilized as a back reference in contexts like `` A::` ``. This PR will make corrections to differentiate the use of `BACKTICK`. This mistake was discovered through the investigation of xstring.txt file. The PR will run tests from xstring.txt file except for `` `f\oo` ``, which will still fail, hence it will be separated into xstring_with_backslash.txt file. This separation will facilitate addressing the correction at a different time. https://github.com/ruby/prism/commit/49ad8df40a
2024-03-12[ruby/prism] Fix some whitequark/parser lexer compatibilitiesKevin Newton
https://github.com/ruby/prism/commit/34e521d071
2024-03-12[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for `tBACK_REF2`: ## Parser gem (Expected) Returns `tBACK_REF2` token: ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Parser::Ruby33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]], [:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the parser returned `tXSTRING_BEG` token when parsing the following: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]], [:tXSTRING_BEG, ["`", #<Parser::Source::Range example.rb 3...4>]]] ``` After the update, the parser now returns `tBACK_REF2` token for the same input: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]], [:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]] ``` This correction enables the restoration of `constants.txt` as a test case. https://github.com/ruby/prism/commit/7f63b28f98
2024-03-12[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for beginless range: ## Parser gem (Expected) Returns `tBDOT2` token: ```console $ bundle exec ruby -Ilib -rparser/ruby33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Parser::Ruby33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]] ``` ## `Prism::Translation::Parser` (Actual) Previously, the parser returned `tDOT2` token when parsing the following: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]] ``` After the update, the parser now returns `tBDOT2` token for the same input: ```console $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \ 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]' ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22] [[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]] ``` This correction enables the restoration of `endless_range_in_conditional.txt` as a test case. https://github.com/ruby/prism/commit/f624b99ab0
2024-03-12[ruby/prism] Fix an AST and token incompatibility for ↵Koichi ITO
`Prism::Translation::Parser` Fixes https://github.com/ruby/prism/pull/2515. This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser` for string literal with line breaks. https://github.com/ruby/prism/commit/c58466e5bf
2024-03-08[ruby/prism] Fix a token incompatibility for `Prism::Translation::Parser`Koichi ITO
Fixes https://github.com/ruby/prism/pull/2512. This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for `HEREDOC_END` with a newline. https://github.com/ruby/prism/commit/b67d1e0c6f
2024-03-08Fix an error for `Prism::Translation::Parser::Lexer`Koichi ITO
This PR fixes the following error for `Prism::Translation::Parser::Lexer` on the main branch: ```console $ cat example.rb 'a' # aあ " #{x} " $ bundle exec rubocop Parser::Source::Range: end_pos must not be less than begin_pos /Users/koic/.rbenv/versions/3.0.4/lib/ruby/gems/3.0.0/gems/parser-3.3.0.5/lib/parser/source/range.rb:39:in `initialize' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `new' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `block in to_a' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `map' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `to_a' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:263:in `build_tokens' /Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:92:in `tokenize' ``` This change was made in https://github.com/ruby/prism/pull/2557, and it seems there was an inconsistency in Range due to forgetting to apply `offset_cache` to `start_offset`.
2024-03-07[ruby/prism] Fix an AST and token incompatibility for ↵Koichi ITO
`Prism::Translation::Parser` Fixes https://github.com/ruby/prism/pull/2506. This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser` for symbols quoted with line breaks. https://github.com/ruby/prism/commit/06ab4df8cd
2024-03-06[ruby/prism] Use the diagnostic types in the parser translation layerKevin Newton
https://github.com/ruby/prism/commit/1a8a0063dc
2024-03-06[ruby/prism] Fix some type-checking errors by using different method callsUfuk Kayserilioglu
For example, use `.fetch` or `.dig` instead of `[]`, and use `===` instead of `is_a?` for checking types of objects. https://github.com/ruby/prism/commit/548b54915f
2024-03-04[ruby/prism] Fix up some minor parser incompatibilitiesKevin Newton
https://github.com/ruby/prism/commit/c6c771d1fa
2024-02-16[ruby/prism] Fix lexing of `foo!` when it's a first thing to parseMax Prokopiev
https://github.com/ruby/prism/commit/7597aca76a
2024-01-27[ruby/prism] Add parser translationKevin Newton
https://github.com/ruby/prism/commit/8cdec8070c