diff options
author | Koichi ITO <koic.ito@gmail.com> | 2024-03-16 01:20:58 +0900 |
---|---|---|
committer | git <svn-admin@ruby-lang.org> | 2024-03-15 18:08:39 +0000 |
commit | e3a82d79fd727c90638ef697cb1b5ad73c7e62c0 (patch) | |
tree | 6a4b7539b7577f46143d163b98dad5ef536cb617 /lib/prism/translation/parser/lexer.rb | |
parent | c9da8d67fdb9fab82f76d583239f5b9761f60350 (diff) |
[ruby/prism] Fix token incompatibility for `Prism::Translation::Parser::Lexer`
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer`
when using escaped backslash in string literal:
```ruby
"\\ foo \\ bar"
```
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING` token:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\\\ foo \\\\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
The reproduction test is based on the following strings.txt and exists:
https://github.com/ruby/prism/blob/v0.24.0/test/prism/fixtures/strings.txt#L79
But, the restoration has not yet been performed due to remaining other issues in strings.txt.
https://github.com/ruby/prism/commit/2c44e7e307
Diffstat (limited to 'lib/prism/translation/parser/lexer.rb')
-rw-r--r-- | lib/prism/translation/parser/lexer.rb | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/lib/prism/translation/parser/lexer.rb b/lib/prism/translation/parser/lexer.rb index 9e5d27ef29..cb23fe8ac0 100644 --- a/lib/prism/translation/parser/lexer.rb +++ b/lib/prism/translation/parser/lexer.rb @@ -289,7 +289,7 @@ module Prism elsif ["\"", "'"].include?(value) && (next_token = lexed[index][0]) && next_token.type == :STRING_CONTENT && next_token.value.lines.count <= 1 && (next_next_token = lexed[index + 1][0]) && next_next_token.type == :STRING_END next_location = token.location.join(next_next_token.location) type = :tSTRING - value = next_token.value + value = next_token.value.gsub("\\\\", "\\") location = Range.new(source_buffer, offset_cache[next_location.start_offset], offset_cache[next_location.end_offset]) index += 2 elsif value.start_with?("<<") |