| Age | Commit message (Collapse) | Author |
|
The same also applies to `break`/`next`.
https://bugs.ruby-lang.org/issues/21540
https://github.com/ruby/prism/commit/3a38b192e3
|
|
Generally I have been good about safely accessing the tokens but failed
to properly guard against no tokens in places
where it could theoretically happen through invalid syntax.
I added a test case for one occurance, other changes are theoretical only.
https://github.com/ruby/prism/commit/4a3866af19
|
|
https://docs.ruby-lang.org/en/master/syntax/literals_rdoc.html#label-25w+and+-25W-3A+String-Array+Literals
> %W allow escape sequences described in Escape Sequences. However the continuation line <newline> is not usable because it is interpreted as the escaped newline described above.
https://github.com/ruby/prism/commit/f5c7460ad5
|
|
Instead, prefer `scan_byte` over `get_byte` since that already returns the byte as an integer, sidestepping conversion issues.
Fixes https://github.com/ruby/prism/issues/3582
https://github.com/ruby/prism/commit/7f3008b2b5
|
|
https://github.com/ruby/prism/commit/12af4e144e
|
|
`StringNode` and `SymbolNode` don't have the same shape
(`content` vs `value`) and that wasn't handled.
I believe the logic for the common case can be reused.
I simply left the special handling for implicit nodes in pattern matching
and fall through otherwise.
NOTE: patterns.txt is not actually tested at the moment,
because it contains syntax that `parser` mistakenly rejects.
But I checked manually that this doesn't introduce other failures.
https://github.com/whitequark/parser/pull/1060
https://github.com/ruby/prism/commit/55adfaa895
|
|
|
|
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
https://github.com/ruby/prism/commit/4edfe9d981
|
|
https://github.com/ruby/prism/commit/422d5c4c64
|
|
I see `Array.include?` as 2.4% runtime. Probably because of `LPAREN_CONVERSION_TOKEN_TYPES` but
the others will be faster as well.
Also remove some inline array checks. They are specifically optimized in Ruby since 3.4, but for now prism is for >= 2.7
https://github.com/ruby/prism/commit/ca9500a3fc
|
|
`Integer#chr` performs some validation that we don't want/need. Octal escapes can go above 255, where it will then raise trying to convert.
`append_as_bytes` actually allows to pass a number, so we can just skip that call.
Although, on older rubies of course we still need to handle this in the polyfill.
I don't really like using `pack` but don't know of another way to do so.
For the utf-8 escapes, this is not an issue. Invalid utf-8 in these is simply a syntax error.
https://github.com/ruby/prism/commit/161c606b1f
|
|
https://github.com/ruby/prism/commit/09c59a3aa5
|
|
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
https://github.com/ruby/prism/commit/4edfe9d981
|
|
Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already.
There was also a mistake where the wrong value was constructed for the ast, this is now fixed.
One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh,
I don't entirely understand all the rules.
Fixes more than half of the remaining ast differences for rubocop tests
https://github.com/ruby/prism/commit/e1c75f304b
|
|
Also fixes a token incompatibility for the word separator. parser only considers whitespace until the first newline
https://github.com/ruby/prism/commit/bd3dd2b62a
|
|
The offset cache contains an entry for each byte so it can't be accessed via the string length.
Adds tests for all variants except for this:
```
"fo
o" "ba
’"
```
For some reason, this still has the wrong offset.
https://github.com/ruby/prism/commit/a651126458
|
|
Mostly around newlines and line continuation.
* percent arrays need special backslash handling in the ast
* Fix offset issue for heredocs with many line continuations (used wrong variable as index access)
* More refined rules on when to simplify string tokens
* Handle line continuations in squiggly heredocs
* Correctly dedent squiggly heredocs with interpolation
* Consider `':foo:` and `%s[foo]` to not be interpolation
https://github.com/ruby/prism/commit/4edfe9d981
|
|
1. The string starts out as binary
2. `ち` is appended, forcing it back into utf-8
3. Some invalid byte sequences are tried to append
> incompatible character encodings: UTF-8 and BINARY (ASCII-8BIT)
This makes use of my wish to use `append_as_bytes`. Unfortunatly that method is rather new
so it needs a fallback
https://github.com/ruby/prism/commit/e31e94a775
|
|
https://github.com/ruby/prism/commit/422d5c4c64
|
|
Avoids an array allocation which matters more and more
the larger the file is.
I have it at 14% of runtime.
https://github.com/ruby/prism/commit/f65b90f27d
|
|
I see `Array.include?` as 2.4% runtime. Probably because of `LPAREN_CONVERSION_TOKEN_TYPES` but
the others will be faster as well.
Also remove some inline array checks. They are specifically optimized in Ruby since 3.4, but for now prism is for >= 2.7
https://github.com/ruby/prism/commit/ca9500a3fc
|
|
Turns out, it was already almost correct. If you disregard \c and \M style escapes, only a single character is allowed to be escaped in a regex so most tests passed already.
There was also a mistake where the wrong value was constructed for the ast, this is now fixed.
One test fails because of this, but I'm fairly sure it is because of a parser bug. For `/\“/`, the backslash is supposed to be removed because it is a multibyte character. But tbh,
I don't entirely understand all the rules.
Fixes more than half of the remaining ast differences for rubocop tests
|
|
It falsely considered it to be a single backtick command
https://github.com/ruby/prism/commit/dd762be590
|
|
```rb
<<~F
foo
#{}
bar
F
```
has zero common whitespace.
https://github.com/ruby/prism/commit/1f3c222a06
|
|
Tests worked around this but the incompatibility is not hard to fix.
This fixes 17 token incompatibilies in tests here that were previously passing
https://github.com/ruby/prism/commit/101962526d
|
|
multiline bytes
A rather silly issue with a rather simple fix.
The ranges already use the offset cache, this effectivly double-encoded them.
https://github.com/ruby/prism/commit/66b65634c0
|
|
In https://github.com/ruby/prism/pull/3393 I made a mistake.
When there is no previous token, it wraps around to -1. Oops
Additionally, if a comment has no newline then the offset should be kept as is
https://github.com/ruby/prism/commit/3c266f1de4
|
|
There appear to be a bunch of rules, changing behaviour for
inline comments, multiple comments after another, etc.
This seems to line up with reality pretty closely, token differences for RuboCop tests go from 1129 to 619 which seems pretty impressive
https://github.com/ruby/prism/commit/2e1b92670c
|
|
strings and word arrays
These are not line continuations. They either should be taken literally,
or allow the word array to contain the following whitespace (newlines in this case)
Before:
```
0...1: tSTRING_BEG => "'"
1...12: tSTRING_CONTENT => "foobar\\\n"
12...16: tSTRING_CONTENT => "baz\n"
16...17: tSTRING_END => "'"
17...18: tNL => nil
```
After:
```
0...1: tSTRING_BEG => "'"
1...6: tSTRING_CONTENT => "foo\\\n"
6...12: tSTRING_CONTENT => "bar\\\n"
12...16: tSTRING_CONTENT => "baz\n"
16...17: tSTRING_END => "'"
17...18: tNL => nil
```
https://github.com/ruby/prism/commit/b6554ad64e
|
|
This leaves `\c` and `\M` escaping but I don't understand how these should even work yet. Maybe later.
https://github.com/ruby/prism/commit/13db3e8cb9
|
|
The offset cache contains an entry for each byte so it can't be accessed via the string length.
Adds tests for all variants except for this:
```
"fo
o" "ba
’"
```
For some reason, this still has the wrong offset.
https://github.com/ruby/prism/commit/a651126458
|
|
`Prism::Translation::Parser::Lexer`
## Summary
This PR fixes `kDO_LAMBDA` token incompatibility between Parser gem and `Prism::Translation::Parser` for lambda `do` block.
### Parser gem (Expected)
Returns `kDO_LAMBDA` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
### `Prism::Translation::Parser` (Actual)
Previously, the parser returned `kDO` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
After the update, the parser now returns `kDO_LAMBDA` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
## Additional Information
Unfortunately, this kind of edge case doesn't work as expected; `kDO` is returned instead of `kDO_LAMBDA`.
However, since `kDO` is already being returned in this case, there is no change in behavior.
### Parser gem
Returns `tLAMBDA` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
[:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
[:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 23...25>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
```
### `Prism::Translation::Parser`
Returns `kDO` token:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
[:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
[:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO, ["do", #<Parser::Source::Range example.rb 23...25>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
```
As the intention is not to address such special cases at this point, a comment has been left indicating that this case still returns `kDO`.
In other words, `kDO_LAMBDA` will now be returned except for edge cases after this PR.
https://github.com/ruby/prism/commit/2ee480654c
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for double splat argument.
## Parser gem (Expected)
Returns `tDSTAR` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tPOW` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
After the update, the parser now returns `tDSTAR` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
With this change, the following code could be removed from test/prism/ruby/parser_test.rb:
```diff
- when :tPOW
- actual_token[0] = expected_token[0] if expected_token[0] == :tDSTAR
```
`tPOW` is the token type for the behavior of `a ** b`, and its behavior remains unchanged:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "a ** b"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["a", #<Parser::Source::Range example.rb 0...1>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 2...4>]],
[:tIDENTIFIER, ["b", #<Parser::Source::Range example.rb 5...6>]]]
```
https://github.com/ruby/prism/commit/66bde35a44
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for left parenthesis.
## Parser gem (Expected)
Returns `tLPAREN2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 \
-ve 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tLPAREN` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
After the update, the parser now returns `tLPAREN2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
The `PARENTHESIS_LEFT` token in Prism is classified as either `tLPAREN` or `tLPAREN2` in the Parser gem.
The tokens that were previously all classified as `tLPAREN` are now also classified to `tLPAREN2`.
With this change, the following code could be removed from `test/prism/ruby/parser_test.rb`:
```diff
- when :tLPAREN
- actual_token[0] = expected_token[0] if expected_token[0] == :tLPAREN2
```
https://github.com/ruby/prism/commit/04d6f3478d
|
|
https://github.com/ruby/prism/commit/5985ab7687
|
|
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using backquoted heredoc indetiner:
```ruby
<<-` FOO`
a
b
FOO
```
## Parser gem (Expected)
Returns `tXSTRING_BEG` as the first token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING_BEG` token and
value of `tSTRING_END` token.
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tSTRING_BEG, ["<<\"", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, ["` FOO`", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bunlde exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr,
s(:str, "a\n"),
s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
[:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
[:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
[:tSTRING_END, [" FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
```
https://github.com/ruby/prism/commit/308f8d85a1
|
|
This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer`
when using escaped backslash in string literal:
```ruby
"\\ foo \\ bar"
```
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING` token:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\\\ foo \\\\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:str, "\\ foo \\ bar"), [], [[:tSTRING, ["\\ foo \\ bar", #<Parser::Source::Range example.rb 0...15>]],
[:tNL, [nil, #<Parser::Source::Range example.rb 15...16>]]]]
```
The reproduction test is based on the following strings.txt and exists:
https://github.com/ruby/prism/blob/v0.24.0/test/prism/fixtures/strings.txt#L79
But, the restoration has not yet been performed due to remaining other issues in strings.txt.
https://github.com/ruby/prism/commit/2c44e7e307
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser`
for the heredocs_leading_whitespace.txt test.
https://github.com/ruby/prism/commit/7d45fb1eed
|
|
`Prism::Translation::Parser`
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for dstring literal:
```ruby
"foo
#{bar}"
```
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:dstr,
s(:str, "foo\n"),
s(:str, " "),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the AST and tokens returned by the Parser gem were different. In this case, `dstr` node should not be nested:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:dstr,
s(:dstr,
s(:str, "foo\n"),
s(:str, " ")),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:dstr,
s(:str, "foo\n"),
s(:str, " "),
s(:begin,
s(:send, nil, :bar))), [], [[:tSTRING_BEG, ["\"", #<Parser::Source::Range example.rb 0...1>]],
[:tSTRING_CONTENT, ["foo\n", #<Parser::Source::Range example.rb 1...5>]],
[:tSTRING_CONTENT, [" ", #<Parser::Source::Range example.rb 5...7>]],
[:tSTRING_DBEG, ["\#{", #<Parser::Source::Range example.rb 7...9>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 9...12>]],
[:tSTRING_DEND, ["}", #<Parser::Source::Range example.rb 12...13>]],
[:tSTRING_END, ["\"", #<Parser::Source::Range example.rb 13...14>]], [:tNL, [nil, #<Parser::Source::Range example.rb 14...15>]]]]
```
https://github.com/ruby/prism/commit/c1652a9ee7
|
|
`Prism::Translation::Parser`
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for empty xstring literal.
## Parser gem (Expected)
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Parser::Ruby33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the AST and tokens returned by the Parser gem were different:
```console
$ bunele exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr, s(:str, "")), [], [[:tBACK_REF2, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
After this correction, the AST and tokens returned by the Parser gem are the same:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("/tmp/s.rb"); buf.source = "``"; p Prism::Translation::Parser33.new.tokenize(buf)'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[s(:xstr), [], [[:tXSTRING_BEG, ["`", #<Parser::Source::Range /tmp/s.rb 0...1>]],
[:tSTRING_END, ["`", #<Parser::Source::Range /tmp/s.rb 1...2>]]]]
```
https://github.com/ruby/prism/commit/4ac89dcbb5
|
|
In practice, the `BACKTICK` is mapped either as `:tXSTRING_BEG` or `:tBACK_REF2`.
The former is used in xstrings like `` `foo` ``, while the latter is utilized as
a back reference in contexts like `` A::` ``.
This PR will make corrections to differentiate the use of `BACKTICK`.
This mistake was discovered through the investigation of xstring.txt file.
The PR will run tests from xstring.txt file except for `` `f\oo` ``, which will still fail,
hence it will be separated into xstring_with_backslash.txt file.
This separation will facilitate addressing the correction at a different time.
https://github.com/ruby/prism/commit/49ad8df40a
|
|
https://github.com/ruby/prism/commit/34e521d071
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for `tBACK_REF2`:
## Parser gem (Expected)
Returns `tBACK_REF2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tXSTRING_BEG` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tXSTRING_BEG, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
After the update, the parser now returns `tBACK_REF2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "A::`"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tCONSTANT, ["A", #<Parser::Source::Range example.rb 0...1>]], [:tCOLON2, ["::", #<Parser::Source::Range example.rb 1...3>]],
[:tBACK_REF2, ["`", #<Parser::Source::Range example.rb 3...4>]]]
```
This correction enables the restoration of `constants.txt` as a test case.
https://github.com/ruby/prism/commit/7f63b28f98
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for
beginless range:
## Parser gem (Expected)
Returns `tBDOT2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tDOT2` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
After the update, the parser now returns `tBDOT2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "..42"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[[:tBDOT2, ["..", #<Parser::Source::Range example.rb 0...2>]], [:tINTEGER, [42, #<Parser::Source::Range example.rb 2...4>]]]
```
This correction enables the restoration of `endless_range_in_conditional.txt` as a test case.
https://github.com/ruby/prism/commit/f624b99ab0
|
|
`Prism::Translation::Parser`
Fixes https://github.com/ruby/prism/pull/2515.
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for string literal with line breaks.
https://github.com/ruby/prism/commit/c58466e5bf
|
|
Fixes https://github.com/ruby/prism/pull/2512.
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser`
for `HEREDOC_END` with a newline.
https://github.com/ruby/prism/commit/b67d1e0c6f
|
|
This PR fixes the following error for `Prism::Translation::Parser::Lexer` on the main branch:
```console
$ cat example.rb
'a' # aあ
"
#{x}
"
$ bundle exec rubocop
Parser::Source::Range: end_pos must not be less than begin_pos
/Users/koic/.rbenv/versions/3.0.4/lib/ruby/gems/3.0.0/gems/parser-3.3.0.5/lib/parser/source/range.rb:39:in `initialize'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `new'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:299:in `block in to_a'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `map'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser/lexer.rb:297:in `to_a'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:263:in `build_tokens'
/Users/koic/src/github.com/ruby/prism/lib/prism/translation/parser.rb:92:in `tokenize'
```
This change was made in https://github.com/ruby/prism/pull/2557, and it seems there was
an inconsistency in Range due to forgetting to apply `offset_cache` to `start_offset`.
|
|
`Prism::Translation::Parser`
Fixes https://github.com/ruby/prism/pull/2506.
This PR fixes an AST and token incompatibility between Parser gem and `Prism::Translation::Parser`
for symbols quoted with line breaks.
https://github.com/ruby/prism/commit/06ab4df8cd
|
|
https://github.com/ruby/prism/commit/1a8a0063dc
|
|
For example, use `.fetch` or `.dig` instead of `[]`, and use `===` instead of `is_a?` for checking types of objects.
https://github.com/ruby/prism/commit/548b54915f
|