| Age | Commit message (Collapse) | Author |
|
Tests worked around this but the incompatibility is not hard to fix.
This fixes 17 token incompatibilies in tests here that were previously passing
https://github.com/ruby/prism/commit/101962526d
|
|
Skipping detecting the encoding is almost always right, just for binary it should actually happen.
A symbol containing escapes that are invalid
in utf-8 would fail to parse since symbols must be valid in the script encoding.
Additionally, the parser gem would raise an exception somewhere during string handling
https://github.com/ruby/prism/commit/fa0154d9e4
|
|
This leaves `\c` and `\M` escaping but I don't understand how these should even work yet. Maybe later.
https://github.com/ruby/prism/commit/13db3e8cb9
|
|
Slightly tweaking the import script becaues of backtrace format changes in Ruby 3.4
Most tests pass in all parsers, with only a handful of failures overall
https://github.com/ruby/prism/commit/9b5b785aa4
|
|
translator
Much of this logic should be shared between interpolated symbols and regexps.
It's also incorrect when the node contains a literal `\\n` (same as for plain string nodes at the moment)
https://github.com/ruby/prism/commit/561914f99b
|
|
Turns out, the vast majority of work was already done with handling the same for heredocs
I'm confident this should also apply to actual string nodes (there's even a todo for it) but
no tests change if I apply it there too, so I can't say for sure if the logic would be correct.
The individual test files are a bit too large, maybe something else would break that currently passes.
Leaving it for later to look more closely into that.
https://github.com/ruby/prism/commit/6bba1c54e1
|
|
The offset cache contains an entry for each byte so it can't be accessed via the string length.
Adds tests for all variants except for this:
```
"fo
o" "ba
’"
```
For some reason, this still has the wrong offset.
https://github.com/ruby/prism/commit/a651126458
|
|
That issue is exactly about what this test file contains:
A single-quoted heredocs with backslashes
https://github.com/ruby/prism/commit/4820a44c7b
|
|
Heredocs that contain "\\n" don't start a new string node.
https://github.com/ruby/prism/commit/61d9d3a15e
|
|
https://github.com/ruby/prism/commit/5ea6042408
|
|
Introduce StringQuery to provide methods to access some metadata
about the Ruby lexer.
https://github.com/ruby/prism/commit/d3f55b67b9
|
|
Calculating code unit offsets for a source can be very expensive,
especially when the source is large. This commit introduces a new
class that wraps the source and desired encoding into a cache that
reuses pre-computed offsets. It performs quite a bit better.
There are still some problems with this approach, namely character
boundaries and the fact that the cache is unbounded, but both of
these may be addressed in subsequent commits.
https://github.com/ruby/prism/commit/2e3e1a4d4d
|
|
https://github.com/ruby/prism/commit/343197e4ff
|
|
https://github.com/ruby/prism/commit/25a4cf6794
Co-authored-by: Kevin Newton <kddnewton@users.noreply.github.com>
|
|
`Prism::Translation::Parser::Lexer`
## Summary
This PR fixes `kDO_LAMBDA` token incompatibility between Parser gem and `Prism::Translation::Parser` for lambda `do` block.
### Parser gem (Expected)
Returns `kDO_LAMBDA` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
### `Prism::Translation::Parser` (Actual)
Previously, the parser returned `kDO` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
After the update, the parser now returns `kDO_LAMBDA` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
```
## Additional Information
Unfortunately, this kind of edge case doesn't work as expected; `kDO` is returned instead of `kDO_LAMBDA`.
However, since `kDO` is already being returned in this case, there is no change in behavior.
### Parser gem
Returns `tLAMBDA` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
[:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
[:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 23...25>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
```
### `Prism::Translation::Parser`
Returns `kDO` token:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
[[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
[:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
[:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
[:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO, ["do", #<Parser::Source::Range example.rb 23...25>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
```
As the intention is not to address such special cases at this point, a comment has been left indicating that this case still returns `kDO`.
In other words, `kDO_LAMBDA` will now be returned except for edge cases after this PR.
https://github.com/ruby/prism/commit/2ee480654c
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for double splat argument.
## Parser gem (Expected)
Returns `tDSTAR` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tPOW` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
After the update, the parser now returns `tDSTAR` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "def f(**foo) end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:kDEF, ["def", #<Parser::Source::Range example.rb 0...3>]], [:tIDENTIFIER, ["f", #<Parser::Source::Range example.rb 4...5>]],
[:tLPAREN2, ["(", #<Parser::Source::Range example.rb 5...6>]], [:tDSTAR, ["**", #<Parser::Source::Range example.rb 6...8>]],
[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 8...11>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 11...12>]],
[:kEND, ["end", #<Parser::Source::Range example.rb 13...16>]]]
```
With this change, the following code could be removed from test/prism/ruby/parser_test.rb:
```diff
- when :tPOW
- actual_token[0] = expected_token[0] if expected_token[0] == :tDSTAR
```
`tPOW` is the token type for the behavior of `a ** b`, and its behavior remains unchanged:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "a ** b"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["a", #<Parser::Source::Range example.rb 0...1>]], [:tPOW, ["**", #<Parser::Source::Range example.rb 2...4>]],
[:tIDENTIFIER, ["b", #<Parser::Source::Range example.rb 5...6>]]]
```
https://github.com/ruby/prism/commit/66bde35a44
|
|
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for left parenthesis.
## Parser gem (Expected)
Returns `tLPAREN2` token:
```console
$ bundle exec ruby -Ilib -rparser/ruby33 \
-ve 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
## `Prism::Translation::Parser` (Actual)
Previously, the parser returned `tLPAREN` token when parsing the following:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
After the update, the parser now returns `tLPAREN2` token for the same input:
```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```
The `PARENTHESIS_LEFT` token in Prism is classified as either `tLPAREN` or `tLPAREN2` in the Parser gem.
The tokens that were previously all classified as `tLPAREN` are now also classified to `tLPAREN2`.
With this change, the following code could be removed from `test/prism/ruby/parser_test.rb`:
```diff
- when :tLPAREN
- actual_token[0] = expected_token[0] if expected_token[0] == :tLPAREN2
```
https://github.com/ruby/prism/commit/04d6f3478d
|
|
https://github.com/ruby/prism/commit/9e4fb665ee
|
|
https://github.com/ruby/prism/commit/5985ab7687
|
|
https://github.com/ruby/prism/commit/f7faedfb3f
|
|
https://github.com/ruby/prism/commit/1528d3c019
|
|
https://github.com/ruby/prism/commit/d218e65561
|
|
https://github.com/ruby/prism/commit/100340bc6b
|
|
https://github.com/ruby/prism/commit/4a9a7a62af
Co-authored-by: Jason Kim <jasonkim@github.com>
Co-authored-by: Adam Hess <HParker@github.com>
|
|
https://github.com/ruby/prism/commit/09ba678066
|
|
https://github.com/ruby/prism/commit/85b4a5f804
|
|
https://github.com/ruby/prism/commit/461aa5e658
|
|
https://github.com/ruby/prism/commit/c49efdf824
|
|
https://github.com/ruby/prism/commit/12e079c97e
|
|
translation""
This reverts commit https://github.com/ruby/prism/commit/d8ae19d0334a.
https://github.com/ruby/prism/commit/df1eda2811
|
|
This reverts commit https://github.com/ruby/prism/commit/823e931ff230.
https://github.com/ruby/prism/commit/d8ae19d033
|
|
https://github.com/ruby/prism/commit/823e931ff2
|
|
https://github.com/ruby/prism/commit/a4e164e22b
|
|
https://github.com/ruby/prism/commit/840185110f
|
|
in parser
https://github.com/ruby/prism/commit/beed43922c
|
|
https://github.com/ruby/prism/commit/f2a327449a
|
|
https://github.com/ruby/prism/commit/785de2c39d
|
|
https://github.com/ruby/prism/commit/6f886be0a4
|