diff options
| author | Koichi ITO <koic.ito@gmail.com> | 2024-03-21 01:46:53 +0900 |
|---|---|---|
| committer | git <svn-admin@ruby-lang.org> | 2024-03-25 12:16:32 +0000 |
| commit | 56a2fad2a4578987a371f7a5563812b52ed8e9c6 (patch) | |
| tree | 8120119e81dbb98f4f96243185a0fcb364ec50fd /include/ruby/encoding.h | |
| parent | 9b921f662285e785ddbd22d0bcd540fa35151b08 (diff) | |
[ruby/prism] Fix incorrect paring when using invalid regexp options
Fixes https://github.com/ruby/prism/pull/2617.
There was an issue with the lexer as follows.
The following are valid regexp options:
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.lex("/x/io").value.map {|token| token[0].type }'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[:REGEXP_BEGIN, :STRING_CONTENT, :REGEXP_END, :EOF]
```
The following are invalid regexp options. Unnecessary the `IDENTIFIER` token is appearing:
```console
$ bundle exec ruby -Ilib -rprism -ve 'p Prism.lex("/x/az").value.map {|token| token[0].type }'
ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
[:REGEXP_BEGIN, :STRING_CONTENT, :REGEXP_END, :IDENTIFIER, :EOF]
```
As a behavior of Ruby, when given `A` to `Z` and `a` to `z`, they act as invalid regexp options. e.g.,
```console
$ ruby -e '/regexp/az'
-e:1: unknown regexp options - az
/regexp/az
-e: compile error (SyntaxError)
```
Thus, it should probably not be construed as `IDENTIFIER` token.
Therefore, `pm_byte_table` has been adapted to accept those invalid regexp option values.
Whether it is a valid regexp option or not is checked by `pm_regular_expression_flags_create`.
For invalid regexp options, `PM_ERR_REGEXP_UNKNOWN_OPTIONS` is added to diagnostics.
https://github.com/ruby/prism/commit/d2a6096fcf
Diffstat (limited to 'include/ruby/encoding.h')
0 files changed, 0 insertions, 0 deletions
