summaryrefslogtreecommitdiff
path: root/ext/json/parser/parser.c
AgeCommit message (Collapse)Author
2025-12-11[ruby/json] Add `allow_control_characters` parsing optionJean Boussier
While it's not allowed by the spec, some parsers like Oj do accept it, and it can be blocking a transition. Having this feature can help people migrate. https://github.com/ruby/json/commit/3459499cb3
2025-12-10[ruby/json] Add a specific error for unescaped newlinesJean Boussier
It's the most likely control character so it's worth giving a better error message for it. https://github.com/ruby/json/commit/1da3fd9233
2025-12-04[ruby/json] Fix a regression in parsing of unicode surogate pairsJean Boussier
Fix: https://github.com/ruby/json/issues/912 In the case of surogate pairs we consume two backslashes, so `json_next_backslash` need to ensure it's not sending us back in the stream. https://github.com/ruby/json/commit/0fce370c41
2025-12-03[ruby/json] Fix macro argumentsNobuyoshi Nakada
`ALWAYS_INLINE()` and `NOINLINE()` are defined with one argument. https://github.com/ruby/json/commit/8fb727901e
2025-11-22[ruby/json] parser.c: Record escape positions while parsingJean Boussier
We can then pass them to the decoder to save having to parse the string again. ``` == Parsing activitypub.json (58160 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 1.275k i/100ms Calculating ------------------------------------- after 12.774k (± 0.8%) i/s (78.29 μs/i) - 65.025k in 5.090834s Comparison: before: 12314.3 i/s after: 12773.8 i/s - 1.04x faster == Parsing twitter.json (567916 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 143.000 i/100ms Calculating ------------------------------------- after 1.441k (± 0.2%) i/s (693.86 μs/i) - 7.293k in 5.060345s Comparison: before: 1430.1 i/s after: 1441.2 i/s - 1.01x faster == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 69.000 i/100ms Calculating ------------------------------------- after 695.919 (± 0.4%) i/s (1.44 ms/i) - 3.519k in 5.056691s Comparison: before: 687.8 i/s after: 695.9 i/s - 1.01x faster ``` https://github.com/ruby/json/commit/4f4551f993
2025-11-22[ruby/json] Fix the parser to not accept invalid escapesJean Boussier
Only `"\/bfnrtu` are valid after a backslash. https://github.com/ruby/json/commit/f7f8f552ed
2025-11-22[ruby/json] Use booleans in string_scanJean Boussier
https://github.com/ruby/json/commit/256cad5def
2025-11-21[ruby/json] Move RUBY_TYPED_FROZEN_SHAREABLE macro to json.hÉtienne Barrié
https://github.com/ruby/json/commit/2a4ebe8250
2025-11-21[ruby/json] Ractor-shareable JSON::CoderÉtienne Barrié
https://github.com/ruby/json/commit/58d60d6b76
2025-11-20[ruby/json] Remove unused symbolsÉtienne Barrié
https://github.com/ruby/json/commit/9364d0c761
2025-11-18[ruby/json] parser.c: Remove unued JSON_ParserStruct.parsing_nameJean Boussier
https://github.com/ruby/json/commit/ab5efca015
2025-11-17strnlen is not used nowv4.0.0-preview2NARUSE, Yui
2025-11-04[ruby/json] Tentative fix for RHEL8 compilerJean Boussier
``` parser.c:87:77: error: missing binary operator before token "(" #if JSON_CPU_LITTLE_ENDIAN_64BITS && defined(__has_builtin) && __has_builtin(__builtin_bswap64) ``` https://github.com/ruby/json/commit/fce1c7e84a
2025-11-04[ruby/json] Micro-optimize `rstring_cache_fetch`Jean Boussier
Closes: https://github.com/ruby/json/pull/888 - Mark it as `inline`. - Use `RSTRING_GETMEM`, instead of `RSTRING_LEN` and `RSTRING_PTR`. - Use an inlinable version of `memcmp`. ``` == Parsing activitypub.json (58160 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Comparison: before: 11766.6 i/s after: 12272.1 i/s - 1.04x faster == Parsing twitter.json (567916 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Comparison: before: 1333.2 i/s after: 1422.0 i/s - 1.07x faster == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Comparison: before: 656.3 i/s after: 673.1 i/s - 1.03x faster == Parsing float parsing (2251051 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Comparison: before: 276.8 i/s after: 276.4 i/s - same-ish: difference falls within error ``` https://github.com/ruby/json/commit/a67d1a1af4 Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-04[ruby/json] Extract `JSON_CPU_LITTLE_ENDIAN_64BITS` definitionJean Boussier
Only apply these definitions on 64 bits archs, as it's unclear if they have performance benefits or compatibility issues on 32bit archs. https://github.com/ruby/json/commit/ddad00b746
2025-11-03[ruby/json] parser.c: Always inline `json_eat_whitespace`Jean Boussier
``` == Parsing activitypub.json (58160 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 1.174k i/100ms Calculating ------------------------------------- after 11.756k (± 0.9%) i/s (85.06 μs/i) - 59.874k in 5.093438s Comparison: before: 11078.6 i/s after: 11756.1 i/s - 1.06x faster == Parsing twitter.json (567916 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 130.000 i/100ms Calculating ------------------------------------- after 1.340k (± 0.3%) i/s (746.06 μs/i) - 6.760k in 5.043432s Comparison: before: 1191.1 i/s after: 1340.4 i/s - 1.13x faster == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 68.000 i/100ms Calculating ------------------------------------- after 689.451 (± 1.6%) i/s (1.45 ms/i) - 3.468k in 5.031470s Comparison: before: 630.3 i/s after: 689.5 i/s - 1.09x faster == Parsing float parsing (2251051 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 27.000 i/100ms Calculating ------------------------------------- after 248.265 (± 0.8%) i/s (4.03 ms/i) - 1.242k in 5.003185s Comparison: before: 232.7 i/s after: 248.3 i/s - 1.07x faster ``` https://github.com/ruby/json/commit/043880f6ab Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-03[ruby/json] parser.c: use `rb_str_to_interned_str` over `rb_funcall`Jean Boussier
https://github.com/ruby/json/commit/21284ea649
2025-11-03[ruby/json] parser.c: Extract `json_string_cacheable_p`Jean Boussier
We can share that logic between the two functions. https://github.com/ruby/json/commit/ac580458e0
2025-11-03[ruby/json] parser.c: simplify sorted insert loop in rstring_cache_fetchJean Boussier
https://github.com/ruby/json/commit/31453b8e95 Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-03[ruby/json] parser.c: Skip checking for escape sequences in ↵Jean Boussier
`rstring_cache_fetch` The caller already know if the string contains escape sequences so this check is redundant. Also stop calling `rstring_cache_fetch` from `json_string_unescape` as we know it won't match anyways. ``` == Parsing twitter.json (567916 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 122.000 i/100ms Calculating ------------------------------------- after 1.226k (± 0.3%) i/s (815.85 μs/i) - 6.222k in 5.076282s Comparison: before: 1206.2 i/s after: 1225.7 i/s - 1.02x faster ``` https://github.com/ruby/json/commit/b8cdf3282d Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-03[ruby/json] Centralize macro definitionsJean Boussier
https://github.com/ruby/json/commit/1576ea7d47
2025-11-01[ruby/json] Enable JSON_DEBUG for parser/extconf.rbJean Boussier
https://github.com/ruby/json/commit/82b030f294
2025-11-01[ruby/json] parser.c: Appease GCC warningJean Boussier
``` ../../../../../../ext/json/ext/parser/parser.c:1142:40: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] 1142 | if (RB_UNLIKELY(first_digit == '0' && mantissa_digits > 1 || negative && mantissa_digits == 0)) { | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ ``` https://github.com/ruby/json/commit/ded62a5122
2025-11-01[ruby/json] parser.c: Use SWAR to skip consecutive spacesJean Boussier
Closes: https://github.com/ruby/json/pull/881 If we encounter a newline, it is likely that the document is pretty printed, hence that the newline is followed by multiple spaces. In such case we can use SWAR to count up to eight consecutive spaces at once. ``` == Parsing activitypub.json (58160 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 1.118k i/100ms Calculating ------------------------------------- after 11.223k (± 0.7%) i/s (89.10 μs/i) - 57.018k in 5.080522s Comparison: before: 10834.4 i/s after: 11223.4 i/s - 1.04x faster == Parsing twitter.json (567916 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 118.000 i/100ms Calculating ------------------------------------- after 1.188k (± 1.0%) i/s (841.62 μs/i) - 6.018k in 5.065355s Comparison: before: 1094.8 i/s after: 1188.2 i/s - 1.09x faster == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 58.000 i/100ms Calculating ------------------------------------- after 570.506 (± 3.7%) i/s (1.75 ms/i) - 2.900k in 5.091529s Comparison: before: 419.6 i/s after: 570.5 i/s - 1.36x faster == Parsing float parsing (2251051 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 22.000 i/100ms Calculating ------------------------------------- after 212.010 (± 1.9%) i/s (4.72 ms/i) - 1.078k in 5.086885s Comparison: before: 189.4 i/s after: 212.0 i/s - 1.12x faster ``` https://github.com/ruby/json/commit/b3fd7b26be Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-01[ruby/json] Use SWAR for parsing integers on little endian machinesJean Boussier
Closes: https://github.com/ruby/json/pull/878 ``` == Parsing float parsing (2251051 bytes) ruby 3.4.6 (2025-09-16 revision https://github.com/ruby/json/commit/dbd83256b1) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- after 23.000 i/100ms Calculating ------------------------------------- after 214.382 (± 0.5%) i/s (4.66 ms/i) - 1.081k in 5.042555s Comparison: before: 189.5 i/s after: 214.4 i/s - 1.13x faster ``` https://github.com/ruby/json/commit/6348ff0891 Co-Authored-By: Scott Myron <samyron@gmail.com>
2025-11-01[ruby/json] parser.c: Introduce `rest()` helperJean Boussier
https://github.com/ruby/json/commit/11f4e7b7be
2025-11-01[ruby/json] parser.c: Introduce `peek()` and `eos()` helpersJean Boussier
Encapsulate pointer arithmetic to reduce possibility of mistakes. https://github.com/ruby/json/commit/8b39407225
2025-11-01[ruby/json] parser.c: Extract json_parse_digitsJean Boussier
https://github.com/ruby/json/commit/1bf405ecc6
2025-11-01[ruby/json] parser.c: Extract `json_parse_number`Jean Boussier
https://github.com/ruby/json/commit/2681b23b87
2025-10-30[ruby/json] Add ryu float parser.Josef Šimánek
https://github.com/ruby/json/commit/9c4db31908 Co-Authored-By: Jean Boussier <jean.boussier@gmail.com>
2025-10-27[ruby/json] parser.c: Fix indentation in json_decode_integerJean Boussier
https://github.com/ruby/json/commit/f228b30635
2025-10-27[ruby/json] Use locale indepenent version of `islapha`Jean Boussier
https://github.com/ruby/json/commit/1ba1e9bef9
2025-09-19[ruby/json] parser: Reject invalid surogate pairs more consistently.Jean Boussier
https://github.com/ruby/json/commit/5855f4f603
2025-08-27JSON.generate: warn or raise on duplicated keyJean Boussier
Because both strings and symbols keys are serialized the same, it always has been possible to generate documents with duplicated keys: ```ruby >> puts JSON.generate({ foo: 1, "foo" => 2 }) {"foo":1,"foo":2} ``` This is pretty much always a mistake and can cause various issues because it's not guaranteed how various JSON parsers will handle this. Until now I didn't think it was possible to catch such case without tanking performance, hence why I only made the parser more strict. But I finally found a way to check for duplicated keys cheaply enough.
2025-08-27[ruby/json] parser.c: Remove useless dereferenceJean Boussier
https://github.com/ruby/json/commit/2d63648c0a
2025-08-18Fix typosDouglas Eichelberger
2025-07-28[ruby/json] Fix duplicated key warning locationJean Boussier
Followup: https://github.com/ruby/json/pull/818 Now the warning should point at the `JSON.parse` caller, and not inside the json gem itself. https://github.com/ruby/json/commit/cd51557387
2025-07-28[ruby/json] Improve duplicate key warning and errors to include the key nameJean Boussier
Followup: https://github.com/ruby/json/pull/818 https://github.com/ruby/json/commit/e3de4cc59c
2025-07-07[ruby/json] Improve consistency of code styleJean Boussier
https://github.com/ruby/json/commit/a497c71960
2025-07-01[ruby/json] Remove trailing spaces [ci skip]Nobuyoshi Nakada
https://github.com/ruby/json/commit/68ee9cf188
2025-06-30Optimize 'json_parse_string' using SIMD.Scott Myron
2025-06-24[ruby/json] Deprecate duplicate keys in objectJean Boussier
There are few legitimate use cases for duplicate keys, and can in some case be exploited. Rather to always silently accept them, we should emit a warning, and in the future require to explictly allow them. https://github.com/ruby/json/commit/06f00a42e8
2025-05-13[ruby/json] Further improve parsing errorsJean Boussier
Report EOF when applicable instead of an empty fragment. Also stop fragment extraction on first whitespace. https://github.com/ruby/json/commit/cc1daba860 Notes: Merged: https://github.com/ruby/ruby/pull/13310
2025-05-13[ruby/json] Add missing single quotes in error messagesJean Boussier
https://github.com/ruby/json/commit/f3dde3cb2f Notes: Merged: https://github.com/ruby/ruby/pull/13310
2025-05-13[ruby/json] parser.c: include line and column in error messagesJean Boussier
https://github.com/ruby/json/commit/30e35b9ba5 Notes: Merged: https://github.com/ruby/ruby/pull/13310
2025-05-13[ruby/json] parser.c: refactor `raise_parse_error` to have document startJean Boussier
https://github.com/ruby/json/commit/832b5b1a4c Notes: Merged: https://github.com/ruby/ruby/pull/13310
2025-03-28[ruby/json] Move `create_addtions` logic in Ruby.Jean Boussier
By leveraging the `on_load` callback we can move all this logic out of the parser. Which mean we no longer have to duplicate that logic in both parser and that we'll later be able to extract it entirely from the gem. https://github.com/ruby/json/commit/f411ddf1ce Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-28[ruby/json] JSON.load invoke the proc callback directly from the parser.Jean Boussier
And substitute the return value like `Marshal.load` doesm which I can only assume was the intent. This also open the door to re-implement all the `create_addition` logic in `json/common.rb`. https://github.com/ruby/json/commit/73d2137fd3 Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-28[ruby/json] Remove `Class#json_creatable?` monkey patch.Jean Boussier
https://github.com/ruby/json/commit/1ca7efed1f Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-13[ruby/json] Fix potential out of bound read in `json_string_unescape`.Jean Boussier
https://github.com/ruby/json/commit/cf242d89a0