| Age | Commit message (Collapse) | Author |
|
This optimization is based on a few assumptions:
- Most strings are ASCII only.
- Most strings had their coderange scanned already.
If the above is true, then by checking the string coderange, we can
use a much more streamlined function to encode ASCII strings.
Before:
```
== Encoding twitter.json (466906 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 140.000 i/100ms
oj 230.000 i/100ms
rapidjson 108.000 i/100ms
Calculating -------------------------------------
json 1.464k (± 1.4%) i/s (682.83 μs/i) - 7.420k in 5.067573s
oj 2.338k (± 1.5%) i/s (427.64 μs/i) - 11.730k in 5.017336s
rapidjson 1.075k (± 1.6%) i/s (930.40 μs/i) - 5.400k in 5.025469s
Comparison:
json: 1464.5 i/s
oj: 2338.4 i/s - 1.60x faster
rapidjson: 1074.8 i/s - 1.36x slower
```
After:
```
== Encoding twitter.json (466906 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 189.000 i/100ms
oj 228.000 i/100ms
rapidjson 108.000 i/100ms
Calculating -------------------------------------
json 1.903k (± 1.2%) i/s (525.55 μs/i) - 9.639k in 5.066521s
oj 2.306k (± 1.3%) i/s (433.71 μs/i) - 11.628k in 5.044096s
rapidjson 1.069k (± 2.4%) i/s (935.38 μs/i) - 5.400k in 5.053794s
Comparison:
json: 1902.8 i/s
oj: 2305.7 i/s - 1.21x faster
rapidjson: 1069.1 i/s - 1.78x slower
```
|
|
All these macros are available on Ruby 2.3+
https://github.com/ruby/json/commit/227885f460
|
|
All of these are for rubies older than 2.3.
https://github.com/ruby/json/commit/811297f86a
|
|
If we assume that most of the time the `opts` hash is small
it's faster to go over the provided keys with a `case` than
to test all possible keys one by one.
Before:
```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 156.832k i/100ms
oj 209.769k i/100ms
rapidjson 162.922k i/100ms
Calculating -------------------------------------
json 1.599M (± 2.5%) i/s (625.34 ns/i) - 7.998M in 5.005110s
oj 2.137M (± 1.5%) i/s (467.99 ns/i) - 10.698M in 5.007806s
rapidjson 1.677M (± 3.5%) i/s (596.31 ns/i) - 8.472M in 5.059515s
Comparison:
json: 1599141.2 i/s
oj: 2136785.3 i/s - 1.34x faster
rapidjson: 1676977.2 i/s - same-ish: difference falls within error
== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 216.464k i/100ms
oj 661.328k i/100ms
rapidjson 324.434k i/100ms
Calculating -------------------------------------
json 2.301M (± 1.7%) i/s (434.57 ns/i) - 11.689M in 5.081278s
oj 7.244M (± 1.2%) i/s (138.05 ns/i) - 36.373M in 5.021985s
rapidjson 3.323M (± 2.9%) i/s (300.96 ns/i) - 16.871M in 5.081696s
Comparison:
json: 2301142.2 i/s
oj: 7243770.3 i/s - 3.15x faster
rapidjson: 3322673.0 i/s - 1.44x faster
```
After:
```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 168.087k i/100ms
oj 208.872k i/100ms
rapidjson 149.909k i/100ms
Calculating -------------------------------------
json 1.761M (± 1.1%) i/s (567.90 ns/i) - 8.909M in 5.059794s
oj 2.144M (± 0.9%) i/s (466.37 ns/i) - 10.861M in 5.065903s
rapidjson 1.692M (± 1.7%) i/s (591.04 ns/i) - 8.545M in 5.051808s
Comparison:
json: 1760868.2 i/s
oj: 2144205.9 i/s - 1.22x faster
rapidjson: 1691941.1 i/s - 1.04x slower
== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 242.957k i/100ms
oj 675.217k i/100ms
rapidjson 355.040k i/100ms
Calculating -------------------------------------
json 2.569M (± 1.5%) i/s (389.22 ns/i) - 12.877M in 5.013095s
oj 7.128M (± 2.3%) i/s (140.30 ns/i) - 35.787M in 5.023594s
rapidjson 3.656M (± 3.1%) i/s (273.50 ns/i) - 18.462M in 5.054558s
Comparison:
json: 2569217.5 i/s
oj: 7127705.6 i/s - 2.77x faster
rapidjson: 3656285.0 i/s - 1.42x faster
```
|
|
Most of these classes and modules don't need to be global variables
https://github.com/ruby/json/commit/b783445ec9
|
|
This helps very marginally with allocation speed.
https://github.com/ruby/json/commit/25db79dfaa
|
|
`JSON.dump` looks terrible on micro-benchmarks because the way it
handles arguments is quite allocation heavy compared to the actual
JSON generation work.
Profiling the `small hash` benchmarked show 14% of time spent in `Array#compact`
and `34%` time spent in `JSON::Ext::GeneratorState.new`. Only `41%` in the
actual `generate` function.
By micro-optimizing `JSON.dump`, it can look much better:
Before:
```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 91.687k i/100ms
oj 205.309k i/100ms
rapidjson 161.648k i/100ms
Calculating -------------------------------------
json 941.965k (± 1.4%) i/s (1.06 μs/i) - 4.768M in 5.062573s
oj 2.138M (± 1.2%) i/s (467.82 ns/i) - 10.881M in 5.091254s
rapidjson 1.678M (± 1.9%) i/s (596.04 ns/i) - 8.406M in 5.011931s
Comparison:
json: 941964.8 i/s
oj: 2137586.5 i/s - 2.27x faster
rapidjson: 1677737.1 i/s - 1.78x faster
== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 141.737k i/100ms
oj 676.871k i/100ms
rapidjson 373.266k i/100ms
Calculating -------------------------------------
json 1.491M (± 1.0%) i/s (670.78 ns/i) - 7.512M in 5.039463s
oj 7.226M (± 1.4%) i/s (138.39 ns/i) - 36.551M in 5.059475s
rapidjson 3.729M (± 2.2%) i/s (268.15 ns/i) - 18.663M in 5.007182s
Comparison:
json: 1490798.2 i/s
oj: 7225766.2 i/s - 4.85x faster
rapidjson: 3729192.2 i/s - 2.50x faster
```
After:
```
== Encoding small nested array (121 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 156.832k i/100ms
oj 209.769k i/100ms
rapidjson 162.922k i/100ms
Calculating -------------------------------------
json 1.599M (± 2.5%) i/s (625.34 ns/i) - 7.998M in 5.005110s
oj 2.137M (± 1.5%) i/s (467.99 ns/i) - 10.698M in 5.007806s
rapidjson 1.677M (± 3.5%) i/s (596.31 ns/i) - 8.472M in 5.059515s
Comparison:
json: 1599141.2 i/s
oj: 2136785.3 i/s - 1.34x faster
rapidjson: 1676977.2 i/s - same-ish: difference falls within error
== Encoding small hash (65 bytes)
ruby 3.4.0preview2 (2024-10-07 master 32c733f57b) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
json 216.464k i/100ms
oj 661.328k i/100ms
rapidjson 324.434k i/100ms
Calculating -------------------------------------
json 2.301M (± 1.7%) i/s (434.57 ns/i) - 11.689M in 5.081278s
oj 7.244M (± 1.2%) i/s (138.05 ns/i) - 36.373M in 5.021985s
rapidjson 3.323M (± 2.9%) i/s (300.96 ns/i) - 16.871M in 5.081696s
Comparison:
json: 2301142.2 i/s
oj: 7243770.3 i/s - 3.15x faster
rapidjson: 3322673.0 i/s - 1.44x faster
```
Now profiles of the `small hash` benchmark show 44% in `generate` and
`45%` in `GeneratorState` allocation.
|
|
This speeds up `JSON.generate` by about 12% in a benchmark.
https://github.com/ruby/json/commit/4329e30826
|
|
This speeds up `JSON.generate` by about 4% in a benchmark.
https://github.com/ruby/json/commit/6471710cfc
|
|
Also, remove static functions that are no longer used.
This speeds up `JSON.generate` by about 5% in a benchmark.
https://github.com/ruby/json/commit/4c984b2017
|
|
This speeds up `JSON.generate` by about 4% in a benchmark
https://github.com/ruby/json/commit/ed47a10e4f
|
|
The purpose of this change is to exploit `fbuffer_append_char` that is
faster than `fbuffer_append`.
`array_delim` was a buffer that concatenated a single comma with
`array_nl`. However, in the typical use case (`JSON.generate(data)`),
`array_nl` is empty. This means that `array_delim` was a
single-character buffer in many cases.
`fbuffer_append(buffer, array_delim)` used `memcpy` to copy one byte,
which was not so efficient.
Rather, this change uses `fbuffer_append_char(buffer, ',')` and then
`fbuffer_append(buffer, array_nl)` only when `array_nl` is not NULL.
This speeds up `JSON.generate` by about 9% in a benchmark.
https://github.com/ruby/json/commit/445de6e459
|
|
... instead of `generate_json`.
Since the object key is already confirmed to be a string, using a
generic dispatch function brings an unnecessary overhead.
This speeds up `JSON.generate` by about 3% in a benchmark.
https://github.com/ruby/json/commit/e125072130
|
|
Dispatching based on Ruby's VALUE structure is more efficient than
simply cascaded "if ... else if ..." checks.
This speeds up `JSON.generate` by about 5% in a benchmark.
https://github.com/ruby/json/commit/4f9180debb
|
|
It is safe to use `RARRAY_AREF` here because no Ruby code is executed
between `RARRAY_LEN` and `RARRAY_AREF`.
This speeds up `JSON.generate` by about 4% in a benchmark.
https://github.com/ruby/json/commit/c5d80f9fd4
|
|
https://github.com/ruby/json/commit/81092639e8
|
|
https://github.com/ruby/json/commit/0f9564104f
|
|
signs
https://github.com/ruby/json/commit/c372dc9268
|
|
|
|
https://github.com/ruby/json/commit/53409bcc74
|
|
```
generator.c:69:27: warning: comparison of integers of different signs: 'short' and 'unsigned long' [-Wsign-compare]
for (i = 1; i < ch_len; i++) {
```
https://github.com/ruby/json/commit/ff8edcd47c
|
|
https://github.com/ruby/json/commit/62301c0bc3
|
|
I, Luke T. Shumaker, am the sole author of the added code.
I did not reference CVTUTF when writing it. I did reference the
Unicode standard (15.0.0), the Wikipedia article on UTF-8, and the
Wikipedia article on UTF-16. When I saw some tests fail, I did
reference the old deleted code (but a JSON-specific part, inherently
not as based on CVTUTF) to determine that script_safe should also
escape U+2028 and U+2029.
I targeted simplicity and clarity when writing the code--it can likely
be optimized. In my mind, the obvious next optimization is to have it
combine contiguous non-escaped characters into just one call to
fbuffer_append(), instead of calling fbuffer_append() for each
character.
Regarding the use of the "modern" types `uint32_t`, `uint16_t`, and
`bool`:
- ruby.h is guaranteed to give us uint32_t and uint16_t.
- Since Ruby 3.0.0, ruby.h is guaranteed to give us bool... but we
support down to Ruby 2.3. But, ruby.h is guaranteed to give us
HAVE_STDBOOL_H for the C99 stdbool.h; so use that to include
stdbool.h if we can, and if not then fall back to a copy of the
same bool definition that Ruby 3.0.5 uses with C89.
https://github.com/ruby/json/commit/c96351f874
|
|
I did this based on manual inspection, comparing the code to my re-created
history of CVTUTF at https://git.lukeshu.com/2git/cvtutf/ (created by the
scripts at https://git.lukeshu.com/2git/cvtutf-make/)
https://github.com/ruby/json/commit/0819553144
|
|
https://github.com/ruby/json/commit/1edfeb8f10
|
|
Rather than checking the class we can check the type.
This is very subtly different for String subclasses, but I think it's
OK.
We also save on checking the type again in the fast path.
https://github.com/flori/json/commit/772a0201ab
Notes:
Merged: https://github.com/ruby/ruby/pull/11775
|
|
On my `JSON.dump` benchmark it shows up as 6% of runtime, compared
to 40% for `convert_UTF8_to_JSON`.
Since the vast majority of the time this function is called we
still have some buffer capacity, we might as well check that
first and skip the expensive loop etc.
With this change my profiler now report this function as 0.7%,
so almost 10x better.
https://github.com/flori/json/commit/a7206bf2db
Notes:
Merged: https://github.com/ruby/ruby/pull/11775
|
|
Given that we called `rb_enc_str_asciionly_p`, if the string encoding
isn't valid UTF-8, we can't know it very cheaply by checking the
encoding and coderange that was just computed by Ruby, rather than
to do it ourselves.
Also Ruby might have already computed that earlier.
https://github.com/flori/json/commit/4b04c469d5
Notes:
Merged: https://github.com/ruby/ruby/pull/11775
|
|
`json` requires Ruby 2.3, so `HAVE_RUBY_ENCODING_H` and `HAVE_RB_ENC_RAISE`
are always true.
https://github.com/flori/json/commit/5c8dc6b70a
|
|
* Using the benchmark from https://github.com/flori/json/pull/580
$ ruby benchmarks/bench.rb dump pure
JSON::Pure::Generator
truffleruby 24.0.0, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
Warming up --------------------------------------
JSON.dump(obj) 116.000 i/100ms
JSON.dump(obj) 235.000 i/100ms
JSON.dump(obj) 317.000 i/100ms
JSON.dump(obj) 372.000 i/100ms
JSON.dump(obj) 374.000 i/100ms
Calculating -------------------------------------
JSON.dump(obj) 3.735k (± 0.9%) i/s (267.76 μs/i) - 18.700k in 5.007526s
JSON.dump(obj) 3.738k (± 0.7%) i/s (267.49 μs/i) - 18.700k in 5.002252s
JSON.dump(obj) 3.743k (± 0.7%) i/s (267.18 μs/i) - 19.074k in 5.096375s
JSON.dump(obj) 3.747k (± 0.5%) i/s (266.87 μs/i) - 19.074k in 5.090463s
JSON.dump(obj) 3.746k (± 0.5%) i/s (266.96 μs/i) - 19.074k in 5.092069s
$ ruby benchmarks/bench.rb dump ext
JSON::Ext::Generator
truffleruby 24.0.0, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
Warming up --------------------------------------
JSON.dump(obj) 19.000 i/100ms
JSON.dump(obj) 18.000 i/100ms
JSON.dump(obj) 18.000 i/100ms
JSON.dump(obj) 18.000 i/100ms
JSON.dump(obj) 21.000 i/100ms
Calculating -------------------------------------
JSON.dump(obj) 221.260 (±10.8%) i/s (4.52 ms/i) - 1.092k in 5.004381s
JSON.dump(obj) 221.983 (± 8.1%) i/s (4.50 ms/i) - 1.113k in 5.055574s
JSON.dump(obj) 221.446 (± 8.6%) i/s (4.52 ms/i) - 1.113k in 5.073167s
JSON.dump(obj) 226.452 (± 7.9%) i/s (4.42 ms/i) - 1.134k in 5.048568s
JSON.dump(obj) 227.795 (± 8.3%) i/s (4.39 ms/i) - 1.134k in 5.025187s
https://github.com/flori/json/commit/8256455cdc
|
|
The json gem now requires Ruby 2.3, so there is no point keeping
compatibility code for older releases that don't have the
TypedData API.
https://github.com/flori/json/commit/45c86e153f
|
|
|
|
https://github.com/flori/json/commit/036944acc6
|
|
https://github.com/flori/json/commit/fff285968d
|
|
https://github.com/flori/json/commit/b507f9e404
|
|
If an exception is raised the FBuffer is leaked.
For example, the following script leaks memory:
o = Object.new
def o.to_json(a) = raise
10.times do
100_000.times do
begin
JSON(o)
rescue
end
end
puts `ps -o rss= -p #{$$}`
end
Before:
31824
35696
40240
44304
47424
50944
54000
58384
62416
65296
After:
24416
24640
24640
24736
24736
24736
24736
24736
24736
24736
https://github.com/flori/json/commit/44df509dc2
|
|
https://github.com/flori/json/commit/202ffe2335
|
|
This avoids pinning an id to the symbol used if a dynamic symbol is
passed in as a hash key.
rb_sym2str is available in Ruby 2.2+ and json depends on >= 2.3.
https://github.com/flori/json/commit/5cbafb8dbe
|
|
https://github.com/flori/json/commit/a1af7a308c
|
|
|
|
https://github.com/flori/json/commit/3ef57b5b39
|
|
https://github.com/flori/json/commit/11b31210ac
|
|
(https://github.com/flori/json/pull/557)
* RDoc for additions
* Update lib/json/add/time.rb
Co-authored-by: Hiroshi SHIBATA <hsbt@ruby-lang.org>
---------
https://github.com/flori/json/commit/3f2efd60f7
Co-authored-by: Hiroshi SHIBATA <hsbt@ruby-lang.org>
|
|
https://github.com/flori/json/commit/41c2712a3b
|
|
https://github.com/flori/json/commit/936f280f9f
|
|
Fix: https://github.com/flori/json/issues/553
We can never add keyword arguments to `dump` otherwise
existing code using unenclosed hash will break.
https://github.com/flori/json/commit/8e0076a3f2
|
|
> https://github.com/flori/json/pull/525
> Rename escape_slash in script_safe and also escape E+2028 and E+2029
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
> https://github.com/flori/json/pull/454
> Remove unnecessary initialization of create_id in JSON.parse()
Co-authored-by: Watson <watson1978@gmail.com>
|
|
It is rather common to directly interpolate JSON string inside
<script> tags in HTML as to provide configuration or parameters to a
script.
However this may lead to XSS vulnerabilities, to prevent that 3
characters need to be escaped:
- `/` (forward slash)
- `U+2028` (LINE SEPARATOR)
- `U+2029` (PARAGRAPH SEPARATOR)
The forward slash need to be escaped to prevent closing the script
tag early, and the other two are valid JSON but invalid Javascript
and can be used to break JS parsing.
Given that the intent of escaping forward slash is the same than escaping
U+2028 and U+2029, I chos to rename and repurpose the existing `escape_slash`
option.
|
|
They are allocated with ruby_xmalloc, they should be freed with
ruby_xfree.
|
|
https://github.com/flori/json/commit/ca546128f2
|