<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/benchmark/int_to_s.yml, branch master</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>Speed up Integer#to_s with a two digit lookup table (#16719)</title>
<updated>2026-05-08T17:10:12+00:00</updated>
<author>
<name>Chris Hasiński</name>
<email>krzysztof.hasinski@gmail.com</email>
</author>
<published>2026-05-08T17:10:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=4a0072d5f29befde814ea0d9a83c711e1f049564'/>
<id>4a0072d5f29befde814ea0d9a83c711e1f049564</id>
<content type='text'>
* numeric: emit two decimal digits per iteration in rb_fix2str

Replace the digit-at-a-time loop in rb_fix2str with the standard
itoa 2-digit lookup table for base 10.  Each iteration now
writes two digits using a single (u % 100, u / 100) pair, so the
number of loop iterations is halved for multi-digit integers.
The classic per-digit loop is kept for non-base-10 conversion.

Benchmark (Apple M-series, 5M-10M ops, best of 3 runs):

  case            base       patch      delta
  ---------       -----      -----      -----
  1-digit   (5)   64 ns/op   64 ns/op    -0%
  2-digit   (42)  64 ns/op   65 ns/op    +2%  (noise)
  3-digit   (400) 66 ns/op   64 ns/op    -3%
  5-digit   (12345)          69 ns/op   67 ns/op    -3%
  10-digit  (1234567890)     77 ns/op   67 ns/op   -13%
  19-digit  (2^62-1)        111 ns/op   75 ns/op   -33%

The crossover is at ~3 digits: below that the constant setup
dominates and the benefit is within noise, above that the halved
iteration count shows up linearly.  Typical Rails payloads mix
short IDs (1-5 digits) and longer values (timestamps, nanos,
large counts), so the win is workload-dependent but strictly
non-negative for real code.

Correctness: 100k random fuzz across the full fixnum range plus
targeted edges (0, ±1, ±99, ±100, 2^30-1, 2^62-1, etc.) all pass.
make test-all shows 34694 tests, 7325860 assertions, 0 new
failures (same pre-existing TestArgf#test_puts flake as on
master) — test_integer.rb alone runs 38 tests / 421628 assertions
of which Integer#to_s exercises the bulk, all pass.

The 200-byte lookup table sits in .rodata and fits in a single
cache line of its own (3 lines for the whole table).  No change
to public API, no change to bignum conversion, no change to
non-base-10 conversion paths.

* bignum: emit two decimal digits per iteration in big2str_2bdigits

Extend the 2-digit lookup-table itoa optimisation from rb_fix2str to
the inner conversion loop used by Bignum#to_s.  big2str_2bdigits has
two code paths — a leading-chunk path that emits variable-length
digits, and a recursive-chunk path that emits a fixed-width zero-
padded block — and both gain from the halved division count.  The
classic per-digit loop is preserved for non-base-10 conversion.

Moves the ruby_decimal_digit_pairs table from a file-static in
numeric.c to bignum.c next to ruby_digitmap, and exposes it through
internal/bignum.h so both files share the same 200-byte .rodata
instance.

Benchmark (Apple M-series, best of 3 runs, measures bignum-only
speedup against the preceding fixnum commit):

  case            base      patch     delta
  ---------       -----     -----     -----
  big_20dig   10^19+...  146 ns/op 124 ns/op  -15%
  big_40dig   10^39+...  174 ns/op 152 ns/op  -13%
  big_100dig  10^99+42   236 ns/op 213 ns/op  -10%
  big_500dig  10^499+7  1119 ns/op 1086 ns/op  -3%
  big_1000dig 10^999    3490 ns/op 3459 ns/op  -1%
  fix_19dig   2^62-1      76 ns/op   76 ns/op   0% (unchanged path)

Wins concentrate in the 20-100 digit range where big2str_2bdigits
is the dominant cost.  Above ~500 digits the Karatsuba divmod
recursion dominates and the digit-emission saving shrinks to the
noise floor.  The 20-100 range is what actual Ruby code exercises
(financial high-precision sums, nanosecond timestamps, large
counters); crypto-size (1000+ digit) bignums are rare in to_s paths.

Correctness: 100k random fixnum fuzz unchanged, 500 random bignum
fuzz up to 2^256 with cross-check against sprintf("%d"), bases
2/8/16/36 round-trip, plus edge cases (0, just-above-fixnum, ±2^100,
20-digit strings near the fixnum boundary).  test/ruby/test_integer.rb
stays at 38 tests / 421628 assertions / 0 failures, test_bignum.rb
passes 74 / 607 / 0 failures, full make test-all reports 34694
tests / 0 new failures (same TestArgf#test_puts pre-existing flake
as master).

* benchmark: add int_to_s yaml for Integer#to_s

Reproducible benchmark for the two preceding commits.  Covers:

- 1/2/3/5/10/19-digit positive fixnums (spans the break-even point
  and the two large-number wins at the top)
- A negative fixnum (exercises the minus-sign prepend path)
- 20/40/100-digit bignums (spans the big2str_2bdigits win range)
- Two string-interpolation scenarios, so reviewers can see how much
  of the Integer#to_s speedup reaches real code that allocates the
  result string too

Intended to be consumed by benchmark-driver against master vs
int-to-s-twodigit for A/B comparison.  Matches the numbers in the
commit messages of 5bfb7e02a2 and c5df6de835.

---------

Co-authored-by: tomoya ishida &lt;tomoyapenguin@gmail.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* numeric: emit two decimal digits per iteration in rb_fix2str

Replace the digit-at-a-time loop in rb_fix2str with the standard
itoa 2-digit lookup table for base 10.  Each iteration now
writes two digits using a single (u % 100, u / 100) pair, so the
number of loop iterations is halved for multi-digit integers.
The classic per-digit loop is kept for non-base-10 conversion.

Benchmark (Apple M-series, 5M-10M ops, best of 3 runs):

  case            base       patch      delta
  ---------       -----      -----      -----
  1-digit   (5)   64 ns/op   64 ns/op    -0%
  2-digit   (42)  64 ns/op   65 ns/op    +2%  (noise)
  3-digit   (400) 66 ns/op   64 ns/op    -3%
  5-digit   (12345)          69 ns/op   67 ns/op    -3%
  10-digit  (1234567890)     77 ns/op   67 ns/op   -13%
  19-digit  (2^62-1)        111 ns/op   75 ns/op   -33%

The crossover is at ~3 digits: below that the constant setup
dominates and the benefit is within noise, above that the halved
iteration count shows up linearly.  Typical Rails payloads mix
short IDs (1-5 digits) and longer values (timestamps, nanos,
large counts), so the win is workload-dependent but strictly
non-negative for real code.

Correctness: 100k random fuzz across the full fixnum range plus
targeted edges (0, ±1, ±99, ±100, 2^30-1, 2^62-1, etc.) all pass.
make test-all shows 34694 tests, 7325860 assertions, 0 new
failures (same pre-existing TestArgf#test_puts flake as on
master) — test_integer.rb alone runs 38 tests / 421628 assertions
of which Integer#to_s exercises the bulk, all pass.

The 200-byte lookup table sits in .rodata and fits in a single
cache line of its own (3 lines for the whole table).  No change
to public API, no change to bignum conversion, no change to
non-base-10 conversion paths.

* bignum: emit two decimal digits per iteration in big2str_2bdigits

Extend the 2-digit lookup-table itoa optimisation from rb_fix2str to
the inner conversion loop used by Bignum#to_s.  big2str_2bdigits has
two code paths — a leading-chunk path that emits variable-length
digits, and a recursive-chunk path that emits a fixed-width zero-
padded block — and both gain from the halved division count.  The
classic per-digit loop is preserved for non-base-10 conversion.

Moves the ruby_decimal_digit_pairs table from a file-static in
numeric.c to bignum.c next to ruby_digitmap, and exposes it through
internal/bignum.h so both files share the same 200-byte .rodata
instance.

Benchmark (Apple M-series, best of 3 runs, measures bignum-only
speedup against the preceding fixnum commit):

  case            base      patch     delta
  ---------       -----     -----     -----
  big_20dig   10^19+...  146 ns/op 124 ns/op  -15%
  big_40dig   10^39+...  174 ns/op 152 ns/op  -13%
  big_100dig  10^99+42   236 ns/op 213 ns/op  -10%
  big_500dig  10^499+7  1119 ns/op 1086 ns/op  -3%
  big_1000dig 10^999    3490 ns/op 3459 ns/op  -1%
  fix_19dig   2^62-1      76 ns/op   76 ns/op   0% (unchanged path)

Wins concentrate in the 20-100 digit range where big2str_2bdigits
is the dominant cost.  Above ~500 digits the Karatsuba divmod
recursion dominates and the digit-emission saving shrinks to the
noise floor.  The 20-100 range is what actual Ruby code exercises
(financial high-precision sums, nanosecond timestamps, large
counters); crypto-size (1000+ digit) bignums are rare in to_s paths.

Correctness: 100k random fixnum fuzz unchanged, 500 random bignum
fuzz up to 2^256 with cross-check against sprintf("%d"), bases
2/8/16/36 round-trip, plus edge cases (0, just-above-fixnum, ±2^100,
20-digit strings near the fixnum boundary).  test/ruby/test_integer.rb
stays at 38 tests / 421628 assertions / 0 failures, test_bignum.rb
passes 74 / 607 / 0 failures, full make test-all reports 34694
tests / 0 new failures (same TestArgf#test_puts pre-existing flake
as master).

* benchmark: add int_to_s yaml for Integer#to_s

Reproducible benchmark for the two preceding commits.  Covers:

- 1/2/3/5/10/19-digit positive fixnums (spans the break-even point
  and the two large-number wins at the top)
- A negative fixnum (exercises the minus-sign prepend path)
- 20/40/100-digit bignums (spans the big2str_2bdigits win range)
- Two string-interpolation scenarios, so reviewers can see how much
  of the Integer#to_s speedup reaches real code that allocates the
  result string too

Intended to be consumed by benchmark-driver against master vs
int-to-s-twodigit for A/B comparison.  Matches the numbers in the
commit messages of 5bfb7e02a2 and c5df6de835.

---------

Co-authored-by: tomoya ishida &lt;tomoyapenguin@gmail.com&gt;</pre>
</div>
</content>
</entry>
</feed>
