<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/lib/bundler/fetcher/gem_remote_fetcher.rb, branch master</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>[ruby/rubygems] Normalize the number of workers:</title>
<updated>2026-03-18T10:55:45+00:00</updated>
<author>
<name>Edouard CHIN</name>
<email>chin.edouard@gmail.com</email>
</author>
<published>2026-03-16T16:05:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a0efe15163ecf45cb3748a4bdb1051dd2f7019b7'/>
<id>a0efe15163ecf45cb3748a4bdb1051dd2f7019b7</id>
<content type='text'>
- ### Problem

  I'd like to normalize the number of workers when downloading gems
  and use the `BUNDLE_JOBS` configuration (or default to
  `Etc.nprocessors`).
  Right now the number of workers when doing parallel work seems a bit
  random.

  ### Benchmarks

  **Downloading 40 git gems**

  === Comparison Summary ===

  Scenario: git-gems (40 gems)
                             Cold     +/-                        Warm     +/-
  ------------------------------------------------------------------------------
  more-downloads            7.94s   1.44s  baseline             5.02s   0.31s  baseline
  master                   14.59s   1.67s  83.7% slower         5.72s   0.30s  13.9% slower

  _________________________________

  **Downloading 249 gems from a fake gemserver with a 300ms latency**

  === Comparison Summary ===

  Scenario: no-deps (249 gems)
                             Cold     +/-                        Warm     +/-
  ------------------------------------------------------------------------------
  more-downloads           11.11s   0.66s  baseline             1.23s   0.14s  baseline
  master                   16.89s   0.60s  52.0% slower         1.03s   0.09s  16.2% faster

  ### Context

  I originally added those workers count in

  1. https://github.com/ruby/rubygems/pull/9087/changes#diff-524173391e40a96577540013a1ad749433454155f79aa05c5d0832235b0bdad1R11
  2. https://github.com/ruby/rubygems/pull/9100/changes#diff-04ae823e98259f697c78d2d0b4eab0ced6a83a84a986578703eb2837d6db1a32R1105

  For 1. (downloading gems from Rubygems.org)I opted to go with a
  hardcoded worker count of 5 and not anything higher as I was
  that we could hammer RubyGems.org. I think this concern
  is not valid, because requests to download gems don't even hit
  RubyGems.org server as there is the fastly CDN in front of the s3
  bucket.

  For 2. I went with a worker count of 5 to match, without giving this
  a second thought.

https://github.com/ruby/rubygems/commit/170c9d75c2
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- ### Problem

  I'd like to normalize the number of workers when downloading gems
  and use the `BUNDLE_JOBS` configuration (or default to
  `Etc.nprocessors`).
  Right now the number of workers when doing parallel work seems a bit
  random.

  ### Benchmarks

  **Downloading 40 git gems**

  === Comparison Summary ===

  Scenario: git-gems (40 gems)
                             Cold     +/-                        Warm     +/-
  ------------------------------------------------------------------------------
  more-downloads            7.94s   1.44s  baseline             5.02s   0.31s  baseline
  master                   14.59s   1.67s  83.7% slower         5.72s   0.30s  13.9% slower

  _________________________________

  **Downloading 249 gems from a fake gemserver with a 300ms latency**

  === Comparison Summary ===

  Scenario: no-deps (249 gems)
                             Cold     +/-                        Warm     +/-
  ------------------------------------------------------------------------------
  more-downloads           11.11s   0.66s  baseline             1.23s   0.14s  baseline
  master                   16.89s   0.60s  52.0% slower         1.03s   0.09s  16.2% faster

  ### Context

  I originally added those workers count in

  1. https://github.com/ruby/rubygems/pull/9087/changes#diff-524173391e40a96577540013a1ad749433454155f79aa05c5d0832235b0bdad1R11
  2. https://github.com/ruby/rubygems/pull/9100/changes#diff-04ae823e98259f697c78d2d0b4eab0ced6a83a84a986578703eb2837d6db1a32R1105

  For 1. (downloading gems from Rubygems.org)I opted to go with a
  hardcoded worker count of 5 and not anything higher as I was
  that we could hammer RubyGems.org. I think this concern
  is not valid, because requests to download gems don't even hit
  RubyGems.org server as there is the fastly CDN in front of the s3
  bucket.

  For 2. I went with a worker count of 5 to match, without giving this
  a second thought.

https://github.com/ruby/rubygems/commit/170c9d75c2
</pre>
</div>
</content>
</entry>
<entry>
<title>[ruby/rubygems] Increase connection pool to allow for up to 70% speed increase:</title>
<updated>2025-12-04T06:47:46+00:00</updated>
<author>
<name>Edouard CHIN</name>
<email>chin.edouard@gmail.com</email>
</author>
<published>2025-11-16T23:18:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=932762f29457ad1def6fbab7eca7bcbeeb58ea5c'/>
<id>932762f29457ad1def6fbab7eca7bcbeeb58ea5c</id>
<content type='text'>
- ### TL;DR

  Bundler is heavily limited by the connection pool which manages a
  single connection. By increasing the number of connection, we can
  drastiscally speed up the installation process when many gems need
  to be downloaded and installed.

  ### Benchmark

  There are various factors that are hard to control such as
  compilation time and network speed but after dozens of tests I
  can consistently get aroud 70% speed increase when downloading and
  installing 472 gems, most having no native extensions (on purpose).

  ```
  # Before
  bundle install  28.60s user 12.70s system 179% cpu 23.014 total

  # After
  bundle install  30.09s user 15.90s system 281% cpu 16.317 total
  ```

  You can find on this gist how this was benchmarked and the Gemfile
  used https://gist.github.com/Edouard-chin/c8e39148c0cdf324dae827716fbe24a0

  ### Context

  A while ago in #869, Aaron introduced a connection pool which
  greatly improved Bundler speed. It was noted in the PR description
  that managing one connection was already good enough and it wasn't
  clear whether we needed more connections. Aaron also had the
  intuition that we may need to increase the pool for downloading
  gems and he was right.

  &gt; We need to study how RubyGems uses connections and make a decision
  &gt; based on request usage (e.g. only use one connection for many small
  &gt; requests like bundler API, and maybe many connections for
  &gt; downloading gems)

  When bundler downloads and installs gem in parallel https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/installer/parallel_installer.rb#L128
  most threads have to wait for the only connection in the pool to be
  available which is not efficient.

  ### Solution

  This commit modifies the pool size for the fetcher that Bundler
  uses. RubyGems fetcher will continue to use a single connection.

  The bundler fetcher is used in 2 places.

  1. When downloading gems https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/source/rubygems.rb#L481-L484
  2. When grabing the index (not the compact index) using the
    `bundle install --full-index` flag.
    https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/fetcher/index.rb#L9

  Having more connections in 2) is not any useful but tweaking the
  size based on where the fetcher is used is a bit tricky so I opted
  to modify it at the class level.
  I fiddle with the pool size and found that 5 seems to be the sweet
  spot at least for my environment.

https://github.com/ruby/rubygems/commit/6063fd9963
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- ### TL;DR

  Bundler is heavily limited by the connection pool which manages a
  single connection. By increasing the number of connection, we can
  drastiscally speed up the installation process when many gems need
  to be downloaded and installed.

  ### Benchmark

  There are various factors that are hard to control such as
  compilation time and network speed but after dozens of tests I
  can consistently get aroud 70% speed increase when downloading and
  installing 472 gems, most having no native extensions (on purpose).

  ```
  # Before
  bundle install  28.60s user 12.70s system 179% cpu 23.014 total

  # After
  bundle install  30.09s user 15.90s system 281% cpu 16.317 total
  ```

  You can find on this gist how this was benchmarked and the Gemfile
  used https://gist.github.com/Edouard-chin/c8e39148c0cdf324dae827716fbe24a0

  ### Context

  A while ago in #869, Aaron introduced a connection pool which
  greatly improved Bundler speed. It was noted in the PR description
  that managing one connection was already good enough and it wasn't
  clear whether we needed more connections. Aaron also had the
  intuition that we may need to increase the pool for downloading
  gems and he was right.

  &gt; We need to study how RubyGems uses connections and make a decision
  &gt; based on request usage (e.g. only use one connection for many small
  &gt; requests like bundler API, and maybe many connections for
  &gt; downloading gems)

  When bundler downloads and installs gem in parallel https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/installer/parallel_installer.rb#L128
  most threads have to wait for the only connection in the pool to be
  available which is not efficient.

  ### Solution

  This commit modifies the pool size for the fetcher that Bundler
  uses. RubyGems fetcher will continue to use a single connection.

  The bundler fetcher is used in 2 places.

  1. When downloading gems https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/source/rubygems.rb#L481-L484
  2. When grabing the index (not the compact index) using the
    `bundle install --full-index` flag.
    https://github.com/ruby/rubygems/blob/4f85e02fdd89ee28852722dfed42a13c9f5c9193/bundler/lib/bundler/fetcher/index.rb#L9

  Having more connections in 2) is not any useful but tweaking the
  size based on where the fetcher is used is a bit tricky so I opted
  to modify it at the class level.
  I fiddle with the pool size and found that 5 seems to be the sweet
  spot at least for my environment.

https://github.com/ruby/rubygems/commit/6063fd9963
</pre>
</div>
</content>
</entry>
<entry>
<title>[rubygems/rubygems] User bundler UA when downloading gems</title>
<updated>2023-11-15T08:33:14+00:00</updated>
<author>
<name>Samuel Giddins</name>
<email>segiddins@segiddins.me</email>
</author>
<published>2023-10-22T20:25:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=b69bbf588a3dd167d62dbb89f0cef25ebae4a7ea'/>
<id>b69bbf588a3dd167d62dbb89f0cef25ebae4a7ea</id>
<content type='text'>
Gem::RemoteFetcher uses Gem::Request, which adds the RubyGems UA.
Gem::RemoteFetcher is used to download gems, as well as the full index.
We would like the bundler UA to be used whenever bundler is making
requests.

This PR also avoids unsafely mutating the headers hash on the shared
`Gem::RemoteFetcher.fetcher` instance, which could cause corruption or
incorrect headers when making parallel requests. Instead, we create one
remote fetcher per rubygems remote, which is similar to the connection
segregation bundler is already doing

https://github.com/rubygems/rubygems/commit/f0e8dacdec
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Gem::RemoteFetcher uses Gem::Request, which adds the RubyGems UA.
Gem::RemoteFetcher is used to download gems, as well as the full index.
We would like the bundler UA to be used whenever bundler is making
requests.

This PR also avoids unsafely mutating the headers hash on the shared
`Gem::RemoteFetcher.fetcher` instance, which could cause corruption or
incorrect headers when making parallel requests. Instead, we create one
remote fetcher per rubygems remote, which is similar to the connection
segregation bundler is already doing

https://github.com/rubygems/rubygems/commit/f0e8dacdec
</pre>
</div>
</content>
</entry>
</feed>
