<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ruby.git/yjit/src/asm/x86_64/tests.rs, branch v4.0.3</title>
<subtitle>The Ruby Programming Language</subtitle>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/'/>
<entry>
<title>YJIT: x86: Fix panic writing 32-bit number with top bit set</title>
<updated>2025-06-11T10:49:49+00:00</updated>
<author>
<name>Alan Wu</name>
<email>XrXr@users.noreply.github.com</email>
</author>
<published>2025-06-10T11:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=e5c7f1695e8cf774d073e7b103c1d9289cad56ee'/>
<id>e5c7f1695e8cf774d073e7b103c1d9289cad56ee</id>
<content type='text'>
Previously, `asm.mov(m32, imm32)` panicked when `imm32 &gt; 0x80000000`. It
attempted to split imm32 into a register before doing the store, but
then the register size didn't match the destination size.

Instead of splitting, use the `MOV r/m32, imm32` form which works for
all 32-bit values. Adjust asserts that assumed that all forms undergo
sign extension, which is not true for this case.

See: 54edc930f9f0a658da45cfcef46648d1b6f82467
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, `asm.mov(m32, imm32)` panicked when `imm32 &gt; 0x80000000`. It
attempted to split imm32 into a register before doing the store, but
then the register size didn't match the destination size.

Instead of splitting, use the `MOV r/m32, imm32` form which works for
all 32-bit values. Adjust asserts that assumed that all forms undergo
sign extension, which is not true for this case.

See: 54edc930f9f0a658da45cfcef46648d1b6f82467
</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: Use u32 for CodePtr to save 4 bytes each</title>
<updated>2023-11-07T22:43:43+00:00</updated>
<author>
<name>Alan Wu</name>
<email>XrXr@users.noreply.github.com</email>
</author>
<published>2023-10-16T22:35:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a1c61f0ae5f5ecaa7d8289942b78e6b0c77118fe'/>
<id>a1c61f0ae5f5ecaa7d8289942b78e6b0c77118fe</id>
<content type='text'>
We've long had a size restriction on the code memory region such that a
u32 could refer to everything. This commit capitalizes on this
restriction by shrinking the size of `CodePtr` to be 4 bytes from 8.

To derive a full raw pointer from a `CodePtr`, one needs a base pointer.
Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The
base pointer is readily available everywhere, except for in the case of
the `jit_return` "branch". Generalize lea_label() to lea_jump_target()
in the IR to delay deriving the `jit_return` address until `compile()`,
when the base pointer is available.

On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size`
(58,397,765 to 57,742,248).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We've long had a size restriction on the code memory region such that a
u32 could refer to everything. This commit capitalizes on this
restriction by shrinking the size of `CodePtr` to be 4 bytes from 8.

To derive a full raw pointer from a `CodePtr`, one needs a base pointer.
Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The
base pointer is readily available everywhere, except for in the case of
the `jit_return` "branch". Generalize lea_label() to lea_jump_target()
in the IR to delay deriving the `jit_return` address until `compile()`,
when the base pointer is available.

On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size`
(58,397,765 to 57,742,248).
</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: implement imul instruction encoding in x86 assembler (#8191)</title>
<updated>2023-08-09T17:12:21+00:00</updated>
<author>
<name>Maxime Chevalier-Boisvert</name>
<email>maxime.chevalierboisvert@shopify.com</email>
</author>
<published>2023-08-09T17:12:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=c9b30f9d76ec7c726a703a7f8aad95b5998e7d6c'/>
<id>c9b30f9d76ec7c726a703a7f8aad95b5998e7d6c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: expand bitwise shift support in x86 assembler (#8174)</title>
<updated>2023-08-04T18:57:56+00:00</updated>
<author>
<name>Maxime Chevalier-Boisvert</name>
<email>maxime.chevalierboisvert@shopify.com</email>
</author>
<published>2023-08-04T18:57:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=8d7861e3daf64e0bd30b2f9fe56f94eadfde5d3f'/>
<id>8d7861e3daf64e0bd30b2f9fe56f94eadfde5d3f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix typos in YJIT [ci skip]</title>
<updated>2023-02-02T21:16:45+00:00</updated>
<author>
<name>Alan Wu</name>
<email>alanwu@ruby-lang.org</email>
</author>
<published>2023-02-02T21:16:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=92ac5f686b72942c9709a8f3e07f45f6a44ebc6b'/>
<id>92ac5f686b72942c9709a8f3e07f45f6a44ebc6b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix YJIT backend to account for unsigned int immediates (#6789)</title>
<updated>2022-11-23T15:48:17+00:00</updated>
<author>
<name>Jemma Issroff</name>
<email>jemmaissroff@gmail.com</email>
</author>
<published>2022-11-23T15:48:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=e82b15b6603ddc6754f4cfa7a189c0acb0ccce71'/>
<id>e82b15b6603ddc6754f4cfa7a189c0acb0ccce71</id>
<content type='text'>
YJIT: x86_64: Fix cmp with number where sign bit is set

Before this commit, we were unconditionally treating unsigned ints as
signed ints when counting the number of bits required for representing
the immediate in machine code. When the size of the immediate matches
the size of the other operand, no sign extension happens, so this was
incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's
encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this
issue.

Co-Authored-By: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
Co-Authored-By: Alan Wu &lt;alanwu@ruby-lang.org&gt;

Co-authored-by: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
Co-authored-by: Alan Wu &lt;alanwu@ruby-lang.org&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
YJIT: x86_64: Fix cmp with number where sign bit is set

Before this commit, we were unconditionally treating unsigned ints as
signed ints when counting the number of bits required for representing
the immediate in machine code. When the size of the immediate matches
the size of the other operand, no sign extension happens, so this was
incorrect. `asm.cmp(opnd64, 0x8000_0000)` panicked even though it's
encodable as `CMP r/m32, imm32`. Large shape ids were impacted by this
issue.

Co-Authored-By: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
Co-Authored-By: Alan Wu &lt;alanwu@ruby-lang.org&gt;

Co-authored-by: Aaron Patterson &lt;tenderlove@ruby-lang.org&gt;
Co-authored-by: Alan Wu &lt;alanwu@ruby-lang.org&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733)</title>
<updated>2022-11-15T23:23:20+00:00</updated>
<author>
<name>Takashi Kokubun</name>
<email>takashikkbn@gmail.com</email>
</author>
<published>2022-11-15T23:23:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=41b0f641ef0671d8cde397e56b1eb3c6b8e0f0db'/>
<id>41b0f641ef0671d8cde397e56b1eb3c6b8e0f0db</id>
<content type='text'>
* YJIT: Always encode Opnd::Value in 64 bits on x86_64

for GC offsets

Co-authored-by: Alan Wu &lt;alansi.xingwu@shopify.com&gt;

* Introduce heap_object_p

* Leave original mov intact

* Remove unneeded branches

* Add a test for movabs

Co-authored-by: Alan Wu &lt;alansi.xingwu@shopify.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* YJIT: Always encode Opnd::Value in 64 bits on x86_64

for GC offsets

Co-authored-by: Alan Wu &lt;alansi.xingwu@shopify.com&gt;

* Introduce heap_object_p

* Leave original mov intact

* Remove unneeded branches

* Add a test for movabs

Co-authored-by: Alan Wu &lt;alansi.xingwu@shopify.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: fold the "asm_comments" feature into "disasm" (#6591)</title>
<updated>2022-10-19T18:03:07+00:00</updated>
<author>
<name>Alan Wu</name>
<email>XrXr@users.noreply.github.com</email>
</author>
<published>2022-10-19T18:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=5ca23caa2057fc4760fbefab6087371b11c4bc6c'/>
<id>5ca23caa2057fc4760fbefab6087371b11c4bc6c</id>
<content type='text'>
Previously, enabling only "disasm" didn't actually build. Since these
two features are closely related and we don't really use one without the
other, let's simplify and merge the two features together.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, enabling only "disasm" didn't actually build. Since these
two features are closely related and we don't really use one without the
other, let's simplify and merge the two features together.</pre>
</div>
</content>
</entry>
<entry>
<title>* Arm64 Beginnings (https://github.com/Shopify/ruby/pull/291)</title>
<updated>2022-08-29T15:46:54+00:00</updated>
<author>
<name>Maxime Chevalier-Boisvert</name>
<email>maxime.chevalierboisvert@shopify.com</email>
</author>
<published>2022-06-15T17:10:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=a1b8c947380716a5ffca2b1888a6310e8132b00c'/>
<id>a1b8c947380716a5ffca2b1888a6310e8132b00c</id>
<content type='text'>
* Initial setup for aarch64

* ADDS and SUBS

* ADD and SUB for immediates

* Revert moved code

* Documentation

* Rename Arm64* to A64*

* Comments on shift types

* Share sig_imm_size and unsig_imm_size
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Initial setup for aarch64

* ADDS and SUBS

* ADD and SUB for immediates

* Revert moved code

* Documentation

* Rename Arm64* to A64*

* Comments on shift types

* Share sig_imm_size and unsig_imm_size
</pre>
</div>
</content>
</entry>
<entry>
<title>YJIT: On-demand executable memory allocation; faster boot (#5944)</title>
<updated>2022-06-14T14:23:13+00:00</updated>
<author>
<name>Alan Wu</name>
<email>XrXr@users.noreply.github.com</email>
</author>
<published>2022-06-14T14:23:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.ruby-lang.org/ruby.git/commit/?id=9f09397bfe6762bf19ef47b2f60988e49b80560d'/>
<id>9f09397bfe6762bf19ef47b2f60988e49b80560d</id>
<content type='text'>
This commit makes YJIT allocate memory for generated code gradually as
needed. Previously, YJIT allocates all the memory it needs on boot in
one go, leading to higher than necessary resident set size (RSS) and
time spent on boot initializing the memory with a large memset().

Users should no longer need to search for a magic number to pass to
`--yjit-exec-mem` since physical memory consumption should now more
accurately reflect the requirement of the workload.

YJIT now reserves a range of addresses on boot. This region start out
with no access permission at all so buggy attempts to jump to the region
crashes like before this change. To get this hardening at finer
granularity than the page size, we fill each page with trapping
instructions when we first allocate physical memory for the page.

Most of the time applications don't need 256 MiB of executable code, so
allocating on-demand ends up doing less total work than before. Case in
point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about
half as long after this change. In terms of memory consumption, here is
a table to give a rough summary of the impact:

    | Peak RSS in MiB | -eitself example | railsbench once |
    | :-------------: | ---------------: | --------------: |
    |     before      |              265 |             377 |
    |      after      |               11 |             143 |
    |     no YJIT     |               10 |             101 |

A new module is introduced to handle allocation bookkeeping.
`CodePtr` is moved into the module since it has a close relationship
with the new `VirtualMemory` struct. This new interface has a slightly
smaller surface than before in that marking a region as writable is no
longer a public operation.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit makes YJIT allocate memory for generated code gradually as
needed. Previously, YJIT allocates all the memory it needs on boot in
one go, leading to higher than necessary resident set size (RSS) and
time spent on boot initializing the memory with a large memset().

Users should no longer need to search for a magic number to pass to
`--yjit-exec-mem` since physical memory consumption should now more
accurately reflect the requirement of the workload.

YJIT now reserves a range of addresses on boot. This region start out
with no access permission at all so buggy attempts to jump to the region
crashes like before this change. To get this hardening at finer
granularity than the page size, we fill each page with trapping
instructions when we first allocate physical memory for the page.

Most of the time applications don't need 256 MiB of executable code, so
allocating on-demand ends up doing less total work than before. Case in
point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about
half as long after this change. In terms of memory consumption, here is
a table to give a rough summary of the impact:

    | Peak RSS in MiB | -eitself example | railsbench once |
    | :-------------: | ---------------: | --------------: |
    |     before      |              265 |             377 |
    |      after      |               11 |             143 |
    |     no YJIT     |               10 |             101 |

A new module is introduced to handle allocation bookkeeping.
`CodePtr` is moved into the module since it has a close relationship
with the new `VirtualMemory` struct. This new interface has a slightly
smaller surface than before in that marking a region as writable is no
longer a public operation.</pre>
</div>
</content>
</entry>
</feed>
