JIT: Extend cast removal pattern for bitwise ops: OP(CAST(x), CAST(y)) → CAST(OP(x, y)) by Copilot · Pull Request #126671 · dotnet/runtime

Copilot · 2026-04-08T22:49:33Z

Description

PR #120980 introduced range assertions that can eliminate outer casts on bitwise operations when the result is provably in range. This breaks the existing fgSimpleLowerCastOfSmpOp pattern which expects CAST(OP(CAST(x), CAST(y))) → CAST(OP(x, y)). When the outer cast is removed first, we're left with OP(CAST(x), CAST(y)) — two casts where at most one is needed.

New function fgSimpleLowerSmpOpCasts handles the complementary case for AND, OR, XOR:

AND(CAST_T(x), CAST_T(y))  →  CAST_T(AND(x, y))

This is valid because bitwise ops are bit-independent: ext_n(x) OP ext_n(y) = ext_n(x OP y) for sign/zero extension with AND, OR, XOR.

Diffs.

When range assertions remove an outer cast from CAST(OP(CAST(x), CAST(y))), we're left with OP(CAST(x), CAST(y)) with two redundant casts. For bitwise ops (AND, OR, XOR), we can transform this to CAST(OP(x, y)) since these operations are bit-independent: sign/zero extension commutes with bitwise ops. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/7a203210-db47-47e7-9074-92b298f13949 Co-authored-by: adamperlin <10533886+adamperlin@users.noreply.github.com>

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/7a203210-db47-47e7-9074-92b298f13949 Co-authored-by: adamperlin <10533886+adamperlin@users.noreply.github.com>

dotnet-policy-service · 2026-04-09T00:37:17Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

github-actions · 2026-05-08T18:34:53Z

🤖 Copilot Code Review — PR #126671

Note

This review was generated by GitHub Copilot using multi-model analysis (Claude Opus 4.6 primary, with GPT-5.3-Codex and Claude Haiku 4.5 sub-agents).

Holistic Assessment

Motivation: The optimization is well-motivated. When range assertions remove the outer cast from CAST(OP(CAST(x), CAST(y))), the remaining OP(CAST(x), CAST(y)) has two redundant casts that can be folded into one. This is a natural complement to the existing fgSimpleLowerCastOfSmpOp which handles the inverse pattern (removing inner casts when the outer cast is present).

Approach: The approach is sound. For bitwise operations (AND, OR, XOR), truncation/extension commutes with the operation — the lower bits of the result depend only on the lower bits of the inputs. The transformation OP(CAST_small(x), CAST_small(y)) → CAST_small(OP(x, y)) reduces two CAST nodes to one. Performing this in rationalization (during HIR→LIR transition) correctly avoids the NOL (normalized-on-load) issues that would make this unsafe in morph, consistent with the existing similar optimizations.

Summary: ⚠️ Needs Human Review. The transformation logic is mathematically correct and the implementation follows established patterns in the codebase. However, there are a few points that warrant human maintainer attention: missing benchmark evidence for an optimization PR, no regression tests, and a minor robustness concern around flag handling. None are likely to cause miscompilation, but a JIT maintainer should confirm.

Detailed Findings

✅ Correctness — Bitwise commutativity with truncation is valid

The core claim that OP(CAST_small(x), CAST_small(y)) ≡ CAST_small(OP(x, y)) for bitwise AND/OR/XOR is correct. For zero-extension: (x & mask) OP (y & mask) = (x OP y) & mask. For sign-extension, the lower bits are identical, and the CAST at the end re-extends correctly. The guards are appropriate:

Both casts must target the same small integer type (castToType match)
No overflow-checked casts
Cast sources must have the same actual type as the op

✅ Type safety — No type mismatch for consumers

One sub-agent flagged a type mismatch concern, but this is a false positive. GenTreeCast for small type targets has node type TYP_INT (not the small type), as confirmed by how casts are constructed throughout the JIT (e.g., gtNewCastNode(TYP_INT, op1, false, castToType) in gentree.cpp:15834). Consumers of the replaced node will continue to see TYP_INT, matching the original op's type. The CastToType() carries the narrowing semantics separately.

✅ GTF_UNSIGNED flag — Not meaningful for small type casts

The GTF_UNSIGNED flag on the repurposed cast1 is technically stale (it reflects the original cast source, not the new OP(x,y) source). However, per the GenTreeCast documentation (gentree.h:4128-4131), for unchecked casts to small types, IsUnsigned is meaningless — the extension behavior is determined entirely by varTypeIsUnsigned(CastToType()) (see IsZeroExtending() at gentree.h:4166-4168). This is not a correctness issue.

✅ LIR ordering — Correct linear sequence maintained

After transformation, the LIR order is [..., x, y, OP(x,y), CAST(OP(x,y)), ...] which is a valid def-use chain. The range.Remove and range.InsertAfter calls correctly maintain this.

⚠️ Missing benchmark evidence

Per dotnet/runtime conventions, optimization PRs should include benchmark evidence (BenchmarkDotNet or equivalent) showing the improvement is real and measurable. The PR description doesn't reference any benchmarks or codegen diffs. While removing a CAST node is a straightforward win, it would be helpful to show:

An example of before/after codegen for a real-world pattern
Evidence that this pattern actually occurs in practice (e.g., from SPMI/JITDUMP analysis)

This is non-blocking but a JIT maintainer should evaluate whether evidence is needed.

⚠️ No regression tests added

The PR adds no tests. While this is an optimization (not a bug fix), and existing tests should cover correctness of AND/OR/XOR on small types, it would be valuable to add at least one test method that exercises the specific pattern OP(CAST_small(x), CAST_small(y)) for the three bitwise ops, to ensure the optimization fires and produces correct results. This could be a simple [MethodImpl(MethodImplOptions.NoInlining)] method with small-type parameters.

💡 Side-effect flag propagation — Consistent with existing patterns

After repurposing cast1 (changing CastOp() from x to OP(x,y)), the side-effect flags (GTF_ALL_EFFECT) on cast1 are not recomputed to reflect the new subtree. However, the existing fgSimpleLowerCastOfSmpOp and fgSimpleLowerBswap16 follow the same pattern — they don't update flags either. In LIR, execution order is governed by linear ordering rather than tree structure, so this is unlikely to cause issues. Noting for completeness.

💡 Naming consistency — Minor observation

The JITDUMP message says "Lower - Simple Op Casts" which follows the fgSimpleLower* naming convention. This is fine and consistent, though it could briefly confuse someone since this runs during rationalization, not the lowering phase proper. The existing functions have the same naming pattern, so this is established convention.

Models contributing to this review: Claude Opus 4.6 (primary analysis), GPT-5.3-Codex (correctness verification), Claude Haiku 4.5 (flag/type analysis — some findings were false positives, noted above).

Generated by Code Review for issue #126671 · ◷

Copilot

Pull request overview

This PR adds a new rationalization-time (LIR) cast-reduction transform for bitwise ops so that AND/OR/XOR(CAST_T(x), CAST_T(y)) can be rewritten as CAST_T(AND/OR/XOR(x, y)), reducing two inner casts down to one outer cast after certain earlier optimizations remove an outer cast.

Changes:

Add Compiler::fgSimpleLowerSmpOpCasts in flowgraph.cpp to fold two operand casts into a single outer cast for GT_AND/GT_OR/GT_XOR.
Invoke this new helper during rationalization for GT_AND/GT_OR/GT_XOR, replacing the parent use when a rewrite occurs.
Declare the new helper in compiler.h.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/coreclr/jit/rationalize.cpp	Calls the new cast-folding helper for `GT_AND/GT_OR/GT_XOR` and replaces the use edge when it rewrites.
src/coreclr/jit/flowgraph.cpp	Implements `fgSimpleLowerSmpOpCasts` to transform `OP(CAST, CAST)` into `CAST(OP)` in LIR.
src/coreclr/jit/compiler.h	Adds the declaration for `fgSimpleLowerSmpOpCasts`.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

…jit-cast-removal-pattern

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

…ight() check

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

+        case GT_AND:
+        case GT_OR:
+        case GT_XOR:
+        {
+            GenTree* parent = parentStack.Height() >= 2 ? parentStack.Top(1) : nullptr;
+            if ((parent == nullptr) || !parent->OperIs(GT_CAST))
+            {
+                GenTree* replacement = m_compiler->fgSimpleLowerSmpOpCasts(BlockRange(), node->AsOp());


adamperlin · 2026-05-12T16:59:42Z

I'm not totally sure if this opt is worth it. Scanning for this specific pattern while processing any AND/OR/XOR operation during rationalization seems a bit overkill for small-ish diffs. I'm curious if you have any thoughts @EgorBo!

EgorBo · 2026-05-12T20:44:05Z

I'm not totally sure if this opt is worth it.

Not sure either, seems like it mostly only improves win-x64.

Any particular reason you've done it in Rationalizer? Normally, we either do such transforms in gtFoldExpr, morph or Lower

adamperlin · 2026-05-12T21:32:19Z

Any particular reason you've done it in Rationalizer? Normally, we either do such transforms in gtFoldExpr, morph or Lower

There is a comment on fgSimpleLowerCastOfSmpOp which is a similar transform, which says doing the transformation in morph causes problems with normalized-on-load locals in value numbering, so I figured we might run into a similar issue here with removing inner casts? Do you have any idea if that might be an issue here?

Initial plan

c0cd7e8

Copilot AI assigned Copilot and adamperlin Apr 8, 2026

Copilot AI requested review from Copilot and removed request for Copilot April 8, 2026 22:49

Copilot started work on behalf of adamperlin April 8, 2026 22:49 View session

Copilot AI linked an issue Apr 8, 2026 that may be closed by this pull request

Extend JIT Cast Removal Pattern for Simple Ops #121384

Open

github-actions Bot added the area-Infrastructure label Apr 8, 2026

github-project-automation Bot added this to Runtime Infra Apr 8, 2026

Copilot AI requested review from Copilot and removed request for Copilot April 8, 2026 23:21

Fix comment typo: "Simple put" -> "Simply put"

205d435

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/7a203210-db47-47e7-9074-92b298f13949 Co-authored-by: adamperlin <10533886+adamperlin@users.noreply.github.com>

Copilot AI requested review from Copilot and removed request for Copilot April 8, 2026 23:37

Copilot AI changed the title ~~[WIP] Extend JIT cast removal pattern for simple ops~~ Extend JIT cast removal pattern for bitwise ops: OP(CAST(x), CAST(y)) → CAST(OP(x, y)) Apr 8, 2026

Copilot AI requested a review from adamperlin April 8, 2026 23:38

Copilot finished work on behalf of adamperlin April 8, 2026 23:38

jkoritzinsky added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-Infrastructure labels Apr 9, 2026

adamperlin changed the title ~~Extend JIT cast removal pattern for bitwise ops: OP(CAST(x), CAST(y)) → CAST(OP(x, y))~~ JIT: Extend cast removal pattern for bitwise ops: OP(CAST(x), CAST(y)) → CAST(OP(x, y)) Apr 10, 2026

Merge branch 'main' into copilot/extend-jit-cast-removal-pattern

4404dff

Copilot AI review requested due to automatic review settings May 8, 2026 18:21

Copilot started reviewing on behalf of adamperlin May 8, 2026 18:21 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Comment thread src/coreclr/jit/rationalize.cpp Outdated

Comment thread src/coreclr/jit/flowgraph.cpp

Potential fix for pull request finding

eb0d01a

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 8, 2026 20:27

Copilot started reviewing on behalf of adamperlin May 8, 2026 20:27 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Comment thread src/coreclr/jit/rationalize.cpp Outdated

Comment thread src/coreclr/jit/flowgraph.cpp

adamperlin added 3 commits May 8, 2026 13:38

Merge branch 'main' of github.com:dotnet/runtime into copilot/extend-…

81cae1b

…jit-cast-removal-pattern

Properly check if parent is GT_CAST

a2eb385

Add effects flags from new child op to parent cast

6ab04b7

Copilot AI review requested due to automatic review settings May 11, 2026 21:41

Copilot started reviewing on behalf of adamperlin May 11, 2026 21:41 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Comment thread src/coreclr/jit/rationalize.cpp Outdated

Comment thread src/coreclr/jit/flowgraph.cpp

adamperlin added 2 commits May 11, 2026 17:57

Include int as a smaller type for opt on 64 bit targets, fix stack He…

908dc10

…ight() check

Simplify comment and update test

b593005

Copilot AI review requested due to automatic review settings May 12, 2026 01:25

Copilot started reviewing on behalf of adamperlin May 12, 2026 01:25 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

build-analysis Bot mentioned this pull request May 12, 2026

[ci-scan] Known Build Error: StreamReader NegativeOneBufferSize test fails under PGO #128062

Open

adamperlin marked this pull request as ready for review May 12, 2026 16:49

Fix comment in IntAnd test

34e5eac

adamperlin requested a review from EgorBo May 12, 2026 16:53

Conversation

Copilot AI commented Apr 8, 2026 • edited by adamperlin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

dotnet-policy-service Bot commented Apr 9, 2026

Uh oh!

github-actions Bot commented May 8, 2026

🤖 Copilot Code Review — PR #126671

Holistic Assessment

Detailed Findings

✅ Correctness — Bitwise commutativity with truncation is valid

✅ Type safety — No type mismatch for consumers

✅ GTF_UNSIGNED flag — Not meaningful for small type casts

✅ LIR ordering — Correct linear sequence maintained

⚠️ Missing benchmark evidence

⚠️ No regression tests added

💡 Side-effect flag propagation — Consistent with existing patterns

💡 Naming consistency — Minor observation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

adamperlin commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EgorBo commented May 12, 2026

Uh oh!

adamperlin commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Copilot AI commented Apr 8, 2026 •

edited by adamperlin

Loading

adamperlin commented May 12, 2026 •

edited

Loading

adamperlin commented May 12, 2026 •

edited

Loading