Skip to content

[release/10.0] Fix ARM64 interface dispatch cache torn read#126689

Open
github-actions[bot] wants to merge 3 commits intorelease/10.0from
backport/pr-126346-to-release/10.0
Open

[release/10.0] Fix ARM64 interface dispatch cache torn read#126689
github-actions[bot] wants to merge 3 commits intorelease/10.0from
backport/pr-126346-to-release/10.0

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Apr 9, 2026

Backport of #126346 to release/10.0

/cc @MichalStrehovsky

Customer Impact

  • Customer reported
  • Found internally

Reported by a first party. Torn read in ARM64 interface dispatch can cause a dispatch to null.

Interface dispatch cell is a pair or MethodTable and a code address. If the type of MethodTable matches the type of this, we call the code address. The pair is written atomically, however the read is not atomic and on ARM64 a torn read can happen, matching the MethodTable, but still seeing a 0 code address.

Regression

  • Yes
  • No

Not a regression, this bug has existed ever since ARM64 was added to .NET Native for UWP apps in 2017 or so.

Testing

This is a race condition that requires a lot of luck to hit. Testing is "code review" basically.

Risk

The risk is low, instead of loading two pointers individually, we load them together, and add an extra check for the race condition case.

IMPORTANT: If this backport is for a servicing release, please verify that:

  • For .NET 8 and .NET 9: The PR target branch is release/X.0-staging, not release/X.0.
  • For .NET 10+: The PR target branch is release/X.0 (no -staging suffix).

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

MichalStrehovsky and others added 3 commits April 9, 2026 04:09
On ARM64, the CHECK_CACHE_ENTRY macro read m_pInstanceType and m_pTargetCode
from a cache entry using two separate ldr instructions separated by a control
dependency (cmp/bne). ARM64's weak memory model does not order loads across
control dependencies, so the hardware can speculatively satisfy the second
load (target) before the first (type) commits. When a concurrent thread
atomically populates the entry via stlxp/casp (UpdateCacheEntryAtomically),
the reader can observe the new m_pInstanceType but the old m_pTargetCode (0),
then br to address 0.

Fix by using ldp to load both fields in a single instruction (single-copy
atomic on FEAT_LSE2 / ARMv8.4+ hardware), plus a cbz guard to catch torn
reads on pre-LSE2 hardware where ldp pair atomicity is not architecturally
guaranteed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of emitting an add instruction per cache entry when the ldp
offset exceeds [-512,504], rebase x9 once when the threshold is
crossed. This keeps the per-entry probe to a single ldp for all
entries in the 32/64 slot stubs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Apple ARM64 platforms all have FEAT_LSE2, which makes ldp single-copy
atomic for 16-byte aligned pairs. The cbz torn-read guard is
unnecessary there.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

@MichalStrehovsky
Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants