fix(runtime): avoid bounded dispatch channel stalls by NathanFlurry · Pull Request #5015 · rivet-dev/rivet

NathanFlurry · 2026-05-11T12:05:41Z

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

NathanFlurry · 2026-05-11T12:05:57Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

claude · 2026-05-11T12:09:01Z

Code Review: fix(runtime): avoid bounded dispatch channel stalls

Overview

This PR converts 12 bounded mpsc::channel usages to mpsc::unbounded_channel across universaldb postgres/rocksdb transaction drivers, pegboard-gateway(2), UDB CLI, and the Python client. The motivation is to eliminate backpressure-induced stalls. The mechanical changes (removing .await on sends, updating type signatures) are all correct.

Transaction channels (Postgres + RocksDB) — correct fix

Each transaction command is paired with a one-shot response channel, so the caller already blocks waiting for the reply before issuing the next command. Natural backpressure exists at the application level, so the bounded channel served no real purpose and only introduced a deadlock surface if the transaction task was busy and the queue filled up. Unbounded is right here.

Pegboard gateway message forwarding — acceptable, with a note

The in-flight request channels were bounded at 128. Converting to unbounded means that if a WebSocket client is slow (network congestion, backpressure from the browser), tunnel messages pile up with no limit. The drop_rx watch is a liveness check, not a flow-control mechanism. This is probably fine in practice since messages are already bounded by what the actor sends, and stalled bounded sends would have caused worse problems by blocking the gateway dispatch loop. Consider a high-watermark log or metric so runaway queues are observable in production.

UDB changelog entry channel — most concerning

The old bound was 10,000 entries. This channel sits between a reader task (fast: scanning FDB changelog) and a writer task (slower: re-applying entries in txn-bounded batches). If the writer falls behind on a large replay, entries now accumulate without bound. For large datasets this could cause significant memory pressure. A semaphore-based permit system could cap in-flight entries while still avoiding blocking.

Minor issues

forward_tunnel_message in pegboard-gateway2/src/shared_state.rs:854 is still marked async but has no .await points after this change. The async keyword should be removed; #[tracing::instrument] does not require it.

CLAUDE.md notes that actor-owned dispatch producers should use try_reserve helpers and return actor.overloaded rather than awaiting bounded sends. These channels are not actor lifecycle channels, but the same principle — non-blocking sends with structured error responses — is worth considering for the gateway forwarding paths where backpressure matters.

Summary

Change	Assessment
UniversalDB transaction channels	Correct fix — natural backpressure from one-shot responses
Pegboard gateway request message channels	Acceptable — consider high-watermark log or metric
UDB changelog entry channel (10k to unbounded)	Risk — could cause unbounded memory growth on large replays
Python client event channels	Low risk — event rate is typically controlled
`forward_tunnel_message` still `async`	Minor cleanup — drop the `async` keyword

The core fix (avoiding blocked sends on full bounded channels) is sound. The main gap is the UDB entry channel having no memory bound during replays.

fix(runtime): avoid bounded dispatch channel stalls

9c656c0

This was referenced May 11, 2026

WIP: counter latency monitor #5011

Draft

feat(rivetkit-core): expose metrics endpoint #5013

Draft

fix(sqlite): fence remote requests by generation #5014

Draft

This was referenced May 11, 2026

fix(rivetkit): expose client actor metadata #5016

Draft

feat(kitchen-sink): add load testing harness #5017

Draft

docs(serverless): document timeout tuning #5018

Draft

This was referenced May 11, 2026

feat(rivetkit): expose low-cardinality metrics routes #5019

Draft

feat(rivetkit): add current actor metrics #5020

Draft

fix(rivetkit): require engine ping for health #5022

Draft

abcxff mentioned this pull request May 11, 2026

temp: patch wasm-pack to point to latest binary #5037

Draft

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(runtime): avoid bounded dispatch channel stalls#5015

fix(runtime): avoid bounded dispatch channel stalls#5015
NathanFlurry wants to merge 1 commit into
counter-latency/sqlite-generation-fencefrom
counter-latency/unbounded-dispatch-channels

NathanFlurry commented May 11, 2026

Uh oh!

NathanFlurry commented May 11, 2026 •

edited by abcxff

Loading

Uh oh!

claude Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NathanFlurry commented May 11, 2026

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

NathanFlurry commented May 11, 2026 • edited by abcxff Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NathanFlurry commented May 11, 2026 •

edited by abcxff

Loading