Skip to content

chore: clean up err tracker wf, ignore silenced wf#4979

Open
MasterPtato wants to merge 1 commit into
mainfrom
05-05-chore_clean_up_err_tracker_wf_ignore_silenced_wf
Open

chore: clean up err tracker wf, ignore silenced wf#4979
MasterPtato wants to merge 1 commit into
mainfrom
05-05-chore_clean_up_err_tracker_wf_ignore_silenced_wf

Conversation

@MasterPtato
Copy link
Copy Markdown
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 5, 2026

🚅 Deployed to the rivet-pr-4979 environment in rivet-frontend

Service Status Web Updated (UTC)
kitchen-sink ❌ Build Failed (View Logs) Web May 9, 2026 at 11:21 pm
website 😴 Sleeping (View Logs) Web May 9, 2026 at 3:33 am
mcp-hub ✅ Success (View Logs) Web May 5, 2026 at 8:02 pm
ladle ❌ Build Failed (View Logs) Web May 5, 2026 at 8:00 pm
frontend-cloud ❌ Build Failed (View Logs) Web May 5, 2026 at 8:00 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 5, 2026 at 8:00 pm

Copy link
Copy Markdown
Contributor Author

MasterPtato commented May 5, 2026

@MasterPtato MasterPtato marked this pull request as ready for review May 5, 2026 19:58
@MasterPtato MasterPtato requested a review from NathanFlurry May 5, 2026 19:58
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 5, 2026

PR #4979 Review: chore: clean up err tracker wf, ignore silenced wf

Overview

This PR makes four targeted fixes across the workflow engine and actor runtime:

  1. Gasoline KV – silenced workflows are now skipped in the main workflow poll/dispatch path
  2. Guard – removes a duplicate rustls ring crypto-provider installation
  3. Actor2 runtime – clears reschedule_ts on successful allocation (mirrors existing sleep_ts reset)
  4. Runner pool – captures the error-tracker workflow ID and sends it a Shutdown signal when the runner pool workflow exits

Change-by-change notes

engine/packages/gasoline/src/db/kv/mod.rs

Good fix. The lease-expiry path at line 568 already skips silenced workflows; this change brings the main workflow-fetch path into parity. The implementation is correct:

  • SilenceTsKey is read in the existing parallel tokio::try_join!, so no extra serial roundtrip is added.
  • The destructuring position aligns with the new future.
  • Log message uses tracing::warn! consistent with the adjacent has_output warning.
  • The combined if has_output || is_silenced lease-clear block avoids duplicating the cleanup.

One minor note: silence_ts_entry is fetched on every poll iteration even for non-silenced workflows. This is fine since it is parallel with the other reads, but worth keeping in mind if silenced-workflow volume ever grows high.

engine/packages/guard/src/lib.rs

Clean removal. rivet_pools::Pools::new already calls install_rustls_provider() (pools.rs:106–108), so the call in guard::start was redundant. This is safe to drop — pools are constructed before start is entered.

engine/packages/pegboard/src/workflows/actor2/runtime.rs

Correct. When allocate finds a slot, both sleep_ts and reschedule_ts should be cleared so stale retry timers don't fire after a successful allocation. Previously only sleep_ts was reset, leaving reschedule_ts pointing at an obsolete backoff timestamp.

engine/packages/pegboard/src/workflows/runner_pool.rs

Bug fix with one migration caveat.

The error tracker runs in a lupe loop that exits only on a Shutdown signal. Before this PR, the runner pool dispatched the tracker but never sent Shutdown when it exited, leaving the tracker running indefinitely. Capturing error_tracker_wf_id and sending Shutdown at the end of the workflow is the right fix.

Migration caveat: runner pool workflows that were dispatched and completed before this code is deployed will have orphaned error-tracker workflows stuck waiting for a signal that will never arrive. These will need to be cleaned up via the silence mechanism or manual intervention. This is a timing limitation, not a flaw in the fix itself.

The ctx.v(2) version marker on the new signal send is safe. Gasoline tracks each op by its cursor position in history, so reusing version 2 for the new terminal op does not conflict with the earlier ctx.v(2).workflow(error_tracker).dispatch() call.


Summary

All four changes are correct and consistent with surrounding patterns. No blocking concerns. The orphaned-tracker migration side effect is the only non-obvious consequence, and it is limited in scope.

@MasterPtato MasterPtato force-pushed the 05-05-chore_clean_up_err_tracker_wf_ignore_silenced_wf branch from b5ccc8d to 7496d90 Compare May 9, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant