Skip to content

docs(serverless): document timeout tuning#5018

Draft
NathanFlurry wants to merge 1 commit into
counter-latency/kitchen-sink-load-harnessfrom
counter-latency/serverless-docs
Draft

docs(serverless): document timeout tuning#5018
NathanFlurry wants to merge 1 commit into
counter-latency/kitchen-sink-load-harnessfrom
counter-latency/serverless-docs

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

Code Review

Overview

This PR adds two new checklist items to the serverless production checklist (requestLifespan tuning and drainGracePeriod guidance) and rewrites the existing concurrency bullet to be more precise. The .agent/notes/ file also carries unrelated test-run entries.


Issues

1. Inconsistency with runtime-modes.mdx

runtime-modes.mdx line 82 still reads:

Set this to match your platform's function timeout (e.g. requestLifespan: 3600 for Vercel Pro).

The new checklist says 3595 for Vercel Pro. Both the 5-second buffer (here) and the exact-limit value (runtime-modes) cannot be correct simultaneously. The checklist's 3595 is the safer recommendation; runtime-modes.mdx should be updated to match so readers are not confused by the discrepancy.

2. Cloud Run "15-min cap" is inaccurate

Cloud Run services have a configurable request timeout up to 3600 s (60 min); there is no platform-enforced 15-minute cap. The value 840 = 900 - 60 is only correct if the user has already configured Cloud Run's timeout to exactly 900 s. The example should say something like "Cloud Run configured at 900 s (15 min)" to make clear this is a deployment choice, not a platform constraint. Users running the Cloud Run default (300 s) will silently misconfigure if they copy 840.

3. drainGracePeriod default attribution

"Default is 30 minutes from the engine"

The 30-minute default is the runner-config-level drain_grace_period (confirmed at engine/packages/api-types/src/namespaces/runner_configs.rs:71). However, runtime-modes.mdx describes a separate value, serverless_drain_grace_period, that defaults to 10 seconds (engine-wide config at engine/packages/config/src/config/pegboard.rs:316). Linking this bullet to runtime-modes.mdx#timeouts risks readers conflating the two. Either point to /docs/actors/limits (which correctly shows "Runner config drain grace period: 30 min") or add a brief clarifier distinguishing the runner-level setting from the engine-level one.


Minor Points

  • Unrelated file in this PR. .agent/notes/driver-test-progress.md records internal test-run results unrelated to the serverless timeout docs. Consider dropping it so the diff stays focused.
  • configurePool link target. The link [configurePool](/docs/general/registry-configuration) lands on a page without a configurePool heading or anchor. Consider deep-linking to the specific section.
  • New concurrency bullet is a clear improvement. The explanation that "each serverless instance hosts one actor per in-flight /api/rivet/start request" is concrete and actionable.

Summary

The core guidance is correct and valuable. Main fixes before merging: (1) sync the Vercel Pro requestLifespan example in runtime-modes.mdx to 3595, (2) reframe the Cloud Run example so the 15-min figure is clearly a user-configured timeout not a platform cap, and (3) clarify the drainGracePeriod default attribution to avoid conflating runner-config and engine-config grace periods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant