Skip to content

feat(rivetkit-core): expose metrics endpoint#5013

Draft
NathanFlurry wants to merge 1 commit into
05-09-wip_counter_latency_monitorfrom
counter-latency/core-metrics-endpoint
Draft

feat(rivetkit-core): expose metrics endpoint#5013
NathanFlurry wants to merge 1 commit into
05-09-wip_counter_latency_monitorfrom
counter-latency/core-metrics-endpoint

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 11, 2026

🚅 Deployed to the rivet-pr-5013 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web May 11, 2026 at 12:16 pm
mcp-hub ✅ Success (View Logs) Web May 11, 2026 at 12:07 pm
frontend-cloud ❌ Build Failed (View Logs) Web May 11, 2026 at 12:06 pm
ladle ❌ Build Failed (View Logs) Web May 11, 2026 at 12:06 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 11, 2026 at 12:06 pm
kitchen-sink ❌ Build Failed (View Logs) Web May 11, 2026 at 12:06 pm

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

Code Review: feat(rivetkit-core): expose metrics endpoint

Overview

This PR adds a /metrics endpoint that serves Prometheus metrics in text format. It's implemented across three files: a new shared metrics_endpoint.rs module, registry/http.rs for the native HTTP path, and serverless.rs for the serverless path. Authentication uses constant-time bearer token comparison gated by two env vars.


Issues

Dead / unreachable code in registry/http.rs

The early return before actor lookup handles all Metrics routes:

if matches!(route, RegistryHttpRoute::Framework(FrameworkHttpRoute::Metrics)) {
    return handle_metrics_fetch(&request);
}
let instance = match self.active_actor(actor_id).await { ... }

This means FrameworkHttpRoute::Metrics => handle_metrics_fetch(&request) in the downstream match arm is unreachable. Calling the real handler there silently hides the invariant. Per the project convention (CLAUDE.md) of avoiding _ => catch-alls, this arm must exist for exhaustiveness — but it should signal the invariant explicitly:

FrameworkHttpRoute::Metrics => unreachable!("metrics handled before actor lookup"),

Missing WWW-Authenticate header on 401

RFC 7235 §3.1 requires a WWW-Authenticate header on 401 responses. Without it, some HTTP clients and scraper agents won't recognize the response as a bearer-auth challenge. Add:

WWW-Authenticate: Bearer realm="rivetkit-metrics"

Env vars read on every request

configured_metrics_token() calls std::env::var on every metrics request. Prometheus scrapers typically hit this every 15 seconds so the overhead is negligible in practice, but reading env vars in a hot path is an avoidable allocation. Consider caching with a OnceLock<Option<String>> at process startup, or at minimum document the behavior so future readers don't wonder.

Two-var opt-in can silently do nothing

If RIVETKIT_METRICS_ENABLED=1 is set but RIVETKIT_METRICS_TOKEN is missing or empty, the endpoint returns 403 "metrics not enabled" with no diagnostic. Operators may believe they configured it correctly but get no metrics. Consider a tracing::warn! at first request (or startup) when enabled but token is absent.

403 for disabled endpoint leaks existence

Returning 403 Forbidden when metrics are not configured tells scrapers the endpoint exists but is locked. 404 Not Found is more conservative if the intent is to hide the endpoint entirely when not configured. Worth an intentional decision either way.


Minor suggestions

bearer_token_from_authorization byte-slice approach. value.get(..6) is safe here because "bearer" is ASCII, but returns None on a multibyte boundary before position 6 rather than a parse failure. Using strip_prefix makes the intent clearer and removes the implicit byte count:

fn bearer_token_from_authorization(value: &str) -> Option<&str> {
    let trimmed = value.trim_start();
    let rest = if trimmed.len() >= 7 && trimmed[..7].eq_ignore_ascii_case("bearer ") {
        &trimmed[7..]
    } else {
        return None;
    };
    let token = rest.trim_start();
    if token.is_empty() { None } else { Some(token) }
}

Content-type header key inconsistency. serverless.rs uses the raw string "content-type" while registry/http.rs uses http::header::CONTENT_TYPE.to_string(). Both work, but using the typed constant in both paths is more consistent.


What's good

  • Using subtle::ConstantTimeEq for token comparison is the right call — it prevents timing side-channels.
  • Separating auth/render logic into metrics_endpoint.rs and sharing it across both HTTP paths avoids duplicating the security-sensitive bits.
  • pub(crate) visibility on the module is appropriate.
  • The design requiring explicit opt-in (RIVETKIT_METRICS_ENABLED=1) before the endpoint activates is a good security default.

No tests

The PR checklist notes tests are missing. At minimum, unit tests for bearer_token_from_authorization and authorize_metrics_request in metrics_endpoint.rs would cover the auth boundary. Per CLAUDE.md, Rust tests belong in tests/ rather than inline #[cfg(test)] modules.


This PR is marked Draft — flagging the above before it moves to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant