Skip to content

Inject sampled traceparent on workflow execute calls#459

Merged
arnaudde merged 6 commits intomainfrom
wfl-927-traceparent-injection
Mar 31, 2026
Merged

Inject sampled traceparent on workflow execute calls#459
arnaudde merged 6 commits intomainfrom
wfl-927-traceparent-injection

Conversation

@arnaudde
Copy link
Copy Markdown
Contributor

@arnaudde arnaudde commented Mar 31, 2026

Problem

External workers return WF_1500 on /trace/otel, /trace/summary, and /trace/events because no sampled traceparent reaches the worker.

Root cause

Workers use ParentBasedTraceIdRatio: if the parent span is unsampled, every worker span becomes a NonRecordingSpan — nothing reaches the OTEL collector. Without a traceparent header, the worker inherits the API's HTTP span context via Temporal. In production, those spans are frequently unsampled.

Fix

A BeforeRequestHook (TraceparentInjectionHook) is registered for all SDK instances. On any request whose path ends with /execute, it injects a sampled W3C traceparent header: forwarding the active OTEL span context if it is already sampled, otherwise generating a fresh sampled one. An explicitly set traceparent header is never overwritten.

Testing

9 unit tests covering: no-op on non-execute paths, header injection, explicit header preservation, OTEL context propagation, fallback for unsampled/absent spans, and ID uniqueness. All pass locally.

Fixes: WFL-927
Slack: https://mistralai.slack.com/archives/C09BLDVF57C/p1774892554307479?thread_ts=1774890508.653139&cid=C09BLDVF57C

Without a traceparent header, external workers inherit the API's HTTP span
context via Temporal. In production the API's spans are frequently unsampled
(ParentBasedTraceIdRatio), so workers produce no-op spans and traces never
reach the collector — /trace/otel returns WF_1500.

Adds a BeforeRequestHook that fires on any /execute path and injects a
sampled W3C traceparent: forwarding the active OTEL span if it is already
sampled, otherwise generating a fresh sampled one. An explicitly set
traceparent header is never overwritten.
Covers: no-op on non-execute paths, sampled header injection, explicit
header preservation, OTEL context propagation, fallback for unsampled
or absent spans, and uniqueness of generated IDs.
- Shorten TraceparentInjectionHook docstring to one line
- Remove module docstring from test file
- Drop low-ROI uniqueness test
Remove unused opentelemetry.trace import (ruff F401) and add
isinstance assertions so pyright can narrow Union[Request, Exception]
before accessing .headers.
@arnaudde arnaudde marked this pull request as ready for review March 31, 2026 07:43
Matching on request.url.path.endswith("/execute") would affect any
future endpoint that happens to share that suffix. Keying on the
operation ID is explicit and safe.
@arnaudde arnaudde requested a review from mistralai-nfau March 31, 2026 07:59
@arnaudde arnaudde merged commit 3336a78 into main Mar 31, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants