Skip to content

fix: measure latencySecond with monotonic high-resolution clock#322

Open
cre8ivejp wants to merge 1 commit intomainfrom
fix-invalid-duration
Open

fix: measure latencySecond with monotonic high-resolution clock#322
cre8ivejp wants to merge 1 commit intomainfrom
fix-invalid-duration

Conversation

@cre8ivejp
Copy link
Copy Markdown
Member

Summary

Replace Date.now()-based latency measurement in ApiClient.getEvaluations with performance.now() so that sub-millisecond network responses no longer report latencySecond: 0 and get rejected by the backend with duration is nil and latencySecond is 0: gateway: metrics event has invalid duration.

Why

The JS SDK measured latency as:

const start = Date.now();
const res = await postInternal(...);
const finish = Date.now();
return { type: 'success', seconds: (finish - start) / 1000, ... };

Date.now() is wall-clock and integer-millisecond resolution, so any operation that completes in less than 1 ms produces seconds: 0, the SDK ships latencySecond: 0, and the backend correctly flags it as invalid (proto3 double has no field-presence, so 0 is indistinguishable from "unset"):

// pkg/api/api/metrics_event.go
if ev.Duration == nil && ev.LatencySecond == 0 {
    return fmt.Errorf("duration is nil and latencySecond is 0: %w", MetricsSaveErrInvalidDuration)
}

The same root cause has already been fixed in the Node and Android SDKs in their respective repos. A 100 000-iteration sweep confirmed Date.now() - Date.now() returning 0 ms in 99 997 / 100 000 consecutive calls. In production we observed latencySecond: 0 events for sourceId=JAVASCRIPT flowing through the same code path.

This PR brings the JS SDK in line with the iOS / Go SDKs, both of which already use higher-resolution monotonic timers.

What changed

  • src/internal/utils/time.ts (new): add latencyStartMillis() and latencySecondsSince(start) backed by performance.now() (W3C High Resolution Time spec). The helper works in every target this SDK builds for — browser, Node 16+, and React Native — without runtime branching, because performance.now() is part of the global API in all three.
  • src/internal/remote/ApiClient.ts::getEvaluations: switch the network-call timer to the new helpers and pass the result directly to seconds.
  • test/internal/utils/time.spec.ts (new): 5 unit tests, including a regression test that asserts latencySecondsSince(...) > 0 for an awaited microtask across 100 iterations (the exact pattern the SDK runs around await postInternal(...)), plus a sub-millisecond-resolution test that the previous Date.now() clock would have been physically incapable of satisfying.

Clock (used for event timestamps, not latency) is intentionally untouched — its consumers want wall-clock seconds, not monotonic time.

No backend, proto, or public-API changes are required.

Test plan

  • pnpm test:node --run — 327/327 passing, including the 5 new internal/utils/time tests.
  • pnpm test:browser --run — 327/327 passing.
  • pnpm typecheck:lib and pnpm typecheck:test — both clean.
  • End-to-end runtime evidence: the existing BKTClient integration tests, which exercise the real ApiClient -> EventInteractor -> register_events pipeline, now ship payloads like "latencySecond": 0.0005245419999999968 (~525 µs) where they previously would have shipped "latencySecond": 0. Verified by inspecting MSW request bodies during the test run.

Signed-off-by: Alessandro Yuichi Okimoto <yuichijpn@gmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the SDK’s evaluation-fetch latency measurement to use a monotonic, high-resolution clock so very fast responses no longer emit latencySecond: 0 and get rejected by the backend.

Changes:

  • Added latencyStartMillis() / latencySecondsSince() helpers backed by performance.now().
  • Switched ApiClientImpl.getEvaluations latency timing from Date.now() to the new helpers.
  • Added unit tests covering the new latency helpers and the original latencySecond is 0 regression.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/internal/utils/time.ts Introduces performance.now()-based helpers for latency measurement.
src/internal/remote/ApiClient.ts Uses the new helpers to compute seconds for evaluation fetch latency.
test/internal/utils/time.spec.ts Adds tests intended to validate non-zero/sub-ms latency measurement behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 59 to +75
@@ -71,7 +72,7 @@ export class ApiClientImpl implements ApiClient {
return {
type: 'success',
sizeByte: contentLength,
seconds: (finish - start) / 1000,
seconds,
Comment on lines +70 to +88
test('latencySecondsSince matches a parallel performance.now() reading', () => {
// Sanity: the helper actually divides the performance.now() diff by
// 1000. Compute the same interval independently and confirm the
// helper's value is consistent with it.
const start = latencyStartMillis()
const perfStart = performance.now()
// tiny burn of work so the interval is non-zero
let acc = 0
for (let i = 0; i < 1000; i++) {
acc += Math.sqrt(i)
}
expect(acc).toBeGreaterThan(0)
const second = latencySecondsSince(start)
const perfDiffSec = (performance.now() - perfStart) / 1000
expect(second).toBeGreaterThan(0)
// helper measured BEFORE the second performance.now() read, so it
// must be <= the independently-computed value (with a small ε for
// jitter).
expect(second).toBeLessThanOrEqual(perfDiffSec + 1e-6)
Comment on lines +32 to +67
test('latencySecondsSince > 0 for an awaited microtask (regression for "latencySecond is 0")', async () => {
// The most realistic SDK scenario: a fetch call that resolves quickly.
// `await` schedules a microtask, which always takes >> 1ns. With the
// old `Date.now()` clock this measurement routinely came back 0; with
// `performance.now()` it must be strictly > 0 every time.
for (let i = 0; i < 100; i++) {
const start = latencyStartMillis()
await Promise.resolve()
const second = latencySecondsSince(start)
expect(
second,
`iteration ${i}: expected latencySecondsSince > 0 for an awaited microtask, got ${second}`,
).toBeGreaterThan(0)
}
})

test('latencySecondsSince has sub-millisecond resolution (proves the fix)', async () => {
// The pre-fix `Date.now()` timer has 1ms granularity, so this assertion
// would have been impossible to satisfy. Show that the new helper can
// measure intervals smaller than 1 millisecond. We sample several
// microtasks; at least one should come back below 1ms on any
// reasonable hardware.
let sawSubMs = false
for (let i = 0; i < 50; i++) {
const start = latencyStartMillis()
await Promise.resolve()
const second = latencySecondsSince(start)
if (second > 0 && second < 0.001) {
sawSubMs = true
break
}
}
expect(
sawSubMs,
'expected at least one awaited microtask to measure < 1ms with the new helper',
).toBe(true)
Copy link
Copy Markdown
Collaborator

@duyhungtnn duyhungtnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cre8ivejp it looks great.

after merge this PR, I'd like to merge this following PR
https://github.com/bucketeer-io/javascript-client-sdk/pull/325/changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants