Skip to content

Sync wait for streaming#5261

Open
pbarejko wants to merge 2 commits intoisaac-sim:developfrom
pbarejko:pbarejko/sync-wait-for-streaming
Open

Sync wait for streaming#5261
pbarejko wants to merge 2 commits intoisaac-sim:developfrom
pbarejko:pbarejko/sync-wait-for-streaming

Conversation

@pbarejko
Copy link
Copy Markdown

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (existing functionality will not work without user modification)
  • Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions bot added the isaac-lab Related to Isaac Lab team label Apr 14, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR replaces the asynchronous event-callback mechanism for tracking RTX streaming status with a synchronous polling approach via omni.usd.UsdContext.get_stage_loading_status() / get_stage_streaming_status(). The change eliminates a race condition where the _streaming_is_busy flag could be stale relative to the latest app.update() tick, and removes five module-level mutable globals plus two helper functions. Tests are fully rewritten to match the new approach.

Design Assessment

Design is sound. The old approach subscribed to omni.streamingstatus:streaming_status events via the carb event dispatcher, then checked a _streaming_is_busy flag — inherently racy because event delivery is asynchronous and the flag could be stale at the point of inspection. The new approach queries omni.usd directly, so the result is always consistent with the most recent app.update() tick. This is simpler, more reliable, and eliminates all the subscription lifecycle management (subscribe, reset on new SimulationContext, idempotency guards).

The scope of additional status checking (get_stage_loading_status in addition to streaming) is a reasonable broadening — waiting for both file loading and texture streaming before capturing a frame ensures downstream annotators see fully-resolved content.

Findings

🔵 Suggestion: Minor behavioral delta in the idle pathisaac_rtx_renderer_utils.py:51-52

The old _wait_for_streaming_complete always issued a final app.update() even when not busy (the function was only called when _streaming_is_busy was True, but the final app.update() ran unconditionally at the end). The new version returns early on idle with no app.update(). This is correct since ensure_isaac_rtx_render_update already pumps app.update() immediately before calling this function, so there is no missing frame — but worth calling out as an intentional behavioral change.

🔵 Suggestion: Consider a debug-level log when streaming wait is enteredisaac_rtx_renderer_utils.py:55

Currently the function only logs on timeout (warning) or non-trivial completion (info). A logger.debug("Waiting for RTX streaming to complete...") at entry would help during development/debugging without adding noise at default log levels.

Test Coverage

The tests are well-structured and cover the important paths:

  • TestIsStageLoadingOrStreaming: idle, loading, streaming, and short-circuit behavior — good coverage of the new polling function.
  • TestWaitForStreamingComplete: immediate idle, pump-until-idle, timeout, warning/info logging — covers all branches.
  • The _patch_busy helper with sequence-based mocking is a clean pattern for controlling multi-call behavior.
  • The _mock_omni_usd fixture properly handles module injection and teardown.

No critical test gaps identified. The MOCK_QUERIES_BEFORE_IDLE constant (line 28) is defined but unused — it appears to be a leftover from the old MOCK_ITERATIONS_BEFORE_IDLE.

CI Status

  • pre-commit: ✅ pass
  • Check for Broken Links: ✅ pass
  • Detect Changes: ✅ pass
  • license-check / Build: ⏳ pending

Verdict

Ship it — Clean simplification that fixes a real race condition. The synchronous polling approach is more reliable, the code is simpler (net -27 lines of production code, 5 fewer globals), and the tests adequately cover the new behavior. The one unused constant in tests is trivial.

# how many app.update() iterations before the mock becomes idle
MOCK_ITERATIONS_BEFORE_IDLE = 3
MOCK_QUERIES_BEFORE_IDLE = 3

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Nit: MOCK_QUERIES_BEFORE_IDLE is defined but never referenced in any test. Looks like a leftover from the old MOCK_ITERATIONS_BEFORE_IDLE. Consider removing it to avoid confusion.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR adds _is_stage_loading_or_streaming() and _wait_for_streaming_complete() helpers to isaac_rtx_renderer_utils.py and wires the wait into ensure_isaac_rtx_render_update() so that additional app.update() calls are pumped until RTX texture/geometry streaming is idle (or a 30 s timeout fires) before proceeding.

  • Missing CHANGELOG entry: source/isaaclab_physx/docs/CHANGELOG.rst was not updated and config/extension.toml was not version-bumped. Per project guidelines, every change to source/ requires a new versioned CHANGELOG entry (never appending to an existing released version). A new 0.5.17 (or higher) entry with the streaming-wait fix should be added.

Confidence Score: 4/5

  • Safe to merge after addressing the missing CHANGELOG/version-bump; the streaming-wait logic and tests are sound.
  • The two new helper functions are well-tested and the wait loop is correct given the (0,0)-idle contract. The only blocking concern per project guidelines is the absent CHANGELOG entry and missing version bump in extension.toml — both are required for every change touching source/. Remaining findings are P2 style/documentation improvements that do not block correctness.
  • source/isaaclab_physx/docs/CHANGELOG.rst and source/isaaclab_physx/config/extension.toml — both need a new version entry for this change.

Important Files Changed

Filename Overview
source/isaaclab_physx/isaaclab_physx/renderers/isaac_rtx_renderer_utils.py Adds _is_stage_loading_or_streaming() and _wait_for_streaming_complete() helpers, then calls the wait after the initial app.update() in ensure_isaac_rtx_render_update(). Logic looks sound for the (0,0)-idle API contract, but the final app.update() on timeout is inconsistent with its docstring and may be confusing to future readers.
source/isaaclab_physx/test/renderers/test_isaac_rtx_renderer_utils.py New unit-test suite covering idle/loading/streaming detection and wait-loop behaviour; uses careful mock teardown for omni.usd and correctly verifies call counts, timeout exit, and log emission.

Sequence Diagram

sequenceDiagram
    participant Cam as Camera/TiledCamera
    participant Utils as ensure_isaac_rtx_render_update()
    participant Wait as _wait_for_streaming_complete()
    participant USD as omni.usd.UsdContext
    participant App as omni.kit.app

    Cam->>Utils: call (per physics step)
    Utils->>Utils: check dedup key (id(sim), step_count)
    alt already pumped this step
        Utils-->>Cam: return (no-op)
    else first call this step
        Utils->>App: "set playSimulations=False"
        Utils->>App: app.update()  (initial pump)
        Utils->>Wait: _wait_for_streaming_complete()
        Wait->>USD: _is_stage_loading_or_streaming()
        alt idle from the start
            Wait-->>Utils: return immediately
        else streaming/loading in progress
            loop until idle or timeout (30 s)
                Wait->>App: app.update()
                Wait->>USD: _is_stage_loading_or_streaming()
            end
            Wait->>App: app.update()  (final flush)
            Wait-->>Utils: return
        end
        Utils->>App: "set playSimulations=True"
        Utils->>Utils: store dedup key
        Utils-->>Cam: return
    end
Loading

Reviews (1): Last reviewed commit: "Reformat" | Re-trigger Greptile

Comment on lines +63 to +71
if _is_stage_loading_or_streaming():
logger.warning(
"RTX streaming did not complete within %.1f s proceeding anyway.",
"RTX streaming did not complete within %.1f s -- proceeding anyway.",
_STREAMING_WAIT_TIMEOUT_S,
)
elif elapsed > 0.01:
logger.info("RTX streaming completed in %.2f s.", elapsed)

omni.kit.app.get_app().update()
app.update()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Final app.update() called on timeout, contradicting docstring

The docstring says the trailing app.update() is issued "after streaming finishes", implying it is a post-completion flush. However, the call is unconditional — it fires even when the loop exited due to timeout (streaming still busy). While the extra pump is harmless and may even be desirable in that case, the mismatch between the documented intent and the actual control-flow can confuse future maintainers.

Consider adding a guard or clarifying the docstring:

Suggested change
if _is_stage_loading_or_streaming():
logger.warning(
"RTX streaming did not complete within %.1f s – proceeding anyway.",
"RTX streaming did not complete within %.1f s -- proceeding anyway.",
_STREAMING_WAIT_TIMEOUT_S,
)
elif elapsed > 0.01:
logger.info("RTX streaming completed in %.2f s.", elapsed)
omni.kit.app.get_app().update()
app.update()
elapsed = time.monotonic() - start
if _is_stage_loading_or_streaming():
logger.warning(
"RTX streaming did not complete within %.1f s -- proceeding anyway.",
_STREAMING_WAIT_TIMEOUT_S,
)
elif elapsed > 0.01:
logger.info("RTX streaming completed in %.2f s.", elapsed)
# Issue one final pump regardless of whether streaming completed or timed out,
# so annotators get the latest rendered state.
app.update()

Comment on lines +36 to +39
_, files_loaded, total_files = usd_context.get_stage_loading_status()
if files_loaded or total_files:
return True
return bool(usd_context.get_stage_streaming_status())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Loading-complete state may not be detected correctly

The check if files_loaded or total_files: return True returns True (busy) whenever either count is non-zero — including the case where files_loaded == total_files > 0, which could mean loading has just finished rather than being still in progress. The test suite only exercises the (0, 0) idle state, so if the Omniverse API reports (N, N) on completion rather than resetting to (0, 0), the helper will treat a completed load as still-in-progress and spin until timeout.

It is worth confirming the exact contract of omni.usd.UsdContext.get_stage_loading_status() — specifically whether it resets both counts to zero once all files are loaded — and adding a comment or test case documenting that assumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant