feat: align perf_tracer with task hierarchy by rchardx · Pull Request #569 · inclusionAI/AReaL

rchardx · 2025-11-13T08:16:32Z

Description

Introduces task-scoped session tracing to retire direct session plumbing and capture clearer lifecycle metrics and reasons. Infers default trace outputs, hardens OpenAI client limits, and adds visualization tooling to keep rollout diagnostics actionable.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not
work as expected)
Documentation update
Code refactoring (no functional changes)
Performance improvement
Test coverage improvement

Checklist

I have read the Contributing Guide
I have run formatting tools (pre-commit or manual)
I have run relevant unit tests and they pass
I have added tests for new functionality
I have updated documentation if needed
My branch is up to date with main
This PR introduces breaking changes (if yes, fill out details below)
If this PR changes documentation, I have built and previewed it locally with
jb build docs
No critical issues raised by AI reviewers (/gemini review)

gemini-code-assist · 2025-11-13T08:17:08Z

Summary of Changes

Hello @rchardx, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the performance tracing system to provide a more structured and insightful view of asynchronous workflows. By introducing a task-scoped session hierarchy, streamlining event tracking, and enhancing client-side limit enforcement, the changes aim to capture clearer lifecycle metrics and improve diagnostic capabilities. The addition of a dedicated visualization tool further empowers users to analyze performance data effectively, ensuring that rollout diagnostics are actionable and precise.

Highlights

Task-Scoped Session Tracing: Introduced a new task-session hierarchy for performance tracing, allowing for more granular tracking of dataset-level tasks and their sample-level sessions. This replaces the previous direct session plumbing.
Refactored Session Events and Metrics: The SessionTraceEvent enum has been streamlined, removing old lifecycle events and introducing a unified FINALIZED event, along with TOOLCALL_START and TOOLCALL_END. Derived metrics have been updated to reflect this new structure, focusing on total_s, generate_s, reward_s, and toolcall_s.
Enhanced OpenAI Client Limits: Hardened the OpenAI client's max_new_tokens handling to ensure a default value of 512 is applied when not explicitly set, accompanied by a warning for clarity.
New Visualization Tool: A new Python script, plot_session_trace.py, has been added to generate interactive HTML plots for overall and step-wise session distributions, session lifecycles, and latency, leveraging the new task-session hierarchy.
Improved Trace Converter Output Inference: The perf_trace_converter.py tool now intelligently infers default output paths for Chrome Trace JSON files based on whether the input is a single file, a directory, or a glob pattern.
Workflow Integration and Documentation Updates: Workflows (RLVRWorkflow, VisionRLVRWorkflow) have been refactored to align with the new task-session tracing, and documentation (perf_profiling.md) has been thoroughly updated to explain the new hierarchy, API usage, and troubleshooting.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a significant and well-executed refactoring of the performance tracing system to align with a new task/session hierarchy. The changes introduce task-scoped session tracing, simplify the event lifecycle, and add powerful visualization tools. The code is well-structured, and the documentation has been updated thoroughly to reflect these changes. I have one piece of feedback regarding a potential loss of diagnostic information in an exception handling path. Overall, this is a high-quality contribution that greatly improves the observability of the system.

areal/core/workflow_executor.py

Copilot

Pull Request Overview

This PR introduces a hierarchical task-session tracing model to improve performance diagnostics for distributed RL training. It refactors the session tracking system to align with the task hierarchy, where each dataset-level task can spawn multiple sample-level sessions (when n_samples > 1).

Key changes:

Introduces task-session hierarchy with separate task_id (dataset-level) and session_id (sample-level) tracking
Refactors session lifecycle events from multi-stage (enqueued → execution_start → execution_end → consumed) to simplified model with mark_finalized
Adds toolcall phase tracking alongside existing generate and reward phases
Includes new visualization tooling (plot_session_trace.py) for analyzing session traces
Improves OpenAI client token limit handling

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
docs/best_practices/perf_profiling.md	Updates documentation to reflect task-session hierarchy and new event model with code examples
areal/workflow/vision_rlvr.py	Refactors to use per-sample session registration and tracing within `_collect_samples` method
areal/workflow/rlvr.py	Similar refactoring to vision_rlvr.py for task-session hierarchy support
areal/utils/perf_tracer.py	Core tracer refactoring with task/session hierarchy, simplified event model, and context variable support
areal/tools/plot_session_trace.py	New visualization tool (1360 lines) for generating interactive HTML reports from session traces
areal/tools/perf_trace_converter.py	Adds intelligent output path inference based on input type (file/directory/glob)
areal/tests/test_perf_tracer.py	Updates tests to use new task-session registration API and `FINALIZED` event
areal/tests/test_perf_trace_converter.py	Adds tests for new output path inference behavior
areal/experimental/openai/client.py	Fixes max_tokens handling to avoid None type errors and adds default fallback
areal/core/workflow_executor.py	Updates to register tasks instead of sessions and use `mark_finalized` event

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

areal/core/workflow_executor.py

areal/tools/plot_session_trace.py

areal/tests/test_perf_tracer.py

garrett4wade · 2025-11-13T09:09:41Z

areal/tools/plot_session_trace.py

+import plotly.graph_objects as go
+from plotly.colors import qualitative
+from plotly.subplots import make_subplots


Maybe we should add plotly in pyproject.toml?

Introduces task-scoped session tracing to retire direct session plumbing and capture clearer lifecycle metrics and reasons. Infers default trace outputs, hardens OpenAI client limits, and adds visualization tooling to keep rollout diagnostics actionable.

rchardx changed the title ~~Aligns perf tracer with task hierarchy~~ feat: align perf_tracer with task hierarchy Nov 13, 2025

rchardx requested a review from Copilot November 13, 2025 08:18

Copilot started reviewing on behalf of rchardx November 13, 2025 08:18 View session

gemini-code-assist bot reviewed Nov 13, 2025

View reviewed changes

areal/core/workflow_executor.py Outdated Show resolved Hide resolved

Copilot finished reviewing on behalf of rchardx November 13, 2025 08:21

Copilot AI reviewed Nov 13, 2025

View reviewed changes

areal/core/workflow_executor.py Outdated Show resolved Hide resolved

areal/tools/plot_session_trace.py Outdated Show resolved Hide resolved

areal/tests/test_perf_tracer.py Show resolved Hide resolved

rchardx force-pushed the rchardx/tracer branch from c15da09 to bb85629 Compare November 13, 2025 08:42

rchardx added the safe-to-test Ready to run unit-tests in a PR. label Nov 13, 2025

rchardx had a problem deploying to AReaL-unittests November 13, 2025 08:46 — with GitHub Actions Error

rchardx requested review from garrett4wade and nuzant November 13, 2025 09:03

garrett4wade reviewed Nov 13, 2025

View reviewed changes

rchardx force-pushed the rchardx/tracer branch from bb85629 to 86512bb Compare November 13, 2025 09:24

garrett4wade approved these changes Nov 13, 2025

View reviewed changes

garrett4wade merged commit 35f39bb into main Nov 13, 2025
1 check passed

garrett4wade deleted the rchardx/tracer branch November 13, 2025 09:30

garrett4wade mentioned this pull request Nov 13, 2025

ci: build docker images with GCP #564

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: align perf_tracer with task hierarchy#569

feat: align perf_tracer with task hierarchy#569
garrett4wade merged 1 commit intomainfrom
rchardx/tracer

rchardx commented Nov 13, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rchardx commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Checklist

Uh oh!

gemini-code-assist bot commented Nov 13, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rchardx commented Nov 13, 2025 •

edited

Loading