refactor: restrict the usage scope of the `rollout_batch` method by garrett4wade · Pull Request #567 · inclusionAI/AReaL

garrett4wade · 2025-11-13T07:39:49Z

Description

rollout_batch provides a synchronous rollout method that is convenient for debugging and writing tests. However, it is incompatible with dynamic filtering: since this method doesn't actively submit new rollouts while waiting, it will hang indefinitely if any requests are filtered out during the wait. Therefore, this method should only be used for debugging and testing, not in production experiments.

This PR removes the async_training configuration option from TrainEngine and enforces the use of prepare_batch in scripts. Documentation added in #558 explains how to achieve synchronous training behavior by setting the CLI config rollout.max_head_offpolicyness.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not
work as expected)
Documentation update
Code refactoring (no functional changes)
Performance improvement
Test coverage improvement

Checklist

I have read the Contributing Guide
I have run formatting tools (pre-commit or manual)
I have run relevant unit tests and they pass
I have added tests for new functionality
I have updated documentation if needed
My branch is up to date with main
This PR introduces breaking changes (if yes, fill out details below)
If this PR changes documentation, I have built and previewed it locally with
jb build docs
No critical issues raised by AI reviewers (/gemini review)

Breaking Change Details (if applicable):

The rollout_batch method does not support should_accept_fn argument any more, but it should be classified as a bug fix instead of a breaking change.

gemini-code-assist · 2025-11-13T07:40:46Z

Summary of Changes

Hello @garrett4wade, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the rollout mechanism to restrict the usage of the synchronous rollout_batch method, which was prone to issues with dynamic filtering in production. The changes enforce the use of prepare_batch for all training scripts, ensuring compatibility with asynchronous operations and dynamic filtering. This also involved removing the now redundant async_training configuration option and updating all relevant documentation and examples to guide users towards the correct and robust approach for data collection.

Highlights

Deprecation of rollout_batch for Production: The rollout_batch method is now explicitly marked for debugging and testing only, as it is synchronous and incompatible with dynamic filtering, which can lead to indefinite hangs in production experiments.
Enforced prepare_batch Usage: The prepare_batch method is now the standard for asynchronous rollout and dynamic filtering, and its usage is enforced across various training scripts and documentation examples.
Removal of async_training Configuration: The async_training configuration option has been removed from TrainEngine and related YAML files, as the prepare_batch method inherently handles asynchronous behavior.
Updated Documentation and Examples: Extensive updates have been made to the documentation, including the README.md, debugging.md, cli_reference.md, and customization guides, to reflect the changes and provide clear guidance on using prepare_batch and configuring synchronous training via rollout.max_head_offpolicyness.
Refined AReaL Architecture Description: The README.md has been significantly refactored to clarify AReaL's 'algorithm-first' design philosophy, component overview, and the roles of various layers, including new backend engines and customization components.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a significant and well-executed refactoring of the data rollout mechanism. It restricts the usage of the synchronous rollout_batch method to debugging and testing, and enforces the use of the asynchronous prepare_batch method in all training scripts. This is achieved by removing the async_training configuration option and updating all example scripts and documentation accordingly. The should_accept_fn parameter has been correctly removed from rollout_batch to prevent it from hanging when used with dynamic filtering. The changes are consistent and thorough across the codebase. The accompanying documentation updates, especially in areal/README.md and docs/lite/gsm8k_grpo.md, are excellent and greatly improve the clarity of the library's architecture and usage. I have a couple of minor suggestions to further improve the clarity of the debugging documentation.

…f rollout_batch in training scripts; remove the async_training option

garrett4wade · 2025-11-13T12:47:28Z

/gemini review

gemini-code-assist

Code Review

This pull request effectively refactors the usage of rollout_batch and prepare_batch, restricting rollout_batch to debugging and testing scenarios and removing the async_training configuration. The changes are consistently applied across the codebase, including API definitions, training scripts, documentation, and tests. This simplification improves the API's clarity and enforces the recommended asynchronous training path.

I've included a couple of suggestions to further improve code clarity by removing a redundant argument in prepare_batch calls. These are minor but would make the example scripts cleaner.

…lusionAI#567) * remove should_accept_fn argument in rollout_batch; remove the usage of rollout_batch in training scripts; remove the async_training option * fix test

gemini-code-assist Bot reviewed Nov 13, 2025

View reviewed changes

Comment thread docs/best_practices/debugging.md Outdated

Comment thread docs/best_practices/debugging.md

This was referenced Nov 13, 2025

doc: update AReaL design doc with the current dev status #568

Merged

doc: update documentation to align the current dev status #570

Merged

garrett4wade force-pushed the fw/rm-rollout-batch branch from 2d452ec to cdd9b54 Compare November 13, 2025 12:34

remove should_accept_fn argument in rollout_batch; remove the usage o…

789f549

…f rollout_batch in training scripts; remove the async_training option

garrett4wade force-pushed the fw/rm-rollout-batch branch from cdd9b54 to 789f549 Compare November 13, 2025 12:46

gemini-code-assist Bot reviewed Nov 13, 2025

View reviewed changes

Comment thread areal/experimental/trainer/rl.py

Comment thread examples/camel/train.py

garrett4wade added the safe-to-test Ready to run unit-tests in a PR. label Nov 13, 2025

garrett4wade had a problem deploying to AReaL-unittests November 13, 2025 13:01 — with GitHub Actions Failure

fix test

a322eda

garrett4wade changed the title ~~[wip] refactor: restrict the usage scope of the rollout_batch method~~ refactor: restrict the usage scope of the rollout_batch method Nov 13, 2025

garrett4wade added safe-to-test Ready to run unit-tests in a PR. and removed safe-to-test Ready to run unit-tests in a PR. labels Nov 13, 2025

garrett4wade temporarily deployed to AReaL-unittests November 13, 2025 13:52 — with GitHub Actions Inactive

nuzant approved these changes Nov 14, 2025

View reviewed changes

nuzant merged commit 63b046c into main Nov 14, 2025
4 checks passed

nuzant deleted the fw/rm-rollout-batch branch November 14, 2025 02:55

garrett4wade mentioned this pull request Nov 18, 2025

[BUG] Using should_accept in rollout_batch causes infinite waiting. #588

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: restrict the usage scope of the `rollout_batch` method#567

refactor: restrict the usage scope of the `rollout_batch` method#567
nuzant merged 2 commits intomainfrom
fw/rm-rollout-batch

garrett4wade commented Nov 13, 2025

Uh oh!

gemini-code-assist Bot commented Nov 13, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

garrett4wade commented Nov 13, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

garrett4wade commented Nov 13, 2025

Description

Type of Change

Checklist

Uh oh!

gemini-code-assist Bot commented Nov 13, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

garrett4wade commented Nov 13, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants