The implementation of RolloutController #469
Replies: 3 comments
-
|
In general LGTM. However, I have two questions about detailed design and implementation:
|
Beta Was this translation helpful? Give feedback.
-
1. LocalInferenceEngine ArchitectureThe
Implementing a 2. RolloutController Implementation
What is the role of
|
Beta Was this translation helpful? Give feedback.
-
Implementation Plan UpdateAbout
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
This plan implements the RolloutController following the single-controller architecture design. The implementation is broken down into 4 major phases.
Phase 1: Implement Local SGLang/vLLM Engine
Step 1.1: Create
LocalSGLangEngineclassFile:
areal/engine/sglang_local.pyCreate a local inference engine that runs SGLang within the same process:
Notes:
LocalSGLangEnginemay occupy multiple GPUs or even span multiple nodes.areal/experimental/sglang_engine.py, which requires significant refactoring before use.Step 1.2: Create
LocalvLLMEngineclassFile:
areal/engine/vllm_local.pySimilar structure to
LocalSGLangEnginebut using vLLM'sAsyncLLMEngine.Phase 2: Refactor WorkflowExecutor to Isolate Async Thread Logic
Step 2.1: Create generic
AsyncTaskRunnerFile:
areal/core/async_task_runner.pyCreate a generic, reusable async task executor with NO AReaL-specific logic:
Step 2.2: Refactor WorkflowExecutor to use AsyncTaskRunner
File:
areal/api/workflow_api.pyWorkflowExecutor composes AsyncTaskRunner and handles all AReaL-specific logic externally. Key changes:
Usage Examples:
Phase 3: Extend RPCClient/RPCServer
asynccalls.RPCServercalled/exec_workflow, which instantiates a workflow object, runsworkflow.arun_episode(self.engine, data), stores the produced data locally, and returns the metadata to the client.Phase 4: Implement RolloutController
RolloutController is a pure composition of refactored components.
Step 4.1: Create RolloutController class
File:
areal/controller/rollout_controller.pyBeta Was this translation helpful? Give feedback.
All reactions