[BUG] Boba GRPO with vllm training failed - AttributeError: 'GRPOConfig' object has no attribute 'weight_update_mode'.

## Checklist

- [NA] The error occurs when using our provided Docker image.
- [Checked] I can consistently reproduce the bug across multiple trials or random seeds.
- [Checked] If the error causes experiment abortion, I've verified that this error is the root
  cause, not a secondary error caused by peer workers.

We use conda to prepare the environment instead of using docker image. But the root cause of the problem is straiteforward which is not directly related to environment.

## Detailed Information

### Describe the bug
Boba GRPO scripts with vLLM in example folder failed because of configuration type mismatch.

```
20251025-06:18:12.733 RayLauncher ERROR: Job trainer:3 failed with error: ray::run_func() (pid=3820227, ip=90.91.103.32)
  File "/home/z00637938/workspace/AReaL/areal/launcher/ray.py", line 65, in run_func
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/z00637938/workspace/AReaL/examples/math/boba_grpo.py", line 127, in main
    weight_update_meta = get_model_update_meta(config)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/z00637938/workspace/AReaL/areal/utils/model.py", line 50, in get_model_update_meta
    if config.weight_update_mode == "disk":
       ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'GRPOConfig' object has no attribute 'weight_update_mode'.
```

### Root cause
examples.math.boda_grpo.main calls areal.utils.model.get_model_update_meta with config as parameter. The type of config is areal.api.cli_args.GRPOConfig.  However get_model_update_meta() does not provide type tips, and try to access property `weight_update_mode`.  Actually `weight_update_mode` is only available in attribute actor of config, which in type of PPOActorConfig (sub class of TrainEngineConfig).

The process will fail because of type missing (failed to locate attribute 'weight_update_mode')

The suggest fix is to add type hints and access weight_update_mode from config.actor instead of config

```
def get_model_update_meta(config):
    if config.weight_update_mode == "disk":
        return WeightUpdateMeta.from_disk(
            config.experiment_name, config.trial_name, config.cluster.fileroot
        )
    else:
        return WeightUpdateMeta.from_fsdp_xccl(
            AllocationMode.from_str(config.allocation_mode)
        )
```

### Expected behavior

Example scripts should run without type mismatch issue.

### Full logs

If possible, provide logs for more detailed information.

(run_func pid=3820226) 20251025-06:18:08.596 Launcher Utils INFO: Found 2 rollout servers: 90.91.103.32:11451, 90.91.103.32:38927
(run_func pid=3820226) 20251025-06:18:08.596 [Remote Inference Engine Rank 2] INFO: Get server addresses from name_resolve.
(run_func pid=3820226) 20251025-06:18:08.596 [Remote Inference Engine Rank 2] INFO: Waiting for server ready...
(run_func pid=3820226) 20251025-06:18:08.602 [Remote Inference Engine Rank 2] INFO: Servers are all ready!
(run_func pid=3820225) [rank1]:[W1025 06:18:09.742729016 ProcessGroupNCCL.cpp:1538] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(run_func pid=3820224) /root/miniconda3/envs/areal_async_vllm_zheng/lib/python3.11/site-packages/megatron/core/optimizer/clip_grads.py:29: UserWarning: Transformer Engine and Apex are not installed. Falling back to local implementations of multi_tensor_applier, multi_tensor_l2norm, and multi_tensor_scale [repeated 6x across cluster]
(run_func pid=3820224)   warnings.warn( [repeated 12x across cluster]
(run_func pid=3820224) /root/miniconda3/envs/areal_async_vllm_zheng/lib/python3.11/site-packages/megatron/core/models/gpt/gpt_layer_specs.py:67: UserWarning: Apex is not installed. Falling back to Torch Norm [repeated 6x across cluster]
(run_func pid=3820224)   warnings.warn("Apex is not installed. Falling back to Torch Norm") [repeated 6x across cluster]
20251025-06:18:12.731 RayLauncher ERROR: Job trainer:0 failed with error: ray::run_func() (pid=3820224, ip=90.91.103.32)
  File "/home/z00637938/workspace/AReaL/areal/launcher/ray.py", line 65, in run_func
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/z00637938/workspace/AReaL/examples/math/boba_grpo.py", line 127, in main
    weight_update_meta = get_model_update_meta(config)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/z00637938/workspace/AReaL/areal/utils/model.py", line 50, in get_model_update_meta
    if config.weight_update_mode == "disk":
       ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'GRPOConfig' object has no attribute 'weight_update_mode'.

## To Reproduce

Run Boba GRPO vllm scripts in example. It's 100% reproducible.

### Commit ID
main's tip: 4a4abc67635d8e1f6e98e8845ad2929c284bee49

### Environment
torch 2.8.0
nvidia-cuda-cupti-cu12  12.8.90
nvidia-cuda-nvrtc-cu12  12.8.93
nvidia-cuda-runtime-cu12 12.8.90

### Script
python3 -m areal.launcher.ray examples/math/boba_grpo.py --config examples/math/boba_grpo_vllm.yaml experiment_name=boba_grpo_vllm_16_gpus trial_name=trail_0 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Boba GRPO with vllm training failed - AttributeError: 'GRPOConfig' object has no attribute 'weight_update_mode'. #482

Checklist

Detailed Information

Describe the bug

Root cause

Expected behavior

Full logs

To Reproduce

Commit ID

Environment

Script

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Boba GRPO with vllm training failed - AttributeError: 'GRPOConfig' object has no attribute 'weight_update_mode'. #482

Description

Checklist

Detailed Information

Describe the bug

Root cause

Expected behavior

Full logs

To Reproduce

Commit ID

Environment

Script

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions