[Feature] Add patch to accelerate SGLang weight loading by nuzant · Pull Request #324 · inclusionAI/AReaL

nuzant · 2025-09-11T09:51:51Z

What does this PR do?

This PR adds options to apply patch to SGLang and accelerate its weight loading.

Option 1 enable_multithread_load: This is a native SGLang weight-loading acceleration which has not been applied when updating weights from disk in the original SGLang code. The patch in this PR fixes this issue. This option is available for all models.

Option 2 enable_fast_load: This is an option to enable an optimized, customized weight loading implementation in SGLang introduced by the patch in this PR. It is faster than enable_multithread_load, but is only available for Qwen3 and Qwen3MoE models.

Why we need this PR?

Disk weight loading is simpler and more flexible than NCCL weight loading. It has great advantages in supporting complex scenarios in the future, such as RL with elastic inference servers or heterogeneous hardware.

Example Usage

Add options in yaml or command line: sglang.enable_multithread_load=true or sglang.enable_fast_load=true.

Performance and Correctness

On Qwen3-30B-A3B, allocation mode sglang:d4p1t4+megatron:(attn:d1p4t2c2|ffn:d1p4t2e2), this PR accelerates weight updating from ~60s to ~30s while maintaining correctness.
Performance in other conditions is pending to be tested.

Update

The patch is upgraded and tested on SGLang v0.5.2, and the performance matches previous results on v0.4.9.post2.

…zy/antcode/optimize-sglang-load

garrett4wade · 2025-10-13T05:04:57Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces an effective optimization for SGLang weight loading by applying a custom patch. The changes are well-structured, adding new configuration options in cli_args.py and the patching logic in launcher.py. The performance improvement from ~60s to ~30s for weight updates is significant.

My review focuses on the integration of the patch. I've identified a critical issue in the patching logic for editable installations that could cause it to fail, along with a couple of medium-severity suggestions to improve robustness and logging. Once these points are addressed, this PR will be a great addition to accelerate model loading.

) * add patch to optimize sglang 0.5.2 weights loading * use patch/diff instead of git apply/diff

add patch to optimize sglang loading

78fb614

nuzant requested review from garrett4wade and rchardx September 11, 2025 09:51

nuzant had a problem deploying to AReaL-unittests September 11, 2025 09:51 — with GitHub Actions Failure

garrett4wade reviewed Sep 11, 2025

View reviewed changes

Comment thread areal/api/cli_args.py Outdated

Comment thread areal/api/cli_args.py Outdated

Comment thread patch/sglang/v0.4.9.post2.patch Outdated

merge

f75fdcd

nuzant had a problem deploying to AReaL-unittests October 10, 2025 08:23 — with GitHub Actions Error

.

e05bf74

nuzant had a problem deploying to AReaL-unittests October 10, 2025 09:06 — with GitHub Actions Error

nuzant and others added 3 commits October 10, 2025 17:22

.

d408ba8

.

5f8e990

.

8102814

nuzant had a problem deploying to AReaL-unittests October 10, 2025 11:07 — with GitHub Actions Error

nuzant requested a deployment to AReaL-unittests October 10, 2025 11:07 — with GitHub Actions In progress

nuzant had a problem deploying to AReaL-unittests October 10, 2025 11:07 — with GitHub Actions Error

nuzant and others added 2 commits October 10, 2025 19:08

Merge remote-tracking branch 'origin/mzy/optimize-sglang-load' into m…

5664c12

…zy/antcode/optimize-sglang-load

.

635fad3

nuzant had a problem deploying to AReaL-unittests October 10, 2025 11:31 — with GitHub Actions Error

nuzant and others added 2 commits October 10, 2025 19:31

Merge remote-tracking branch 'origin/mzy/optimize-sglang-load' into m…

f62ecc3

…zy/antcode/optimize-sglang-load

filename bins

677bd21

nuzant had a problem deploying to AReaL-unittests October 11, 2025 03:18 — with GitHub Actions Error

nuzant requested a deployment to AReaL-unittests October 11, 2025 03:18 — with GitHub Actions In progress

nuzant had a problem deploying to AReaL-unittests October 13, 2025 03:45 — with GitHub Actions Failure

format

683f95f

nuzant had a problem deploying to AReaL-unittests October 13, 2025 03:46 — with GitHub Actions Failure

garrett4wade mentioned this pull request Oct 13, 2025

fix update weights from disk in FSDP engine #443

Merged

gemini-code-assist Bot reviewed Oct 13, 2025

View reviewed changes

Comment thread areal/utils/launcher.py Outdated

Comment thread areal/utils/launcher.py Outdated

Comment thread areal/utils/launcher.py Outdated

garrett4wade reviewed Oct 13, 2025

View reviewed changes

Comment thread areal/experimental/megatron_engine.py Outdated

.

1848ea0

nuzant had a problem deploying to AReaL-unittests October 13, 2025 05:14 — with GitHub Actions Error

use diff instead of git diff

e078360

nuzant had a problem deploying to AReaL-unittests October 13, 2025 05:24 — with GitHub Actions Failure

nuzant had a problem deploying to AReaL-unittests October 13, 2025 05:24 — with GitHub Actions Error

fix patch

20ca314

nuzant had a problem deploying to AReaL-unittests October 13, 2025 05:34 — with GitHub Actions Failure

.

c2f13b4

nuzant had a problem deploying to AReaL-unittests October 13, 2025 05:37 — with GitHub Actions Failure

garrett4wade approved these changes Oct 13, 2025

View reviewed changes

garrett4wade merged commit 0ff615d into main Oct 13, 2025
1 of 4 checks passed

garrett4wade deleted the mzy/optimize-sglang-load branch October 13, 2025 06:37

leandermaben pushed a commit to leandermaben/AReaL that referenced this pull request Mar 24, 2026

[Feature] Add patch to accelerate SGLang weight loading (inclusionAI#324

080c0df

) * add patch to optimize sglang 0.5.2 weights loading * use patch/diff instead of git apply/diff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add patch to accelerate SGLang weight loading#324

[Feature] Add patch to accelerate SGLang weight loading#324
garrett4wade merged 35 commits intomainfrom
mzy/optimize-sglang-load

nuzant commented Sep 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade commented Oct 13, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nuzant commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Why we need this PR?

Example Usage

Performance and Correctness

Update

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade commented Oct 13, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nuzant commented Sep 11, 2025 •

edited

Loading