feat: enable rec fast sampler for llm beam search. by RobbieLeung · Pull Request #1224 · jd-opensource/xllm

RobbieLeung · 2026-04-08T08:12:44Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces a CUDA-optimized fast path for the RecSampler and integrates it into the LLM beam search logic. It adds two new configuration options, enable_block_copy_kernel and enable_rec_fast_sampler, and ensures their propagation across the distributed runtime and worker processes. The implementation includes a new CUDA kernel for efficient top-k post-processing and refactors the RecSampler to support instance-scoped fast path toggling. Review feedback identifies several violations of the repository style guide, including the use of plain int instead of fixed-width integers, the use of auto for simple types, and the presence of relative include paths.

DragonFive

LGTM

RobbieLeung · 2026-04-15T07:02:32Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a fast-path sampling mechanism for LLM beam search, primarily targeting CUDA environments. Key changes include the refactoring of RecSampler to support instance-scoped fast-path enablement, the addition of a specialized CUDA kernel for top-k post-processing, and the integration of these components into the LLMWorkerImpl. Additionally, the PR includes new unit tests and improved hardware-specific logging. Feedback on the implementation highlights a critical need to extend the CUDA kernel's type support; currently, it uses a dispatch macro limited to float and double, which should be expanded to include Half and BFloat16 to support standard production models.

LMX-xin · 2026-04-15T08:01:45Z

LGTM

RobbieLeung requested review from DongheJin, JimHsiung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners April 8, 2026 08:12

gemini-code-assist bot reviewed Apr 8, 2026

View reviewed changes

DragonFive previously approved these changes Apr 8, 2026

View reviewed changes

feat: enable rec fast sampler for llm beam search.

d6936d4

RobbieLeung dismissed DragonFive’s stale review via d6936d4 April 15, 2026 07:01

RobbieLeung force-pushed the feat/beam_sample_kernel branch from 8b186cb to d6936d4 Compare April 15, 2026 07:01

gemini-code-assist bot reviewed Apr 15, 2026

View reviewed changes

Comment thread xllm/core/kernels/cuda/topk_postprocess.cu

LMX-xin approved these changes Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable rec fast sampler for llm beam search.#1224

feat: enable rec fast sampler for llm beam search.#1224
RobbieLeung wants to merge 1 commit intojd-opensource:mainfrom
RobbieLeung:feat/beam_sample_kernel

RobbieLeung commented Apr 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DragonFive left a comment

Uh oh!

RobbieLeung commented Apr 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

LMX-xin commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RobbieLeung commented Apr 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DragonFive left a comment

Choose a reason for hiding this comment

Uh oh!

RobbieLeung commented Apr 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

LMX-xin commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants