Skip to content

Commit 31efda7

Browse files
committed
docs: update speculative decoding guide to use LLM API / PyTorch backend
Replace the legacy TRT engine backend approach (trtllm-build, inflight_batcher_llm, fill_template.py) with the modern LLM API / PyTorch backend workflow. Update EAGLE section to use EAGLE 3 with Llama-3.1-8B-Instruct, add deprecation notice for MEDUSA (unsupported on PyTorch backend), and update Draft Model section to use DraftTargetDecodingConfig via model.yaml. Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
1 parent 6b48ce4 commit 31efda7

1 file changed

Lines changed: 95 additions & 287 deletions

File tree

  • Feature_Guide/Speculative_Decoding/TRT-LLM

0 commit comments

Comments
 (0)