Skip to content

Commit 29d3075

Browse files
committed
address comments
1 parent c6776a3 commit 29d3075

3 files changed

Lines changed: 5 additions & 5 deletions

File tree

Feature_Guide/Speculative_Decoding/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,6 @@ may prove simpler than generating a summary for an article. [Spec-Bench](https:/
5454
shows the performance of different speculative decoding approaches on different tasks.
5555

5656
## Speculative Decoding with Triton Inference Server
57-
Triton Inference Server supports speculative decoding on different types of Triton backend. See what a Triton backend is [here](https://github.com/triton-inference-server/tensorrtllm_backend).
57+
Triton Inference Server supports speculative decoding on different types of Triton backends. See what a Triton backend is [here](https://github.com/triton-inference-server/tensorrtllm_backend).
5858
- Follow [here](TRT-LLM/README.md) to learn how Triton Inference Server supports speculative decoding with [TensorRT-LLM Backend](https://github.com/triton-inference-server/tensorrtllm_backend).
5959
- Follow [here](vLLM/README.md) to learn how Triton Inference Server supports speculative decoding with [vLLM Backend](https://github.com/triton-inference-server/vllm_backend).

Feature_Guide/Speculative_Decoding/TRT-LLM/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ EAGLE ([paper](https://arxiv.org/pdf/2401.15077) | [github](https://github.com/S
4949
### Acquiring EAGLE Model and its Base Model
5050

5151
In this example, we will be using the [EAGLE-Vicuna-7B-v1.3](https://huggingface.co/yuhuili/EAGLE-Vicuna-7B-v1.3) model.
52-
More types of EAGLE models could be found [here](https://huggingface.co/yuhuili). The base model [Vicuna-7B-v1.3](https://huggingface.co/lmsys/vicuna-7b-v1.3) is also needed for EAGLE to work.
52+
More types of EAGLE models can be found [here](https://huggingface.co/yuhuili). The base model [Vicuna-7B-v1.3](https://huggingface.co/lmsys/vicuna-7b-v1.3) is also needed for EAGLE to work.
5353

5454
To download both models, run the following command:
5555
```bash
@@ -66,7 +66,7 @@ Launch Triton docker container with TensorRT-LLM backend.
6666
Note that we're mounting the downloaded EAGLE and base models to `/hf-models` in the docker container.
6767
Make an `engines` folder outside docker to reuse engines for future runs.
6868
Please, make sure to replace <xx.yy> with the version of Triton that you want
69-
to use (must be >= 25.01). The latest Triton Server container is recommended and could be found [here](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver/tags).
69+
to use (must be >= 25.01). The latest Triton Server container is recommended and can be found [here](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver/tags).
7070

7171
```bash
7272
docker run --rm -it --net host --shm-size=2g \
@@ -226,7 +226,7 @@ format required by Gen-AI Perf. Note that MT-bench could not be used since Gen-A
226226
```bash
227227
wget https://raw.githubusercontent.com/SafeAILab/EAGLE/main/eagle/data/humaneval/question.jsonl
228228
229-
# dataset-converter.py file can be found in the parent folder as this README.
229+
# dataset-converter.py file can be found in the parent folder of this README.
230230
python3 dataset-converter.py --input_file question.jsonl --output_file converted_humaneval.jsonl
231231
```
232232

Feature_Guide/Speculative_Decoding/vLLM/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ EAGLE ([paper](https://arxiv.org/pdf/2401.15077) | [github](https://github.com/S
4545
### Acquiring EAGLE Model and its Base Model
4646

4747
In this example, we will be using the [EAGLE-LLaMA3-Instruct-8B](https://huggingface.co/yuhuili/EAGLE-LLaMA3-Instruct-8B) model.
48-
More types of EAGLE models could be found [here](https://huggingface.co/yuhuili). The base model [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) is also needed for EAGLE to work.
48+
More types of EAGLE models can be found [here](https://huggingface.co/yuhuili). The base model [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) is also needed for EAGLE to work.
4949

5050
To download both models, run the following command:
5151
```bash

0 commit comments

Comments
 (0)