triton-inference-server
diff --git a/‎AI_Agents_Guide/README.md‎
Lines changed: 0 additions & 62 deletions b/‎AI_Agents_Guide/README.md‎
Lines changed: 0 additions & 62 deletions
diff --git a/‎…nts_Guide/Constrained_Decoding/README.md‎ ‎…ure_Guide/Constrained_Decoding/README.md‎AI_Agents_Guide/Constrained_Decoding/README.md renamed to Feature_Guide/Constrained_Decoding/README.md b/‎…nts_Guide/Constrained_Decoding/README.md‎ ‎…ure_Guide/Constrained_Decoding/README.md‎AI_Agents_Guide/Constrained_Decoding/README.md renamed to Feature_Guide/Constrained_Decoding/README.md
diff --git a/‎…Constrained_Decoding/artifacts/client.py‎ ‎…Constrained_Decoding/artifacts/client.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/client.py renamed to Feature_Guide/Constrained_Decoding/artifacts/client.py b/‎…Constrained_Decoding/artifacts/client.py‎ ‎…Constrained_Decoding/artifacts/client.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/client.py renamed to Feature_Guide/Constrained_Decoding/artifacts/client.py
diff --git a/‎…ained_Decoding/artifacts/client_utils.py‎ ‎…ained_Decoding/artifacts/client_utils.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/client_utils.py renamed to Feature_Guide/Constrained_Decoding/artifacts/client_utils.py b/‎…ained_Decoding/artifacts/client_utils.py‎ ‎…ained_Decoding/artifacts/client_utils.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/client_utils.py renamed to Feature_Guide/Constrained_Decoding/artifacts/client_utils.py
diff --git a/‎…/Constrained_Decoding/artifacts/utils.py‎ ‎…/Constrained_Decoding/artifacts/utils.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/utils.py renamed to Feature_Guide/Constrained_Decoding/artifacts/utils.py b/‎…/Constrained_Decoding/artifacts/utils.py‎ ‎…/Constrained_Decoding/artifacts/utils.py‎AI_Agents_Guide/Constrained_Decoding/artifacts/utils.py renamed to Feature_Guide/Constrained_Decoding/artifacts/utils.py
diff --git a/‎…_Agents_Guide/Function_Calling/README.md‎ ‎Feature_Guide/Function_Calling/README.md‎AI_Agents_Guide/Function_Calling/README.md renamed to Feature_Guide/Function_Calling/README.md b/‎…_Agents_Guide/Function_Calling/README.md‎ ‎Feature_Guide/Function_Calling/README.md‎AI_Agents_Guide/Function_Calling/README.md renamed to Feature_Guide/Function_Calling/README.md
diff --git a/‎…ide/Function_Calling/artifacts/client.py‎ ‎…ide/Function_Calling/artifacts/client.py‎AI_Agents_Guide/Function_Calling/artifacts/client.py renamed to Feature_Guide/Function_Calling/artifacts/client.py b/‎…ide/Function_Calling/artifacts/client.py‎ ‎…ide/Function_Calling/artifacts/client.py‎AI_Agents_Guide/Function_Calling/artifacts/client.py renamed to Feature_Guide/Function_Calling/artifacts/client.py
diff --git a/‎…nction_Calling/artifacts/client_utils.py‎ ‎…nction_Calling/artifacts/client_utils.py‎AI_Agents_Guide/Function_Calling/artifacts/client_utils.py renamed to Feature_Guide/Function_Calling/artifacts/client_utils.py b/‎…nction_Calling/artifacts/client_utils.py‎ ‎…nction_Calling/artifacts/client_utils.py‎AI_Agents_Guide/Function_Calling/artifacts/client_utils.py renamed to Feature_Guide/Function_Calling/artifacts/client_utils.py
diff --git a/‎…lling/artifacts/system_prompt_schema.yml‎ ‎…lling/artifacts/system_prompt_schema.yml‎AI_Agents_Guide/Function_Calling/artifacts/system_prompt_schema.yml renamed to Feature_Guide/Function_Calling/artifacts/system_prompt_schema.yml b/‎…lling/artifacts/system_prompt_schema.yml‎ ‎…lling/artifacts/system_prompt_schema.yml‎AI_Agents_Guide/Function_Calling/artifacts/system_prompt_schema.yml renamed to Feature_Guide/Function_Calling/artifacts/system_prompt_schema.yml
diff --git a/‎Feature_Guide/Speculative_Decoding/README.md‎
Lines changed: 3 additions & 1 deletion b/‎Feature_Guide/Speculative_Decoding/README.md‎
Lines changed: 3 additions & 1 deletion
@@ -54,4 +54,6 @@ may prove simpler than generating a summary for an article. [Spec-Bench](https:/
 shows the performance of different speculative decoding approaches on different tasks.
 
 ## Speculative Decoding with Triton Inference Server
-Follow [here](TRT-LLM/README.md) to learn how Triton Inference Server supports speculative decoding with [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM).
+ Triton Inference Server supports speculative decoding on different types of Triton backends. See what a Triton backend is [here](https://github.com/triton-inference-server/backend).
+- Follow [here](TRT-LLM/README.md) to learn how Triton Inference Server supports speculative decoding with [TensorRT-LLM Backend](https://github.com/triton-inference-server/tensorrtllm_backend).
+- Follow [here](vLLM/README.md) to learn how Triton Inference Server supports speculative decoding with [vLLM Backend](https://github.com/triton-inference-server/vllm_backend).