Skip to content

Commit aaef80b

Browse files
committed
fix: move build_tool_calls invocation
build_tool_calls was called before streaming block and then not used in case of streaming. Rearrange condition and call to avoid wasted call. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
1 parent d4dcc21 commit aaef80b

1 file changed

Lines changed: 17 additions & 19 deletions

File tree

cli/serve/app.py

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,23 @@ async def endpoint(request: ChatCompletionRequest):
177177
model_options=model_options,
178178
)
179179

180+
# Leave as None since we don't track backend config fingerprints yet
181+
system_fingerprint = None
182+
183+
# Handle streaming response
184+
if request.stream:
185+
return StreamingResponse(
186+
stream_chat_completion_chunks(
187+
output=output,
188+
completion_id=completion_id,
189+
model=request.model,
190+
created=created_timestamp,
191+
stream_options=request.stream_options,
192+
system_fingerprint=system_fingerprint,
193+
),
194+
media_type="text/event-stream",
195+
)
196+
180197
# Extract tool calls from the ModelOutputThunk if available
181198
tool_calls_list = build_tool_calls(output)
182199
tool_calls = (
@@ -203,25 +220,6 @@ async def endpoint(request: ChatCompletionRequest):
203220
| None
204221
) = "tool_calls" if tool_calls else "stop"
205222

206-
# system_fingerprint represents backend config hash, not model name
207-
# The model name is already in response.model (line 73)
208-
# Leave as None since we don't track backend config fingerprints yet
209-
system_fingerprint = None
210-
211-
# Handle streaming response
212-
if request.stream:
213-
return StreamingResponse(
214-
stream_chat_completion_chunks(
215-
output=output,
216-
completion_id=completion_id,
217-
model=request.model,
218-
created=created_timestamp,
219-
stream_options=request.stream_options,
220-
system_fingerprint=system_fingerprint,
221-
),
222-
media_type="text/event-stream",
223-
)
224-
225223
return ChatCompletion(
226224
id=completion_id,
227225
model=request.model,

0 commit comments

Comments
 (0)