.Net: feat(connectors): Support ImageContent in tool/function results#13431
.Net: feat(connectors): Support ImageContent in tool/function results#13431Cozmopolit wants to merge 11 commits intomicrosoft:mainfrom
Conversation
Enable ImageContent preservation in function results for multimodal-capable connectors (Gemini 3+). Non-supporting connectors return clear error message. Changes: - FunctionCallsProcessor: Return object instead of string, preserve ImageContent - Gemini: Native support via FunctionResponse.Parts with inlineData - OpenAI/Bedrock Agents: Error handling with ImageContentNotSupportedErrorMessage Includes 5 new unit tests for ImageContent handling. Fixes microsoft#13430
|
@Cozmopolit Thanks for the contribution! With the latest merges into the Google connector this PR is now with a conflict, appreciate if you can take a look. |
|
Done, test file conflict resolved. |
|
Why is this staling? |
There was a problem hiding this comment.
Pull request overview
This PR updates the .NET function-calling infrastructure so that kernel functions returning ImageContent can be preserved as native multimodal content (instead of being JSON-serialized), enabling connectors like Gemini (3+) to forward image tool results to multimodal models while returning a clear error for connectors/APIs that don’t support multimodal tool outputs.
Changes:
- Updated shared function-result processing to preserve
ImageContent(changingProcessFunctionResultfromstringtoobject) and introduced a shared “not supported” error message constant. - Extended the Google Gemini request model to emit
functionResponse.parts[].inlineDataforImageContenttool results, with validation and unit tests. - Added OpenAI and agent-specific handling that converts
ImageContenttool results into a clear error message (plus targeted unit tests).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/src/SemanticKernel.UnitTests/Utilities/AIConnectors/FunctionCallsProcessorTests.cs | Adds unit coverage verifying ImageContent is preserved rather than serialized. |
| dotnet/src/InternalUtilities/connectors/AI/FunctionCalling/FunctionCallsProcessor.cs | Preserves ImageContent in processed function results; introduces a shared “not supported” error message constant. |
| dotnet/src/Connectors/Connectors.OpenAI/Core/ClientCore.ChatCompletion.cs | Detects ImageContent in tool results and emits a connector-friendly error message instead of attempting to pass images. |
| dotnet/src/Connectors/Connectors.Google/Core/Gemini/Models/GeminiRequest.cs | Converts ImageContent tool results into Gemini’s native functionResponse.parts[].inlineData format with validation. |
| dotnet/src/Connectors/Connectors.Google/Core/Gemini/Models/GeminiPart.cs | Extends Gemini function-response modeling to support nested multimodal parts (parts). |
| dotnet/src/Connectors/Connectors.Google.UnitTests/Core/Gemini/GeminiRequestTests.cs | Adds tests for inlineData generation and required validation failures (missing data/mime type). |
| dotnet/src/Agents/UnitTests/OpenAI/Internal/AssistantMessageFactoryTests.cs | Adds a unit test ensuring OpenAI Assistants path returns the expected error text for ImageContent in function results. |
| dotnet/src/Agents/OpenAI/Internal/AssistantMessageFactory.cs | Adds helper to convert function results to string while handling ImageContent as “not supported”. |
| dotnet/src/Agents/Bedrock/Extensions/BedrockAgentInvokeExtensions.cs | Adds helper to convert function results to string while handling ImageContent as “not supported”. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Followup to PR microsoft#13431 closing the only validated regression introduced by the shared FunctionCallsProcessor.ProcessFunctionResult signature change (string to object) plus small housekeeping items. Changes: - ResponseThreadActions: add internal GetFunctionResultAsString helper that returns ImageContentNotSupportedErrorMessage for ImageContent results; replace both fr.Result?.ToString() call sites. Without this, ImageContent tool results would surface as the literal type name string under the OpenAI Responses API path. - GeminiRequest: extract the synthetic functionResponse envelope to a named s_imageFunctionResponseEnvelope constant for testability and single-source maintenance. - FunctionCallsProcessor: tighten ProcessFunctionResult XML doc to enumerate the three possible return categories (string, ImageContent, JSON serialized). - GeminiRequestTests: fix CS8602 nullability warning that was failing the dotnet-build-and-test pipeline. - Add ResponseThreadActionsTests covering ImageContent rejection, string passthrough, and null handling.
Locks in the new ImageContent rejection branch in CreateRequestMessages: a tool message whose FunctionResultContent.Result is ImageContent must be sent as ImageContentNotSupportedErrorMessage rather than serialized JSON or the type name.
GHSA-pggp-6c3x-2xmx) (microsoft#13960) ### Motivation and Context `MongoDB.Driver 3.5.2` transitively introduces `Snappier 1.0.0`, which carries a high-severity vulnerability (GHSA-pggp-6c3x-2xmx): infinite loop during SnappyStream decompression of malformed framed input. All Snappier versions ≤ 1.3.0 are affected; 1.3.1 is the first patched release. This was blocking the merge queue via NU1903. ### Description - **`dotnet/Directory.Packages.props`** — Add `PackageVersion` entry pinning `Snappier` to `1.3.1`. - **`dotnet/src/VectorData/MongoDB/MongoDB.csproj`** — Add explicit `PackageReference` for `Snappier` (versionless, resolved via CPM) so NuGet treats it as a direct dependency at 1.3.1, overriding the transitive 1.0.0 from `MongoDB.Driver`. - **`dotnet/src/VectorData/CosmosMongoDB/CosmosMongoDB.csproj`** — Same override for the CosmosMongoDB connector. In NuGet's resolution algorithm, a direct reference at depth 1 wins over a transitive reference at depth 2, so this cleanly forces 1.3.1 without changing the `MongoDB.Driver` pin itself. ### Contribution Checklist - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄 <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> Please investigate a solution to snappier vulnerability fix (update version) and propose a PR with the bump, ideally following the first immediate a non breaking version bump <analysis> **Chronological Review:** 1. User asked to analyze PR microsoft#13431 (Gemini multimodal tool results in microsoft/semantic-kernel) for gaps 2. Initial analysis identified ~11 gaps with broad scope including OpenAI Assistants, AzureAI, Responses API, MistralAI 3. User invoked ouroboros skill, asked to reassess scoping to Google package only 4. Reassessment: Google connector bypasses FunctionCallsProcessor; only validated regression is OpenAI Responses API 5. User invoked ouroboros to seed the observation; created seed file `pr-13431-followup.seed.yaml` 6. User asked to execute seed + identify pipeline error 7. Implemented 5 file changes; identified pipeline error CS8602 in GeminiRequestTests.cs:809 8. All builds/tests passed; user asked about string→object impact in plan mode 9. Investigation revealed `FunctionCallsProcessor` is `internal sealed`, source-distributed, so blast radius small 10. User confirmed all implemented; ran CI-parity dotnet format via WSL2+Docker (all pass) 11. User asked to commit and push - committed `de08bce99` and pushed to Cozmopolit fork 12. User asked to check PR comments - found 2 Copilot bot review comments 13. User said: add OpenAI test (item 1), reply out-of-scope (item 2) 14. Added OpenAI ChatCompletion test, committed `e9f27d21a`, pushed, replied to both bot comments 15. User asked PR number (13431) 16. User invoked /auto pr_task - PR was green, no action needed 17. User said merge queue failed, asked to investigate 18. Investigation found: NU1903 Snappier 1.0.0 vulnerability blocking merge queue; not caused by our PR 19. **Most recent: User asked "Do we have a fix in main for this already?"** 20. **Investigation confirmed: NO fix in main** - origin/main HEAD `1a5065e5c` unchanged, `MongoDB.Driver 3.5.2` still pinned, no Snappier override, no PRs/issues for Snappier or NU1903 in the repo recently 21. **Offered to open a small fix PR pinning Snappier to a patched version** **Intent Mapping:** - User wants PR microsoft#13431 fully landed - User wants to understand whether merge queue failure is in scope - User now considering whether to fix the Snappier vulnerability separately **Technical Inventory:** - .NET 10.0, semantic-kernel repo - WSL2 + Docker (mcr.microsoft.com/dotnet/sdk:10.0) for CI parity - gh CLI for PR/CI operations - Git remotes: origin=microsoft, roger=rogerbarreto fork, cozmopolit=Cozmopolit fork (added) - Branch: `fix/multimodal-tool-results` - PR head fork: Cozmopolit, maintainerCanModify=true **Code Archaeology:** Files changed in commits `de08bce99` + `e9f27d21a`: - `dotnet/src/Agents/OpenAI/Internal/ResponseThreadActions.cs` — added GetFunctionResultAsString helper - `dotnet/src/Agents/UnitTests/OpenAI/Internal/ResponseThreadActionsTests.cs` — new (3 tests) - `dotnet/src/Connectors/Connectors.Google/Core/Gemini/Models/GeminiRequest.cs` — extracted s_imageFunctionResponseEnvelope - `dotnet/src/Connectors/Connectors.Google.UnitTests/Core/Gemini/GeminiRequestTests.cs` — CS8602 fix - `dotnet/src/InternalUtilities/connectors/AI/FunctionCalling/FunctionCallsProcessor.cs` — XML-doc tightening - `dotnet/src/Connectors/Connectors.OpenAI.UnitTests/Services/OpenAIChatCompletionServiceTests.cs` — added ItSendsImageContentNotSupportedErrorWhenToolResultIsImageContentAsync Identified vulnerable package: `MongoDB.Driver 3.5.2` in `dotnet/Directory.Packages.props:173` brings in transitive `Snappier 1.0.0` **Progress Assessment:** - ✅ All planned implementation completed - ✅ Both bot review comments replied to - ✅ All 7 todos done -⚠️ PR cannot merge due to environmental Snappier vulnerability (not our PR's fault) - 🔲 User considering whether to open a separate Snappier fix PR **Context Validation:** - PR state: OPEN, MERGEABLE, CLEAN, APPROVED (markwallace-microsoft) - HEAD SHA: `e9f27d21a85a8b97ee5f29619b23b82a03e313ff` - All required CI checks pass on PR head - Merge queue attempt failed at 2026-05-07T10:01:31, removed by github-merge-queue[bot] at 10:08:05 - Failed merge_group run: 25489152862 **Recent Commands Analysis:** Most recent two tool batches: 1. `git fetch origin main; git log origin/main --since="2026-05-06T15:00:00Z" --oneline` → only `1a5065e5c` (yesterday's commit) 2. `git show origin/main:dotnet/Directory.Packages.props | grep Snappier|MongoDB.Driver` → only `MongoDB.Driver 3.5.2`, no Snappier override 3. `gh search prs/issues --state all "Snappier"` → command failed (invalid `all` state) 4. `gh api search/issues?q=Snappier+repo:microsoft/semantic-kernel` → empty results 5. `gh api search/issues?q=NU1903+repo:microsoft/semantic-kernel` → 3 historical results from 2024, all unrelated Final response: confirmed NO fix in main, offered to open a small fix PR pinning Snappier. </analysis> <summary> 1. Conversation Overview: - Primary Objectives: - Analyze SK PR microsoft#13431 (G... </details> <!-- START COPILOT CODING AGENT SUFFIX --> Created from Copilot CLI via the copilot delegate command. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Motivation and Context
Fixes #13430
Currently, when a Semantic Kernel function returns
ImageContent, it gets serialized to JSON - losing the binary image data and preventing multimodal-capable models from processing the image.This PR enables
ImageContentpreservation in tool/function results, allowing connectors with multimodal capabilities (Gemini 3+, Anthropic) to pass images natively to the model. This is essential for agentic workflows where tools generate or process images that the model needs to analyze.Description
FunctionCallsProcessor (shared infrastructure)
ProcessFunctionResult()return type fromstringtoobjectImageContentto preserve it for multimodal-capable connectorsImageContentNotSupportedErrorMessageconstant for consistent error messagingGoogle Gemini Connector (native support)
FunctionResponsePartwithPartsproperty for nested multimodal contentFunctionResponsePartContentclass withInlineDatasupportCreateImageFunctionResponsePart()to convertImageContentto Gemini's nativeinlineDataformatOpenAI Connector (error handling)
ImageContentcheck with clear error message (API does not support images in tool results)OpenAI Agents (error handling)
GetFunctionResultAsString()helper withImageContenterror handlingAmazon Bedrock Agents (error handling)
GetFunctionResultAsString()helper withImageContenterror handlingNew Unit Tests
ItShouldPreserveImageContentWithoutSerializationFunctionCallsProcessorTests.csFromChatHistoryImageContentInToolResultCreatesInlineDataPartGeminiRequestTests.csFromChatHistoryImageContentWithoutDataThrowsInvalidOperationExceptionGeminiRequestTests.csFromChatHistoryImageContentWithoutMimeTypeThrowsInvalidOperationExceptionGeminiRequestTests.csVerifyAssistantMessageAdapterGetMessageWithImageContentInFunctionResultAssistantMessageFactoryTests.csNotes
FunctionResultContent.Resultproperty is alreadyobject?.functionResponse.parts, the API will return an appropriate error.ImageContentwith binary data is supported. URI-basedImageContentwill throwInvalidOperationException.ProcessFunctionResultimplementation.Contribution Checklist
sk-pr-MultimodalToolResults.md