Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 27 additions & 14 deletions docs/decisions/0022-chat-history-persistence-consistency.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The persistence timing and `FunctionResultContent` trimming behaviors are interr
## Considered Options

- Option 1: Per-run persistence with opt-in FRC (FunctionResultContent) trimming
- Option 2: Opt-in per-service-call persistence (via `SimulateServiceStoredChatHistory`)
- Option 2: Opt-in per-service-call persistence (via `RequirePerServiceCallChatHistoryPersistence`)

## Pros and Cons of the Options

Expand All @@ -57,12 +57,12 @@ Keep the current default behavior of persisting chat history only at the end of
- Bad, because if the process crashes mid-loop, all intermediate progress from the current run is lost, not satisfying driver C.
- Bad, because this option alone does not provide a way for users to opt into per-service-call persistence, not satisfying driver E.

### Option 2: Opt-in per-service-call persistence (via `SimulateServiceStoredChatHistory`)
### Option 2: Opt-in per-service-call persistence (via `RequirePerServiceCallChatHistoryPersistence`)

Introduce an optional SimulateServiceStoredChatHistory setting to persist chat history after each individual service call within the FIC loop, matching the AI service's behavior. Trailing `FunctionResultContent` trimming is unnecessary with this approach (it is naturally handled).
Introduce an optional RequirePerServiceCallChatHistoryPersistence setting to persist chat history after each individual service call within the FIC loop, matching the AI service's behavior. Trailing `FunctionResultContent` trimming is unnecessary with this approach (it is naturally handled).

Settings:
- `SimulateServiceStoredChatHistory` = `true`
- `RequirePerServiceCallChatHistoryPersistence` = `true`

- Good, because the stored history matches the service's behavior when opting in for both timing and content, fully satisfying driver A.
- Good, because intermediate progress is preserved if the process is interrupted, satisfying driver C.
Expand All @@ -73,36 +73,49 @@ Settings:

## Decision Outcome

Chosen option: **Option 2: Opt-in per-service-call persistence (via `SimulateServiceStoredChatHistory`)**. The existing per-run persistence behavior is retained as-is, requiring no changes from users. Per-service-call persistence is available as an opt-in feature via the `SimulateServiceStoredChatHistory` setting. This satisfies drivers B (atomicity) and D (simplicity) for the common case, while fully satisfying driver A (consistency) for users who opt into simulated service-stored behavior. Users who need per-service-call persistence for recoverability (driver C) can enable it explicitly.
Chosen option: **Option 2: Opt-in per-service-call persistence (via `RequirePerServiceCallChatHistoryPersistence`)**. The existing per-run persistence behavior is retained as-is, requiring no changes from users. Per-service-call persistence is available as an opt-in feature via the `RequirePerServiceCallChatHistoryPersistence` setting. This satisfies drivers B (atomicity) and D (simplicity) for the common case, while fully satisfying driver A (consistency) for users who opt into simulated service-stored behavior. Users who need per-service-call persistence for recoverability (driver C) can enable it explicitly.

### Configuration Matrix

The behavior depends on the combination of `UseProvidedChatClientAsIs` and `SimulateServiceStoredChatHistory`:
The behavior depends on the combination of `UseProvidedChatClientAsIs` and `RequirePerServiceCallChatHistoryPersistence`:

| `UseProvidedChatClientAsIs` | `SimulateServiceStoredChatHistory` | Behavior |
| `UseProvidedChatClientAsIs` | `RequirePerServiceCallChatHistoryPersistence` | Behavior |
|---|---|---|
| `false` (default) | `false` (default) | **Per-run persistence.** Messages are persisted at the end of the full agent run via the `ChatHistoryProvider`. |
| `false` | `true` | **Per-service-call persistence (simulated).** A `ServiceStoredSimulatingChatClient` middleware is automatically injected into the chat client pipeline between `FunctionInvokingChatClient` and the leaf `IChatClient`. Messages are persisted after each service call. A sentinel `ConversationId` causes FIC to treat the conversation as service-managed. |
| `false` | `true` | **Per-service-call persistence (simulated).** A `PerServiceCallChatHistoryPersistingChatClient` middleware is automatically injected into the chat client pipeline between `FunctionInvokingChatClient` and the leaf `IChatClient`. Messages are persisted after each service call. A sentinel `ConversationId` causes FIC to treat the conversation as service-managed. |
| `true` | `false` | **Per-run persistence.** No middleware is injected because the user has provided a custom chat client stack. Messages are persisted at the end of the run. |
| `true` | `true` | **User responsibility.** The system checks whether the custom chat client stack includes a `ServiceStoredSimulatingChatClient`. If not, a warning is emitted — the user is expected to have added their own per-service-call persistence mechanism. End-of-run persistence is skipped. |
| `true` | `true` | **User responsibility.** The system checks whether the custom chat client stack includes a `PerServiceCallChatHistoryPersistingChatClient`. If not, a warning is emitted — the user is expected to have added their own per-service-call persistence mechanism. End-of-run persistence is skipped. |

### Consequences

- Good, because per-run persistence is atomic by default — chat history is only updated when the full run succeeds, satisfying driver B.
- Good, because the default mental model is simple: one run = one history update, satisfying driver D.
- Good, because users who opt into `SimulateServiceStoredChatHistory` get stored history that matches the service's behavior for both timing and content, fully satisfying driver A.
- Good, because users who opt into `RequirePerServiceCallChatHistoryPersistence` get stored history that matches the service's behavior for both timing and content, fully satisfying driver A.
- Good, because per-service-call persistence preserves intermediate progress if the process is interrupted, satisfying driver C when opted in.
- Good, because no separate `FunctionResultContent` trimming logic is needed when per-service-call persistence is active — it is naturally handled.
- Good, because conflict detection (configurable via `ThrowOnChatHistoryProviderConflict`, `WarnOnChatHistoryProviderConflict`, `ClearOnChatHistoryProviderConflict`) prevents misconfiguration when a service returns a `ConversationId` alongside a configured `ChatHistoryProvider`.
- Bad, because per-service-call persistence (when opted in) may leave chat history in an incomplete state if the run fails mid-loop (e.g., `FunctionCallContent` stored without corresponding `FunctionResultContent`), requiring manual recovery in rare cases.
- Neutral, because users who want per-service-call consistency can opt in via `SimulateServiceStoredChatHistory = true`, satisfying driver E.
- Neutral, because users who want per-service-call consistency can opt in via `RequirePerServiceCallChatHistoryPersistence = true`, satisfying driver E.
- Neutral, because increased write frequency from per-service-call persistence may impact performance for some storage backends; this can be mitigated with a caching decorator.

### Implementation Notes

#### Conversation ID Consistency

We should introduce a separate `ConversationIdPersistingChatClient`, middleware which allows us to
persist response `ConversationIds` during the FICC loop. This could be used with or without
`ServiceStoredSimulatingChatClient`.
When `RequirePerServiceCallChatHistoryPersistence` is enabled, the `PerServiceCallChatHistoryPersistingChatClient`
decorator also updates `session.ConversationId` after each service call. This handles two scenarios:

1. **Framework-managed chat history** — the decorator sets a sentinel `ConversationId` on the response
so that `FunctionInvokingChatClient` treats the conversation as service-managed (clearing accumulated
history between iterations and not injecting duplicate `FunctionCallContent` during approval processing).

2. **Service-stored chat history** — when the service returns a real `ConversationId`, the decorator
updates `session.ConversationId` immediately after each service call, rather than deferring the update
to the end of the run. This ensures intermediate ConversationId changes are captured even if the
process is interrupted mid-loop.

For some service-stored scenarios (e.g., the Conversations API with the Responses API), there is only
one thread with one ID, so every service call returns the same ConversationId and this per-call update
makes no practical difference. Enabling `RequirePerServiceCallChatHistoryPersistence` ensures consistent
per-service-call behavior across all service types regardless of how they manage ConversationIds.

Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
// Copyright (c) Microsoft. All rights reserved.

// This sample demonstrates how the ChatClientAgent persists chat history after each individual
// call to the AI service, using the SimulateServiceStoredChatHistory option.
// call to the AI service, using the RequirePerServiceCallChatHistoryPersistence option.
// When an agent uses tools, FunctionInvokingChatClient may loop multiple times
// (service call → tool execution → service call), and intermediate messages (tool calls and
// results) are persisted after each service call. This allows you to inspect or recover them
// even if the process is interrupted mid-loop, but may also result in chat history that is not
// yet finalized (e.g., tool calls without results) being persisted, which may be undesirable in some cases.
//
// To use end-of-run persistence instead (atomic run semantics), remove the
// SimulateServiceStoredChatHistory = true setting (or set it to false). End-of-run
// RequirePerServiceCallChatHistoryPersistence = true setting (or set it to false). End-of-run
// persistence is the default behavior.
//
// The sample runs two multi-turn conversations: one using non-streaming (RunAsync) and one
Expand Down Expand Up @@ -54,7 +54,7 @@ static string GetTime([Description("The city name.")] string city) =>
_ => $"{city}: time data not available."
};

// Create the agent — per-service-call persistence is enabled via SimulateServiceStoredChatHistory.
// Create the agent — per-service-call persistence is enabled via RequirePerServiceCallChatHistoryPersistence.
// The in-memory ChatHistoryProvider is used by default when the service does not require service stored chat
// history, so for those cases, we can inspect the chat history via session.TryGetInMemoryChatHistory().
IChatClient chatClient = string.Equals(store, "TRUE", StringComparison.OrdinalIgnoreCase) ?
Expand All @@ -64,7 +64,7 @@ static string GetTime([Description("The city name.")] string city) =>
new ChatClientAgentOptions
{
Name = "WeatherAssistant",
SimulateServiceStoredChatHistory = true,
RequirePerServiceCallChatHistoryPersistence = true,
ChatOptions = new()
{
Instructions = "You are a helpful assistant. When asked about multiple cities, call the appropriate tool for each city.",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# In-Function-Loop Checkpointing

This sample demonstrates how `ChatClientAgent` can persist chat history after each individual call to the AI service using the `SimulateServiceStoredChatHistory` option. This per-service-call persistence ensures intermediate progress is saved during the function invocation loop.
This sample demonstrates how `ChatClientAgent` can persist chat history after each individual call to the AI service using the `RequirePerServiceCallChatHistoryPersistence` option. This per-service-call persistence ensures intermediate progress is saved during the function invocation loop.

## What This Sample Shows

When an agent uses tools, the `FunctionInvokingChatClient` loops multiple times (service call → tool execution → service call → …). By enabling `SimulateServiceStoredChatHistory = true`, chat history is persisted after each service call via the `ServiceStoredSimulatingChatClient` decorator:
When an agent uses tools, the `FunctionInvokingChatClient` loops multiple times (service call → tool execution → service call → …). By enabling `RequirePerServiceCallChatHistoryPersistence = true`, chat history is persisted after each service call via the `PerServiceCallChatHistoryPersistingChatClient` decorator:

- A `ServiceStoredSimulatingChatClient` decorator is inserted into the chat client pipeline
- A `PerServiceCallChatHistoryPersistingChatClient` decorator is inserted into the chat client pipeline
- Before each service call, the decorator loads history from the `ChatHistoryProvider` and prepends it to the request
- After each service call, the decorator notifies the `ChatHistoryProvider` (and any `AIContextProvider` instances) with the new messages
- Only **new** messages are sent to providers on each notification — messages that were already persisted in an earlier call within the same run are deduplicated automatically

By default (without `SimulateServiceStoredChatHistory`), chat history is persisted at the end of the full agent run instead. To use per-service-call persistence, set `SimulateServiceStoredChatHistory = true` on `ChatClientAgentOptions`.
By default (without `RequirePerServiceCallChatHistoryPersistence`), chat history is persisted at the end of the full agent run instead. To use per-service-call persistence, set `RequirePerServiceCallChatHistoryPersistence = true` on `ChatClientAgentOptions`.

With `SimulateServiceStoredChatHistory` = true, the behavior matches that of chat history stored in the underlying AI service exactly.
With `RequirePerServiceCallChatHistoryPersistence` = true, the behavior matches that of chat history stored in the underlying AI service exactly.

Per-service-call persistence is useful for:
- **Crash recovery** — if the process is interrupted mid-loop, the intermediate tool calls and results are already persisted
Expand All @@ -29,7 +29,7 @@ The sample asks the agent about the weather and time in three cities. The model
```
ChatClientAgent
└─ FunctionInvokingChatClient (handles tool call loop)
└─ ServiceStoredSimulatingChatClient (persists after each service call)
└─ PerServiceCallChatHistoryPersistingChatClient (persists after each service call)
└─ Leaf IChatClient (Azure OpenAI)
```

Expand Down
Loading
Loading