fix(sessions): prevent PydanticSerializationError when session state contains non-serializable objects#4748
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness of session state management by addressing a critical Pydantic serialization error. It ensures that agent session data, even when containing complex or non-serializable Python objects, can be reliably persisted without causing application crashes. Furthermore, it introduces a comprehensive, SQL-backed memory service with a flexible scratchpad feature, empowering agents with durable storage for their working memory and facilitating more complex, stateful interactions. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Hello @Raman369AI, thank you for creating this PR! This PR is a bug fix, could you please provide logs or a screenshot after the fix is applied? This information will help reviewers to review your PR more efficiently. Thanks! Response from ADK Triaging Agent |
There was a problem hiding this comment.
Code Review
This pull request addresses a PydanticSerializationError by making session state JSON-serializable. The fix in event_actions.py is a good approach to prevent crashes. However, this PR also introduces a very large new feature: the DatabaseMemoryService with its own storage schema, search backend, agent tools, and extensive tests. This feature is not mentioned in the pull request title or description, which makes the PR's scope unclear and hard to review. It is strongly recommended to split this into two separate pull requests: one for the PydanticSerializationError fix, and a separate, larger PR for the new DatabaseMemoryService feature with an updated title and description. This separation will make the changes easier to review, test, and merge safely. I have added a couple of specific comments on the implementation details.
| existing = await sql.get( | ||
| StorageScratchpadKV, (app_name, user_id, session_id, key) | ||
| ) | ||
| if existing is not None: | ||
| existing.value_json = value | ||
| else: | ||
| sql.add( | ||
| StorageScratchpadKV( | ||
| app_name=app_name, | ||
| user_id=user_id, | ||
| session_id=session_id, | ||
| key=key, | ||
| value_json=value, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
The current implementation for set_scratchpad uses a get followed by an add or update. This pattern is not atomic and can lead to a race condition. If two coroutines attempt to set a value for the same new key concurrently, both might find that the key doesn't exist and then both will attempt to add it. This will cause the second transaction to fail with a primary key constraint violation when it's committed.
To ensure this upsert operation is atomic and safe from race conditions, I recommend using session.merge(). It's the idiomatic SQLAlchemy ORM way to handle upserts portably across different database backends.
await sql.merge(
StorageScratchpadKV(
app_name=app_name,
user_id=user_id,
session_id=session_id,
key=key,
value_json=value,
)
)| if isinstance(obj, dict): | ||
| return {k: _make_json_serializable(v) for k, v in obj.items()} | ||
| if isinstance(obj, (list, tuple)): | ||
| return [_make_json_serializable(v) for v in obj] | ||
| try: | ||
| json.dumps(obj) | ||
| return obj | ||
| except (TypeError, ValueError): | ||
| return f'<not serializable: {type(obj).__name__}>' |
There was a problem hiding this comment.
The current implementation of _make_json_serializable doesn't handle Pydantic BaseModel instances, which are not directly serializable by the standard json library. If a Pydantic model is present in the state, it will be incorrectly marked as non-serializable (<not serializable: ...>), leading to data that could have been persisted being lost.
To make this more robust, you should add a specific check for BaseModel instances and serialize them using obj.model_dump() before proceeding with the recursive serialization. This will ensure that Pydantic models within the state are correctly persisted.
| if isinstance(obj, dict): | |
| return {k: _make_json_serializable(v) for k, v in obj.items()} | |
| if isinstance(obj, (list, tuple)): | |
| return [_make_json_serializable(v) for v in obj] | |
| try: | |
| json.dumps(obj) | |
| return obj | |
| except (TypeError, ValueError): | |
| return f'<not serializable: {type(obj).__name__}>' | |
| if isinstance(obj, BaseModel): | |
| return _make_json_serializable(obj.model_dump()) | |
| if isinstance(obj, dict): | |
| return {k: _make_json_serializable(v) for k, v in obj.items()} | |
| if isinstance(obj, (list, tuple)): | |
| return [_make_json_serializable(v) for v in obj] | |
| try: | |
| json.dumps(obj) | |
| return obj | |
| except (TypeError, ValueError): | |
| return f'<not serializable: {type(obj).__name__}>' |
… non-serializable objects When Python callables or other non-JSON-serializable values are stored in session state (e.g. via MCP tool callbacks), EventActions.state_delta and agent_state would cause DatabaseSessionService.append_event to crash with PydanticSerializationError during event.model_dump(mode="json"). Add field serializers for state_delta and agent_state that recursively convert non-serializable leaf values to a descriptive string, allowing the event to be persisted without data loss for serializable fields. Fixes google#4724
d98a2f9 to
d661e31
Compare
Reproduction & fix verificationBefore the fix — running the minimal reproduction from #4724: from google.adk.events.event_actions import EventActions
def my_callback(): pass
actions = EventActions(state_delta={"user": "alice", "callback": my_callback})
actions.model_dump(mode="json")
# PydanticSerializationError: Unable to serialize unknown type: <class 'function'>After the fix: from google.adk.events.event_actions import EventActions
def my_callback(): pass
actions = EventActions(state_delta={"user": "alice", "callback": my_callback})
result = actions.model_dump(mode="json")
print(result["state_delta"])
# {"user": "alice", "callback": "<not serializable: function>"}
# ✅ No crash — serializable values are preserved, non-serializable ones are gracefully replacedThe Also note: the previous push accidentally included an unrelated |
Summary
Fixes #4724
DatabaseSessionService.append_eventcrashes withPydanticSerializationError: Unable to serialize unknown type: <class 'function'>when session state contains non-JSON-serializable objects (e.g. Python callables stored via MCP tool callbacks).Root cause:
EventActions.state_deltais typed asdict[str, object]andagent_stateasdict[str, Any]— both accept arbitrary Python objects. WhenStorageEvent.from_event()callsevent.model_dump(mode="json"), Pydantic cannot serialize callables and raisesPydanticSerializationError, crashing the agent.Fix: Add Pydantic
@field_serializerdecorators forstate_deltaandagent_stateinEventActions. A helper_make_json_serializable()recursively walks the dict/list structure and replaces any non-JSON-serializable leaf value with a descriptive string (<not serializable: typename>), so the event is safely persisted without crashing and without losing any serializable data.Changes
src/google/adk/events/event_actions.py_make_json_serializable()helper@field_serializer('state_delta')onEventActions@field_serializer('agent_state')onEventActionsTest plan
DatabaseSessionService; confirmappend_eventno longer crashes when state contains callable values