fix: bridge model.generate() to agenerate() for custom columns in async engine#545
fix: bridge model.generate() to agenerate() for custom columns in async engine#545andreatgretel wants to merge 1 commit intomainfrom
Conversation
…ync engine Custom column generators that call model.generate() fail under the async engine because the sync HTTP client is unavailable. Add an _AsyncBridgedModelFacade proxy in _build_models_dict() that intercepts the sync-client RuntimeError and schedules agenerate() on the engine's persistent event loop via run_coroutine_threadsafe. Includes a deadlock guard for async custom columns running on the event loop.
Code Review: PR #545 — fix: bridge model.generate() to agenerate() for custom columns in async engineSummaryThis PR adds an Files changed: 2 (1 source, 1 test) — +160 / -3 lines. FindingsCorrectness
Design
Testing
Style / Conventions
VerdictApprove with minor suggestions. The implementation is well-structured, follows project conventions, and includes solid test coverage. The proxy pattern is clean, the deadlock guard is a thoughtful addition, and the tests cover the important scenarios. The main concern is the exact-string error matching (Finding #1), which creates a fragile coupling to an upstream error message. This is acceptable for a short-term fix but should be tracked for improvement — ideally by introducing a typed exception in the HTTP client layer. The missing kwargs assertion in the bridge test (Finding #2) is minor but easy to fix. No blocking issues found. |
Greptile SummaryIntroduces
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/column_generators/generators/custom.py | Adds _AsyncBridgedModelFacade proxy and updates _build_models_dict to wrap all facades; logic is correct — sync pass-through, async bridge via run_coroutine_threadsafe, and deadlock guard all behave as intended. |
| packages/data-designer-engine/tests/engine/column_generators/generators/test_custom.py | Adds four targeted tests (sync pass-through, async bridge, non-client-error propagation, deadlock guard); coverage is complete for the new proxy class and the patch target is correct. |
Sequence Diagram
sequenceDiagram
participant UC as User Custom Column
participant P as _AsyncBridgedModelFacade
participant F as ModelFacade
participant L as Engine Event Loop<br/>(ensure_async_engine_loop)
Note over UC,L: Sync Engine Path
UC->>P: model.generate(prompt)
P->>F: facade.generate(prompt)
F-->>P: (result, metadata)
P-->>UC: (result, metadata)
Note over UC,L: Async Engine Path (worker thread via asyncio.to_thread)
UC->>P: model.generate(prompt)
P->>F: facade.generate(prompt)
F-->>P: RuntimeError("Sync methods are not available...")
P->>P: asyncio.get_running_loop() → RuntimeError (no loop in worker thread)
P->>L: ensure_async_engine_loop()
L-->>P: persistent event loop
P->>L: run_coroutine_threadsafe(facade.agenerate(prompt))
L->>F: await facade.agenerate(prompt)
F-->>L: (result, metadata)
L-->>P: future resolved
P-->>UC: (result, metadata)
Note over UC,L: Deadlock Guard (called from event loop)
UC->>P: model.generate(prompt)
P->>F: facade.generate(prompt)
F-->>P: RuntimeError("Sync methods are not available...")
P->>P: asyncio.get_running_loop() → loop (running!)
P-->>UC: RuntimeError("Use 'await model.agenerate()'")
Reviews (1): Last reviewed commit: "feat: bridge model.generate() to agenera..." | Re-trigger Greptile
📋 Summary
Custom column generators that call
model.generate()inside their function body fail under theasync engine because the sync HTTP client is unavailable. This adds a transparent proxy that
bridges to
model.agenerate()viarun_coroutine_threadsafe, so user code works unchangedin both engines.
🔗 Related Issue
N/A - discovered via NVIDIA-NeMo/Anonymizer#119 where a
model_compat.model_generate()workaroundwas needed; this PR moves the fix into DD so all consumers get it for free.
🔄 Changes
_AsyncBridgedModelFacadeproxy class that intercepts the sync-clientRuntimeErrorandschedules
agenerate()on the engine's persistent event loop(
custom.py#L25-L72)_build_models_dict()so the bridge is transparent to user codeHttpModelClientsync-mode error (not substring matching)🧪 Testing
make testpasses (197 column generator tests)✅ Checklist