Bug: Duplicate key constraint violation for immutable entities when batches are appended with memoization enabled
Submitter's note: This issue was researched and written with the assistance of Claude Code. While I cannot personally vouch for the identification of the root cause or the proposed solution, I can confirm that the proposed workaround (GRAPH_STORE_WRITE_BATCH_MEMOIZE=false) is working and resolves the issue in our environment.
Summary
When GRAPH_STORE_WRITE_BATCH_MEMOIZE=true (the default), immutable entities created in appended batches are not added to the last_mod memoization map. This causes last_op() to return None for these entities, leading to load() returning null even when the entity exists in the pending batch. The subgraph then creates a duplicate entity, resulting in a PostgreSQL constraint violation.
Affected Subgraph
This bug was discovered while troubleshooting subgraph QmQhHo4B63yqxLsqFTMjEF6VQJ6xYorn6k5r5a8takVnJQ (a Uniswap V3 fork on BSC using the Messari subgraph methodology). The subgraph uses ActiveAccount entities marked as @entity(immutable: true) for tracking daily/hourly active users.
The issue was identified and root-caused with the assistance of Claude Code (Anthropic's CLI tool).
Error Message
duplicate key value violates unique constraint "active_account_id_key"
Key (id)=(\x...) already exists
Environment
- graph-node version: v0.41.1
GRAPH_STORE_WRITE_BATCH_MEMOIZE=true (default)
GRAPH_STORE_WRITE_BATCH_SIZE=50000
- Batching enabled (567 blocks per batch observed)
Root Cause Analysis
The Bug Location
File: graph/src/components/store/write.rs
Problem 1: append_row() doesn't update last_mod for immutable entities (lines 446-459)
fn append_row(&mut self, row: EntityModification) -> Result<(), StoreError> {
if self.immutable {
match row {
EntityModification::Insert { .. } => {
self.rows.push(row); // BUG: Does NOT call push_row(), so last_mod is not updated!
}
// ...
}
return Ok(());
}
// ... non-immutable path uses push_row() correctly
}
Compare with push_row() which correctly updates last_mod (lines 528-531):
fn push_row(&mut self, row: EntityModification) {
self.last_mod.insert(row.id().clone(), self.rows.len()); // Updates memoization map
self.rows.push(row);
}
Problem 2: last_op() returns early when entity not in last_mod (lines 386-398)
pub fn last_op(&self, key: &EntityKey, at: BlockNumber) -> Option<EntityOp<'_>> {
if ENV_VARS.store.write_batch_memoize {
let idx = *self.last_mod.get(&key.entity_id)?; // BUG: The ? returns None immediately!
// ... quick lookup using memoized index
}
// Fallback scan - NEVER REACHED if entity not in last_mod!
self.rows
.iter()
.rev()
.filter(|emod| emod.id() == &key.entity_id)
.find(|emod| emod.block() <= at)
.map(|emod| emod.as_entity_op(at))
}
The ? operator causes an early return of None when the entity is not in last_mod, completely bypassing the fallback scan that would have found the entity in rows.
Bug Scenario
-
Block N processes, creates immutable entity X
as_modifications() calls push() → X added to Batch1's rows AND last_mod
transact_block_operations() pushes Batch1 to queue
-
Block N+1 processes, creates immutable entity Y
as_modifications() calls push() → Y added to Batch2's rows AND last_mod
transact_block_operations() calls push_write(Batch2)
- Since batching is enabled,
Batch1.append(Batch2) is called
append_row() for Y (immutable): rows.push(Y) but last_mod NOT updated
- Batch1 now has Y in
rows but NOT in last_mod
-
Block N+K processes, handler calls load(Y) (same entity ID, e.g., same user same day)
EntityCache.get() → Queue.get() → Batch.last_op(Y)
last_mod.get(Y) returns None → ? returns None immediately
- Fallback scan is never executed
load(Y) returns null even though Y exists in rows
-
Handler creates duplicate new ActiveAccount(Y).save()
- Insert(Y) is added to the current block's modifications
-
Database write fails with duplicate key constraint violation
Steps to Reproduce
- Deploy a subgraph with an immutable entity type (e.g.,
ActiveAccount @entity(immutable: true))
- Ensure batching is enabled (default when subgraph is behind chain head)
- Ensure memoization is enabled (
GRAPH_STORE_WRITE_BATCH_MEMOIZE=true, the default)
- Index blocks where the same immutable entity ID would be created across different blocks
- Example:
daily-${userId}-${day} pattern where same user transacts in multiple blocks on the same day
- With enough blocks being batched together, the bug manifests as duplicate key errors
Workaround
Disable batch memoization:
export GRAPH_STORE_WRITE_BATCH_MEMOIZE=false
This forces last_op() to always use the fallback scan, which correctly finds entities in rows regardless of whether they're in last_mod. Performance may be impacted for large batches.
Proposed Fix
Option 1: Fix append_row() to update last_mod for immutable entities
fn append_row(&mut self, row: EntityModification) -> Result<(), StoreError> {
if self.immutable {
match row {
EntityModification::Insert { .. } => {
self.push_row(row); // Use push_row() which updates last_mod
}
// ...
}
return Ok(());
}
// ...
}
Option 2: Fix last_op() to fall through to scan when entity not in last_mod
pub fn last_op(&self, key: &EntityKey, at: BlockNumber) -> Option<EntityOp<'_>> {
if ENV_VARS.store.write_batch_memoize {
if let Some(&idx) = self.last_mod.get(&key.entity_id) { // Use if-let instead of ?
if let Some(op) = self.rows.get(idx).and_then(|emod| {
if emod.block() <= at {
Some(emod.as_entity_op(at))
} else {
None
}
}) {
return Some(op);
}
}
// Fall through to scan if not in last_mod
}
// Fallback scan (always executed if memoization fails)
self.rows
.iter()
.rev()
.filter(|emod| emod.id() == &key.entity_id)
.find(|emod| emod.block() <= at)
.map(|emod| emod.as_entity_op(at))
}
Option 1 is preferred as it maintains the performance benefit of memoization.
Additional Notes
- The
write_batch_memoize feature was added relatively recently with a comment suggesting removal "after 2025-07-01 if there have been no issues with it" (see graph/src/env/store.rs:134-135)
- This bug only affects immutable entities when batches are appended (i.e., when batching is enabled and multiple blocks are combined into a single write)
- The bug is timing-dependent: it requires the same immutable entity ID to be created in different blocks that end up in the same combined batch
Related Code Paths
graph/src/components/store/write.rs: RowGroup::append_row(), RowGroup::push_row(), RowGroup::last_op()
store/postgres/src/writable.rs: Queue::get(), BlockTracker::find_map(), Queue::push_write()
graph/src/components/store/entity_cache.rs: EntityCache::get(), EntityCache::as_modifications()
Bug: Duplicate key constraint violation for immutable entities when batches are appended with memoization enabled
Summary
When
GRAPH_STORE_WRITE_BATCH_MEMOIZE=true(the default), immutable entities created in appended batches are not added to thelast_modmemoization map. This causeslast_op()to returnNonefor these entities, leading toload()returning null even when the entity exists in the pending batch. The subgraph then creates a duplicate entity, resulting in a PostgreSQL constraint violation.Affected Subgraph
This bug was discovered while troubleshooting subgraph
QmQhHo4B63yqxLsqFTMjEF6VQJ6xYorn6k5r5a8takVnJQ(a Uniswap V3 fork on BSC using the Messari subgraph methodology). The subgraph usesActiveAccountentities marked as@entity(immutable: true)for tracking daily/hourly active users.The issue was identified and root-caused with the assistance of Claude Code (Anthropic's CLI tool).
Error Message
Environment
GRAPH_STORE_WRITE_BATCH_MEMOIZE=true(default)GRAPH_STORE_WRITE_BATCH_SIZE=50000Root Cause Analysis
The Bug Location
File:
graph/src/components/store/write.rsProblem 1:
append_row()doesn't updatelast_modfor immutable entities (lines 446-459)Compare with
push_row()which correctly updateslast_mod(lines 528-531):Problem 2:
last_op()returns early when entity not inlast_mod(lines 386-398)The
?operator causes an early return ofNonewhen the entity is not inlast_mod, completely bypassing the fallback scan that would have found the entity inrows.Bug Scenario
Block N processes, creates immutable entity X
as_modifications()callspush()→ X added to Batch1'srowsANDlast_modtransact_block_operations()pushes Batch1 to queueBlock N+1 processes, creates immutable entity Y
as_modifications()callspush()→ Y added to Batch2'srowsANDlast_modtransact_block_operations()callspush_write(Batch2)Batch1.append(Batch2)is calledappend_row()for Y (immutable):rows.push(Y)butlast_modNOT updatedrowsbut NOT inlast_modBlock N+K processes, handler calls
load(Y)(same entity ID, e.g., same user same day)EntityCache.get()→Queue.get()→Batch.last_op(Y)last_mod.get(Y)returnsNone→?returnsNoneimmediatelyload(Y)returns null even though Y exists inrowsHandler creates duplicate
new ActiveAccount(Y).save()Database write fails with duplicate key constraint violation
Steps to Reproduce
ActiveAccount @entity(immutable: true))GRAPH_STORE_WRITE_BATCH_MEMOIZE=true, the default)daily-${userId}-${day}pattern where same user transacts in multiple blocks on the same dayWorkaround
Disable batch memoization:
export GRAPH_STORE_WRITE_BATCH_MEMOIZE=falseThis forces
last_op()to always use the fallback scan, which correctly finds entities inrowsregardless of whether they're inlast_mod. Performance may be impacted for large batches.Proposed Fix
Option 1: Fix
append_row()to updatelast_modfor immutable entitiesOption 2: Fix
last_op()to fall through to scan when entity not inlast_modOption 1 is preferred as it maintains the performance benefit of memoization.
Additional Notes
write_batch_memoizefeature was added relatively recently with a comment suggesting removal "after 2025-07-01 if there have been no issues with it" (seegraph/src/env/store.rs:134-135)Related Code Paths
graph/src/components/store/write.rs:RowGroup::append_row(),RowGroup::push_row(),RowGroup::last_op()store/postgres/src/writable.rs:Queue::get(),BlockTracker::find_map(),Queue::push_write()graph/src/components/store/entity_cache.rs:EntityCache::get(),EntityCache::as_modifications()