Skip to content

Crash when trying to replicate a ref table that is actually dropped #6592

@agedemenli

Description

@agedemenli

Steps to reproduce:

CREATE TABLE ref_table (key int);
SELECT create_reference_table('ref_table');
SET citus.enable_ddl_propagation TO OFF;
DROP TABLE ref_table CASCADE; -- citus_drop_all_shards doesn't drop shards and metadata

SELECT 1 FROM master_add_node('localhost', 9700, groupId => 0); -- a new node without the ref table
CREATE TABLE citus_local_table(id int, other_column int);
SELECT citus_add_local_table_to_metadata('citus_local_table'); -- triggers EnsureReferenceTablesExistOnAllNodes
-- ref_table has placements on both workers, but not on the coordinator
-- EnsureReferenceTablesExistOnAllNodes tries to replicate it to the coordinator

DROP TABLE triggers some other drop functions:

  • master_remove_distributed_table_metadata_from_workers
  • citus_drop_all_shards
  • master_remove_partition_metadata

but these functions do an early return without dropping anything, because enable_ddl_propagation is set to off.
The actual table is dropped on the coordinator, but the shards remain on the workers. We don't do anything on pg_dist_partition, pg_dist_shard and pg_dist_placement as well. So the metadata actually says that we still have a reference table, which has shard placements on both workers.

When master_add_node is called, it introduces a new node which doesn't have the reference table on it.

citus_add_local_table_to_metadata calls EnsureReferenceTablesExistOnAllNodes, which fetches all reference table records from pg_dist_partition, including the dropped one. Then, because there is no placement on the new node for the reference table, as pg_dist_placement tells us, it tries to replicate the reference table to the new node, by calling citus_copy_shard_placement.

citus_copy_shard_placement crashes during the recreation of the table on the target node, because it can't find the name/schema for the given relationId, as that table doesn't exist anymore.

In short, I think we have two separate issues here

  1. DROP TABLE doesn't drop shards and the metadata when ddl propagation is off. We can maybe leave shards on workers, but we should delete the pg_dist_ entries for the table, as we actually don't have that reference/distributed table anymore. This will prevent the crash as there will be no entry in pg_dist_partition for the table anymore.
    The shards left on the workers can be cleaned up right away ignoring the GUC, or we can create cleanup records for them to make them dropped once the GUC is set to on again. We can also consider removing the GUC in the future.
  2. We should add some checks & errors to the citus_copy_shard_placement logic so it should never crash for cases like this but gracefully error out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    backportdev_reportedto track issues reported by the development team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions