Skip to content

[spanner-to-sourcedb] Add Integration Tests for retryDLQ and retryAllDLQ mode for sharded and non-sharded setup#3564

Merged
aasthabharill merged 27 commits intomainfrom
aastha-dlq-it
Apr 14, 2026
Merged

[spanner-to-sourcedb] Add Integration Tests for retryDLQ and retryAllDLQ mode for sharded and non-sharded setup#3564
aasthabharill merged 27 commits intomainfrom
aastha-dlq-it

Conversation

@aasthabharill
Copy link
Copy Markdown
Member

@aasthabharill aasthabharill commented Mar 26, 2026

b/457948107

This PR introduces comprehensive integration tests for the Dead Letter Queue (DLQ) retry mechanisms in the Spanner-to-Source Dataflow template. It adds end-to-end verification for both the concurrent batch retry flow (retryDLQ) and the streaming retry flow (retryAllDLQ), covering both non-sharded and multi-shard schema routing scenarios.

Tests Added

  1. SpannerToSourceDBMySQLRetryDLQIT: Tests the retryDLQ batch job. Validates that it correctly processes and retries DLQ events alongside an actively running streaming pipeline by utilizing the active dlqPubSubConsumer flow. Uses the overrides file.
  2. SpannerToSourceDBMySQLRetryAllDLQIT: Tests the retryAllDLQ batch job. Validates that it correctly processes and retries ALL DLQ events offline when the main pipeline has been safely drained or stopped, utilizing the file-based consumer. Uses the overrides file.
  3. SpannerToSourceDBShardedMySQLRetryDLQIT: The sharded equivalent of retryDLQ. Validates that the pipeline successfully evaluates native migration_shard_id columns in the source Spanner database to reliably route DLQ/retry events across multiple distributed target instances. Uses session file.
  4. SpannerToSourceDBShardedMySQLRetryAllDLQIT: The sharded equivalent of retryAllDLQ. Validates the behavior of dynamic custom shard routing (where ShardIdColumn is not present) by relying on a custom Java shard ID fetcher. Uses overrides file.

Features & Edge Cases Covered

  • DLQ State Integrity: Consistently verifying that fixed items successfully rewrite to MySQL upon retry, while genuinely un-fixable items explicitly route back into their appropriate (severe/ or retry/) error buckets without stalling progress.
  • Handling Database Constraints: Generating and resolving retriable database exceptions, including testing resilience against missing parent rows (Foreign Key Violations).
  • Handling Logic/Processing Errors: Simulating unrecoverable severe errors originating inside the runtime execution pipeline (failed custom transformations).
  • Active User-Intervention Simulation: Orchestrating mid-flight environmental repairs. Before retry pipelines are run, the tests actively "fix" the previously generated errors by manually injecting parent rows via JDBC queries and swapping the custom pipeline transformer from bad mode to semi-fixed mode.
  • Multi-Shard Target Routing: Validating sharding logic by testing both ShardIdColumn flow and Custom Sharding Jar flow.

Test Setup & Simulated Schema Divergences

To accurately simulate volatile production environments, these integration tests operate against highly divergent test schemas between the Spanner source and MySQL target:

  • Robust type mapping validations utilizing a heavily populated, extensive AllDataTypes reference table layout.
  • Mismatched Primary Keys: Guaranteeing successful execution logic in pipelines where the Spanner structure and downstream Source primary keys intentionally diverge.
  • Altered / Divergent Columns: Accommodating architectures where columns have been asynchronously added, deleted, or explicitly renamed across Spanner / Source database platforms.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Dead Letter Queue (DLQ) handling capabilities for DataStreamToSpanner and SpannerToSourceDb templates. It introduces a new 'one-shot' re-consumer for severe errors and a retryAllDLQ run mode, providing more flexible and robust options for error recovery. These changes allow users to either concurrently process severe errors alongside a running pipeline or perform a full drain of all DLQ errors when the main pipeline is offline, improving the overall resilience and manageability of migration pipelines.

Highlights

  • Enhanced DLQ Re-consumption: Introduced a new 'one-shot' DLQ re-consumer (dlqOneShotReconsumer) in DeadLetterQueueManager and FileBasedDeadLetterQueueReconsumer to process severe errors from the DLQ directory for a specific time range.
  • New retryAllDLQ Run Mode: Added a new retryAllDLQ run mode for both the DataStreamToSpanner and SpannerToSourceDb templates. This mode allows draining both retryable and severe DLQ errors when the main pipeline is not running, providing a comprehensive error recovery mechanism.
  • Updated Documentation for DLQ Modes: Revised the README.md files for sourcedb-to-spanner and spanner-to-sourcedb to clearly explain the new retryAllDLQ mode, clarify the usage of retryDLQ mode, and provide guidance on end-state monitoring for DLQ processing.
  • Constants for Run Modes: Refactored run mode string literals into dedicated constants within Constants.java to improve code readability and maintainability across the templates.
  • New Integration Test for SpannerToSourceDb Retry: Added a new integration test (SpannerToSrcDBMySQLAllDataTypesRetryIT) to validate the retry logic and the new retryAllDLQ mode for the SpannerToSourceDb template, including custom transformations and partial fixes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 0% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.49%. Comparing base (5baa012) to head (203cfdb).
⚠️ Report is 91 commits behind head on main.

Files with missing lines Patch % Lines
...m/custom/SpannerToSourceDbRetryTransformation.java 0.00% 44 Missing ⚠️
...ava/com/custom/CustomShardIdFetcherForRetryIT.java 0.00% 21 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3564      +/-   ##
============================================
+ Coverage     52.20%   52.49%   +0.28%     
- Complexity     6060     6225     +165     
============================================
  Files          1040     1062      +22     
  Lines         63059    64136    +1077     
  Branches       6912     7089     +177     
============================================
+ Hits          32923    33666     +743     
- Misses        27909    28188     +279     
- Partials       2227     2282      +55     
Components Coverage Δ
spanner-templates 72.16% <0.00%> (+<0.01%) ⬆️
spanner-import-export 68.89% <ø> (-0.04%) ⬇️
spanner-live-forward-migration 80.87% <ø> (+0.49%) ⬆️
spanner-live-reverse-replication 77.54% <0.00%> (-0.30%) ⬇️
spanner-bulk-migration 89.32% <ø> (+0.13%) ⬆️
gcs-spanner-dv 85.75% <ø> (+0.40%) ⬆️
Files with missing lines Coverage Δ
...ava/com/custom/CustomShardIdFetcherForRetryIT.java 0.00% <0.00%> (ø)
...m/custom/SpannerToSourceDbRetryTransformation.java 0.00% <0.00%> (ø)

... and 42 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aasthabharill aasthabharill changed the title DLQ Integration Tests [spanner-to-sourcedb] Add Integration Tests for retryDLQ and retryAllDLQ mode Mar 28, 2026
@aasthabharill aasthabharill marked this pull request as ready for review March 28, 2026 11:13
@aasthabharill aasthabharill requested a review from a team as a code owner March 28, 2026 11:13
@aasthabharill aasthabharill marked this pull request as draft March 28, 2026 15:20
@aasthabharill aasthabharill marked this pull request as ready for review March 28, 2026 17:41
@aasthabharill aasthabharill changed the title [spanner-to-sourcedb] Add Integration Tests for retryDLQ and retryAllDLQ mode [spanner-to-sourcedb] Add Integration Tests for retryDLQ and retryAllDLQ mode for sharded and non-sharded setup Mar 28, 2026
@aasthabharill aasthabharill force-pushed the aastha-dlq-it branch 5 times, most recently from 71cf1b5 to 46eaeea Compare April 7, 2026 12:25
Copy link
Copy Markdown
Contributor

@darshan-sj darshan-sj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments

Copy link
Copy Markdown
Contributor

@darshan-sj darshan-sj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thank you!

@aasthabharill aasthabharill merged commit 72a145d into main Apr 14, 2026
55 checks passed
@aasthabharill aasthabharill deleted the aastha-dlq-it branch April 14, 2026 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants