fix: resilient manifest parsing for disabled field and unknown versions#95
Open
Sourabhchrs93 wants to merge 1 commit intomainfrom
Open
fix: resilient manifest parsing for disabled field and unknown versions#95Sourabhchrs93 wants to merge 1 commit intomainfrom
Sourabhchrs93 wants to merge 1 commit intomainfrom
Conversation
…ions - Add fallback that strips unused `disabled` field on Pydantic validation failure — the field uses strict discriminated unions that break on dbt Cloud schema changes but is never consumed by downstream wrappers - Add forward-compatibility for unknown manifest versions (e.g. v13+) by attempting parse with latest known class instead of hard failing - Refactor version routing to dict lookup for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| return _try_parse_manifest(manifest, model_class) | ||
|
|
||
| # Forward-compatibility: unknown manifest version — try latest known class | ||
| match = _MANIFEST_VERSION_RE.match(dbt_schema_version) |
There was a problem hiding this comment.
Bug: The code may raise a TypeError when parsing a manifest. re.match is called on dbt_schema_version, which can be None if the manifest contains a null value for it.
Severity: MEDIUM
Suggested Fix
Add a check to ensure dbt_schema_version is a non-None string before passing it to re.match(). This validation can be added directly before the call or within the get_dbt_schema_version utility function.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/vendor/dbt_artifacts_parser/parser.py#L159
Potential issue: The function `get_dbt_schema_version` can return `None` if an input
manifest contains `"metadata": {"dbt_schema_version": null}`. This `None` value is then
passed to `re.match()` within the `parse_manifest` function. Since `re.match()` expects
a string or bytes-like object, this will raise a `TypeError` and crash the parser.
Although manifests with a null schema version may be uncommon, the corresponding
Pydantic model explicitly defines this field as optional, making this a possible edge
case that is not handled.
Did we get this right? 👍 / 👎 to inform future reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
dbt_ingestioncelery task failure (ManifestV12810 validation errors ondisabledfield)disabledfield uses strict Pydantic discriminated unions (DisabledthroughDisabled13) that break when dbt Cloud changes its manifest schema. This field is never consumed by any downstream wrapper or backend code.disabledfield and retries parsingValueErrorRoot Cause
dbt Cloud updated their manifest artifact, and the auto-generated
ManifestV12Pydantic model'sdisabledfield couldn't validate the new structure. Each disabled node failed against all 14 union variants, producing 810 validation errors.Changes
_try_parse_manifest(): attempts strict parse first, falls back to stripping unused fields_strip_unused_fields(): removesdisabled(and any future unused strict fields) from manifest dictmanifest/vN.jsonURLs and tries latest known parserTest plan
🤖 Generated with Claude Code