Skip to content

Commit a58f303

Browse files
authored
Merge pull request #2564 from keboola/PSGO-227/fix-multiple-parents-variables-templates
fix(relations): two-pass link+validate to prevent order-dependent multiple-parents crash [PSGO-227]
2 parents de9fe61 + 58f0ca4 commit a58f303

10 files changed

Lines changed: 341 additions & 23 deletions

File tree

docs/relations-validation.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Variables Relations Validation
2+
3+
## Background: the keboola.variables parent-child model
4+
5+
In the Keboola platform, variable configs (`keboola.variables` component) belong to a parent config (e.g. a transformation). In the local CLI directory layout, the variables config lives *inside* the parent's folder:
6+
7+
```
8+
my-transformation/
9+
variables/
10+
config.json ← keboola.variables config
11+
values/
12+
default/ ← variables values row
13+
```
14+
15+
This parent-child ownership is expressed through relations stored on the objects:
16+
17+
| Relation type | Stored on | Direction | Storage |
18+
|---|---|---|---|
19+
| `variablesFor` | variables config | → parent config | manifest (local only) |
20+
| `variablesFrom` | parent config | → variables config | API |
21+
| `variablesValuesFor` | values row | → parent config | manifest |
22+
| `variablesValuesFrom` | parent config | → values row | API |
23+
24+
The `variablesFor` relation on the variables config determines which parent folder it is placed in during sync. Because each object can have only one file-system path, **each variables config can have exactly one `variablesFor` relation** (i.e. one parent). This is enforced by `OneToXRelations()` in `internal/pkg/model/relation.go`.
25+
26+
## Why multiple parents can appear in the API
27+
28+
Keboola's Storage API allows a variables config to be referenced by more than one consumer via the project's configuration metadata. If a user creates such a setup through the API or the UI directly, the `keboola.variables` config will have multiple `variablesFrom` back-references, which the mapper translates into multiple `variablesFor` relations during the linking step.
29+
30+
## The `multiple parents` guard
31+
32+
`Relations.ParentKey()` (`internal/pkg/model/relation.go:66`) collects all relations that define a parent key. If more than one is found it returns a fatal error:
33+
34+
```
35+
unexpected state: multiple parents defined by "relations" in <config desc>
36+
```
37+
38+
This guard exists to protect the CLI sync engine: without it, path generation would be ambiguous and the local directory structure would be corrupted.
39+
40+
## The ordering bug that broke Templates
41+
42+
The relation mapper (`internal/pkg/mapper/relations/link.go`) previously processed objects in a single pass: link → validate per object. This is order-dependent:
43+
44+
1. If the variables config Y is iterated **before** its consumers X and Z, `validateRelations(Y)` runs when Y has zero `variablesFor` relations — nothing to clean up.
45+
2. Later iterations for X and Z call `linkRelations`, which adds `VariablesForRelation` entries to Y.
46+
3. After the mapper loop, `PathsGenerator.Invoke()` (called in `remote/manager.go`) calls `Y.ParentKey()`, finds two parents, and returns the fatal error.
47+
48+
CLI `pull` was unaffected in practice because the API happened to return consumer configs before variables configs, making consumers iterated first. The Templates service load path had a different iteration order and hit the bug.
49+
50+
## The fix: two-pass link and validate
51+
52+
`AfterRemoteOperation` and `AfterLocalOperation` were refactored into two explicit passes:
53+
54+
**Pass 1 — link all objects**
55+
Run `linkRelations` for every loaded object. This creates all other-side relations (including adding `variablesFor` entries to every variables config) before any validation happens.
56+
57+
**Pass 2 — validate all objects**
58+
Run `validateRelations` for every loaded object. With the complete relation graph now in place, duplicates are correctly detected, removed, and logged as warnings — regardless of the order objects appear in the API response.
59+
60+
This makes behaviour order-independent: both CLI and the Templates service emit a warning and continue when a variables config has multiple parents, instead of crashing.
61+
62+
## Silent skip in VariablesValuesForRelation.NewOtherSideRelation
63+
64+
A values row's `variablesValuesFor` relation is linked by looking up the parent variables config's single `variablesFor` relation to determine the target consumer config. In the two-pass approach, Pass 1 runs before validation, so when `linkRelations(values_row)` runs, the parent variables config Y may already hold two `variablesFor` entries (added earlier in the same pass by the consumer configs).
65+
66+
Calling `GetOneByType(VariablesForRelType)` on Y at this point returns an error ("only one expected, but found 2"). If that error were propagated, it would produce a duplicate "invalid config Y" message alongside the one already generated by Pass 2 validation of Y itself.
67+
68+
To avoid the duplicate, `NewOtherSideRelation` checks `len(GetByType(VariablesForRelType)) > 1` before calling `GetOneByType`. When multiple relations exist, it returns `(nil, nil, nil)` — deferring to Pass 2 to emit the canonical warning. The `variablesValuesFor` relation on the values row is left in place; the values row is then transitively excluded from the pull output by the ignore mapper (see below).
69+
70+
### Edge case: values row iterated before consumers
71+
72+
If the Loaded() order is `vars_config → values_row → consumer1 → consumer2`, then when `linkRelations(values_row)` runs in Pass 1, Y has zero `variablesFor` entries (consumers have not been processed yet). The `> 1` guard does not fire. `GetOneByType` returns `nil` (no relation found), and `NewOtherSideRelation` returns the "missing relation variablesFor" error — producing an extra warning in the output alongside the "found 2" warning from Pass 2.
73+
74+
This double-warning is a **known limitation of this fix**, not a pre-existing bug. Before this PR, the values-row-first ordering would have led to a different failure mode depending on whether consumers appeared later. The `> 1` guard eliminates the crash and the more common duplicate-warning case; this remaining ordering is accepted as a minor trade-off. In practice it does not arise because consumer configs always appear in `Loaded()` before their variables values rows (the Storage API returns consumers first, and values rows are discovered via the consumer's `variablesValuesFrom` relations). Both warnings correctly describe the malformed setup, so no data is lost.
75+
76+
## Why the ignore mapper excludes orphaned variables configs
77+
78+
After `AfterRemoteOperation` completes, the ignore mapper (`internal/pkg/mapper/ignore/remote.go`) runs. It marks a `keboola.variables` config as ignored when it has no `variablesFor` relation. Config rows are **transitively ignored** when their parent config is ignored. So after Pass 2 removes Y's duplicate `variablesFor` entries (leaving Y with zero), both Y and its values rows are removed from `changes.Loaded()` and never written to the local filesystem.
79+
80+
`PathsGenerator` only processes objects in `changes.Loaded()`, so it never encounters Y with multiple parents — making the fatal guard in `Relations.ParentKey()` safe to keep as-is.
81+
82+
## Why this is warning-only, not a hard error
83+
84+
The `validateRelations` function removes all duplicate relations and logs a warning. The invalid `variablesFor` entries are dropped and the variables config (along with its rows) is excluded from the local sync output. The project can still be synced — it just does not represent the shared-variables scenario in the local directory hierarchy.
85+
86+
Templates only reads remote state to discover existing configs and apply template changes on top. It never generates or syncs a local directory hierarchy, so the "ambiguous path" concern does not apply. Making this a hard error in the Templates code path only served to block users from using templates in projects with this configuration.
87+
88+
## Related files
89+
90+
| File | Role |
91+
|---|---|
92+
| `internal/pkg/mapper/relations/link.go` | Two-pass AfterRemoteOperation / AfterLocalOperation |
93+
| `internal/pkg/model/relation.go` | `Relations.ParentKey()`, `OneToXRelations()` |
94+
| `internal/pkg/model/variables.go` | `VariablesValuesForRelation.NewOtherSideRelation` — silent skip for multiple variablesFor |
95+
| `internal/pkg/mapper/ignore/remote.go` | Transitive ignore of orphaned variables configs and their rows |
96+
| `internal/pkg/state/remote/manager.go` | Sequencing of AfterRemoteOperation → PathsGenerator |
97+
| `test/cli/pull/variables-used-twice/` | E2E test covering the shared-variables warning |

internal/pkg/mapper/relations/link.go

Lines changed: 40 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,27 @@ import (
1212
func (m *relationsMapper) AfterLocalOperation(ctx context.Context, changes *model.LocalChanges) error {
1313
errs := errors.NewMultiError()
1414
allObjects := m.state.LocalObjects()
15-
for _, objectState := range changes.Loaded() {
16-
if err := m.linkAndValidateRelations(objectState.LocalState(), allObjects); err != nil {
17-
errs.Append(err)
15+
loaded := changes.Loaded()
16+
17+
// Pass 1: link all objects so every other-side relation exists before validation.
18+
// Processing link and validate in a single loop would cause order-dependent behaviour:
19+
// a variables config iterated before its consumers would be validated before the consumer
20+
// linking steps add VariablesForRelation entries to it, leaving duplicates undetected until
21+
// PathsGenerator runs and hits a fatal "multiple parents" error.
22+
for _, objectState := range loaded {
23+
if o, ok := objectState.LocalState().(model.ObjectWithRelations); ok {
24+
if err := m.linkRelations(o, allObjects); err != nil {
25+
errs.Append(err)
26+
}
27+
}
28+
}
29+
30+
// Pass 2: validate all objects now that the relation graph is complete.
31+
for _, objectState := range loaded {
32+
if o, ok := objectState.LocalState().(model.ObjectWithRelations); ok {
33+
if err := m.validateRelations(o); err != nil {
34+
errs.Append(errors.PrefixErrorf(err, "invalid %s", objectState.LocalState().Desc()))
35+
}
1836
}
1937
}
2038

@@ -30,9 +48,24 @@ func (m *relationsMapper) AfterLocalOperation(ctx context.Context, changes *mode
3048
func (m *relationsMapper) AfterRemoteOperation(ctx context.Context, changes *model.RemoteChanges) error {
3149
errs := errors.NewMultiError()
3250
allObjects := m.state.RemoteObjects()
33-
for _, objectState := range changes.Loaded() {
34-
if err := m.linkAndValidateRelations(objectState.RemoteState(), allObjects); err != nil {
35-
errs.Append(err)
51+
loaded := changes.Loaded()
52+
53+
// Pass 1: link all objects so every other-side relation exists before validation.
54+
// See AfterLocalOperation for the reasoning behind the two-pass approach.
55+
for _, objectState := range loaded {
56+
if o, ok := objectState.RemoteState().(model.ObjectWithRelations); ok {
57+
if err := m.linkRelations(o, allObjects); err != nil {
58+
errs.Append(err)
59+
}
60+
}
61+
}
62+
63+
// Pass 2: validate all objects now that the relation graph is complete.
64+
for _, objectState := range loaded {
65+
if o, ok := objectState.RemoteState().(model.ObjectWithRelations); ok {
66+
if err := m.validateRelations(o); err != nil {
67+
errs.Append(errors.PrefixErrorf(err, "invalid %s", objectState.RemoteState().Desc()))
68+
}
3669
}
3770
}
3871

@@ -44,20 +77,7 @@ func (m *relationsMapper) AfterRemoteOperation(ctx context.Context, changes *mod
4477
return nil
4578
}
4679

47-
func (m *relationsMapper) linkAndValidateRelations(object model.Object, allObjects model.Objects) error {
48-
errs := errors.NewMultiError()
49-
if o, ok := object.(model.ObjectWithRelations); ok {
50-
if err := m.linkRelations(o, allObjects); err != nil {
51-
errs.Append(err)
52-
}
53-
if err := m.validateRelations(o); err != nil {
54-
errs.Append(errors.PrefixErrorf(err, "invalid %s", object.Desc()))
55-
}
56-
}
57-
return errs.ErrorOrNil()
58-
}
59-
60-
// lintRelations finds the other side of the relation and create a corresponding relation on the other side.
80+
// linkRelations finds the other side of the relation and creates a corresponding relation on the other side.
6181
func (m *relationsMapper) linkRelations(object model.ObjectWithRelations, allObjects model.Objects) error {
6282
errs := errors.NewMultiError()
6383
relations := object.GetRelations()

internal/pkg/mapper/relations/link_test.go

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import (
44
"strings"
55
"testing"
66

7+
"github.com/keboola/keboola-sdk-go/v2/pkg/keboola"
78
"github.com/stretchr/testify/assert"
89
"github.com/stretchr/testify/require"
910

@@ -66,6 +67,99 @@ func TestRelationsMapperLinkRelations(t *testing.T) {
6667
}, object2.Local.Relations)
6768
}
6869

70+
// TestRelationsMapperVariablesSharedAcrossConsumers verifies that when a variables config
71+
// is loaded before its consumers in changes.Loaded(), the two-pass approach still produces
72+
// exactly one warning instead of crashing with "multiple parents defined by relations".
73+
// It also exercises the > 1 guard in VariablesValuesForRelation.NewOtherSideRelation by
74+
// including a values row loaded after both consumers so that, during Pass 1, the variables
75+
// config already holds two variablesFor relations when linkRelations(valuesRow) runs.
76+
func TestRelationsMapperVariablesSharedAcrossConsumers(t *testing.T) {
77+
t.Parallel()
78+
state, d := createStateWithMapper(t)
79+
logger := d.DebugLogger()
80+
81+
branchID := keboola.BranchID(1)
82+
consumerCompID := keboola.ComponentID("ex-generic-v2")
83+
84+
// Variables config (Y) — added to state and loaded FIRST to exercise the ordering
85+
// that previously caused a fatal "multiple parents" crash in PathsGenerator.
86+
varsKey := model.ConfigKey{BranchID: branchID, ComponentID: keboola.VariablesComponentID, ID: "vars"}
87+
varsConfig := &model.ConfigState{
88+
ConfigManifest: &model.ConfigManifest{ConfigKey: varsKey},
89+
Remote: &model.Config{ConfigKey: varsKey},
90+
}
91+
require.NoError(t, state.Set(varsConfig))
92+
93+
// Consumer 1 — variablesFrom and variablesValuesFrom relations pointing to varsKey.
94+
// VariablesValuesFromRelation causes linkRelations(consumer1) to add a VariablesValuesFor
95+
// relation to the values row, which in turn exercises the > 1 guard when
96+
// linkRelations(valuesRow) runs later in Pass 1.
97+
valuesRowID := keboola.RowID("val1")
98+
consumer1Key := model.ConfigKey{BranchID: branchID, ComponentID: consumerCompID, ID: "consumer1"}
99+
consumer1 := &model.ConfigState{
100+
ConfigManifest: &model.ConfigManifest{ConfigKey: consumer1Key},
101+
Remote: &model.Config{
102+
ConfigKey: consumer1Key,
103+
Relations: model.Relations{
104+
&model.VariablesFromRelation{VariablesID: varsKey.ID},
105+
&model.VariablesValuesFromRelation{VariablesValuesID: valuesRowID},
106+
},
107+
},
108+
}
109+
require.NoError(t, state.Set(consumer1))
110+
111+
// Consumer 2 — also variablesFrom relation pointing to the same varsKey.
112+
consumer2Key := model.ConfigKey{BranchID: branchID, ComponentID: consumerCompID, ID: "consumer2"}
113+
consumer2 := &model.ConfigState{
114+
ConfigManifest: &model.ConfigManifest{ConfigKey: consumer2Key},
115+
Remote: &model.Config{
116+
ConfigKey: consumer2Key,
117+
Relations: model.Relations{
118+
&model.VariablesFromRelation{VariablesID: varsKey.ID},
119+
},
120+
},
121+
}
122+
require.NoError(t, state.Set(consumer2))
123+
124+
// Values row — loaded LAST so that when linkRelations(valuesRow) runs in Pass 1, the
125+
// variables config already holds 2 variablesFor relations (added by consumer1 and
126+
// consumer2). The > 1 guard in VariablesValuesForRelation.NewOtherSideRelation fires
127+
// and returns (nil, nil, nil), preventing a duplicate "invalid config Y" error.
128+
valuesRowKey := model.ConfigRowKey{BranchID: branchID, ComponentID: keboola.VariablesComponentID, ConfigID: varsKey.ID, ID: valuesRowID}
129+
valuesRow := &model.ConfigRowState{
130+
ConfigRowManifest: &model.ConfigRowManifest{ConfigRowKey: valuesRowKey},
131+
Remote: &model.ConfigRow{ConfigRowKey: valuesRowKey},
132+
}
133+
require.NoError(t, state.Set(valuesRow))
134+
135+
// Variables config is first in Loaded(), values row is last — both orderings that
136+
// previously caused problems are exercised in a single pass.
137+
changes := model.NewRemoteChanges()
138+
changes.AddLoaded(varsConfig)
139+
changes.AddLoaded(consumer1)
140+
changes.AddLoaded(consumer2)
141+
changes.AddLoaded(valuesRow)
142+
143+
// Must return nil — the duplicate is a warning, not a fatal error.
144+
require.NoError(t, state.Mapper().AfterRemoteOperation(t.Context(), changes))
145+
146+
// Exactly one warning about the duplicate variablesFor relation.
147+
allTxt := logger.AllMessagesTxt()
148+
assert.Equal(t, 1, strings.Count(allTxt, `Only one relation "variablesFor" expected, but found 2`))
149+
assert.Empty(t, logger.ErrorMessages())
150+
151+
// The > 1 guard in VariablesValuesForRelation.NewOtherSideRelation prevented
152+
// linkRelations(valuesRow) from adding a VariablesValuesFromRelation to any consumer.
153+
// consumer1 retains its original single VariablesValuesFromRelation (no duplicate was
154+
// added from the values-row side), and consumer2 has none at all.
155+
assert.Len(t, consumer1.Remote.Relations.GetByType(model.VariablesValuesFromRelType), 1)
156+
assert.Empty(t, consumer2.Remote.Relations.GetByType(model.VariablesValuesFromRelType))
157+
158+
// AfterLocalOperation is structurally identical to AfterRemoteOperation (same two-pass
159+
// logic). It is not re-tested here because the shared-variables scenario only arises
160+
// from the remote API path, where object order is non-deterministic.
161+
}
162+
69163
func TestRelationsMapperOtherSideMissing(t *testing.T) {
70164
t.Parallel()
71165
state, d := createStateWithMapper(t)

internal/pkg/model/variables.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,14 @@ func (t *VariablesValuesForRelation) NewOtherSideRelation(relationDefinedOn Obje
178178
return nil, nil, errors.Errorf(`%s not found, referenced from %s, by relation "%s"`, variablesConfigKey.Desc(), relationDefinedOn.Desc(), t.Type())
179179
}
180180
variablesConfig := variablesConfigRaw.(*Config)
181+
// When the variables config temporarily holds multiple variablesFor relations (shared
182+
// across consumers), skip linking here — Pass 2 validation will detect and warn about
183+
// the duplicates and remove them. The variablesValuesFor relation on the row is
184+
// intentionally left in place; the ignore mapper excludes the parent variables config
185+
// (and this row transitively) after Pass 2 removes all variablesFor entries.
186+
if len(variablesConfig.Relations.GetByType(VariablesForRelType)) > 1 {
187+
return nil, nil, nil
188+
}
181189
variablesForRaw, err := variablesConfig.Relations.GetOneByType(VariablesForRelType)
182190
if err != nil {
183191
return nil, nil, errors.PrefixErrorf(err, "invalid %s", variablesConfig.Desc())

test/cli/pull/variables-used-twice/expected-stderr

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,3 @@ Warning:
33
- Only one relation "variablesFor" expected, but found 2:
44
- {"componentId":"ex-generic-v2","configId":"%s"}
55
- {"componentId":"ex-generic-v2","configId":"%s"}
6-
- Missing relation "variablesFor" in config "branch:%s/component:keboola.variables/config:%s":
7-
- Referenced from config row "branch:%s/component:keboola.variables/config:%s/row:%s".
8-
- By relation "variablesValuesFor".
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
200
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"instances": []
3+
}
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"path": "/v1/project/%%TEST_BRANCH_MAIN_ID%%/instances",
3+
"method": "GET",
4+
"headers": {
5+
"Content-Type": "application/json",
6+
"X-StorageApi-Token": "%%TEST_KBC_STORAGE_API_TOKEN%%"
7+
},
8+
"body": {}
9+
}
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
{
2+
"branches": [
3+
{
4+
"branch": {
5+
"name": "Main",
6+
"description": "",
7+
"isDefault": true
8+
},
9+
"configs": [
10+
{
11+
"componentId": "keboola.variables",
12+
"name": "Variables",
13+
"description": "test fixture",
14+
"configuration": {
15+
"variables": [
16+
{
17+
"name": "foo",
18+
"type": "string"
19+
}
20+
]
21+
},
22+
"rows": [
23+
{
24+
"name": "Default values",
25+
"description": "test fixture",
26+
"isDisabled": false,
27+
"configuration": {
28+
"values": [
29+
{
30+
"name": "foo",
31+
"value": "bar"
32+
}
33+
]
34+
}
35+
}
36+
],
37+
"isDisabled": false
38+
},
39+
{
40+
"componentId": "ex-generic-v2",
41+
"name": "With Variables 1",
42+
"description": "test fixture",
43+
"configuration": {
44+
"variables_id": "%s",
45+
"variables_values_id": "%s"
46+
},
47+
"rows": [],
48+
"isDisabled": false
49+
},
50+
{
51+
"componentId": "ex-generic-v2",
52+
"name": "With Variables 2",
53+
"description": "test fixture",
54+
"configuration": {
55+
"variables_id": "%s",
56+
"variables_values_id": "%s"
57+
},
58+
"rows": [],
59+
"isDisabled": false
60+
}
61+
]
62+
}
63+
]
64+
}

0 commit comments

Comments
 (0)