FIX: Stable random sampling in DatasetConfiguration#1697
Open
adrian-gavrila wants to merge 1 commit intomicrosoft:mainfrom
Open
FIX: Stable random sampling in DatasetConfiguration#1697adrian-gavrila wants to merge 1 commit intomicrosoft:mainfrom
adrian-gavrila wants to merge 1 commit intomicrosoft:mainfrom
Conversation
Memoize get_seed_groups() and get_all_seeds() so the random subset selected when max_dataset_size is set is stable for the lifetime of the configuration. Reassigning max_dataset_size invalidates the cache. Without this, baseline and strategy atomic attacks each call get_all_seed_attack_groups() independently and receive different random subsets of objectives, making baseline-vs-strategy comparison meaningless. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rlundeen2
reviewed
May 7, 2026
| self._scenario_strategies = scenario_strategies | ||
| self._resolved_groups_cache: Optional[dict[str, list[SeedGroup]]] = None | ||
| self._resolved_seeds_cache: Optional[list[Seed]] = None | ||
| self._max_dataset_size: Optional[int] = None |
Contributor
There was a problem hiding this comment.
Could we simplify this?
Instead of a cache, what if we added a baseline scenario technique that is just PromptSending. We get rid of this in initialize
if self._include_baseline:
baseline_attack = self._get_baseline()
self._atomic_attacks.insert(0, baseline_attack)
and
def _get_baseline(self) -> AtomicAttack:
And instead add a tag in _get_attack_technique_factories that adds a PromptSending technique as baseline?
_build_display_group would also likely need to be updated to support baseline?
There might be some hiccups, but it feels like a more natural place to include it as an additional technique vs trying to cache the datasets
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
When a
Scenarioruns withinclude_default_baseline=Trueand aDatasetConfigurationwhosemax_dataset_sizeis set, the baseline atomic attack ended up evaluating a different random subset ofobjectives than the strategy-based atomic attacks. Baseline-vs-strategy success-rate comparisons measured two different populations and were meaningless.
Root cause:
random.sampleran fresh on every call toDatasetConfiguration.get_seed_groups()(Path 1, used by most scenarios) andget_all_seeds()(Path 2, used byEncodingDatasetConfiguration).Scenario._get_atomic_attacks_asyncandScenario._get_baseline_dataeach called these methods independently and got different samples.Fix: memoize both methods. The resolved sample is cached for the lifetime of the configuration object, and reassigning
max_dataset_sizeinvalidates the cache. Returns are defensive container copies socallers can mutate without poisoning the cache.
max_dataset_sizeis now a property whose setter re-validates the value (mirroring__init__).Subclasses inherit the fix automatically when they use base resolution methods. A short subclassing note in the class docstring flags the two methods that any future override must memoize itself.
Tests and Documentation
TestDatasetConfigurationMemoizationandTestDatasetConfigurationMaxDatasetSizeSetterclasses intest_dataset_configuration.pycovering both call paths, multi-dataset stability, cacheinvalidation, setter validation, and defensive-copy semantics. All randomness-sensitive tests patch
random.samplefor determinism.test_encoding.py(the override routes throughget_all_seeds, which is why both paths needed memoization).test_scenario.pyassertingset(baseline.objectives) == set(strategy.objectives)afterinitialize_asyncwithmax_dataset_sizeset.Verified by stashing the production change and watching the new tests fail (7 failures), then restoring and watching them pass.