-
-
Notifications
You must be signed in to change notification settings - Fork 403
feat:metadata package #3919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
feat:metadata package #3919
Changes from all commits
Commits
Show all changes
73 commits
Select commit
Hold shift + click to select a range
17e74a9
feat(metadata): scaffold zarr-metadata package structure
d-v-b 8da469f
build(metadata): depend on zarr-metadata via local uv workspace source
d-v-b e43bd36
feat(metadata): add JSON, NamedConfig, NamedRequiredConfig primitives
d-v-b 28efcde
feat(metadata): add v3 array metadata types
d-v-b 23aed89
feat(metadata): add v3 consolidated metadata type
d-v-b 43eefba
feat(metadata): add v3 group metadata type
d-v-b f26e1bd
feat(metadata): wire up zarr_metadata.v3 re-exports
d-v-b 27b000c
feat(metadata): add faithful v2 array metadata types
d-v-b 2a45d2d
feat(metadata): add v2 group metadata type
d-v-b d530cb4
feat(metadata): add v2 consolidated metadata type (canonical impl, no…
d-v-b 266a8eb
feat(metadata): wire up zarr_metadata.v2 re-exports
d-v-b e431dee
feat(metadata): add ArrayMetadata, GroupMetadata version-polymorphic …
d-v-b c547f55
feat(metadata): add Codec envelope and blosc codec configurations
d-v-b 1517cd8
feat(metadata): add dtype types (DType, LengthBytesConfig, FixedLengt…
d-v-b b90fb68
test(metadata): smoke + structural tests for the package
d-v-b bb0183c
refactor(common): re-export JSON, NamedConfig, NamedRequiredConfig fr…
d-v-b fc09be6
refactor(metadata): re-export v3 types from zarr-metadata
d-v-b 33bfc99
refactor(metadata): re-export faithful v2 array metadata type
d-v-b 1098718
refactor(codecs): re-export blosc codec configurations from zarr-meta…
d-v-b b437812
refactor(abc): re-export CodecJSON from zarr-metadata
d-v-b 2578ad8
refactor(dtype): re-export DTypeJSON from zarr-metadata
d-v-b 51a1df3
refactor(dtype): re-export LengthBytesConfig from zarr-metadata
d-v-b d06fad4
refactor(dtype): re-export FixedLengthBytesConfig from zarr-metadata
d-v-b a2c2960
refactor(dtype): re-export TimeConfig from zarr-metadata
d-v-b d42a508
refactor(metadata): use tuple[int, ...] for fixed-length fields + typ…
d-v-b e7ff23c
refactor(metadata): fix explicit re-exports and complete DateTimeUnit…
d-v-b 99b2571
refactor(metadata): extract primitives to common.py to break import c…
d-v-b a88716b
fix(metadata): address review findings
d-v-b 7571dbc
refactor(metadata): remove consolidated_metadata from GroupMetadataV3
d-v-b bb98cde
chore(metadata): don't track zarr-metadata's uv.lock
d-v-b 08c7643
feat(metadata): add v3 codec types for bytes, crc32c, gzip, zstd, tra…
d-v-b 2feb4be
refactor(metadata): define codec envelope TypedDicts explicitly
d-v-b a12fe70
feat(metadata): add Final string constants for codec names and enum-v…
d-v-b ec22950
docs(metadata): say "codec metadata" instead of "codec envelope"
d-v-b 275bf55
docs(metadata): use single-backtick markdown code formatting
d-v-b 374181a
feat(metadata): add v3 spec data type metadata
d-v-b 9554ca3
refactor(metadata): per-dtype modules with fill-value types and valid…
d-v-b 8d2bd63
refactor(metadata): per-grid and per-encoding modules for chunk_grid …
d-v-b e6139a6
refactor(metadata): move codec/ and dtype/ under v3/
d-v-b ac0304c
refactor(metadata): rename v3/dtype/ -> v3/data_type/
d-v-b ef4b773
Merge branch 'main' into refactor/metadata-package
d-v-b b7b055e
feat: add zarr-metadata package
d-v-b 0ae8db9
Merge branch 'main' of github.com:zarr-developers/zarr-python into re…
d-v-b 1b62c4c
Merge branch 'main' into refactor/metadata-package
d-v-b c6fcde9
test(metadata): drop tests that don't actually test anything
d-v-b c90c9a0
build(metadata): lower minimum Python to 3.11
d-v-b 331ea93
Merge branch 'main' into refactor/metadata-package
d-v-b 33c9a80
Merge branch 'refactor/metadata-package' of https://github.com/d-v-b/…
d-v-b 700d916
refactor: remove generic base metadata
d-v-b 84d5ca1
refactor: clean up codecs init
d-v-b c30a768
fix: correct v2 structured dtype spec
d-v-b fa66cc9
refactor: drop readonly for numcodecs config
d-v-b cf12cdc
docs: improve docstring
d-v-b 82e10d6
fix: use empty typeddict for crc32c config
d-v-b 4b5bd11
fix: remove arbitrary json from consolidated model
d-v-b 6a0be8c
Merge branch 'main' into refactor/metadata-package
d-v-b e6e5920
fix: don't depend on zarr-metadata yet
d-v-b a732fb2
fix: typesize is not required
d-v-b 8691138
fix: re-wire zarr-metadata up as a dependency for zarr-python
d-v-b a6d0e5e
chore: revert changes to src/zarr
d-v-b 039fd7e
Merge branch 'main' into refactor/metadata-package
d-v-b b8d67fe
chore: mypy ignore the new package
d-v-b cdedda2
Merge branch 'refactor/metadata-package' of https://github.com/d-v-b/…
d-v-b df7a4f5
Merge branch 'main' into refactor/metadata-package
d-v-b 842eec7
refactor: rename extra field
d-v-b e9a7967
refactor: we do a little refactoring
d-v-b 5e7838c
test: more dtype tests
d-v-b 167f7db
chore: add ci
d-v-b c32e51a
chore: use typing extensions typeddict
d-v-b 38e5e47
fix: unbreak ci
d-v-b 32ce2bf
chore: clean up top level exports
d-v-b 3182e23
chore: clarify extension fields and rename type
d-v-b 689fc76
allow must_understand: true, and add canonical nan strings
d-v-b File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| name: zarr-metadata | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main] | ||
| paths: | ||
| - 'packages/zarr-metadata/**' | ||
| - '.github/workflows/zarr-metadata.yml' | ||
| pull_request: | ||
| paths: | ||
| - 'packages/zarr-metadata/**' | ||
| - '.github/workflows/zarr-metadata.yml' | ||
| workflow_dispatch: | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| test: | ||
| name: pytest py=${{ matrix.python-version }} | ||
| runs-on: ubuntu-latest | ||
| defaults: | ||
| run: | ||
| shell: bash | ||
| working-directory: packages/zarr-metadata | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| python-version: ['3.11', '3.12', '3.13', '3.14'] | ||
| steps: | ||
| - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | ||
| with: | ||
| persist-credentials: false | ||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0 | ||
| with: | ||
| enable-cache: true | ||
| - name: Set up Python ${{ matrix.python-version }} | ||
| run: uv python install ${{ matrix.python-version }} | ||
| - name: Sync test dependency group | ||
| run: uv sync --group test --python ${{ matrix.python-version }} | ||
| - name: Run pytest | ||
| run: uv run --group test pytest tests | ||
|
|
||
| ruff: | ||
| name: ruff | ||
| runs-on: ubuntu-latest | ||
| defaults: | ||
| run: | ||
| shell: bash | ||
| working-directory: packages/zarr-metadata | ||
| steps: | ||
| - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | ||
| with: | ||
| persist-credentials: false | ||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0 | ||
| - name: Run ruff | ||
| run: uvx ruff check . | ||
|
|
||
| pyright: | ||
| name: pyright | ||
| runs-on: ubuntu-latest | ||
| defaults: | ||
| run: | ||
| shell: bash | ||
| working-directory: packages/zarr-metadata | ||
| steps: | ||
| - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | ||
| with: | ||
| persist-credentials: false | ||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0 | ||
| with: | ||
| enable-cache: true | ||
| - name: Set up Python | ||
| run: uv python install 3.11 | ||
| - name: Sync test dependency group | ||
| run: uv sync --group test --python 3.11 | ||
| - name: Run pyright | ||
| run: uv run --group test --with pyright pyright src | ||
|
|
||
| zarr-metadata-complete: | ||
| name: zarr-metadata complete | ||
| needs: [test, ruff, pyright] | ||
| if: always() | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Check failure | ||
| if: | | ||
| contains(needs.*.result, 'failure') || | ||
| contains(needs.*.result, 'cancelled') | ||
| run: exit 1 | ||
| - name: Success | ||
| run: echo Success! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| The MIT License (MIT) | ||
|
|
||
| Copyright (c) 2015-2025 Zarr Developers <https://github.com/zarr-developers> | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| of this software and associated documentation files (the "Software"), to deal | ||
| in the Software without restriction, including without limitation the rights | ||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| copies of the Software, and to permit persons to whom the Software is | ||
| furnished to do so, subject to the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be included in all | ||
| copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| SOFTWARE. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # zarr-metadata | ||
|
|
||
| Python type definitions for Zarr v2 and v3 metadata. | ||
|
|
||
| ## What this is | ||
|
|
||
| A typed-data package: `TypedDict` definitions and `Literal` aliases for the | ||
| JSON shapes specified by the [Zarr v2](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html) | ||
| and [Zarr v3](https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html) | ||
| specifications, plus types for [`zarr-extensions`](https://github.com/zarr-developers/zarr-extensions/) | ||
| and a few widely-used-but-unspecified entities (e.g. consolidated metadata). | ||
|
|
||
| ## What this is for | ||
|
|
||
| These types describe the JSON shape of Zarr metadata. They are | ||
| intended for libraries that **read, write, validate, or transform** | ||
| Zarr metadata. Pair them with a runtime validator like | ||
| [pydantic](https://docs.pydantic.dev/) to check JSON loaded from disk: | ||
|
|
||
| ```python | ||
| import json | ||
| from pydantic import TypeAdapter | ||
| from zarr_metadata.v3.array import ArrayMetadataV3 | ||
|
|
||
| with open("zarr.json", "rb") as f: | ||
| raw = json.load(f) | ||
|
|
||
| metadata = TypeAdapter(ArrayMetadataV3).validate_python(raw) | ||
| ``` | ||
|
|
||
| ## What this is *not* | ||
|
|
||
| - Not a parser or builder. There are no `make_array_metadata(...)` factories — | ||
| that surface belongs to consumer libraries. | ||
| - Not a runtime validator on its own. Pair with `pydantic`, `msgspec`, or | ||
| similar to enforce shapes at decode time. | ||
|
|
||
| Even with a runtime validator, these types only describe **structural** | ||
| shape — they will not flag *semantically* invalid metadata, like a 3D v3 | ||
| array whose `dimension_names` has 4 entries instead of 3. That's a job | ||
| for downstream validator routines. | ||
|
|
||
| ## Scope | ||
|
|
||
| At minimum, this library supports what Zarr-Python needs: the complete | ||
| Zarr v2 and v3 specs, consolidated metadata, and a subset of the metadata | ||
| defined in `zarr-extensions`. We are generally open to contributions that | ||
| add types for Zarr metadata with a published spec. | ||
|
|
||
| ## License | ||
|
|
||
| [MIT](./LICENSE.txt) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| [build-system] | ||
| requires = ["hatchling>=1.29.0"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [project] | ||
| name = "zarr-metadata" | ||
| version = "0.1.0" | ||
| description = "Spec-defined metadata types for Zarr v2 and v3." | ||
| readme = "README.md" | ||
| requires-python = ">=3.11" | ||
| license = "MIT" | ||
| license-files = ["LICENSE.txt"] | ||
| authors = [ | ||
| { name = "Davis Bennett", email = "davis.v.bennett@gmail.com" }, | ||
| ] | ||
| classifiers = [ | ||
| "Development Status :: 4 - Beta", | ||
| "Intended Audience :: Developers", | ||
| "License :: OSI Approved :: MIT License", | ||
| "Programming Language :: Python", | ||
| "Programming Language :: Python :: 3", | ||
| "Programming Language :: Python :: 3.11", | ||
| "Programming Language :: Python :: 3.12", | ||
| "Programming Language :: Python :: 3.13", | ||
| "Programming Language :: Python :: 3.14", | ||
| "Typing :: Typed", | ||
| ] | ||
| dependencies = [ | ||
| "typing_extensions>=4.13", | ||
| ] | ||
|
|
||
| [dependency-groups] | ||
| test = ["pytest", "pydantic>=2"] | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["src/zarr_metadata"] | ||
|
|
||
| [tool.ruff] | ||
| extend = "../../pyproject.toml" | ||
| target-version = "py311" | ||
|
|
||
| [tool.pytest.ini_options] | ||
| minversion = "7" | ||
| testpaths = ["tests"] | ||
| xfail_strict = true | ||
| addopts = ["-ra", "--strict-config", "--strict-markers"] | ||
| filterwarnings = [ | ||
| "error", | ||
| # pydantic warns about ReadOnly TypedDict items not being enforced at runtime. | ||
| # That's expected here — we rely on type-checker enforcement, not pydantic mutation guards. | ||
| "ignore::UserWarning:pydantic._internal._generate_schema", | ||
| ] | ||
|
|
||
| [tool.numpydoc_validation] | ||
| checks = [ | ||
| "GL10", | ||
| "SS04", | ||
| "PR02", | ||
| "PR03", | ||
| "PR05", | ||
| "PR06", | ||
| ] | ||
|
|
||
| [tool.pyright] | ||
| include = ["src"] | ||
| enableExperimentalFeatures = true | ||
| typeCheckingMode = "strict" | ||
| pythonVersion = "3.11" | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| from zarr_metadata._common import NamedConfig | ||
| from zarr_metadata.v2.array import ( | ||
| ArrayDimensionSeparatorV2, | ||
| ArrayMetadataV2, | ||
| ArrayOrderV2, | ||
| DataTypeMetadataV2, | ||
| ) | ||
| from zarr_metadata.v2.codec import CodecMetadataV2 | ||
| from zarr_metadata.v2.consolidated import ConsolidatedMetadataV2 | ||
| from zarr_metadata.v2.group import GroupMetadataV2 | ||
| from zarr_metadata.v3._common import MetadataFieldV3 | ||
| from zarr_metadata.v3.array import ArrayMetadataV3, ExtensionFieldV3 | ||
| from zarr_metadata.v3.consolidated import ConsolidatedMetadataV3 | ||
| from zarr_metadata.v3.group import GroupMetadataV3 | ||
|
|
||
| __version__ = "0.1.0" | ||
| """Hardcoded package version. Must match the `version` field in | ||
| `pyproject.toml`; the sync is enforced by `tests/test_version.py`.""" | ||
|
|
||
|
|
||
| __all__ = [ | ||
| "ArrayDimensionSeparatorV2", | ||
| "ArrayMetadataV2", | ||
| "ArrayMetadataV3", | ||
| "ArrayOrderV2", | ||
| "CodecMetadataV2", | ||
| "ConsolidatedMetadataV2", | ||
| "ConsolidatedMetadataV3", | ||
| "DataTypeMetadataV2", | ||
| "ExtensionFieldV3", | ||
| "GroupMetadataV2", | ||
| "GroupMetadataV3", | ||
| "MetadataFieldV3", | ||
| "NamedConfig", | ||
| "__version__", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| """ | ||
| Top-level cross-version primitives for Zarr metadata. | ||
|
|
||
| Version-specific types live under `zarr_metadata.v2` and `zarr_metadata.v3`. | ||
| Codec and dtype spec types live under `zarr_metadata.v3.codec` and | ||
| `zarr_metadata.v3.data_type`. | ||
| """ | ||
|
|
||
| from collections.abc import Mapping | ||
| from typing import NotRequired | ||
|
|
||
| from typing_extensions import TypedDict | ||
|
|
||
|
|
||
| class NamedConfig(TypedDict): | ||
| """ | ||
| Externally-tagged union member for a metadata field. | ||
|
|
||
| The `configuration` mapping holds arbitrary JSON-encodable values; | ||
| it is typed as `Mapping[str, object]` because the type system cannot | ||
| express or verify JSON-encodability. | ||
| """ | ||
|
|
||
| name: str | ||
| configuration: NotRequired[Mapping[str, object]] |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| """Zarr v2 metadata types.""" | ||
|
|
||
| from zarr_metadata.v2.array import ( | ||
| ArrayDimensionSeparatorV2, | ||
| ArrayMetadataV2, | ||
| ArrayOrderV2, | ||
| DataTypeMetadataV2, | ||
| ) | ||
| from zarr_metadata.v2.codec import CodecMetadataV2 | ||
| from zarr_metadata.v2.consolidated import ConsolidatedMetadataV2 | ||
| from zarr_metadata.v2.group import GroupMetadataV2 | ||
|
|
||
| __all__ = [ | ||
| "ArrayDimensionSeparatorV2", | ||
| "ArrayMetadataV2", | ||
| "ArrayOrderV2", | ||
| "CodecMetadataV2", | ||
| "ConsolidatedMetadataV2", | ||
| "DataTypeMetadataV2", | ||
| "GroupMetadataV2", | ||
| ] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.