Skip to content

add serializable protocol#3799

Open
d-v-b wants to merge 2 commits intozarr-developers:mainfrom
d-v-b:feat/serializable-protocol
Open

add serializable protocol#3799
d-v-b wants to merge 2 commits intozarr-developers:mainfrom
d-v-b:feat/serializable-protocol

Conversation

@d-v-b
Copy link
Copy Markdown
Contributor

@d-v-b d-v-b commented Mar 20, 2026

This PR defines a JSONSerializable protocol in zarr.abc.serializable that our metadata classes can use. That protocol is generic with 2 type parameters. 1 type parameter is used to declare the type that the class can be deserialized from, and the other type parameter declares the type the class serializes to. This allows our metadata classes to have generous input types, and narrow output types.

The JSONSerializable protocol has 3 methods:

  • to_json (serialization)
  • from_json (deserialization from typed input)
  • try_from_json (deserialization from arbitrary input; performs type checking before calling from_json)

I implement these methods on the ArrayV3Metadata class, which involves the following changes:

  • Defining the ArrayMetadataJSONLike_V3 type, which is effectively the signature of ArrayV3Metadata.__init__
  • Widening the type of the data_type parameter that the ArrayV3Metadata will accept. Now it accepts string or object declarations of ZDType instances, instead of requiring ZDType instances.
  • Widening the type of the shape parameter that ArrayV3Metadata will accept. It now accepts a ShapeLike input.
  • Defined new ChunkGridLike and CodecLike aliases for the inputs ArrayV3Metadata.__init__ accepts
  • Adding DimensionNamesLike as an alias for DimensionNames

The goal is to make our array metadata class more useful by more clearly defining the types it accepts and creates. If we accept these changes, we can remove metadata parsing at higher levels in the codebase, and move towards making the ArrayV3Metadata class part of our public API.

related:
#3786
#3795

edit: I have reduced the scope of this PR to make it easier to review. Now it's simply a protocol definition, and an implementation on ArrayV3Metadata

@d-v-b d-v-b requested a review from a team March 20, 2026 14:22
@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Mar 20, 2026

for context, we have a Metadata ABC, but it has a serious flaw: if we define some class method like from_dict(obj: T) -> Self, no subclass can narrow the type T to something more specific without violating liskov substitution. So this base class approach is a dead end if we ever want separate metadata objects to clearly indicate the type the deserialize from, and IMO we do want that.

Comment thread src/zarr/core/common.py Outdated
Comment thread src/zarr/abc/serializable.py Outdated
@github-actions github-actions Bot added needs release notes Automatically applied to PRs which haven't added release notes and removed needs release notes Automatically applied to PRs which haven't added release notes labels Mar 26, 2026
Comment thread src/zarr/core/metadata/v3.py Outdated
return dict(data) # type: ignore[arg-type]


class ArrayMetadataJSON_V3(TypedDict, extra_items=AllowedExtraField): # type: ignore[call-arg]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should eventually be a public type

Comment thread src/zarr/core/metadata/v3.py Outdated
)


def check_array_metadata_like(data: object) -> ArrayMetadataJSONLike_V3:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this function to eventually be public

Comment thread src/zarr/core/metadata/v3.py Outdated
return cast(ArrayMetadataJSONLike_V3, data)


class ArrayMetadataJSONLike_V3(TypedDict, extra_items=AllowedExtraField): # type: ignore[call-arg]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should eventually be public

Comment thread src/zarr/core/metadata/v3.py Outdated
Comment on lines +384 to +385
shape: ShapeLike,
data_type: ZDTypeLike,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata now parses shapes and data types.

Comment thread tests/test_metadata/test_v3.py Outdated


@dataclass(frozen=True)
class Expect[TIn, TOut]:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am liking this pattern for simple test cast definition. we could consider using this elsewhere in the codebase.

Comment thread tests/test_metadata/test_v3.py Outdated
@d-v-b d-v-b requested a review from maxrjones March 27, 2026 19:16
@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Mar 27, 2026

This is ready for another round of review.

If this pattern looks good for array metadata, I would start extending it to all our metadata classes. The end result would be a pair of public types for each metadata class: the generous input JSON-like type, and the strict output JSON-like type, and functions that parse unknown values into both types, or error. I think this would be very useful for libraries that want type-safe routines for working with array metadata documents.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 40.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.08%. Comparing base (866aa8d) to head (286a047).

Files with missing lines Patch % Lines
src/zarr/abc/serializable.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3799      +/-   ##
==========================================
- Coverage   93.11%   93.08%   -0.03%     
==========================================
  Files          85       86       +1     
  Lines       11365    11370       +5     
==========================================
+ Hits        10582    10584       +2     
- Misses        783      786       +3     
Files with missing lines Coverage Δ
src/zarr/core/metadata/v3.py 93.75% <100.00%> (+0.03%) ⬆️
src/zarr/abc/serializable.py 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maxrjones
Copy link
Copy Markdown
Member

@d-v-b I'm trying to make my way through a backlog of review requests, but I think it'd help to have the merge conflicts resolved on this first

Introduces a generic `JSONSerializable[T_co]` protocol in
`zarr.abc.serializable` parameterized on the JSON output type. Adds a
`to_json` method to `ArrayV3Metadata` that returns an
`ArrayMetadataJSON_V3` payload, demonstrating the protocol.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@d-v-b d-v-b force-pushed the feat/serializable-protocol branch from 61336ee to b1984f6 Compare April 22, 2026 19:19
@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Apr 23, 2026

@maxrjones I have reduced the scope of this PR to make it easier to review. Now it's simply a protocol definition for one method (to_json), and an implementation of that method on ArrayV3Metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants