PyTensorStore.jl provides a wrapper around the Python package tensorstore. A future TensorStore.jl may wrap the C++ API directly.
For more detailed information on the underlying library, see the official TensorStore documentation.
This package is being primarily developed to test Zarr.jl.
- Julia Array Interface: Support for
size,ndims,eltype,axes, and 1-based indexing. - Read/Write Operations: Synchronous and asynchronous read/write support.
- Indexing & Domain Manipulation:
- Labeled indexing (e.g.,
w[lat=1:10, lon=1:5]). - Domain operations:
translate_by,translate_to, andlabel.
- Labeled indexing (e.g.,
- Transactions: Atomic multi-write operations with an idiomatic Julia context manager.
- Contexts: Share resources (like cache pools) across multiple TensorStore handles.
- Specs & Schemas: Programmatic access to TensorStore
Spec,Schema, andChunkLayout.
julia> using PyTensorStore
Precompiling PyTensorStore...
1 dependency successfully precompiled in 2 seconds. 50 already precompiled.
julia> d = Dict(
"driver" => "n5",
"kvstore" => Dict(
"driver" => "file",
"path" => "tmp/dataset/",
),
"metadata" => Dict(
"compression" => Dict(
"type" => "gzip"
),
"dataType" => "uint32",
"dimensions" => [1000, 20000],
"blockSize" => [100, 100],
),
"create" => true,
"delete_existing" => true
)
Dict{String, Any} with 5 entries:
"driver" => "n5"
"create" => true
"kvstore" => Dict("driver"=>"file", "path"=>"tmp/dataset/")
"metadata" => Dict{String, Any}("blockSize"=>[100, 100], "dataType"=>"…
"delete_existing" => true
julia> A = PyTensorStore.open(d).result()
PyTensorStore.TensorStoreWrapper(<py TensorStore({
'context': {
'cache_pool': {},
'data_copy_concurrency': {},
'file_io_concurrency': {},
'file_io_sync': True,
},
'driver': 'n5',
'dtype': 'uint32',
'kvstore': {'driver': 'file', 'path': 'tmp/dataset/'},
'metadata': {
'blockSize': [100, 100],
'compression': {'level': -1, 'type': 'gzip', 'useZlib': False},
'dataType': 'uint32',
'dimensions': [1000, 20000],
},
'transform': {
'input_exclusive_max': [[1000], [20000]],
'input_inclusive_min': [0, 0],
},
})>)
julia> A[1:100, 1:100]
PyTensorStore.TensorStoreWrapper(<py TensorStore({
'context': {
'cache_pool': {},
'data_copy_concurrency': {},
'file_io_concurrency': {},
'file_io_sync': True,
},
'driver': 'n5',
'dtype': 'uint32',
'kvstore': {'driver': 'file', 'path': 'tmp/dataset/'},
'metadata': {
'blockSize': [100, 100],
'compression': {'level': -1, 'type': 'gzip', 'useZlib': False},
'dataType': 'uint32',
'dimensions': [1000, 20000],
},
'transform': {
'input_exclusive_max': [100, 100],
'input_inclusive_min': [0, 0],
},
})>)
julia> A[1:100, 1:100].write(ones(UInt32, 100, 100)*UInt32(5)).result()
Python: None
julia> A[1:100, 1:100].read().result()
100×100 PyArray{UInt32, 2}:
0x00000005 0x00000005 0x00000005 … 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 … 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 … 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
⋮ ⋱
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 … 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 … 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
0x00000005 0x00000005 0x00000005 0x00000005 0x00000005 0x00000005
julia> A[1,1].write(9).result()
Python: None
julia> A[1,1].read().result()
0-dimensional PyArray{UInt32, 0}:
0x00000009If your TensorStore has dimension labels, you can index using keywords:
# Open with labels in schema
spec["schema"] = Dict("domain" => Dict("labels" => ["x", "y"]))
w = PyTensorStore.open(spec).result()
# Index by dimension label
sub_w = w[x=1:5, y=10:15]Atomic multi-write operations can be performed using an idiomatic Julia context manager:
PyTensorStore.transaction() do txn
w_txn = w.with_transaction(txn)
w_txn[1, 1] = 42
w_txn[2, 2] = 100
# Changes are committed automatically when the block exits successfully.
endContexts allow you to manage shared resources like cache pools:
# Create a context with a 1GB cache limit
ctx = PyTensorStore.context(Dict(
"cache_pool" => Dict("total_bytes_limit" => 10^9)
))
# Use the context when opening a store
w = PyTensorStore.open(spec, context=ctx).result()# Shift the domain coordinate system
shifted_w = PyTensorStore.translate_by(w, 10, 20)
# Move the origin to a specific coordinate
centered_w = PyTensorStore.translate_to(w, 1, 1)
# Re-label dimensions
labeled_w = PyTensorStore.label(w, "lat", "lon")