flowchart LR IPNS -- name --> IPFS -- CID --> content
2025-03-21
a reason
want to share, use, publish, distribute, host and find data together
are part of orthogonal groups, e.g.: aircraft, institution, country, campaign
an opportunity
content addressed & distributed
bytes
and link
typesIPFS CIDs refer to immutable content.
A naming system is required for updates, e.g.:
flowchart LR IPNS -- name --> IPFS -- CID --> content
requires trust
latest.orcestra-campaign.org
QmPmvSb767cgHXKtTC2X9im7ftfZDT4NJigxveqt6yf9PW - meta/
QmT1cpAaBAppRo1EAnpXRonQbRyPeAQx4yM7yjib6cRFa7 - products/
QmUsJPneXEdDKHP75JrnqoABWQxcbPUemsSBZWLdc3x6UH - raw/
import xarray as xr
import matplotlib.pylab as plt
# root = "ipns://latest.orcestra-campaign.org"
root = "ipfs://QmenSJd5QnrikC92MFDaFFjTvkBSTjvD1dBggvzvKLh1DT"
ds = xr.open_dataset(f"{root}/products/HALO/dropsondes/Level_3/PERCUSION_Level_3.zarr", engine="zarr")
plt.scatter(ds.aircraft_longitude, ds.aircraft_latitude, c=ds.iwv, cmap="viridis_r", vmin=45, vmax=70, s=2); plt.colorbar()
flowchart LR data -- ipfs add --> sc[subtree CID] -- pull request --> tree[tree.yaml] subgraph "data flow" tree -- MFS --> root[root CID] root -- pin --> pins[pinning service] root -- publish --> dns["DNS (latest...)"] end subgraph "index flow" root -- scan --> dcid["dataset CID(s)"] -- extract metadata --> stac_item[STAC item] stac_item -- collect --> stac_index[STAC index] end stac_item & stac_index --> browser
We still need to gather more datasets.
IPFS ❤️ nice datasets
After a long planning period, a project for the actual implementation of a distibuted HaloDB is forming.
CS3 Sync & Share 2025-03-21