Skip to content
Logo
EO-LINCS

xcube Multi-Source Data Store

xcube-multistore is a Python package designed to create a Multi-Source Data Store that enables the seamless integration of data from multiple sources into a unified data model. This approach simplifies the data fusion process while ensuring transparency and reproducibility through well-defined configurations.

The package utilizes xcube’s data access, implemented via data store plugins, along with additional functionalities from xcube, to manipulate and harmonize datasets according to user-defined specifications.

The workflow includes the following steps:

  1. Data access through xcube data stores
  2. Data harmonization (e.g. subset, resample, reproject a dataset)
  3. Optional data fusion (e.g. combining multiple data sources into one data cube)

This process results in either a single, unified data cube with all datasets aligned to a consistent grid or a catalog of separate datasets.

Overview

The Multi-Source Data Store is configured via a YAML file. You can find an example configuration in examples/config.yml.

For more detailed guidance on creating a configuration file, please refer to the Configuration Guide.

Once the configuration file is ready, the Multi-Source Data Store can be started with a single line of code, as shown below:

from xcube_multistore.multistore import MultiSourceDataStore

msds = MultiSourceDataStore("config.yml")

For further examples please view the examples folder.

Features

IMPORTANT:
The xcube-multistore package is currently in the early stages of development.
The following features are available so far:

  • subset of dataset (defined by grid mapping)
  • resample and reproject dataset (defined by grid mapping)
  • grid mapping may be defined by the user or by a dataset
  • allow for time series at a single spatial point; interpolate the neighbouring points
  • allow data fusion, where data variables in one xr.Dataset refers to different data sources
  • support spatial cutout of an area around a defined spatial point.
  • support preload API for xcube-clms and xcube-zendoo
  • allow to write to netcdf and zarr

The following features will be implemented in the future:

  • some auxiliary functionalities which shall help to setup a config YAML file.
  • interpolate along the time axis

License

The package is open source and released under the MIT license. ❤