Skip to content
Logo
EO-LINCS

xcube Multi-Source Data Store

xcube-multistore is a Python package designed to create a Multi-Source Data Store that enables the seamless integration of data from multiple sources into a unified data model. This approach simplifies the data fusion process while ensuring transparency and reproducibility through well-defined configurations.

The package utilizes xcube’s data access, implemented via data store plugins, along with additional functionalities from xcube-resampling, to manipulate and harmonize datasets according to user-defined specifications.

The workflow includes the following steps:

  1. Data access through xcube data stores
  2. Data harmonization (e.g. subset, resample, reproject a dataset)
  3. Optional data fusion (e.g. combining multiple data sources into one data cube)

This process results in either a single, unified data cube with all datasets aligned to a consistent grid or a catalog of separate datasets.

Overview

The Multi-Source Data Store is configured through a YAML file. Examples are available in the view the examples folder. For more detailed guidance on creating a configuration file, please refer to the Configuration Guide.

Once the configuration file is ready, the Multi-Source Data Store can be started with a single line of code, as shown below:

from xcube_multistore.multistore import MultiSourceDataStore

msds = MultiSourceDataStore("config.yml")

Note: If the generation of one data cube fails, the system continues with the next dataset. This ensures that all configured datasets are processed and that a single failing dataset does not interrupt the entire workflow.

Features

  • subset of dataset (defined by grid mapping)
  • resample and reproject dataset (defined by grid mapping)
  • resample along the time axis
  • grid mapping may be defined by the user or by a dataset
  • allow for time series at a single spatial point; interpolate the neighbouring points
  • allow data fusion, where data variables in one xr.Dataset refers to different data sources
  • support spatial cutout of an area around a defined spatial point.
  • support preload API for xcube-clms and xcube-zendoo
  • allow to write to netcdf and zarr
  • some auxiliary functionalities which shall help to set up a config YAML file.

License

The package is open source and released under the MIT license. ❤