Skip to content
Logo
EO-LINCS

xcube Multi-Source Data Store

xcube-multistore is a Python package designed to create a Multi-Source Data Store that enables the seamless integration of data from multiple sources into a unified data model. This approach simplifies the data fusion process while ensuring transparency and reproducibility through well-defined configurations.

The package utilizes xcube’s data access, implemented via data store plugins, along with additional functionalities from xcube, to manipulate and harmonize datasets according to user-defined specifications.

The workflow includes the following steps:

  1. Data access through xcube data stores
  2. Data harmonization (e.g. subset, resample, reproject a dataset)
  3. Optional data fusion (e.g. combining multiple data sources into one data cube)

This process results in either a single, unified data cube with all datasets aligned to a consistent grid or a catalog of separate datasets.

Overview

The Multi-Source Data Store is configured via a YAML file. You can find an example configuration in examples/config.yml.

For more detailed guidance on creating a configuration file, please refer to the Configuration Guide.

Once the configuration file is ready, the Multi-Source Data Store can be started with a single line of code, as shown below:

from xcube_multistore.multistore import MultiSourceDataStore

msds = MultiSourceDataStore("config.yml")

For further examples please view the examples folder.

Features

IMPORTANT:
The xcube-multistore package is currently in the early stages of development.
The following features are available so far:

  • subset of dataset (defined by grid mapping)
  • resample and reproject dataset (defined by grid mapping)
  • grid mapping may be defined by the user or by a dataset
  • allow for time series at a single spatial point; interpolate the neighbouring points
  • allow data fusion, where data variables in one xr.Dataset refers to different data sources
  • support spatial cutout of an area around a defined spatial point.
  • support preload API for xcube-clms and xcube-zendoo
  • allow to write to netcdf and zarr
  • some auxiliary functionalities which shall help to setup a config YAML file.
  • interpolate along the time axis

Configuration Generator GUI

The Configuration Generator GUI provides an interactive interface for creating and editing the configuration YAML, making the setup process more intuitive and less error-prone.

Key features (in development):

  • Display of all available fields for each configuration section
  • Dynamic fetching and updating of valid parameters and inputs
  • Dropdown menus that show only supported options
  • Autofill assistance for large option sets (e.g., thousands of data IDs)
  • Built-in configuration validator/checker
  • Geolocation visualization to help define bounding boxes

Note: This feature is under active development, and only a minimal working example is currently available.

To launch the GUI, run the following command from the package root:

panel serve xcube_multistore/gui/app.py --dev

License

The package is open source and released under the MIT license. ❤