Demonstration of function resample_in_space()¶
This notebook demonstrates the functionality of the resample_in_space function. We show how it can be used for three different purposes:
- Rectification
- Reprojection to another Coordinate Reference System
- Up- and Downsampling
import xarray as xr
from xcube_resampling.gridmapping import GridMapping
from xcube_resampling import resample_in_space
from zarr.storage import ZipStore
First we open a Sentinel-3 OLCI Level-1B product.
%%time
store = ZipStore("inputdata/S3-OLCI-L2A.zarr.zip", mode="r")
source_ds = xr.open_zarr(store, consolidated=False)
source_ds
CPU times: user 42.7 ms, sys: 2.9 ms, total: 45.6 ms Wall time: 44.8 ms
<xarray.Dataset> Size: 72MB
Dimensions: (y: 1890, x: 1189)
Coordinates:
lat (y, x) float64 18MB dask.array<chunksize=(512, 512), meta=np.ndarray>
lon (y, x) float64 18MB dask.array<chunksize=(512, 512), meta=np.ndarray>
Dimensions without coordinates: y, x
Data variables:
quality_flags (y, x) uint32 9MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_3 (y, x) float32 9MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_6 (y, x) float32 9MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_8 (y, x) float32 9MB dask.array<chunksize=(512, 512), meta=np.ndarray>
Attributes:
Conventions: CF-1.4
product_type: C2RCC_OLCI
start_date: 04-JUL-2018 09:21:55.677316
stop_date: 04-JUL-2018 09:23:18.811790Next we can plot the raw data.
%%time
source_ds.rtoa_8.plot(vmax=0.25)
CPU times: user 661 ms, sys: 132 ms, total: 793 ms Wall time: 734 ms
<matplotlib.collections.QuadMesh at 0x725a8c7e0d70>
The dataset uses 2D coordinates lon and lat, each defined over two dimensions (x and y) to represent the satellite viewing geometry. These coordinates are expressed in the EPSG:4326 geographic coordinate system (WGS84).
Rectification¶
We can create a GridMapping object from the dataset, which contains all the information required by the rectification algorithm.
%%time
source_gm = GridMapping.from_dataset(source_ds)
source_gm
CPU times: user 55.5 ms, sys: 5.13 ms, total: 60.6 ms Wall time: 50.2 ms
class: Coords2DGridMapping
- is_regular: False
- is_j_axis_up: False
- is_lon_360: False
- crs: EPSG:4326
- xy_res: (0.00447132, 0.00255235) estimated
- xy_bbox: (12.6920653, 55.1989038, 20.0071197, 60.6394662)
- ij_bbox: (0, 0, 1189, 1890)
- xy_dim_names: ('x', 'y')
- xy_var_names: ('lon', 'lat')
- size: (1189, 1890)
- tile_size: (512, 512)
We aim to rectify this dataset so that it uses a regular EPSG:4326 CRS with 1D coordinates lon and lat having constant spacing. To do this, we first create a new regular target grid mapping as follows:
%%time
target_gm = source_gm.to_regular(tile_size=512)
target_gm
CPU times: user 1.1 ms, sys: 16 μs, total: 1.12 ms Wall time: 650 μs
class: RegularGridMapping
- is_regular: True
- is_j_axis_up: False
- is_lon_360: False
- crs: EPSG:4326
- xy_res: (0.00447132, 0.00255235)
- xy_bbox: (12.6920653, 55.1989038, 20.0071448, 60.640514)
- ij_bbox: (0, 0, 1636, 2132)
- xy_dim_names: ('lon', 'lat')
- xy_var_names: ('lon', 'lat')
- size: (1636, 2132)
- tile_size: (512, 512)
The target grid mapping is created lazily: the large 2D x and y coordinates used for rectification are stored as Dask arrays.
%%time
x_coords, y_coords = target_gm.xy_coords
y_coords
CPU times: user 3.4 ms, sys: 0 ns, total: 3.4 ms Wall time: 3.4 ms
<xarray.DataArray 'xy_coords' (lat: 2132, lon: 1636)> Size: 28MB dask.array<getitem, shape=(2132, 1636), dtype=float64, chunksize=(512, 512), chunktype=numpy.ndarray> Dimensions without coordinates: lat, lon
We can now perform resample_in_space to rectify the dataset.
%%time
target_ds = resample_in_space(source_ds, source_gm=source_gm, target_gm=target_gm, interp_methods="nearest")
target_ds
CPU times: user 860 ms, sys: 55.5 ms, total: 915 ms Wall time: 793 ms
<xarray.Dataset> Size: 56MB
Dimensions: (lon: 1636, lat: 2132)
Coordinates:
* lon (lon) float64 13kB 12.69 12.7 12.7 12.71 ... 20.0 20.0 20.0
* lat (lat) float64 17kB 60.64 60.64 60.63 ... 55.21 55.2 55.2
spatial_ref int64 8B 0
Data variables:
quality_flags (lat, lon) uint32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_3 (lat, lon) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_6 (lat, lon) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_8 (lat, lon) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
Attributes:
Conventions: CF-1.4
product_type: C2RCC_OLCI
start_date: 04-JUL-2018 09:21:55.677316
stop_date: 04-JUL-2018 09:23:18.811790%%time
target_ds.rtoa_8.plot(vmax=0.25)
CPU times: user 12.4 s, sys: 3.08 s, total: 15.4 s Wall time: 2.03 s
<matplotlib.collections.QuadMesh at 0x725a74f0a850>
Reprojection¶
We can also switch to a different coordinate reference system, for example UTM zone 33N, also known as EPSG:32633.
Since we need the bounding box and the target resolution in UTM coordinates, we first transform the source grid mapping. This produces an irregular grid mapping with 2D UTM coordinates.
%%time
temp_utm_target_gm = source_gm.transform("EPSG:32633")
temp_utm_target_gm
CPU times: user 338 ms, sys: 6.62 ms, total: 345 ms Wall time: 176 ms
class: Coords2DGridMapping
- is_regular: False
- is_j_axis_up: False
- is_lon_360: True
- crs: EPSG:32633
- xy_res: (264.534, 285.506) estimated
- xy_bbox: (355331, 6120157, 779093, 6722808)
- ij_bbox: (0, 0, 1189, 1890)
- xy_dim_names: ('x', 'y')
- xy_var_names: ('transformed_x', 'transformed_y')
- size: (1189, 1890)
- tile_size: (512, 512)
In the next cell, we convert the irregular grid into a regular grid:
%%time
utm_target_gm = temp_utm_target_gm.to_regular()
utm_target_gm
CPU times: user 92 μs, sys: 0 ns, total: 92 μs Wall time: 95.1 μs
class: RegularGridMapping
- is_regular: True
- is_j_axis_up: False
- is_lon_360: False
- crs: EPSG:32633
- xy_res: (264.534, 285.506)
- xy_bbox: (355330.733, 6120157.247, 779114.267, 6722860.753)
- ij_bbox: (0, 0, 1602, 2111)
- xy_dim_names: ('x', 'y')
- xy_var_names: ('x', 'y')
- size: (1602, 2111)
- tile_size: (512, 512)
%%time
x_coords, y_coords = utm_target_gm.xy_coords
y_coords
CPU times: user 3.1 ms, sys: 23 μs, total: 3.12 ms Wall time: 3.13 ms
<xarray.DataArray 'xy_coords' (y: 2111, x: 1602)> Size: 27MB dask.array<getitem, shape=(2111, 1602), dtype=float64, chunksize=(512, 512), chunktype=numpy.ndarray> Dimensions without coordinates: y, x
Next, we apply resample_in_space to resample the dataset onto the newly defined target grid.
%%time
transformed_target_ds = resample_in_space(
source_ds, source_gm=source_gm, target_gm=utm_target_gm, interp_methods=0
)
transformed_target_ds
CPU times: user 2.41 s, sys: 82.7 ms, total: 2.49 s Wall time: 1.14 s
<xarray.Dataset> Size: 54MB
Dimensions: (x: 1602, y: 2111)
Coordinates:
* x (x) float64 13kB 3.555e+05 3.557e+05 ... 7.787e+05 7.79e+05
* y (y) float64 17kB 6.723e+06 6.722e+06 ... 6.121e+06 6.12e+06
spatial_ref int64 8B 0
Data variables:
quality_flags (y, x) uint32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_3 (y, x) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_6 (y, x) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
rtoa_8 (y, x) float32 14MB dask.array<chunksize=(512, 512), meta=np.ndarray>
Attributes:
Conventions: CF-1.4
product_type: C2RCC_OLCI
start_date: 04-JUL-2018 09:21:55.677316
stop_date: 04-JUL-2018 09:23:18.811790%%time
transformed_target_ds.rtoa_8.plot(vmax=0.25)
CPU times: user 8.91 s, sys: 957 ms, total: 9.87 s Wall time: 2.22 s
<matplotlib.collections.QuadMesh at 0x725a6385f610>
The next examples show how to we can apply up- and down-sampling.
Downsampling¶
%%time
downsampled_target_gm = target_gm.scale(0.25, tile_size=128)
downsampled_target_gm
CPU times: user 99 μs, sys: 14 μs, total: 113 μs Wall time: 116 μs
class: RegularGridMapping
- is_regular: True
- is_j_axis_up: False
- is_lon_360: False
- crs: EPSG:4326
- xy_res: (0.01788528, 0.0102094)
- xy_bbox: (12.6920653, 55.1989038, 20.0071448, 60.640514)
- ij_bbox: (0, 0, 409, 533)
- xy_dim_names: ('lon', 'lat')
- xy_var_names: ('lon', 'lat')
- size: (409, 533)
- tile_size: (128, 128)
%%time
x_coords, y_coords = downsampled_target_gm.xy_coords
y_coords
CPU times: user 3.27 ms, sys: 41 μs, total: 3.31 ms Wall time: 3.33 ms
<xarray.DataArray 'xy_coords' (lat: 533, lon: 409)> Size: 2MB dask.array<getitem, shape=(533, 409), dtype=float64, chunksize=(128, 128), chunktype=numpy.ndarray> Dimensions without coordinates: lat, lon
%%time
downsampled_ds = resample_in_space(
target_ds, source_gm=target_gm, target_gm=downsampled_target_gm
)
downsampled_ds
CPU times: user 200 ms, sys: 4.24 ms, total: 204 ms Wall time: 199 ms
<xarray.Dataset> Size: 3MB
Dimensions: (lat: 533, lon: 409)
Coordinates:
* lat (lat) float64 4kB 60.64 60.63 60.61 60.6 ... 55.22 55.21 55.2
* lon (lon) float64 3kB 12.7 12.72 12.74 12.75 ... 19.96 19.98 20.0
spatial_ref int64 8B 0
Data variables:
quality_flags (lat, lon) uint32 872kB dask.array<chunksize=(128, 128), meta=np.ndarray>
rtoa_3 (lat, lon) float32 872kB dask.array<chunksize=(128, 128), meta=np.ndarray>
rtoa_6 (lat, lon) float32 872kB dask.array<chunksize=(128, 128), meta=np.ndarray>
rtoa_8 (lat, lon) float32 872kB dask.array<chunksize=(128, 128), meta=np.ndarray>
Attributes:
Conventions: CF-1.4
product_type: C2RCC_OLCI
start_date: 04-JUL-2018 09:21:55.677316
stop_date: 04-JUL-2018 09:23:18.811790%%time
downsampled_ds.rtoa_6.plot(vmax=0.25)
/home/konstantin/micromamba/envs/xcube-multistore/lib/python3.13/site-packages/xcube_resampling/coarsen.py:103: RuntimeWarning: Mean of empty slice return nan_reducer(block, axis)
CPU times: user 11.6 s, sys: 1.68 s, total: 13.3 s Wall time: 2.19 s
<matplotlib.collections.QuadMesh at 0x725a63369f90>
Upsampling¶
%%time
upsampled_target_gm = target_gm.scale(4, tile_size=1024)
upsampled_target_gm
CPU times: user 101 μs, sys: 15 μs, total: 116 μs Wall time: 118 μs
class: RegularGridMapping
- is_regular: True
- is_j_axis_up: False
- is_lon_360: False
- crs: EPSG:4326
- xy_res: (0.00111783, 0.0006380875)
- xy_bbox: (12.6920653, 55.1989038, 20.0071448, 60.640514)
- ij_bbox: (0, 0, 6544, 8528)
- xy_dim_names: ('lon', 'lat')
- xy_var_names: ('lon', 'lat')
- size: (6544, 8528)
- tile_size: (1024, 1024)
%%time
x_coords, y_coords = upsampled_target_gm.xy_coords
y_coords
CPU times: user 4.71 ms, sys: 0 ns, total: 4.71 ms Wall time: 4.69 ms
<xarray.DataArray 'xy_coords' (lat: 8528, lon: 6544)> Size: 446MB dask.array<getitem, shape=(8528, 6544), dtype=float64, chunksize=(1024, 1024), chunktype=numpy.ndarray> Dimensions without coordinates: lat, lon
%%time
upsampled_target_ds = resample_in_space(
target_ds, source_gm=target_gm, target_gm=upsampled_target_gm
)
upsampled_target_ds
CPU times: user 91.5 ms, sys: 1.57 ms, total: 93.1 ms Wall time: 88.6 ms
<xarray.Dataset> Size: 893MB
Dimensions: (lat: 8528, lon: 6544)
Coordinates:
* lat (lat) float64 68kB 60.64 60.64 60.64 60.64 ... 55.2 55.2 55.2
* lon (lon) float64 52kB 12.69 12.69 12.69 ... 20.0 20.01 20.01
spatial_ref int64 8B 0
Data variables:
quality_flags (lat, lon) uint32 223MB dask.array<chunksize=(1024, 1024), meta=np.ndarray>
rtoa_3 (lat, lon) float32 223MB dask.array<chunksize=(1024, 1024), meta=np.ndarray>
rtoa_6 (lat, lon) float32 223MB dask.array<chunksize=(1024, 1024), meta=np.ndarray>
rtoa_8 (lat, lon) float32 223MB dask.array<chunksize=(1024, 1024), meta=np.ndarray>
Attributes:
Conventions: CF-1.4
product_type: C2RCC_OLCI
start_date: 04-JUL-2018 09:21:55.677316
stop_date: 04-JUL-2018 09:23:18.811790%%time
upsampled_target_ds.rtoa_6[::4, ::4].plot(vmax=0.25)
CPU times: user 17.9 s, sys: 1.62 s, total: 19.6 s Wall time: 2.13 s
<matplotlib.collections.QuadMesh at 0x725a6330b890>