core.interpolate

class core.interpolate.Index(values)[source]

Bases: object

get_indexer(coords)[source]
class core.interpolate.Interpolator(data: Dataset, **kwargs)[source]

Bases: BlockProcessor

Interpolate/select a Dataset onto new coordinates.

This class is similar to xr.interp and xr.sel, but:
  • Supports dask-based coordinates inputs without triggering immediate computation as is done by xr.interp

  • Supports combinations of selection and interpolation. This is faster and more memory efficient than performing independently the selection and interpolation.

  • Supports pointwise indexing/interpolation using dask arrays (see https://docs.xarray.dev/en/latest/user-guide/indexing.html#more-advanced-indexing)

  • Supports per-dimension options (nearest neighbour selection, linear/spline interpolation, out-of-bounds behaviour, cyclic dimensions…)

  • Works with full Datasets, allowing interpolation of multiple variables at once.

The interp function is a convenience wrapper around this class for single DataArrays.

Parameters:
  • data (xr.Dataset) – The input Dataset containing variables to interpolate.

  • **kwargs

    definition of the selection/interpolation coordinates for each dimension, using the following classes:

    • Linear: linear interpolation (like xr.DataArray.interp)

    • Nearest: nearest neighbour selection (like xr.DataArray.sel)

    • Index: integer index selection (like xr.DataArray.isel)

    These classes store the coordinate data in their .values attribute and have a .get_indexer method which returns an indexer for the passed coordinates.

Example

>>> interpolator = Interpolator(
...     # input Dataset with variables ozone and wind, dimensions (lat, lon)
...     data,
...     # perform linear interpolation along dimension `lat`
...     # using variable "latitude" in coords_dataset
...     # clip out-of-bounds values to the axis min/max.
...     lat = Linear("latitude", bounds='clip'),
...     # perform nearest neighbour selection along
...     # dimension `lon`, using variable "longitude"
...     # in coords
...     lon = Nearest("longitude"),
... )
>>> result = interpolator.map_blocks(coords_dataset)  # coords_dataset has lat and lon variables
... # returns a Dataset with variables "ozone" and "wind", interpolated over the
... # coordinates lat and lon provided in coords_dataset
auto_template() bool[source]

Returns whether “automatic templating” should be activated

Automatic templating calls process_block on mocked up (empty) data to determine the output variables types and dimensions.

auto_template is disactivated by default.

check(ds: Dataset)[source]

Validate the input dataset for this processor.

This method can be overridden by subclasses to perform custom validation on the input dataset before processing. By default, it does nothing.

Parameters:

ds (xr.Dataset) – The input dataset to validate.

created_dims() Dict[str, Any][source]

Describes new dimensions to be created by this processor.

Each dimension is defined either by: - An int: specifying the dimension size - An array-like: providing coordinates for the dimension

Returns:

Dictionary mapping dimension names to their size (int) or coordinates (array-like)

Return type:

Dict[str, int or array-like]

Examples

>>> def created_dims(self):
...     return {'z': 10, 'wavelength': [400, 500, 600, 700]}
created_vars() List[Var][source]

Variables that this processor will create.

Provide Var definition: name, dtype, dims or dims_like (plus optionally flags or other attributes). Declare any new dims in created_dims(). If dtype or both dims and dims_like are not specified, they will be assessed by running process_block on mockup data.

Use dims_like to specify that the dimensions should match those of an existing variable in the dataset (e.g., dims_like=’input_var’).

Example: >>> def created_vars(self): … return [Var(‘sum’, ‘float64’, (‘x’, ‘y’))]

input_vars() List[Var][source]

Input variables required by this processor.

Example: >>> def input_vars(self): … return [Var(‘a’), Var(‘b’), Var(‘flags’)]

process_block(block: Dataset)[source]
ex: block is a xr.Dataset with

latitude (x, y) longitude (x, y)

Returns interpolated_data, interpolated over the coordinates provided in block. interpolated_data is of the same type as data

class core.interpolate.Linear(var: str | DataArray | None, bounds: Literal['error', 'nan', 'clip', 'cycle'] = 'error', spacing: Literal['regular', 'irregular', 'auto'] | Callable[[float], float] = 'auto', period: float | None = None)[source]

Bases: object

get_indexer(coords: DataArray)[source]
class core.interpolate.Linear_Indexer(coords: ndarray[tuple[Any, ...], dtype[_ScalarT]], bounds: str, spacing, period=None)[source]

Bases: object

class core.interpolate.Locator(coords: ndarray[tuple[Any, ...], dtype[_ScalarT]], bounds: str)[source]

Bases: object

The purpose of these classes is to locate values in coordinate axes.

handle_oob(values: ndarray)[source]

handle out of bound values

Note: when bounds == “cycle”, does nothing

locate_index(values)[source]
locate_index_weight(values)[source]

Find indices and dist of values for linear and spline interpolation in self.coords

Returns a list of indices, dist (float 0 to 1) and oob

class core.interpolate.Locator_Regular(coords, bounds: str, inversion_func: Callable | None = None, period=None)[source]

Bases: Locator

locate_index(values)[source]
locate_index_weight(values)[source]

Optimized version of locate_index_weight for regular grid coords

class core.interpolate.Nearest(var: str | DataArray | None, tolerance: float | None = None, spacing: Literal['auto'] | Callable[[float], float] = 'auto')[source]

Bases: object

get_indexer(coords: DataArray)[source]
class core.interpolate.Nearest_Indexer(coords: ndarray[tuple[Any, ...], dtype[_ScalarT]], tolerance: float | None, spacing: str | Callable = 'auto')[source]

Bases: object

class core.interpolate.Spline(var: str | DataArray | None, tension=0.5, bounds: Literal['error', 'nan', 'clip'] = 'error', spacing: Literal['regular', 'irregular', 'auto'] | Callable[[float], float] = 'auto')[source]

Bases: object

get_indexer(coords: DataArray)[source]
class core.interpolate.Spline_Indexer(coords: ndarray[tuple[Any, ...], dtype[_ScalarT]], bounds: str, spacing, tension: float)[source]

Bases: object

core.interpolate.align_lists(lists: list[list]) list[source]

Align items from several lists into a single list that respects the order in each sublist. Raises ValueError if the lists cannot be aligned (e.g., due to cycles).

Parameters:

lists – List of lists, where each sublist defines a partial order.

Returns:

A single list with items aligned according to the partial orders.

Example

align_lists([[‘x’], [‘x’, ‘y’], [‘y’, ‘z’]]) -> [‘x’, ‘y’, ‘z’] align_lists([[‘x’, ‘y’], [‘y’, ‘x’]]) -> ValueError

core.interpolate.broadcast_numpy(ds: Dataset, out_dims: list) Dict[source]

Returns all data variables in ds as numpy arrays broadcastable against each other, with dimensions orderes as out_dims. (with new single-element dimensions)

This requires the input to be broadcasted to common dimensions.

core.interpolate.broadcast_shapes(ds: Dataset, dims: list) Dict[source]

For each data variable in ds, returns the shape for broadcasting in the dimensions defined by dims

core.interpolate.create_locator(coords, bounds: str, spacing, period: float | None = None) Locator[source]

Locator factory

The purpose of this method is to instantiate the appropriate “Locator” class.

The args are passed from the indexers.

core.interpolate.determine_output_dimensions(data: DataArray, mapping: dict, coordinates: Dataset) list[source]

Determine output dimensions for an interpolated/selected DataArray.

This function implements NumPy’s advanced indexing rules to determine the final dimension ordering of the output DataArray after interpolation/selection operations.

The key principle is that when advanced indexing is applied to some dimensions of an array, those dimensions are replaced by the dimensions of the indexing arrays, and these new dimensions are inserted at the position of the first indexed dimension.

Parameters:
  • data (xr.DataArray) – The input DataArray being interpolated/selected.

  • mapping (dict) – A dictionary mapping dimension names in data (keys) to variable names in coordinates (values). Only dimensions being interpolated/selected should be included.

  • coordinates (xr.Dataset) – The Dataset containing coordinate variables referenced in mapping.

Returns:

A list of dimension names for the output DataArray, in the order they will appear.

Return type:

list

core.interpolate.find_indices(grid, xi) tuple[ndarray, ndarray][source]

Multi-dimensional grid interpolation preprocessing.

Parameters:
  • grid – Tuple of 1D arrays defining grid coordinates for each dimension

  • xi – 2D array where each row represents a dimension and each column a query point

Returns:

Grid interval indices for each query point in each dimension distances: Normalized distances within each interval

Return type:

indices

core.interpolate.interp(da: DataArray, **kwargs) DataArray[source]

Interpolate/select a DataArray onto new coordinates.

This function is similar to xr.interp and xr.sel, but:
  • Supports dask-based coordinates inputs without triggering immediate computation as is done by xr.interp

  • Supports combinations of selection and interpolation. This is faster and more memory efficient than performing independently the selection and interpolation.

  • Supports pointwise indexing/interpolation using dask arrays (see https://docs.xarray.dev/en/latest/user-guide/indexing.html#more-advanced-indexing)

  • Supports per-dimension options (nearest neighbour selection, linear/spline interpolation, out-of-bounds behaviour, cyclic dimensions…)

Parameters:
  • da (xr.DataArray) – The input DataArray

  • **kwargs

    definition of the selection/interpolation coordinates for each dimension, using the following classes:

    • Linear: linear interpolation (like xr.DataArray.interp)

    • Nearest: nearest neighbour selection (like xr.DataArray.sel)

    • Index: integer index selection (like xr.DataArray.isel)

    These classes store the coordinate data in their .values attribute and have a .get_indexer method which returns an indexer for the passed coordinates.

Example

>>> interp(
...     data,  # input DataArray with dimensions (a, b, c)
...     a = Linear(           # perform linear interpolation along dimension `a`
...          a_values,        # `a_values` is a DataArray with dimension (x, y);
...          bounds='clip'),  # clip out-of-bounds values to the axis min/max.
...     b = Nearest(b_values), # perform nearest neighbour selection along
...                            # dimension `b`; `b_values` is a DataArray
...                            # with dimension (x, y)
... ) # returns a DataArray with dimensions (x, y, c)
No interpolation or selection is performed along dimension `c` thus it is
left as-is.
Returns:

DataArray on the new coordinates.

Return type:

xr.DataArray

core.interpolate.interp_var(da: DataArray, indices_weights: dict, w_shape: dict, current_out_dims: list, current_dims: list, shared_dims_info: dict | None = None) ndarray[tuple[Any, ...], dtype[Any]][source]

Interpolate a single variable da, with indices and weights

shared_dims_info maps dim names that are shared between da and the coordinate variables to their position in the index arrays. These dims need element-wise range indexing instead of slice(None).

core.interpolate.product_dict(**kwargs) Iterable[Dict][source]

Cartesian product of a dictionary of lists