API reference
earthcarekit.utils.xarray_utils
earthcarekit.utils.xarray_utils
Utilities based on xarray.
concat_datasets
Concatenate two xarray.Dataset objects along a specified dimension, padding other dimensions to match.
Pads all non-concatenation dimensions in both datasets to the maximum size among them (if they differ) before concatenating. Integer variables are padded with -9999 or data type-specific minimum value (e.g., -128 for int8), non-interger variables are padded with NaN.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds1
|
Dataset
|
The first dataset to concatenate. |
required |
ds2
|
Dataset
|
The second dataset to concatenate. |
required |
dim
|
str
|
The name of the dimension to concatenate along. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Dataset |
Dataset
|
A new dataset resulting from the concatenation. |
Source code in earthcarekit/utils/xarray_utils/_concat.py
convert_scalar_var_to_str
Converts a given scalar variable inside a xarray.Dataset to string.
Source code in earthcarekit/utils/xarray_utils/_scalars.py
filter_index
filter_index(
ds: Dataset,
index: int | slice | NDArray | Sequence,
along_track_dim: str = ALONG_TRACK_DIM,
trim_index_offset_var: str = "trim_index_offset",
pad_idxs: int = 0,
) -> Dataset
Filters a dataset given an along-track index number, list/array or range/slice.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Input dataset with along-track dimension. |
required |
index
|
int | slice | NDArray
|
Index(es) to filter. |
required |
along_track_dim
|
str
|
Dimension along which to apply filtering. Defaults to ALONG_TRACK_DIM. |
ALONG_TRACK_DIM
|
pad_idxs
|
int
|
Number of additional samples added at both sides of the selection.
This input is ignored when |
0
|
Returns:
| Name | Type | Description |
|---|---|---|
Dataset |
Dataset
|
Filtered dataset. |
Examples:
>>> fp = "ECA_EXBC_CPR_FMR_2A_20260108T030403Z_20260108T042349Z_09167F.h5"
>>> with eck.read_product(fp) as ds:
>>> ds_filtered = eck.filter_index(ds, 123)
>>> print(ds_filtered.sizes)
Frozen({'along_track': 1, 'vertical': 218})
>>> ds_filtered = eck.filter_index(ds, slice(0, 1000))
>>> print(ds_filtered.sizes)
Frozen({'along_track': 1000, 'vertical': 218})
>>> ds_filtered = eck.filter_index(ds, (0, 1000))
>>> print(ds_filtered.sizes)
Frozen({'along_track': 2, 'vertical': 218})
Source code in earthcarekit/utils/xarray_utils/_filter_index.py
filter_latitude
filter_latitude(
ds: Dataset,
lat_range: NumericPairNoneLike,
start_before_pole: bool = True,
end_before_pole: bool = True,
only_center: bool = False,
lat_var: str = TRACK_LAT_VAR,
along_track_dim: str = ALONG_TRACK_DIM,
trim_index_offset_var: str = "trim_index_offset",
pad_idxs: int = 0,
shift_idxs: int = 0,
) -> Dataset
Filters a dataset to include only points within a specified latitude range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Input dataset with geolocation data. |
required |
lat_range
|
NumericPairNoneLike
|
A pair of latitude values (min_lat, max_lat) defining the selection range. |
required |
start_before_pole
|
bool
|
If True, selection starts before the pole when the track crosses one. Defaults to True. |
True
|
end_before_pole
|
bool
|
If True, selection ends before the pole when the track crosses one. Defaults to True. |
True
|
only_center
|
bool
|
If True, only the sample at the center index of selection is returned. Defaults to False. |
False
|
lat_var
|
str
|
Name of the latitude variable. Defaults to TRACK_LAT_VAR. |
TRACK_LAT_VAR
|
along_track_dim
|
str
|
Dimension along which to apply filtering. Defaults to ALONG_TRACK_DIM. |
ALONG_TRACK_DIM
|
pad_idxs
|
int
|
Number of additional samples added at both sides of the selection. Defaults to 0. |
0
|
shift_idxs
|
int
|
Offset number to shift selection of samples. Defaults to 0. |
0
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If selection is empty. |
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset: Filtered dataset containing only points within the specified latitude range. |
Examples:
>>> fp = "ECA_EXBC_CPR_FMR_2A_20260108T030403Z_20260108T042349Z_09167F.h5"
>>> with eck.read_product(fp) as ds:
>>> print(ds.latitude.values)
[-22.50316844 -22.51202978 -22.52089178 ... -67.48243216 -67.49074691 -67.49906148]
>>> ds_filtered = eck.filter_latitude(ds, (-40, -30))
>>> print(ds_filtered.latitude.values)
[-30.0036885 -30.01258957 -30.02149091 ... -39.98112826 -39.98962597 -39.99812425]
Source code in earthcarekit/utils/xarray_utils/_filter_latitude.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
filter_radius
filter_radius(
ds: Dataset,
radius_km: float = 100.0,
center_lat: float | None = None,
center_lon: float | None = None,
site: GroundSite | str | None = None,
lat_var: str = TRACK_LAT_VAR,
lon_var: str = TRACK_LON_VAR,
along_track_dim: str = ALONG_TRACK_DIM,
method: Literal["geodesic", "haversine"] = "geodesic",
closest: bool = False,
trim_index_offset_var: str = "trim_index_offset",
pad_idxs: int = 0,
shift_idxs: int = 0,
) -> Dataset
Filters a dataset to include only points within a specified radius of a geographic location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Input dataset with geolocation data. |
required |
radius_km
|
float
|
Radius (in kilometers) around the center location. |
100.0
|
site
|
GroundSite or str
|
GroundSite object or name from which center location will be retrieved,
alternatively |
None
|
center_lat
|
float
|
Latitude of the center point,
alternatively |
None
|
center_lon
|
float
|
Longitude of the center point,
alternatively |
None
|
lat_var
|
str
|
Name of the latitude variable. Defaults to TRACK_LAT_VAR. |
TRACK_LAT_VAR
|
lon_var
|
str
|
Name of the longitude variable. Defaults to TRACK_LON_VAR. |
TRACK_LON_VAR
|
along_track_dim
|
str
|
Dimension along which to apply filtering. Defaults to ALONG_TRACK_DIM. |
ALONG_TRACK_DIM
|
method
|
Literal['geodesic', 'haversine']
|
Distance calculation method. Defaults to "geodesic". |
'geodesic'
|
closest
|
bool
|
If True, only the single closest sample is returned, otherwise all samples within radius. Defaults to False. |
False
|
trim_index_offset_var
|
str
|
dataset variable keeping track of index offsets caused by dataset trimming/filtering. Defaults to "trim_index_offset". |
'trim_index_offset'
|
pad_idxs
|
int
|
Number of additional samples added at both sides of the selection. Defaults to 0. |
0
|
shift_idxs
|
int
|
Offset number to shift selection of samples. Defaults to 0. |
0
|
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset: Filtered dataset containing only points within the specified radius. |
Raises:
| Type | Description |
|---|---|
EmptyFilterResultError
|
If no data points are found within the radius. |
ValueError
|
If the |
Examples:
>>> fp = "ECA_EXBB_ATL_EBD_2A_20240902T210023Z_20251107T142547Z_01508B.h5"
>>> with eck.read_product(fp) as ds:
>>> print(ds.sizes)
Frozen({'along_track': 5143, 'vertical': 242, 'layer': 25, 'n_state': 351})
>>> ds_filtered = eck.filter_radius(ds, site="dushanbe")
>>> print(ds_filtered.sizes)
Frozen({'along_track': 197, 'vertical': 242, 'layer': 25, 'n_state': 351})
>>> ds_filtered = eck.filter_radius(ds, site="dushanbe", radius_km=200)
>>> print(ds_filtered.sizes)
Frozen({'along_track': 399, 'vertical': 242, 'layer': 25, 'n_state': 351})
Source code in earthcarekit/utils/xarray_utils/_filter_radius.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
filter_time
filter_time(
ds: Dataset,
time_range: TimeRangeLike | Iterable | None = None,
timestamp: TimestampLike | None = None,
only_center: bool = False,
time_var: str = TIME_VAR,
along_track_dim: str = ALONG_TRACK_DIM,
trim_index_offset_var: str = "trim_index_offset",
pad_idxs: int = 0,
shift_idxs: int = 0,
) -> Dataset
Filters an xarray Dataset to include only samples within a given time range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
The input dataset containing a time coordinate. |
required |
time_range
|
TimeRangeLike | Iterable | None
|
Start and end time of the range to filter, as strings or pandas timestamps. Defaults to None. |
None
|
timestamp
|
TimestampLike | None
|
A single timestamp for which the closest sample to return. Defaults to None. |
None
|
only_center
|
bool
|
If True, only the sample at the center index of selection is returned. Defaults to False. |
False
|
time_var
|
str
|
Name of the time variable in |
TIME_VAR
|
along_track_dim
|
str
|
Dimension name along which time is defined. Defaults to ALONG_TRACK_DIM. |
ALONG_TRACK_DIM
|
pad_idxs
|
int
|
Number of additional samples added at both sides of the selection. Defaults to 0. |
0
|
shift_idxs
|
int
|
Offset number to shift selection of samples. Defaults to 0. |
0
|
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset: Subset of |
Examples:
>>> fp = "ECA_EXBC_CPR_FMR_2A_20260108T030403Z_20260108T042349Z_09167F.h5"
>>> with eck.read_product(fp) as ds:
>>> print(ds.time.values[[0, -1]])
['2026-01-08T03:04:08.393852288' '2026-01-08T03:15:57.401298304']
>>> ds_filtered = eck.filter_time(ds, time_range=("2026-01-08 03:10", "2026-01-08 03:12"))
>>> print(ds_filtered.time.values[[0, -1]])
['2026-01-08T03:10:00.115605248' '2026-01-08T03:11:59.985651712']
Source code in earthcarekit/utils/xarray_utils/_filter_time.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | |
insert_var
insert_var(
ds: Dataset,
var: str,
data: Any,
index: int | None = None,
before_var: str | None = None,
after_var: str | None = None,
) -> Dataset
Inserts a new variable in a xarray.Dataset before or after a given variable or at a given index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
The original dataset to which the variable will be added. |
required |
var
|
str
|
Name of the new variable to be added. |
required |
data
|
Any
|
Data stored in the new variable. |
required |
index
|
int | None
|
Index at which the new variable will be added. Will be ignored when either |
None
|
before_var
|
str | None
|
Name of the variable before which the new variable should be inserted. Defaults to None. |
None
|
after_var
|
str | None
|
Name of the variable after which the new variable should be inserted. Will be ignored
when |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Dataset |
Dataset
|
The original dataset with the new variable inserted. |
Source code in earthcarekit/utils/xarray_utils/_insert_var.py
merge_datasets
Merges two datasets while keeping all global attributes from one dataset.
Source code in earthcarekit/utils/xarray_utils/_merge.py
remove_dims
Drop a list of dimensions and all associated variables and coordinates from a given xarray.dataset.