How to Select and Subset Data#

This guide covers various methods for extracting subsets of unstructured grid data based on spatial criteria.

Coordinate-Based Selection#

Select data at specific coordinates using .ugrid.sel():

import xugrid as xu

uda = xu.data.elevation_nl()

# Select at a single x coordinate (returns cross-section)
cross_section = uda.ugrid.sel(x=155_000)

# Select at a single y coordinate
cross_section = uda.ugrid.sel(y=463_000)

# Select a single point (returns scalar)
value = uda.ugrid.sel(x=155_000, y=463_000)

Point Extraction#

Extract values at multiple specific points:

import numpy as np

# Define points of interest
x_points = np.array([155_000, 160_000, 165_000])
y_points = np.array([463_000, 465_000, 467_000])

# Extract values at those points
values = uda.ugrid.sel_points(x=x_points, y=y_points)

Control behavior for points outside the grid:

# Raise error for out-of-bounds points (default)
values = uda.ugrid.sel_points(x=x_points, y=y_points, out_of_bounds="raise")

# Return NaN for out-of-bounds points
values = uda.ugrid.sel_points(x=x_points, y=y_points, out_of_bounds="warn")

# Silently return NaN
values = uda.ugrid.sel_points(x=x_points, y=y_points, out_of_bounds="ignore")

# Drop out-of-bounds points from result
values = uda.ugrid.sel_points(x=x_points, y=y_points, out_of_bounds="drop")

Bounding Box Selection#

Clip data to a rectangular region:

# Define bounding box (xmin, ymin, xmax, ymax)
subset = uda.ugrid.clip_box(
    xmin=100_000,
    ymin=450_000,
    xmax=200_000,
    ymax=500_000
)

The result includes all faces that intersect the bounding box.

Line Intersection#

Extract data along a line (cross-section):

from shapely.geometry import LineString

# Define a line
line = LineString([(100_000, 450_000), (200_000, 500_000)])

# Get intersection
section = uda.ugrid.intersect_line(
    start=line.coords[0],
    end=line.coords[1]
)

For complex paths with multiple segments:

# Multi-segment line
linestring = LineString([
    (100_000, 450_000),
    (150_000, 475_000),
    (200_000, 500_000)
])

section = uda.ugrid.intersect_linestring(linestring)

Selection with Polygon Mask#

Use vector geometries to mask data:

from shapely.geometry import box, Polygon
import geopandas as gpd

# Create a polygon
polygon = box(100_000, 450_000, 200_000, 500_000)

# Or load from file
gdf = gpd.read_file("region.gpkg")
polygon = gdf.geometry.iloc[0]

# Burn polygon to grid (creates mask)
mask = xu.burn_vector_geometry(polygon, uda)

# Apply mask
subset = uda.where(mask)

Boolean Indexing#

Use boolean arrays for custom selection:

# Select faces where elevation > 0
above_sea_level = uda.where(uda > 0)

# Select faces within distance of a point
grid = uda.ugrid.grid
center_x, center_y = 155_000, 463_000
distance = np.sqrt(
    (grid.face_x - center_x)**2 +
    (grid.face_y - center_y)**2
)
nearby = uda.where(distance < 10_000)

Temporal and Non-Spatial Selection#

For multi-dimensional data, use standard xarray indexing for non-spatial dimensions:

# Load time-varying data
uds = xu.open_dataset("model_output.nc")
temp = uds["temperature"]

# Select specific time
temp_t0 = temp.isel(time=0)

# Select time range
temp_range = temp.sel(time=slice("2020-01", "2020-06"))

# Combine spatial and temporal selection
subset = temp.sel(time="2020-03-15").ugrid.clip_box(
    xmin=100_000, ymin=450_000,
    xmax=200_000, ymax=500_000
)

Selecting by Grid Topology#

Select based on grid connectivity:

grid = uda.ugrid.grid

# Get face indices that touch a specific node
node_idx = 100
connected_faces = grid.node_face_connectivity[node_idx]
connected_faces = connected_faces[connected_faces != grid.fill_value]

# Select those faces
subset = uda.isel({grid.face_dimension: connected_faces})

Getting Indices Instead of Data#

Sometimes you need indices for further processing:

# Get face indices within bounding box
grid = uda.ugrid.grid
face_x = grid.face_x
face_y = grid.face_y

in_box = (
    (face_x >= 100_000) & (face_x <= 200_000) &
    (face_y >= 450_000) & (face_y <= 500_000)
)
indices = np.where(in_box)[0]

# Use indices for multiple arrays
subset_a = uda_a.isel({grid.face_dimension: indices})
subset_b = uda_b.isel({grid.face_dimension: indices})