dataset¶
The public PyTorch Dataset consumed by downstream user code. For the
narrative overview see Dataset API.
MOOSEDataset: PyTorch Dataset over processed Zarr simulation stores.
Supports three representation modes for different PhysicsNeMo model families:
“graph” GNN / MeshGraphNet ───────────────────────────────────────────────────────────────────────── coords float32 [N, D] node spatial coordinates edge_index int64 [2, M] COO edge list (src / dst) node_fields float32 [T, N, F] per-node fields (interpolated from elements) elem_fields float32 [T, E, F] per-element fields (raw) probe_data dict probe_name → float32 [Np, C]
“point_cloud” PointNet / Transformer ───────────────────────────────────────────────────────────────────────── coords float32 [N, D] node spatial coordinates node_fields float32 [T, N, F] per-node fields
“grid” CNN (U-Net, FNO) ───────────────────────────────────────────────────────────────────────── grid_x float32 [Nx] column x-coordinates grid_y float32 [Ny] row y-coordinates grid_fields float32 [T, Nx, Ny, F] fields on regular grid
- All modes also include:
field_names list[str] field name for each F index norm_stats dict field_name → {“mean”: float, “std”: float} sim_name str unique simulation identifier time_steps float32 [T]
If time_idx is given (≥ 0), only that time step is returned (T-dim removed).
Denormalization ───────────────
Call dataset.denormalize("pressure", tensor) to recover a tensor in
original physical units.
- class dataset.moose_dataset.MOOSEDataset(zarr_dir, mode='graph', time_idx=-1)[source]¶
Bases:
DatasetDataset over a directory of processed MOOSE Zarr stores.
- Parameters:
- MODES = ('graph', 'point_cloud', 'grid')¶
- dataset.moose_dataset.to_tensor(arr)[source]¶
Convert a zarr array to a float32 torch tensor.
- Return type:
- dataset.moose_dataset.load_fields(fields_grp, field_names)[source]¶
Load and stack element fields from a zarr group → [T, E, F].
- Return type:
- dataset.moose_dataset.slice_time(tensor, time_idx)[source]¶
If time_idx >= 0, select that time step and remove the T dimension.
- Return type:
- dataset.moose_dataset.elem_to_node(elem_fields, connectivity, n_nodes)[source]¶
Average element fields onto nodes (scatter mean over element→node map).
elem_fields : […, E, F] (leading dims may include T) connectivity : [E, K] 0-indexed node indices n_nodes : N
Returns node_fields : […, N, F]
- Return type: