Alpha-D Case¶
This page mirrors src/cases/alpha_d/README.md for the docs site. The
α_D surrogate predicts a per-station Darcy resistance coefficient along
a pipe contraction-expansion as a function of (Re, Dr, Lr, z). Two
model variants share the same ETL + feature pipeline:
MLP (
train_mlp) — pointwiseFullyConnectedpredicting one row at a time. HPO over ~10 hyperparameters is enabled by default.Conv1D profile (
train_conv1d) — 1D convolutional surrogate that consumes the full 50-station profile per case. No HPO by default.
Layout¶
flowchart LR
R["cases/alpha_d/<br/>README.md"] --> Cfg["configs/<br/>train_mlp · train_conv1d · etl · pycaret"]
R --> DS["datasets/<br/>profile.py (AlphaDProfileDataset)"]
R --> ETL["etl/<br/>source · transform · sink"]
R --> Phys["physics/<br/>baseline · targets"]
R --> Exp["experiment.py<br/>throat loss · decode + baseline plot hooks"]
R --> FD["feature_data.py<br/>ALLOWLIST · engineered features"]
R --> Met["metrics.py<br/>per-region MSE/RMSE · Δp eval"]
R --> Tr["transforms.py<br/>signed-log1p residual target"]
R --> Run["run_etl.py · train.py"]
Tree form (matches the on-disk listing):
cases/alpha_d/
├── configs/ # Hydra YAMLs (train_mlp, train_conv1d, etl, pycaret)
├── datasets/ # AlphaDProfileDataset + build_dataset entry point
├── etl/ # PhysicsNeMo Curator pipeline (source, transform, sink)
├── physics/ # baseline, targets — alpha_D encoding + analytical baseline
├── experiment.py # AlphaDExperiment — throat-weighted loss + decode/baseline plot hooks
├── feature_data.py # ALLOWLIST, GROUPED_FEATURES, engineered_features_spec
├── metrics.py # extended metrics (per-region MSE/RMSE, Δp evaluation)
├── transforms.py # alpha_d_residual_transform (target = signed-log1p residual)
├── run_etl.py # ETL entry point
├── train.py # discoverability wrapper around the shared trainer
└── README.md # source-of-truth, also rendered here
End-to-end (from src/)¶
1. ETL — MOOSE to per-case Zarr¶
python cases/alpha_d/run_etl.py \
etl.source.input_dir=../data/flow_contraction_expansion/parametric_study \
etl.sink.output_dir=../data/flow_contraction_expansion/parametric_study/processed
Writes one {case_name}.zarr per simulation, each with a 50-station
feature/target matrix plus per-case metadata. See the
Alpha-D Surrogate Tutorial for the Zarr
layout and feature reference.
2. PyCaret feature selection — required for MLP, skip for Conv1D¶
python cases/alpha_d/run_feature_selection_pycaret.py
Reads the Zarr stores, runs PyCaret regression with the
ALLOWLIST-constrained candidate set, and writes
selected_features.txt.
MLP (
train_mlp.yaml) pulls its input columns fromdata.input_columns_file: …/selected_features.txt, so this step must run first (or you must overridedata.input_columns=[…]and setdata.input_columns_file=nullfrom the CLI).Conv1D (
train_conv1d.yaml) hard-codes itsinput_columnslist in the YAML and does not readinput_columns_file, so the Conv1D path skips this step entirely.
3. Train¶
Needs Step 2 output.
python train.py --config-path cases/alpha_d/configs --config-name train_mlp
Needs Step 2 output.
python train.py --config-path cases/alpha_d/configs --config-name train_mlp hpo=null
Does not need Step 2.
python train.py --config-path cases/alpha_d/configs --config-name train_conv1d
A discoverability wrapper exists for the MLP path. It defaults to
--config-name train_mlp for this case but is otherwise equivalent to
the top-level train.py — both honour an hpo block in the config:
python cases/alpha_d/train.py # MLP with HPO (default config)
python cases/alpha_d/train.py hpo=null # MLP, skip HPO
python cases/alpha_d/train.py --config-name train_conv1d # Conv1D
4. Evaluate¶
python evaluate.py --config-path cases/alpha_d/configs --config-name train_mlp
run_meta.json written alongside the checkpoint reconstructs the exact
dataset, split, and target_transform, so the eval reproduces the
training conditions.
Further reading¶
Alpha-D Surrogate Tutorial — full walkthrough with feature reference and config knobs.
Hyperparameter Optimization — HPO study layout and CLI overrides.
Case Distribution Analysis — pre-training data audit.