|
1 | | -# Hermes workflow toolkit |
| 1 | +# Hermes Workflow Toolkit |
2 | 2 |
|
3 | | -The `hermesWorkflowToolkit` is the base class for simulation toolkits that support workflow pipelines: |
4 | | - |
5 | | -| Feature | Description | |
6 | | -|---------|-------------| |
7 | | -| Workflow groups | Organize simulations into named groups | |
8 | | -| Chained steps | Chain multiple simulation steps (preprocessing → solving → post-processing) | |
9 | | -| Template support | Use templates for reproducible case setup | |
| 3 | +The `hermesWorkflowToolkit` is the database-backed workflow orchestrator that wraps the [Hermes](https://github.com/KaplanOpenSource/hermes) library. It manages the full lifecycle of simulation workflows: creating them from JSON, storing them in MongoDB with auto-generated names, building them into Luigi task DAGs, executing them, and comparing parameter variations. |
10 | 4 |
|
11 | 5 | `OFToolkit` inherits from `hermesWorkflowToolkit`, gaining workflow support automatically. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Architecture overview |
| 10 | + |
| 11 | +### Relationship to Hermes |
| 12 | + |
| 13 | +The toolkit is a **wrapper** around the Hermes workflow engine: |
| 14 | + |
| 15 | +| Layer | Class | Role | |
| 16 | +|-------|-------|------| |
| 17 | +| **Hera toolkit** | `hermesWorkflowToolkit` | MongoDB storage, naming, retrieval, comparison, execution orchestration | |
| 18 | +| **Hermes workflow** | `hermes.workflow` | JSON → task DAG construction, parameter extraction, node management | |
| 19 | +| **Hermes engine** | `hermes.engines.luigi.builder` | Task DAG → Luigi Python module code generation | |
| 20 | +| **Hermes wrapper** | `hermes.taskwrapper.wrapper` | Wraps each node into a TaskWrapper with dependencies, parameters, and executer resolution | |
| 21 | +| **Luigi** | `luigi.Task` | Dependency-based execution engine with target-file state tracking | |
| 22 | + |
| 23 | +### Class hierarchy |
| 24 | + |
| 25 | +``` |
| 26 | +abstractToolkit |
| 27 | + ↓ |
| 28 | +hermesWorkflowToolkit (hera/simulations/hermesWorkflowToolkit.py) |
| 29 | + ├─ OFToolkit (hera/simulations/openFoam/toolkit.py) |
| 30 | + │ └─ adds OFObjectHome, Analysis, Presentation, solver extensions |
| 31 | + └─ workflowToolkit [LSM variant] (hera/simulations/LSM/hermesWorkflowToolkit.py) |
| 32 | + └─ imports hermes handlers (handler_build, handler_expand, handler_execute) |
| 33 | +``` |
| 34 | + |
| 35 | +### Key constants |
| 36 | + |
| 37 | +```python |
| 38 | +DOCTYPE_WORKFLOW = "hermesWorkflow" # MongoDB document type |
| 39 | +DESC_GROUPNAME = "groupName" # desc field keys |
| 40 | +DESC_GROUPID = "groupID" |
| 41 | +DESC_WORKFLOWNAME = "workflowName" |
| 42 | +DESC_PARAMETERS = "parameters" |
| 43 | +``` |
| 44 | + |
| 45 | +--- |
| 46 | + |
| 47 | +## Workflow JSON structure |
| 48 | + |
| 49 | +A workflow is a directed acyclic graph (DAG) of **nodes**, where each node represents a simulation step (copy files, generate mesh, run solver, etc.): |
| 50 | + |
| 51 | +```json |
| 52 | +{ |
| 53 | + "workflow": { |
| 54 | + "solver": "simpleFoam", |
| 55 | + "root": "finalnode_xx", |
| 56 | + "nodeList": ["blockMesh", "decomposePar", "simpleFoam", "reconstructPar"], |
| 57 | + "nodes": { |
| 58 | + "blockMesh": { |
| 59 | + "type": "openFOAM.mesh.BlockMesh", |
| 60 | + "Execution": { |
| 61 | + "input_parameters": { |
| 62 | + "vertices": [[0,0,0], [100,0,0], ...], |
| 63 | + "cellCount": [50, 50, 20] |
| 64 | + } |
| 65 | + }, |
| 66 | + "requires": [] |
| 67 | + }, |
| 68 | + "simpleFoam": { |
| 69 | + "type": "openFOAM.system.ControlDict", |
| 70 | + "Execution": { |
| 71 | + "input_parameters": { |
| 72 | + "executeDir": "{blockMesh.output}", |
| 73 | + "endTime": 1000 |
| 74 | + } |
| 75 | + }, |
| 76 | + "requires": ["blockMesh", "decomposePar"] |
| 77 | + } |
| 78 | + } |
| 79 | + } |
| 80 | +} |
| 81 | +``` |
| 82 | + |
| 83 | +### Parameter referencing |
| 84 | + |
| 85 | +Nodes reference outputs from other nodes using curly-brace syntax: |
| 86 | + |
| 87 | +| Pattern | Meaning | |
| 88 | +|---------|---------| |
| 89 | +| `{nodeName.param}` | Output parameter from another node | |
| 90 | +| `{workflow.param}` | Workflow-level parameter | |
| 91 | +| `{WebGui.formData.field}` | GUI form data (Hermes workbench) | |
| 92 | + |
| 93 | +These references are resolved by the `TaskWrapper` to build the dependency graph automatically. |
| 94 | + |
| 95 | +--- |
| 96 | + |
| 97 | +## Workflow lifecycle |
| 98 | + |
| 99 | +### 1. Create: load workflow from JSON |
| 100 | + |
| 101 | +```python |
| 102 | +# Dynamic class resolution based on solver field. |
| 103 | +# If solver is null → hermes.workflow (generic) |
| 104 | +# If solver is "simpleFoam" → hera.simulations.openFoam.OFWorkflow.workflow_simpleFoam |
| 105 | +wf = toolkit.getHermesWorkflowFromJSON( |
| 106 | + workflow="path/to/workflow.json", # or dict or JSON string |
| 107 | + name="my_workflow", |
| 108 | + resource="/path/to/case", |
| 109 | +) |
| 110 | +``` |
| 111 | + |
| 112 | +Internally, `pydoc.locate()` dynamically imports the workflow class based on the `solver` field. This allows solver-specific workflow subclasses to add custom node types and validation. |
| 113 | + |
| 114 | +### 2. Add to database |
| 115 | + |
| 116 | +```python |
| 117 | +doc = toolkit.addWorkflowToGroup( |
| 118 | + workflowJSON="path/to/workflow.json", |
| 119 | + groupName="dispersion", |
| 120 | + writeWorkflowToFile=False, |
| 121 | + resource=None, # auto-generated if None |
| 122 | +) |
| 123 | +# Creates document: dispersion_0001 (auto-incremented) |
| 124 | +``` |
| 125 | + |
| 126 | +**Idempotent add:** The method first queries the DB by parameter content. If an identical workflow already exists, it returns the existing document instead of creating a duplicate. |
| 127 | + |
| 128 | +**What gets stored:** |
| 129 | +```python |
| 130 | +doc = self.addSimulationsDocument( |
| 131 | + resource=resource, |
| 132 | + dataFormat=datatypes.STRING, |
| 133 | + type="hermesWorkflow", |
| 134 | + desc={ |
| 135 | + "groupName": groupName, |
| 136 | + "groupID": groupID, |
| 137 | + "workflowName": workflowName, # e.g. "dispersion_0001" |
| 138 | + "solver": hermesWF.solver, |
| 139 | + "workflow": hermesWF.json, # full workflow JSON |
| 140 | + "parameters": hermesWF.parametersJSON, # flattened for querying |
| 141 | + } |
| 142 | +) |
| 143 | +``` |
| 144 | + |
| 145 | +The `parameters` are stored separately from the full `workflow` JSON to enable efficient MongoDB queries via `dictToMongoQuery()` without parsing the workflow tree. |
| 146 | + |
| 147 | +### 3. Build: generate Luigi tasks |
| 148 | + |
| 149 | +```python |
| 150 | +# Inside executeWorkflowFromDB(): |
| 151 | +build = hermesWF.build(buildername=workflow.BUILDER_LUIGI) |
| 152 | +``` |
| 153 | + |
| 154 | +The build process: |
| 155 | + |
| 156 | +1. **`hermes.workflow._buildNetwork()`** traverses the JSON and creates `TaskWrapper` objects for each node |
| 157 | +2. Each `TaskWrapper` extracts its dependencies by scanning `{node.param}` references in input parameters |
| 158 | +3. **`LuigiBuilder.buildWorkflow()`** converts the task graph into a Python module containing Luigi Task classes |
| 159 | +4. The generated module has one class per task (e.g. `blockMesh_0`, `simpleFoam_0`) with: |
| 160 | + - `requires()` → returns upstream task instances |
| 161 | + - `output()` → returns `luigi.LocalTarget` for state tracking |
| 162 | + - `run()` → calls the Hermes executer for that node type |
| 163 | + |
| 164 | +### 4. Execute: run Luigi pipeline |
| 165 | + |
| 166 | +```python |
| 167 | +toolkit.executeWorkflowFromDB("dispersion_0001") |
| 168 | +``` |
| 169 | + |
| 170 | +Execution steps: |
| 171 | + |
| 172 | +1. Retrieve workflow document from DB |
| 173 | +2. Reconstruct `hermes.workflow` object from stored JSON |
| 174 | +3. Build Luigi task DAG (generates Python module) |
| 175 | +4. Write workflow JSON + Python module to `filesDirectory` |
| 176 | +5. **Clean previous targets** — remove `{name}_targetFiles/` to force re-execution |
| 177 | +6. **Run Luigi** via command line: `python3 -m luigi --module {name} finalnode_xx_0 --local-scheduler` |
| 178 | +7. Clean up generated Python module (JSON stays) |
| 179 | + |
| 180 | +The `finalnode_xx` is a synthetic terminal node that depends on all actual nodes — executing it triggers the entire DAG. |
| 181 | + |
| 182 | +### 5. Compare and retrieve |
| 183 | + |
| 184 | +**Cascading search** — `getWorkflowDocumentFromDB(input)` tries four strategies in order: |
| 185 | + |
| 186 | +1. **By name** — exact match on `workflowName` (e.g. `"dispersion_0001"`) |
| 187 | +2. **By resource** — match by file/directory path |
| 188 | +3. **By group name** — returns all workflows in the group (e.g. `"dispersion"`) |
| 189 | +4. **By JSON content** — parses input as JSON, extracts parameters, queries by flattened parameter values via `dictToMongoQuery()` |
| 190 | + |
| 191 | +**Comparison:** |
| 192 | + |
| 193 | +```python |
| 194 | +# Compare all workflows in a group |
| 195 | +diff = toolkit.compareWorkflowInGroup("dispersion") |
| 196 | +# Returns DataFrame: rows = differing parameters, columns = workflow names |
| 197 | + |
| 198 | +# Compare specific workflows |
| 199 | +diff = toolkit.compareWorkflows(["dispersion_0001", "dispersion_0002"]) |
| 200 | + |
| 201 | +# List all workflows in a group |
| 202 | +toolkit.listWorkflows("dispersion", listNodes=True, listParameters=True) |
| 203 | +``` |
| 204 | + |
| 205 | +--- |
| 206 | + |
| 207 | +## MongoDB storage |
| 208 | + |
| 209 | +### Document schema |
| 210 | + |
| 211 | +```json |
| 212 | +{ |
| 213 | + "_id": "ObjectId", |
| 214 | + "_cls": "Metadata.Simulations", |
| 215 | + "projectName": "MY_PROJECT", |
| 216 | + "type": "hermesWorkflow", |
| 217 | + "resource": "/path/to/dispersion_0001", |
| 218 | + "dataFormat": "string", |
| 219 | + "desc": { |
| 220 | + "groupName": "dispersion", |
| 221 | + "groupID": 1, |
| 222 | + "workflowName": "dispersion_0001", |
| 223 | + "solver": "simpleFoam", |
| 224 | + "workflow": { "workflow": { "nodeList": [...], "nodes": {...} } }, |
| 225 | + "parameters": { |
| 226 | + "blockMesh": { "vertices": [...], "cellCount": [...] }, |
| 227 | + "simpleFoam": { "endTime": 1000 } |
| 228 | + } |
| 229 | + } |
| 230 | +} |
| 231 | +``` |
| 232 | + |
| 233 | +### Querying |
| 234 | + |
| 235 | +All queries filter by `type="hermesWorkflow"`. Parameters are queried using `dictToMongoQuery()` which flattens nested dicts into MongoEngine's double-underscore notation: |
| 236 | + |
| 237 | +```python |
| 238 | +# {"parameters": {"simpleFoam": {"endTime": 1000}}} |
| 239 | +# → {"parameters__simpleFoam__endTime": 1000} |
| 240 | +``` |
| 241 | + |
| 242 | +--- |
| 243 | + |
| 244 | +## Group naming convention |
| 245 | + |
| 246 | +**Format:** `<groupName>_<zero-padded-4-digit-id>` |
| 247 | + |
| 248 | +```python |
| 249 | +# Examples: dispersion_0001, flow_0042, lsm_0003 |
| 250 | +formatted_number = "{0:04d}".format(flowID) |
| 251 | +return f"{baseName}_{formatted_number}" |
| 252 | +``` |
| 253 | + |
| 254 | +**Counter mechanism:** Each group has its own counter stored in the project config via `getCounterAndAdd(groupName)`. The counter auto-increments on each `addWorkflowToGroup()` call. |
| 255 | + |
| 256 | +**Parsing names:** |
| 257 | +```python |
| 258 | +base, id = toolkit.splitWorkflowName("dispersion_0001") |
| 259 | +# base = "dispersion", id = "0001" |
| 260 | +``` |
| 261 | + |
| 262 | +--- |
| 263 | + |
| 264 | +## LSM variant |
| 265 | + |
| 266 | +**Source:** `hera/simulations/LSM/hermesWorkflowToolkit.py` |
| 267 | + |
| 268 | +The LSM `workflowToolkit` extends the base with additional handler imports for the expand → build → execute pipeline: |
| 269 | + |
| 270 | +| Handler | Purpose | |
| 271 | +|---------|---------| |
| 272 | +| `handler_expand` | Expand workflow templates with meteorological data | |
| 273 | +| `handler_build` | Build the workflow into executable tasks | |
| 274 | +| `handler_buildExecute` | Combined build + execute in one step | |
| 275 | +| `handler_execute` | Execute a pre-built workflow | |
| 276 | + |
| 277 | +This supports LSM's pattern where workflows are first **expanded** with site-specific meteorological data before being built and executed. The base toolkit only uses build → execute. |
| 278 | + |
| 279 | +--- |
| 280 | + |
| 281 | +## Key methods reference |
| 282 | + |
| 283 | +### Workflow creation |
| 284 | + |
| 285 | +| Method | Purpose | |
| 286 | +|--------|---------| |
| 287 | +| `getHermesWorkflowFromJSON(workflow, name, resource)` | Create workflow object from JSON (file/dict/string) | |
| 288 | +| `getHemresWorkflowFromDocument(documentList, returnFirst)` | Reconstruct workflow from DB document | |
| 289 | +| `updateDocumentWorkflow(document, workflow)` | Update stored JSON + parameters | |
| 290 | + |
| 291 | +### Workflow storage |
| 292 | + |
| 293 | +| Method | Purpose | |
| 294 | +|--------|---------| |
| 295 | +| `addWorkflowToGroup(workflowJSON, groupName, ...)` | Add workflow to DB with auto-naming | |
| 296 | +| `addWorkflowFileInGroup(workflowFilePath, write_file)` | Add from file, infer group from filename | |
| 297 | +| `findAvailableName(simulationGroup)` | Get next available ID + name | |
| 298 | +| `getworkFlowName(baseName, flowID)` | Format `<base>_<padded_id>` | |
| 299 | +| `splitWorkflowName(workflow_name)` | Parse `<base>_<id>` | |
| 300 | + |
| 301 | +### Workflow retrieval |
| 302 | + |
| 303 | +| Method | Purpose | |
| 304 | +|--------|---------| |
| 305 | +| `getHermesWorkflowFromDB(input, returnFirst)` | Retrieve as hermes.workflow object | |
| 306 | +| `getWorkflowDocumentFromDB(input, doctype, dockind)` | Retrieve raw document (cascading search) | |
| 307 | +| `getWorkflowDocumentByName(name, doctype, dockind)` | Retrieve by exact name | |
| 308 | +| `getWorkflowListDocumentFromDB(input)` | Retrieve as list of documents | |
| 309 | +| `getWorkflowDocumentsInGroup(groupName)` | All documents in a group | |
| 310 | +| `getWorkflowListOfSolvers(solverName)` | All documents for a solver | |
| 311 | + |
| 312 | +### Comparison and listing |
| 313 | + |
| 314 | +| Method | Purpose | |
| 315 | +|--------|---------| |
| 316 | +| `compareWorkflows(Workflow, longFormat, transpose)` | Compare workflows by name or list | |
| 317 | +| `compareWorkflowInGroup(workflowGroup)` | Compare all in a group | |
| 318 | +| `compareWorkflowObj(workflowList)` | Compare workflow objects directly | |
| 319 | +| `listWorkflows(workflowGroup, listNodes, listParameters)` | List workflows with optional details | |
| 320 | +| `listGroups(solver, workflowName)` | List all groups in the project | |
| 321 | +| `workflowTable(workflowGroup)` | Alias for `compareWorkflowInGroup` | |
| 322 | + |
| 323 | +### Execution and cleanup |
| 324 | + |
| 325 | +| Method | Purpose | |
| 326 | +|--------|---------| |
| 327 | +| `executeWorkflowFromDB(input)` | Build + execute via Luigi | |
| 328 | +| `deleteWorkflowInGroup(workflowGroup, deepDelete, resetCounter)` | Delete workflows, optionally remove files | |
| 329 | + |
| 330 | +### Templates |
| 331 | + |
| 332 | +| Method | Purpose | |
| 333 | +|--------|---------| |
| 334 | +| `listHermesSolverTemplates(solverName)` | List loaded templates for a solver | |
| 335 | +| `getHermesFlowTemplate(hermesFlowName)` | Get a flow template | |
| 336 | +| `listHermesNodesTemplates()` | List all node templates | |
| 337 | +| `getHermesNodeTemplate(hermesNodeName)` | Get a node template | |
| 338 | + |
| 339 | +--- |
| 340 | + |
| 341 | +## Cross-references |
| 342 | + |
| 343 | +| What | Where | |
| 344 | +|------|-------| |
| 345 | +| User guide | [Toolkits > Simulations > Hermes Workflows](../../toolkits/simulations/workflows.md) | |
| 346 | +| OpenFOAM toolkit (inherits from hermesWorkflowToolkit) | [Simulations > OpenFOAM](openfoam.md) | |
| 347 | +| LSM toolkit | [Simulations > LSM](lsm.md) | |
| 348 | +| Hermes library | `pyHermes/hermes/` (workflow, engines, taskwrapper) | |
| 349 | +| API reference | [API > Simulations](../api/simulations.md) | |
| 350 | +| Workflow examples | [Examples > Workflows](../../examples/workflows.md) | |
0 commit comments