Feat/sof 7938#337
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Warning Review limit reached
More reviews will be available in 48 minutes and 46 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds a new spin-resolved band-structure notebook, updates QE input patching utilities, extends the existing band-structure notebook to patch extra parameters, and broadens optional dependency sets. ChangesBand Structure Workflow Additions
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
src/py/mat3ra/notebooks_utils/workflow.py (1)
41-43: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winValidate dict-based input items before regex patching.
If
contentis missing/non-string, Line 54 raises a generic runtime error. Add explicit type checks and raise a clearValueErrorfor malformed dict input items.Proposed hardening
for input_item in getattr(unit, "input", []): if isinstance(input_item, dict): template_name = input_item.get("name") content = input_item.get("content") + if template_name is None or not isinstance(content, str): + raise ValueError( + f"Malformed dict input item for unit '{unit_name}': expected keys " + "'name' and string 'content'." + ) else:Also applies to: 64-66
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/py/mat3ra/notebooks_utils/workflow.py` around lines 41 - 43, After extracting the "content" value from the dict-based input_item using input_item.get("content") on line 43, add explicit validation to check that content is not None and is a string type before it is used in regex operations on line 54. If content is missing or not a string, raise a clear ValueError with a descriptive message indicating the malformed dict input. Apply the same validation pattern to the similar code block mentioned at lines 64-66 to ensure consistent error handling across both branches.other/materials_designer/workflows/band_structure_magn.ipynb (1)
508-513: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRemove leftover empty code cells.
Cells with ids
30(Line 513),46(Line 709), and47(Line 717) have emptysource. Drop them before merge to keep the tutorial clean.Also applies to: 704-717
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@other/materials_designer/workflows/band_structure_magn.ipynb` around lines 508 - 513, Remove the three empty code cell objects from the notebook that have no source content. Specifically, delete the cell with id "30" (found around line 513), the cell with id "46" (found around line 704-709), and the cell with id "47" (found around line 717). These cells appear as complete cell objects with "cell_type": "code" but contain empty "source" arrays and serve no purpose in the tutorial. Simply remove the entire cell object definitions for each of these ids to clean up the notebook structure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@other/materials_designer/workflows/band_structure_magn.ipynb`:
- Around line 592-602: The cluster selection logic has a bug where if
CLUSTER_NAME is provided but does not match any hostname in the clusters list,
the next() call returns None, causing a subsequent AttributeError when accessing
compute.cluster.hostname in the print statement. Add a guard after the cluster
assignment that checks if cluster is None when CLUSTER_NAME is set, and either
raise a clear error message indicating that the specified CLUSTER_NAME could not
be found in available clusters, or fall back to clusters[0] as a default. This
ensures the code handles both the case where CLUSTER_NAME matches nothing and
provides appropriate feedback or fallback behavior.
- Around line 473-498: Add validation checks before calling
species_names.index() in both the starting_magnetization list comprehension and
the HUBBARD_U loop to ensure all atomic_species keys from STARTING_MAGNETIZATION
and HUBBARD_U dictionaries exist in the species_names list. Validate each key
upfront and raise a clear, descriptive error message if any key is not found,
rather than allowing the opaque ValueError to propagate. This prevents crashes
when the magnetization/Hubbard labels do not match the actual material species.
- Around line 122-128: The KPATH definition uses the Cyrillic character Г
(U+0413) instead of the correct Quantum ESPRESSO k-point label. Replace all
instances of the Cyrillic "Г" with the Latin letter "G" in the KPATH list where
the "point" key is set to represent the Gamma point. This fix applies to all
three dictionaries in the KPATH array that currently reference the Cyrillic
character, ensuring the k-point path matches Quantum ESPRESSO's label
expectations.
In `@src/py/mat3ra/notebooks_utils/workflow.py`:
- Around line 37-39: The loop iterating through unit_names currently silently
skips any unit that cannot be found via subworkflow.get_unit_by_name(), which
allows typos or invalid unit names to go unnoticed and leave the QE input
unpatched. Instead of using continue when a unit is not found, raise an
exception to fail fast and immediately alert the user that a requested unit name
does not exist. This will ensure that typos in notebook parameters are caught
rather than silently ignored.
---
Nitpick comments:
In `@other/materials_designer/workflows/band_structure_magn.ipynb`:
- Around line 508-513: Remove the three empty code cell objects from the
notebook that have no source content. Specifically, delete the cell with id "30"
(found around line 513), the cell with id "46" (found around line 704-709), and
the cell with id "47" (found around line 717). These cells appear as complete
cell objects with "cell_type": "code" but contain empty "source" arrays and
serve no purpose in the tutorial. Simply remove the entire cell object
definitions for each of these ids to clean up the notebook structure.
In `@src/py/mat3ra/notebooks_utils/workflow.py`:
- Around line 41-43: After extracting the "content" value from the dict-based
input_item using input_item.get("content") on line 43, add explicit validation
to check that content is not None and is a string type before it is used in
regex operations on line 54. If content is missing or not a string, raise a
clear ValueError with a descriptive message indicating the malformed dict input.
Apply the same validation pattern to the similar code block mentioned at lines
64-66 to ensure consistent error handling across both branches.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 287aab0c-10db-4066-b002-e8de33d39a37
📒 Files selected for processing (4)
other/materials_designer/workflows/band_structure.ipynbother/materials_designer/workflows/band_structure_magn.ipynbpyproject.tomlsrc/py/mat3ra/notebooks_utils/workflow.py
| "starting_magnetization = [\n", | ||
| " {\"index\": species_names.index(atomic_species) + 1, \"value\": value}\n", | ||
| " for atomic_species, value in STARTING_MAGNETIZATION.items()\n", | ||
| "]\n", | ||
| "magnetization_context = {\n", | ||
| " \"collinearMagnetization\": {\n", | ||
| " \"isTotalMagnetization\": IS_TOTAL_MAGNETIZATION,\n", | ||
| " \"totalMagnetization\": TOTAL_MAGNETIZATION,\n", | ||
| " \"startingMagnetization\": starting_magnetization,\n", | ||
| " },\n", | ||
| " \"isCollinearMagnetizationEdited\": True,\n", | ||
| "}\n", | ||
| "for unit_name in magn_unit_names:\n", | ||
| " unit = bs_subworkflow.get_unit_by_name(name=unit_name)\n", | ||
| " if unit:\n", | ||
| " unit.add_context(magnetization_context)\n", | ||
| " bs_subworkflow.set_unit(unit)\n", | ||
| "\n", | ||
| "system_patch = {}\n", | ||
| "if USE_DFT_U and HUBBARD_U:\n", | ||
| " system_patch[\"lda_plus_u\"] = True\n", | ||
| " system_patch[\"lda_plus_u_kind\"] = 0\n", | ||
| " system_patch[\"U_projection_type\"] = \"ortho-atomic\"\n", | ||
| " for atomic_species, hubbard_u_value in HUBBARD_U.items():\n", | ||
| " index = species_names.index(atomic_species) + 1\n", | ||
| " system_patch[f\"Hubbard_U({index})\"] = hubbard_u_value\n", |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
species_names.index(...) raises an opaque ValueError when a magnetization/Hubbard label is absent.
Both the starting_magnetization comprehension (Lines 473-476) and the HUBBARD_U loop (Lines 496-498) call species_names.index(atomic_species). If any key in STARTING_MAGNETIZATION or HUBBARD_U is not among the material's species/labels, this throws ValueError: 'Fe1' is not in list. This is reachable with the shipped defaults: MATERIAL_NAME = "FeO" but the magnetization/Hubbard dicts are keyed by Fe1/Fe2 sublabels, so a default "Run All" will crash unless the loaded FeO actually carries those labels. Consider validating keys up front with a clear message (and aligning the default material with the labeled example).
🛡️ Example guard
+missing = [s for s in {**STARTING_MAGNETIZATION, **(HUBBARD_U if USE_DFT_U else {})} if s not in species_names]
+if missing:
+ raise ValueError(f"Species/labels {missing} not found among {species_names}. "
+ f"Assign numeric labels (e.g. Fe1, Fe2) in Materials Designer first.")
starting_magnetization = [
{"index": species_names.index(atomic_species) + 1, "value": value}
for atomic_species, value in STARTING_MAGNETIZATION.items()
]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/materials_designer/workflows/band_structure_magn.ipynb` around lines
473 - 498, Add validation checks before calling species_names.index() in both
the starting_magnetization list comprehension and the HUBBARD_U loop to ensure
all atomic_species keys from STARTING_MAGNETIZATION and HUBBARD_U dictionaries
exist in the species_names list. Validate each key upfront and raise a clear,
descriptive error message if any key is not found, rather than allowing the
opaque ValueError to propagate. This prevents crashes when the
magnetization/Hubbard labels do not match the actual material species.
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", | ||
| "compute = Compute(\n", | ||
| " cluster=cluster,\n", | ||
| " queue=QUEUE_NAME,\n", | ||
| " ppn=PPN\n", | ||
| ")\n", | ||
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")\n", |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
cluster can be None when CLUSTER_NAME matches nothing, crashing the next line.
With the default CLUSTER_NAME = "002", next(..., None) returns None if no hostname contains the substring. Line 602 then dereferences compute.cluster.hostname and raises a confusing AttributeError. The fall-back to clusters[0] only happens when CLUSTER_NAME is empty, so the "default to first cluster" behavior described in the summary never triggers on a bad match.
🛡️ Proposed guard
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
+ if cluster is None:
+ raise ValueError(
+ f"No cluster matching '{CLUSTER_NAME}' in {[c['hostname'] for c in clusters]}"
+ )
else:
cluster = clusters[0]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/materials_designer/workflows/band_structure_magn.ipynb` around lines
592 - 602, The cluster selection logic has a bug where if CLUSTER_NAME is
provided but does not match any hostname in the clusters list, the next() call
returns None, causing a subsequent AttributeError when accessing
compute.cluster.hostname in the print statement. Add a guard after the cluster
assignment that checks if cluster is None when CLUSTER_NAME is set, and either
raise a clear error message indicating that the specified CLUSTER_NAME could not
be found in available clusters, or fall back to clusters[0] as a default. This
ensures the code handles both the case where CLUSTER_NAME matches nothing and
provides appropriate feedback or fallback behavior.
| for unit_name in unit_names: | ||
| if not (unit := subworkflow.get_unit_by_name(name=unit_name)): | ||
| continue |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Fail fast when requested QE units are not found.
Line 39 currently skips unknown unit names silently. With per-unit patching now driven by notebook params, a typo becomes a silent no-op and the QE input stays unpatched.
Proposed fix
def patch_workflow_qe_input(
@@
) -> Workflow:
@@
- for subworkflow in workflow.subworkflows:
+ requested_units = set(unit_names)
+ patched_units = set()
+ for subworkflow in workflow.subworkflows:
for unit_name in unit_names:
if not (unit := subworkflow.get_unit_by_name(name=unit_name)):
continue
+ patched_units.add(unit_name)
@@
subworkflow.set_unit(unit)
+ missing_units = requested_units - patched_units
+ if missing_units:
+ raise ValueError(f"Units not found: {sorted(missing_units)}")
return workflow🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/workflow.py` around lines 37 - 39, The loop
iterating through unit_names currently silently skips any unit that cannot be
found via subworkflow.get_unit_by_name(), which allows typos or invalid unit
names to go unnoticed and leave the QE input unpatched. Instead of using
continue when a unit is not found, raise an exception to fail fast and
immediately alert the user that a requested unit name does not exist. This will
ensure that typos in notebook parameters are caught rather than silently
ignored.
Summary by CodeRabbit