Skip to content

Clean up notebook outputs and improve documentation#847

Closed
mtauraso wants to merge 6 commits intomainfrom
claude/review-stale-docs-TnxCG
Closed

Clean up notebook outputs and improve documentation#847
mtauraso wants to merge 6 commits intomainfrom
claude/review-stale-docs-TnxCG

Conversation

@mtauraso
Copy link
Copy Markdown
Collaborator

Change Description

This PR cleans up pre-executed Jupyter notebooks by removing cell outputs and execution counts, and improves documentation across multiple files.

Solution Description

Notebook Cleanup

  • train_model.ipynb: Removed execution outputs and stderr from cells 2 and 4, reset execution counts to null. Updated code to use h.set_config() API instead of direct dictionary assignment (e.g., h.set_config("model.name", ...) instead of h.config["model"]["name"] = ...). Fixed widget value formatting (float to int).
  • mpr_demo.ipynb: Removed execution outputs, reset execution counts. Updated markdown text to reference "Hyrax" instead of "FIBAD". Fixed widget value formatting.
  • custom_dataset.ipynb: Removed execution outputs and stderr. Simplified imports by removing unnecessary Dataset inheritance and torch.utils.data imports. Fixed widget value formatting.
  • using_umap.ipynb: Removed execution outputs and reset execution counts.
  • hyraxql_demo.ipynb: Updated markdown title and description.
  • hyrax_hats_cutouts.ipynb: Minor markdown updates.

Documentation Improvements

  • verbs.rst: Completely restructured and expanded verb documentation with clearer descriptions, better formatting, and consistent code examples. Added documentation for previously undocumented verbs (test, model, download, rebuild_manifest, lookup, save_to_database, database_connection, to_onnx, engine). Improved existing verb descriptions with return types and usage context.
  • data_flow.rst: Improved clarity and conciseness of data flow explanations.
  • index.rst: Minor updates to example code.
  • dev_guide.rst: Updated Python version recommendation from 3.10 to 3.11.
  • reference_and_faq.rst: Commented out empty FAQ section.
  • science_examples.rst: Added note about science workflows in development.
  • HYRAX_GUIDE.md and CLAUDE.md: Minor documentation updates.
  • notebooks.rst: Minor formatting updates.

Code Quality

  • Documentation changes follow the project style
  • Notebook outputs removed to reduce repository size
  • Code examples updated to use current API patterns
  • No functional code changes to source implementation

https://claude.ai/code/session_01XXeVghMRKhSmMQ5XVxdSC4

claude and others added 6 commits March 27, 2026 21:24
Adds :doc: and :ref: links throughout the 15 RST documentation files so
readers can navigate between related concepts. Key connections include:
verbs ↔ configuration, data flow ↔ dataset/model class references,
required inputs ↔ class reference pages, configuration system ↔
external package setup, and concept pages ↔ hands-on workflow notebooks.

https://claude.ai/code/session_018Ni98cUN4gA2ymfJvxptR9
Adopts 9 of 11 unique cross-references identified by the other agent:
- architecture_overview: link to getting_started and science_examples
- configuration: immutable config → model_comparison
- configuration_system: back-link to configuration; validation → dataset_class_reference
- dataset_class_reference: fields list → data_flow pipeline overview
- dataset_splits: config editing primer → configuration
- external_libraries: → required_input for minimum requirements
- getting_started: data_request → dataset_class_reference contract
- model_class_reference: train_batch metrics → model_comparison

Skipped: dataset_class_reference metadata→verbs (weak connection to
a legacy path) and model_class_reference checklist→external_library_package
(already linked a few lines later in __init__ section).

https://claude.ai/code/session_018Ni98cUN4gA2ymfJvxptR9
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Red-light fixes:
- CLAUDE.md / HYRAX_GUIDE.md: train_step → train_batch (matches model_registry.py)
- dev_guide.rst: Python 3.10 → 3.11 (matches pyproject.toml requires-python)
- notebooks.rst: Remove dead link to nonexistent export_model notebook
- using_umap.ipynb: n_epochs → epochs (correct config key)

Yellow-light fixes:
- index.rst: Replace incorrect h.search_by_vector() with actual verb chain, add h.umap()
- model_comparison.rst: f.config → h.config
- required_input.rst: Clarify prepare_inputs is a model @staticmethod, not a free function
- reference_and_faq.rst: Comment out empty FAQ "TBD" section
- science_examples.rst: Add note that more science workflows are in development
- HYRAX_GUIDE.md: Add missing dataset classes to built-in list
- using_tensorboard_and_mlflow.ipynb: Fix "TenosrBoard" and "reactiviating" typos
- custom_dataset.ipynb: Remove unnecessary torch Dataset dual-inheritance
- train_model.ipynb: Standardize on h.set_config() instead of direct dict mutation
- hyraxql_demo.ipynb: Rename title from misleading "GraphQL alternative"
- mpr_demo.ipynb: Replace stale "FIBAD" project name with "Hyrax"
- hyrax_hats_cutouts.ipynb: Add intro noting LSSTDataset-specific config pattern, fix typo

https://claude.ai/code/session_01XXeVghMRKhSmMQ5XVxdSC4
verbs.rst:
- Document all 15 verbs (was 6, with a nonexistent "index" verb)
- Add: test, save_to_database, database_connection, lookup, model,
  to_onnx, engine, download, rebuild_manifest, search
- Remove nonexistent "index" verb
- Mark notebook-only verbs (visualize, prepare, model, database_connection)
- Show return types for notebook context

data_flow.rst:
- Remove incorrect claim that inference uses ONNX (it uses PyTorch)
- Clarify that ONNX is an optional export path via the engine verb
- Document that prepare_inputs returns numpy (not tensors)
- Add data format summary table showing types at each pipeline stage
- Clarify the numpy→tensor conversion is automatic

https://claude.ai/code/session_01XXeVghMRKhSmMQ5XVxdSC4
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Base automatically changed from claude/add-doc-cross-references-1RrfM to main March 30, 2026 19:56
@drewoldag
Copy link
Copy Markdown
Collaborator

@mtauraso I think that we have finished reorganizing the docs for the time being. If you wnat to reengage claude on this, now is a good time.

@mtauraso mtauraso closed this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants