Skip to content

Pystamps#18

Open
sirbastiano wants to merge 14 commits intomainfrom
pystamps
Open

Pystamps#18
sirbastiano wants to merge 14 commits intomainfrom
pystamps

Conversation

@sirbastiano
Copy link
Copy Markdown
Collaborator

No description provided.

Add a burst-native SNAP workflow notebook that uses phidown burst
search, validation, and on-demand CDSE downloads instead of full
SLC products.

Also add notebook support helpers and focused tests, bump the
phidown dependency floor, and make the notebook accept local CDSE
credential aliases plus shared-home temp paths for SNAP execution.
Add a runtime warning before raising the missing-s1isp import
error and include the uv install command alongside the existing
pip installation guidance.
Expose the processing outputs as a reusable notebook variable and add
a final quicklook section that renders product summaries, wrapped
phase, coherence, terrain support rasters, and a coherence
histogram from the generated ENVI products.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05b8b1d6dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +116 to +118
if master_id or slave_id:
if not master_id or not slave_id:
raise ValueError("master_id and slave_id must be provided together")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject identical master/slave override IDs

The explicit override path only checks that both IDs are present, so passing the same UUID for master_id and slave_id is accepted and returns a pair built from the same row. That bypasses the automatic branch’s "at least two acquisitions" safeguard and can drive downstream processing with a self-pair (or a single cached archive), which is not a valid interferometric pair.

Useful? React with 👍 / 👎.

"start_date": config["search"]["start_date"],
"end_date": config["search"]["end_date"],
"polarisation": config["search"]["polarisation"],
"relative_orbit": 117,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Derive relative orbit from notebook data

The generated report hard-codes relative_orbit to 117, so the "Run Context" card will show incorrect metadata whenever the notebook is run on a different orbit. Because this page is presented as the latest run summary, embedding a constant here can silently misreport which track was actually processed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new SNAPflow “burst” helpers and publishes a static HTML “Latest Results” report page into the documentation site, alongside dependency updates to support the workflow.

Changes:

  • Introduce sarpyx.snapflow.burst_utils for selecting a burst pair from search results and safely extracting .SAFE roots from burst archives, with new pytest coverage.
  • Add a docs generator to produce a static “SNAPflow v2 Results” HTML report and wire it into the docs navigation.
  • Update Python dependencies (add seaborn, bump phidown, and remove the sar-decode extra / lockfile entries for s1isp).

Reviewed changes

Copilot reviewed 8 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sarpyx/snapflow/burst_utils.py New utilities for burst pair selection and archive extraction.
tests/test_snapflow_burst_utils.py New tests validating burst pair selection logic and archive extraction behavior.
docs/generate_snapflow_results.py New script to build the static SNAPflow v2 results report from notebook outputs.
docs/generate_static_site.py Adds “Results” to the nav template used for static site generation.
docs/index.html Adds “Results” nav link and a “Latest Results” card on the landing page.
docs/_site/index.html Updates generated docs landing page to include “Latest Results”.
sarpyx/processor/core/decode.py Improves messaging when s1isp is missing and decode functionality is invoked.
pyproject.toml Adds seaborn, bumps phidown, removes sar-decode optional extra.
uv.lock Lockfile update reflecting dependency graph changes (e.g., boto3/botocore, seaborn, phidown bump).
.gitignore Un-ignores docs/results/** and docs/_site/results/** so the published report can be committed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +111 to +115
def build_page(nb: dict[str, Any]) -> str:
config = parse_stream_json(nb["cells"][3])
env_info = parse_stream_json(nb["cells"][1])
burst_info = parse_stream_json(nb["cells"][10])
process_info = parse_stream_json(nb["cells"][14])
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_page() depends on hard-coded notebook cell indices (nb["cells"][3], [10], [14]), which is brittle (any notebook reordering will break report generation). Also, burst_info and process_info are parsed but never used. Prefer locating these outputs via headings/metadata (similar to cells_by_heading) and remove unused variables.

Suggested change
def build_page(nb: dict[str, Any]) -> str:
config = parse_stream_json(nb["cells"][3])
env_info = parse_stream_json(nb["cells"][1])
burst_info = parse_stream_json(nb["cells"][10])
process_info = parse_stream_json(nb["cells"][14])
def _find_stream_json(nb: dict[str, Any], predicate) -> dict[str, Any]:
"""
Find the first cell whose stream JSON, parsed via ``parse_stream_json``,
satisfies the given ``predicate``.
"""
for cell in nb.get("cells", []):
try:
data = parse_stream_json(cell)
except Exception:
continue
try:
if predicate(data):
return data
except Exception:
# If the predicate itself fails (e.g. missing keys), skip this cell
continue
raise KeyError("No notebook cell matched the requested JSON structure")
def build_page(nb: dict[str, Any]) -> str:
config = _find_stream_json(
nb,
lambda d: isinstance(d, dict)
and isinstance(d.get("search"), dict)
and all(
k in d["search"]
for k in ("aoi_wkt", "start_date", "end_date", "polarisation")
),
)
env_info = _find_stream_json(
nb,
lambda d: isinstance(d, dict)
and all(
k in d
for k in ("workspace", "gpt_exists", "output_dir", "process_dir")
),
)

Copilot uses AI. Check for mistakes.
Comment on lines +173 to +179
with ZipFile(archive) as zf:
for member in zf.namelist():
destination = (target_dir / member).resolve(strict=False)
root = target_dir.resolve(strict=False)
if destination != root and root not in destination.parents:
raise ValueError(f"Unsafe archive member path: {member}")
zf.extract(member, target_dir)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_burst_archive guards against path traversal, but it still extracts ZIP members that may be symlinks. Python's zipfile can materialize symlinks on Unix, which can be abused (e.g., create a symlink inside target_dir that points outside and later files/reads follow it). Consider rejecting symlink entries via ZipInfo metadata (or extracting with a routine that forbids symlinks) before calling extract.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +139
config_subset = {
"aoi_wkt": config["search"]["aoi_wkt"],
"start_date": config["search"]["start_date"],
"end_date": config["search"]["end_date"],
"polarisation": config["search"]["polarisation"],
"relative_orbit": 117,
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config_subset hardcodes relative_orbit to 117, which will produce an incorrect report whenever the executed notebook uses a different orbit. This should be derived from the notebook outputs (e.g., from the JSON emitted by the config/selection cells) rather than a constant.

Suggested change
config_subset = {
"aoi_wkt": config["search"]["aoi_wkt"],
"start_date": config["search"]["start_date"],
"end_date": config["search"]["end_date"],
"polarisation": config["search"]["polarisation"],
"relative_orbit": 117,
relative_orbit = config["search"].get("relative_orbit")
if relative_orbit is None:
# Fallback for older configs that may store relative_orbit elsewhere or not at all.
relative_orbit = config.get("relative_orbit", 117)
config_subset = {
"aoi_wkt": config["search"]["aoi_wkt"],
"start_date": config["search"]["start_date"],
"end_date": config["search"]["end_date"],
"polarisation": config["search"]["polarisation"],
"relative_orbit": relative_orbit,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants