Skip to content

Fix wrong V-pol (IV) imaging gradient: chisqgrad_vvis drops the rho gradient#296

Merged
achael merged 1 commit into
devfrom
fix/vvis-grad-rho
Jun 12, 2026
Merged

Fix wrong V-pol (IV) imaging gradient: chisqgrad_vvis drops the rho gradient#296
achael merged 1 commit into
devfrom
fix/vvis-grad-rho

Conversation

@rohandahale

@rohandahale rohandahale commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Fixes a real correctness bug in IV (Stokes I + circular pol) imaging: the analytic gradient chisqgrad_vvis dropped the physical ρ-gradient, making the V-angle gradient ~99% wrong. NumPy IV reconstructions have been converging against a bad gradient.

Root cause

pol_solve masks physical gradient rows inside chisqgrad_vvis, but the vcv transform is non-diagonal — its Jacobian couples the V solver variable (vprime) to both physical ρ and ψ:

vcv_grad.out[2] = drho_dvprime * grad_rho  +  dpsi_dvprime * grad_psi

For pol='IV' (pol_solve=[1,0,0,1]), chisqgrad_vvis computed only rows 0 and 3 — leaving grad_rho (row 1) = 0 — so the dominant drho_dvprime · grad_rho term was dropped.

Why the other pol modes escaped:

  • IP (mcv): the un-computed row is ψ, but its coupling coefficient dpsi_dmprime = 0 at vfrac=0 → harmless.
  • IPV (polcv): the transform is diagonal and pol_solve[1]=1 anyway → ρ-gradient is computed.

So the bug is IV-specific: only vcv has a nonzero coupling to the row pol_solve skipped.

Fix

Compute the physical ρ-gradient whenever V is solved (pol_solve[1] or pol_solve[3]), in both chisqgrad_vvis and its nfft twin chisqgrad_vvis_nfft. Two-line condition change + comments.

Validation

  • IV objgrad now matches central finite differences to median 2.8e-10, max 1.6e-9 (was ~99% wrong).
  • New FD-based regression test test_iv_gradient_matches_finite_difference.
  • The 26 existing pol chisqgrad parity tests (direct + nfft) and the IP recon test are unchanged (the fix is consistent across direct + nfft, so parity holds). Ruff clean.

How it was found

Surfaced by the JAX autodiff objective port (#295): jax autodiff and finite differences agreed, while the hand-written analytic disagreed — the same discovery mechanism as the stv_pol_grad fix (#240).

Notes

  • This affects released numpy IV imaging, so like Fix factor-of-2 bug in stv_pol_grad gradient #240 it targets dev for a fast main release. Rebased onto dev locally so the diff is just the fix (not the jax development on dev-backend); once merged it can propagate dev→dev-backend→dev-backend-mixpol and be cherry-picked to main.
  • Audit done (chisqgrad_p/chisqgrad_m, follow-up to this fix): IP (mcv), IQUV (polcv), and IV (vcv, post-fix) all match finite differences to ~1e-9. The same-class fragility in chisqgrad_p/m is latent onlymcv is always used at vfrac=0, where its psi coupling coefficient is exactly 0 — so they are correct in every real mode and no code change is needed. This PR is scoped to the reproduced IV fix only.

@rohandahale rohandahale requested a review from achael June 10, 2026 09:42
@rohandahale rohandahale assigned achael and rohandahale and unassigned achael Jun 10, 2026
@rohandahale rohandahale added this to the 2.0 milestone Jun 10, 2026
@rohandahale rohandahale marked this pull request as ready for review June 10, 2026 09:43
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.75%. Comparing base (cec8b7d) to head (d9c0527).

Files with missing lines Patch % Lines
ehtim/imaging/pol_imager_utils.py 0.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #296      +/-   ##
==========================================
- Coverage   43.75%   43.75%   -0.01%     
==========================================
  Files          52       52              
  Lines       26289    26289              
  Branches     4473     4473              
==========================================
- Hits        11504    11503       -1     
  Misses      13523    13523              
- Partials     1262     1263       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@achael achael left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- another bug from my major 2024 reorg or the polarization imaging code that didn't surface because it's on a non-standard imaging path. 2 questions.

  1. I think that the pol_solve entries are meant to directly correspond to the Poincare sphere slots (I,rho,phi,psi). So the logic in IV imaging I think should not be to change the if statements in chisqgrad_vvis, but to change which pol_solve entries are assigned to IV imaging to be (1,1,0,1). Does that make sense?

  2. we now have 4 versions of the code that all should in principle get this fix -- the main branch (stable, 1.3.1), the dev branch (1.4, pre-jax), the dev-backend branch (jax) and the dev-backend-mixpol branch. I agree all of these should get the fix, but I don't think any of them should have a major change of scope yet. So it would be useful to think about what the best strategy is to propagate this fix -- should we just have separate PRs for each or is there a better way?

@rohandahale

Copy link
Copy Markdown
Collaborator Author

@achael Thanks!

  1. pol_solve=(1,1,0,1) for IV: I tried it, and it isn't clean as-is, because which_solve is doing two things: pack_imarr builds the solver vector from its nonzero slots, and that same array is what gets passed to chisqgrad_vvis as pol_solve. So changing index 1 on to get the rho-gradient also makes m a solved variable:
  • IV which_solve = [1,0,0,1] --> 2 solver images (I + V)
  • (1,1,0,1) --> 3 solver images (I + m + V)

Since vcv holds m = P/I fixed, that extra m is a unnecessary in IV imaging.

I agree the Poincaré method is the right way. The issue is that one array currently means both "which slots are solver variables" and "which physical gradients to compute". IV wants those to differ (solve I + v′, but compute the physical rho-gradient that vcv_grad uses). The if pol_solve[1] or pol_solve[3] change gives that with no extra variable. Doing it your way cleanly would mean splitting which_solve into a solver selection array vs an active physical slots array. I am happy to file that as a follow-up (it is like with the imager-config NamedTuple work).

  1. It's a two line fix (chisqgrad_vvis + the nfft variant), so rather than four hand-written PRs I would do what we did for Fix factor-of-2 bug in stv_pol_grad gradient #240: a minimal fix on dev, let it merge dev --> dev-backend --> dev-backend-mixpol, and if you want, cherry-pick to main (1.3.1). I can retarget this PR from dev-backend to dev.

achael
achael previously approved these changes Jun 11, 2026

@achael achael left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, makes sense -- i think your proposed merge sequence also makes. sense.

@achael achael changed the base branch from dev-backend to dev June 11, 2026 15:29
@achael achael changed the base branch from dev to dev-backend June 11, 2026 15:30
@achael achael dismissed their stale review June 11, 2026 15:30

The base branch was changed.

@achael

achael commented Jun 11, 2026

Copy link
Copy Markdown
Owner

@rohandahale it looks like rebasing it from github to dev introduces a bunch of other changes from the jax development, so i'll wait for you to do that locally. Than we propagate the change from dev->dev-backend and from dev->dev-mixedpol, and we cherry pick for main.

@rohandahale rohandahale changed the base branch from dev-backend to dev June 11, 2026 19:33
@rohandahale

Copy link
Copy Markdown
Collaborator Author

@achael Done, ready to merge. Then we can do dev-->dev-backend & dev->dev-backend-mixpol and only these two lines into main.

@achael achael merged commit b3f47ac into dev Jun 12, 2026
7 of 8 checks passed
@achael achael deleted the fix/vvis-grad-rho branch June 16, 2026 13:52
achael added a commit that referenced this pull request Jun 16, 2026
* Add physical_grad_slots helper

Maps the Stokes DOF mask to the physical gradout slots the chisq/reg
kernels must fill, centralizing the mcv/vcv cross-coupling that mirrors
transform_gradients' Jacobian sparsity. Not yet wired in. + unit tests.

* Wire physical_grad_slots into chisq and reg gradient dicts

Feed the cross-coupling-aware mask to the pol gradient kernels in both
compute_chisqgrad_dict and compute_reggrad_dict. Behavior-identical for
now (kernels still carry the or-patches). Guard physical_grad_slots
against sub-4-wide single-pol masks (Stokes-I carries 'mcv' inertly).
+ regression test.

* Revert vvis kernels to diagonal pol_solve gating

The mcv/vcv cross-coupling now lives in physical_grad_slots, so drop the
'or pol_solve[3]' patches (#296) in chisqgrad_vvis / chisqgrad_vvis_nfft;
each physical slot keys on its own bit again. Note in each pol chisqgrad
docstring that pol_solve flags required physical gradients, not DOFs.

* Fix reggrad_ptv first-row/col boundary masking + epsilon_tv

Zero the back-neighbor (m2/m3) terms on the first row/column in
reggrad_ptv slots 0/1/3 (the back-neighbor is the zero pad), matching
reggrad_vtv/reggrad_tv. Pre-fix the whole first row+col of those slots
was wrong (corner ~4x off vs FD). Add epsilon_tv to reg_ptv/reggrad_ptv
denominators (default 0, byte-identical) for #295 parity. Note pol_solve
= physical-gradient slots in the 8 pol reggrad docstrings.

Add full-grid boundary FD regression tests for ptv, vtv, and Stokes-I tv.

* Note pol_solve semantics in polchisqgrad docstring

polchisqgrad is a legacy shim (parity tests only); document that its
pol_solve is a physical-gradient mask, not a raw DOF mask.

* Drive pol regularizer FD with all four physical slots

_pol_solve_for now returns [1,1,1,1] so the previously-blind cross-
coupling slots are FD-checked: reggrad_ptv psi (3), reggrad_vflux/l1v/
l2v/vtv rho (1), and slot 0 for every pol reg. Proves the reg-grad
slots are individually correct against finite differences.

* Add pol chisq FD + cross-ttype tests in test_chisquared.py

New pol coverage in its final-home file: TestPolChisqGradFD checks
chisqgrad_p/m/vvis against finite differences of the chisq value in all
four physical slots (pol_solve=[1,1,1,1]) for direct+nfft, asserting
vvis slot 2 (EVPA) is identically zero. TestPolChisq{,Grad}Consistency
check direct-vs-nfft agreement. Closes the m / p-slot-3 blind spots.

* Add parametrized pol objective-FD sampling the polarization DOF block

TestObjectiveGradPolarimetricFD checks objgrad vs FD for IP/IV/IQUV x
{direct,nfft}, with each case bundling its pol data terms + a pol reg, so
both the chisq and reg gradient paths through physical_grad_slots are
exercised. Samples the pol DOF block (past the Stokes-I block), where the
mcv/vcv cross-coupling lives -- the existing global-sampling FD tests
missed it (the dropped IP slot-3 term is ~4% off FD at V=0.02*I, ~430% at
V=0.2*I). Comments out the now-subsumed test_fd_matches_analytic_polarimetric
(backend) and test_iv_gradient_matches_finite_difference (e2e).

* Use an asymmetric image for chisq/regularizer/gradient FD fixtures

Add make_asym_image (broad offset/elongated/rotated double-Gaussian) and
switch the Stokes-I FD fixtures (chisq_setup, reg_setup, mfreg_setup,
grad_setup) to it. Breaking the reflection/rotation/x<->y symmetry of the
centered Gaussian surfaces boundary/axis-ordering bugs a symmetric image
hides. Blobs kept broad (grid-filling, no dead pixels) so the |.|-kink TV
gradients stay FD-well-conditioned at epsilon_tv=0; all tolerances unchanged.

* Use asymmetric + spatially-varying pol in pol FD test fixtures

chisq_setup_pol and a new asym_pol_setup build on make_asym_image and use
add_random_pol (ccorr>0) so EVPA, vfrac, rho, and psi all vary spatially
instead of a constant pol fraction. polreg_setup switches its Stokes I to
the asymmetric image (keeping the per-pixel pol jitter that keeps TV
denominators non-degenerate). TestObjectiveGradPolarimetricFD now uses the
structured-pol obs.

Widen the pol chisq FD check to a median+max split (median 1e-5, max 1e-3):
the structured-pol imcur has sharper local curvature, so 2nd-order FD
truncation pushes a few small-gradient pixels to ~2.6e-4 -- well below any
real pol-gradient bug (%-level), which the tight median still catches.

* Comment cleanup + epsilon_tv consistency in pol_imager_utils

Manual review pass: per-slot dR/dX labels, docstrings on the reg kernels,
a module-level CONVENTIONS block (imarr = [I, rho, phi=2chi, psi]), and
removal of stale TODOs. Two behavior touches, both byte-identical at the
defaults:
- reg_vtv / reggrad_vtv now honor epsilon_tv (kwargs, default 0) like the
  ptv pair, instead of the value ignoring it while the grad pinned it to 0.
- reggrad_ptv masks the chi-slot back-neighbor terms (c2/c3) too, for
  uniformity (they already self-zero at the pad).
Plus an mcv_r exception-message fix and ruff-clean whitespace.

* Comment cleanup in imager_utils (no behavior change)

Manual review pass: docstrings on the Stokes-I reg kernels, per-block
comments, 'fourier/transform matrices' labels on the diag Amatrices
unpacking, and removal of dead commented-out systematic-noise code in the
bispectrum data functions (the intent is now documented in
apply_systematic_noise_snrcut). Purely cosmetic; ruff-clean.

* fixed lint errors in test_regularizers and test_chisquared
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants