From 40ee0b56a6931ed482c6e9a23b41bd94c1f0c797 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 30 Mar 2026 10:58:20 -0400
Subject: [PATCH 1/4] build(deps): bump codecov/codecov-action from 5 to 6
 (#295)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Bumps
[codecov/codecov-action](https://github.com/codecov/codecov-action) from
5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/codecov/codecov-action/releases">codecov/codecov-action's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>⚠️ This version introduces support for node24 which make cause
breaking changes for systems that do not currently support node24.
⚠️</h2>
<h2>What's Changed</h2>
<ul>
<li>Revert &quot;Revert &quot;build(deps): bump actions/github-script
from 7.0.1 to 8.0.0&quot;&quot; by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1929">codecov/codecov-action#1929</a></li>
<li>Th/6.0.0 by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1928">codecov/codecov-action#1928</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.4...v6.0.0">https://github.com/codecov/codecov-action/compare/v5.5.4...v6.0.0</a></p>
<h2>v5.5.4</h2>
<p>This is a mirror of <code>v5.5.2</code>. <code>v6</code> will be
released which requires <code>node24</code></p>
<h2>What's Changed</h2>
<ul>
<li>Revert &quot;build(deps): bump actions/github-script from 7.0.1 to
8.0.0&quot; by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1926">codecov/codecov-action#1926</a></li>
<li>chore(release): 5.5.4 by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1927">codecov/codecov-action#1927</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.3...v5.5.4">https://github.com/codecov/codecov-action/compare/v5.5.3...v5.5.4</a></p>
<h2>v5.5.3</h2>
<h2>What's Changed</h2>
<ul>
<li>build(deps): bump actions/github-script from 7.0.1 to 8.0.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1874">codecov/codecov-action#1874</a></li>
<li>chore(release): bump to 5.5.3 by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1922">codecov/codecov-action#1922</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.2...v5.5.3">https://github.com/codecov/codecov-action/compare/v5.5.2...v5.5.3</a></p>
<h2>v5.5.2</h2>
<h2>What's Changed</h2>
<ul>
<li>check gpg only when skip-validation = false by <a
href="https://github.com/maxweng-sentry"><code>@​maxweng-sentry</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1894">codecov/codecov-action#1894</a></li>
<li>chore: <code>disable_search</code> alignment by <a
href="https://github.com/freemanzMrojo"><code>@​freemanzMrojo</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1881">codecov/codecov-action#1881</a></li>
<li>chore(release): 5.5.2 by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1902">codecov/codecov-action#1902</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/maxweng-sentry"><code>@​maxweng-sentry</code></a>
made their first contribution in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1894">codecov/codecov-action#1894</a></li>
<li><a
href="https://github.com/freemanzMrojo"><code>@​freemanzMrojo</code></a>
made their first contribution in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1881">codecov/codecov-action#1881</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.1...v5.5.2">https://github.com/codecov/codecov-action/compare/v5.5.1...v5.5.2</a></p>
<h2>v5.5.1</h2>
<h2>What's Changed</h2>
<ul>
<li>build(deps): bump ossf/scorecard-action from 2.4.1 to 2.4.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1833">codecov/codecov-action#1833</a></li>
<li>build(deps): bump github/codeql-action from 3.28.18 to 3.29.9 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1861">codecov/codecov-action#1861</a></li>
<li>Document a <code>codecov-cli</code> version reference example by <a
href="https://github.com/webknjaz"><code>@​webknjaz</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1774">codecov/codecov-action#1774</a></li>
<li>docs: fix typo in README by <a
href="https://github.com/datalater"><code>@​datalater</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1866">codecov/codecov-action#1866</a></li>
<li>fix: update to use local app/ dir by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1872">codecov/codecov-action#1872</a></li>
<li>build(deps): bump github/codeql-action from 3.29.9 to 3.29.11 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1867">codecov/codecov-action#1867</a></li>
<li>build(deps): bump actions/checkout from 4.2.2 to 5.0.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1868">codecov/codecov-action#1868</a></li>
<li>fix: overwrite pr number on fork by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1871">codecov/codecov-action#1871</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md">codecov/codecov-action's
changelog</a>.</em></p>
<blockquote>
<h2>v5.5.2</h2>
<h3>What's Changed</h3>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.1..v5.5.2">https://github.com/codecov/codecov-action/compare/v5.5.1..v5.5.2</a></p>
<h2>v5.5.1</h2>
<h3>What's Changed</h3>
<ul>
<li>fix: overwrite pr number on fork by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1871">codecov/codecov-action#1871</a></li>
<li>build(deps): bump actions/checkout from 4.2.2 to 5.0.0 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1868">codecov/codecov-action#1868</a></li>
<li>build(deps): bump github/codeql-action from 3.29.9 to 3.29.11 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1867">codecov/codecov-action#1867</a></li>
<li>fix: update to use local app/ dir by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1872">codecov/codecov-action#1872</a></li>
<li>docs: fix typo in README by <a
href="https://github.com/datalater"><code>@​datalater</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1866">codecov/codecov-action#1866</a></li>
<li>Document a <code>codecov-cli</code> version reference example by <a
href="https://github.com/webknjaz"><code>@​webknjaz</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1774">codecov/codecov-action#1774</a></li>
<li>build(deps): bump github/codeql-action from 3.28.18 to 3.29.9 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1861">codecov/codecov-action#1861</a></li>
<li>build(deps): bump ossf/scorecard-action from 2.4.1 to 2.4.2 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1833">codecov/codecov-action#1833</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.5.0..v5.5.1">https://github.com/codecov/codecov-action/compare/v5.5.0..v5.5.1</a></p>
<h2>v5.5.0</h2>
<h3>What's Changed</h3>
<ul>
<li>feat: upgrade wrapper to 0.2.4 by <a
href="https://github.com/jviall"><code>@​jviall</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1864">codecov/codecov-action#1864</a></li>
<li>Pin actions/github-script by Git SHA by <a
href="https://github.com/martincostello"><code>@​martincostello</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1859">codecov/codecov-action#1859</a></li>
<li>fix: check reqs exist by <a
href="https://github.com/joseph-sentry"><code>@​joseph-sentry</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1835">codecov/codecov-action#1835</a></li>
<li>fix: Typo in README by <a
href="https://github.com/spalmurray"><code>@​spalmurray</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1838">codecov/codecov-action#1838</a></li>
<li>docs: Refine OIDC docs by <a
href="https://github.com/spalmurray"><code>@​spalmurray</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1837">codecov/codecov-action#1837</a></li>
<li>build(deps): bump github/codeql-action from 3.28.17 to 3.28.18 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1829">codecov/codecov-action#1829</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.4.3..v5.5.0">https://github.com/codecov/codecov-action/compare/v5.4.3..v5.5.0</a></p>
<h2>v5.4.3</h2>
<h3>What's Changed</h3>
<ul>
<li>build(deps): bump github/codeql-action from 3.28.13 to 3.28.17 by
<code>@​app/dependabot</code> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1822">codecov/codecov-action#1822</a></li>
<li>fix: OIDC on forks by <a
href="https://github.com/joseph-sentry"><code>@​joseph-sentry</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1823">codecov/codecov-action#1823</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codecov/codecov-action/compare/v5.4.2..v5.4.3">https://github.com/codecov/codecov-action/compare/v5.4.2..v5.4.3</a></p>
<h2>v5.4.2</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/codecov/codecov-action/commit/57e3a136b779b570ffcdbf80b3bdc90e7fab3de2"><code>57e3a13</code></a>
Th/6.0.0 (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1928">#1928</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/f67d33dda8a42b51c42a8318a1f66468119e898b"><code>f67d33d</code></a>
Revert &quot;Revert &quot;build(deps): bump actions/github-script from
7.0.1 to 8.0.0&quot;&quot;...</li>
<li>See full diff in <a
href="https://github.com/codecov/codecov-action/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=codecov/codecov-action&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 .github/workflows/coverage.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/.github/workflows/coverage.yaml b/.github/workflows/coverage.yaml
index b14db8cf..2bd2b9af 100644
--- a/.github/workflows/coverage.yaml
+++ b/.github/workflows/coverage.yaml
@@ -63,7 +63,7 @@ jobs:
           CASKADE_BACKEND: jax
 
       - name: Upload coverage reports to Codecov with GitHub Action
-        uses: codecov/codecov-action@v5
+        uses: codecov/codecov-action@v6
         env:
           CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
         with:

From 12f528245b9492c701121dcd277f732f0d9b4f46 Mon Sep 17 00:00:00 2001
From: "Connor Stone, PhD" <connorstone628@gmail.com>
Date: Thu, 2 Apr 2026 15:46:42 -0400
Subject: [PATCH 2/4] More detailed and accurate parameter descriptions (#297)

---
 astrophot/fit/lm.py                           |  4 --
 astrophot/models/airy.py                      | 59 ++++++++---------
 astrophot/models/basis_psf.py                 | 41 +++++++++---
 astrophot/models/batch_model_object.py        | 12 +++-
 astrophot/models/bilinear_sky.py              | 22 ++++---
 astrophot/models/flatsky.py                   |  4 +-
 astrophot/models/func/gaussian.py             |  7 +-
 astrophot/models/gaussian_ellipsoid.py        |  2 -
 astrophot/models/group_psf_model.py           | 31 +++++++--
 astrophot/models/mixins/exponential.py        | 24 +++----
 astrophot/models/mixins/gaussian.py           | 22 +++----
 astrophot/models/mixins/sample.py             | 65 +++++++++++++++++--
 astrophot/models/mixins/sersic.py             | 18 ++---
 astrophot/models/mixins/spline.py             | 12 ++--
 astrophot/models/mixins/transform.py          | 24 ++++---
 astrophot/models/multi_gaussian_expansion.py  |  4 +-
 astrophot/models/pixelated_model.py           |  6 +-
 astrophot/models/pixelated_psf.py             |  7 +-
 astrophot/models/planesky.py                  | 17 +++--
 astrophot/models/point_source.py              | 14 ++--
 astrophot/models/psf_model_object.py          | 40 ++++++------
 astrophot/models/sky_model_object.py          |  2 +-
 docs/source/tutorials/AdvancedPSFModels.ipynb |  3 -
 docs/source/tutorials/GettingStarted.ipynb    |  3 +-
 docs/source/tutorials/ModelZoo.ipynb          |  2 +-
 tests/test_model.py                           |  4 +-
 tests/test_psfmodel.py                        | 18 ++---
 27 files changed, 271 insertions(+), 196 deletions(-)

diff --git a/astrophot/fit/lm.py b/astrophot/fit/lm.py
index adca7c97..fa1aa2da 100644
--- a/astrophot/fit/lm.py
+++ b/astrophot/fit/lm.py
@@ -1,7 +1,6 @@
 # Levenberg-Marquardt algorithm
 from typing import Sequence, Optional
 
-import torch
 import numpy as np
 
 from .base import BaseOptimizer
@@ -238,7 +237,6 @@ def poisson_2nll_ndf(self):
         M = self.forward(self.current_state)
         return 2 * backend.sum(M - self.Y * backend.log(M + 1e-10)) / self.ndf
 
-    @torch.no_grad()
     def fit(self, update_uncertainty=True) -> BaseOptimizer:
         """This performs the fitting operation. It iterates the LM step
         function until convergence is reached. Includes a message
@@ -366,7 +364,6 @@ def check_convergence(self) -> bool:
         return False
 
     @property
-    @torch.no_grad()
     def covariance_matrix(self) -> ArrayLike:
         """The covariance matrix for the model at the current
         parameters. This can be used to construct a full Gaussian PDF for the
@@ -391,7 +388,6 @@ def covariance_matrix(self) -> ArrayLike:
             self._covariance_matrix = backend.linalg.pinv(hess)
         return self._covariance_matrix
 
-    @torch.no_grad()
     def update_uncertainty(self) -> None:
         """Call this function after optimization to set the uncertainties for
         the parameters. This will use the diagonal of the covariance
diff --git a/astrophot/models/airy.py b/astrophot/models/airy.py
index 6e93cd2e..c6233ac7 100644
--- a/astrophot/models/airy.py
+++ b/astrophot/models/airy.py
@@ -1,6 +1,3 @@
-import torch
-import numpy as np
-
 from ..utils.decorators import ignore_numpy_warnings, combine_docstrings
 from .psf_model_object import PSFModel
 from .mixins import RadialMixin
@@ -15,9 +12,13 @@ class AiryPSF(RadialMixin, PSFModel):
     """The Airy disk is an analytic description of the diffraction pattern
     for a circular aperture.
 
-    The diffraction pattern is described exactly by the configuration
-    of the lens system under the assumption that all elements are
-    perfect. This expression goes as:
+    WARNING: This model does not work in JAX (it doesn't have the required Bessel function implemented)
+
+    WARNING: PyTorch appears to have an issue with gradients wrt the R1 parameter. Optimization doesn't seem to work for this model (maybe try scipy optimize?).
+
+    The diffraction pattern is described exactly by the configuration of the
+    lens system under the assumption that all elements are perfect. This
+    expression goes as:
 
     .. math::
 
@@ -27,50 +28,46 @@ class AiryPSF(RadialMixin, PSFModel):
 
        x = ka\\sin(\\theta) = \\frac{2\\pi a r}{\\lambda R}
 
-    where :math:`I(\\theta)` is the intensity as a function of the
-    angular position within the diffraction system along its main
-    axis, :math:`I_0` is the central intensity of the airy disk,
-    :math:`J_1` is the Bessel function of the first kind of order one,
-    :math:`k = \\frac{2\\pi}{\\lambda}` is the wavenumber of the
-    light, :math:`a` is the aperture radius, :math:`r` is the radial
-    position from the center of the pattern, :math:`R` is the distance
+    where :math:`I(\\theta)` is the intensity as a function of the angular
+    position within the diffraction system along its main axis, :math:`I_0` is
+    the central intensity of the airy disk, :math:`J_1` is the Bessel function
+    of the first kind of order one, :math:`k = \\frac{2\\pi}{\\lambda}` is the
+    wavenumber of the light, :math:`a` is the aperture radius, :math:`r` is the
+    radial position from the center of the pattern, :math:`R` is the distance
     from the circular aperture to the observation plane.
 
-    In the ``Airy_PSF`` class we combine the parameters
-    :math:`a,R,\\lambda` into a single ratio to be optimized (or fixed
-    by the optical configuration).
+    In the ``AiryPSF`` class we combine the parameters :math:`a,R,\\lambda` and
+    scale based on the first zero of the :math:`J_1` function. This way you can
+    work with the more intuitive radius parameter.
 
-    :param I0: The central intensity of the airy disk in flux/arcsec^2.
-    :param aRL: The ratio of the aperture radius to the
-        product of the wavelength and the distance from the aperture to the
-        observation plane, :math:`\\frac{a}{R \\lambda}`.
+    :param I0: The central intensity of the airy disk in flux/pix^2.
+    :param R1: The radius of the first zero of the airy disk in pix.
 
     """
 
     _model_type = "airy"
     _parameter_specs = {
         "I0": {
-            "units": "flux/arcsec^2",
+            "units": "flux/pix^2",
             "value": 1.0,
             "shape": (),
             "dynamic": False,
-            "description": "The central intensity of the airy disk in flux/arcsec^2.",
+            "description": "The central intensity of the airy disk in flux/pix^2.",
         },
-        "aRL": {
-            "units": "a/(R lambda)",
+        "R1": {
+            "units": "pix",
             "shape": (),
             "dynamic": True,
-            "description": "The ratio of the aperture radius to the product of the wavelength and the distance from the aperture to the observation plane.",
+            "description": "The radius of the first zero of the airy disk in pix.",
         },
     }
     usable = True
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
 
-        if self.I0.initialized and self.aRL.initialized:
+        if self.I0.initialized and self.R1.initialized:
             return
         icenter = self.target.targpixel_to_mypixel(*self.center.value)
 
@@ -80,10 +77,10 @@ def initialize(self):
                 int(icenter[1]) - 2 : int(icenter[1]) + 2,
             ]
             self.I0.value = backend.mean(mid_chunk) / self.target.upsample**2
-        if not self.aRL.initialized:
-            self.aRL.value = (5.0 / 8.0) * 2
+        if not self.R1.initialized:
+            self.R1.value = 2.0
 
     @forward
-    def radial_model(self, R: ArrayLike, I0: ArrayLike, aRL: ArrayLike) -> ArrayLike:
-        x = 2 * np.pi * aRL * R
+    def radial_model(self, R: ArrayLike, I0: ArrayLike, R1: ArrayLike) -> ArrayLike:
+        x = 3.8317 * R / R1
         return I0 * (2 * backend.bessel_j1(x) / x) ** 2
diff --git a/astrophot/models/basis_psf.py b/astrophot/models/basis_psf.py
index 71ef9fea..85886474 100644
--- a/astrophot/models/basis_psf.py
+++ b/astrophot/models/basis_psf.py
@@ -16,13 +16,28 @@
 
 @combine_docstrings
 class PixelBasisPSF(PSFModel):
-    """point source model which uses multiple images as a basis for the
-    PSF as its representation for point sources. Using bilinear interpolation it
-    will shift the PSF within a pixel to accurately represent the center
-    location of a point source. There is no functional form for this object type
-    as any image can be supplied. Bilinear interpolation is very fast and
-    accurate for smooth models, so it is possible to do the expensive
-    interpolation before optimization and save time.
+    """A point source defined by a linear combination of basis images.
+
+    Point source model which uses multiple images as a basis for the PSF as its
+    representation for point sources. Using bilinear interpolation it will shift
+    the PSF within a pixel to accurately represent the center location of a
+    point source. There is no functional form for this object type as any image
+    can be supplied. Bilinear interpolation is very fast and accurate for smooth
+    models, so it is possible to do the expensive interpolation before
+    optimization and save time.
+
+    The initialization of the weights is currently done by setting random
+    values. This almost certainly produces a bad initial model. You may either
+    set weights manually, or use a fitting step to get good starting weights.
+
+    Note: The resulting PSF from the combined basis set will be normalized
+        before being used as a PSF model, so the sum of the `weights` does not
+        need to be restricted to any particular value.
+
+    Note: It is possible for the basis elements to combine to give a PSF model
+        that is negative in some areas. This is likely not desired, if this is a
+        concern then use a non-negative basis and set the valid range of the
+        weights to be `(0, None)`.
 
     :param weights: The weights of the basis set of images in units of flux.
     """
@@ -67,9 +82,15 @@ def initialize(self):
             self.basis = func.zernike_basis(order, N) / self.target.pixel_area
 
         if not self.weights.initialized:
-            w = np.zeros(self.basis.shape[0])
-            w[0] = 1.0
-            self.weights.value = w
+            w = backend.as_array(
+                1 / np.arange(1, self.basis.shape[0] + 1),
+                dtype=config.DTYPE,
+                device=config.DEVICE,
+            )
+            scale = backend.mean(self.target[self.window].data) / backend.mean(
+                backend.sum(w[:, None, None] * self.basis, dim=0)
+            )
+            self.weights.value = w * scale
 
     @forward
     def brightness(self, x: ArrayLike, y: ArrayLike, weights: ArrayLike) -> ArrayLike:
diff --git a/astrophot/models/batch_model_object.py b/astrophot/models/batch_model_object.py
index 273aa5a1..da8894c8 100644
--- a/astrophot/models/batch_model_object.py
+++ b/astrophot/models/batch_model_object.py
@@ -19,9 +19,17 @@ class BatchModel(GradMixin, SampleMixin, Model):
     model). If you want to model the same object in multiple images, see the
     BatchSceneModel instead.
 
+    Once placed in a BatchModel, parameters for the base Model may now be given
+    values with an extra dimension. This extra dimension is the batch dimension
+    that will be vectorized over. For example, with a Gaussian Model, the
+    `sigma` parameter is normally a single value scalar. You may now set it with
+    a vector of values, and the length of that vector determines how many
+    Gaussians the BatchModel will generate. Of course, every dynamic parameter
+    that is batched must have the same size for its batch dimension.
+
     **Note:** any model parameters that you wish to batch over must be set to
-        dynamic=True. See [caskade hierarchical
-        models](https://caskade.readthedocs.io/en/latest/notebooks/HierarchicalModels.html)
+        dynamic=True. See `caskade hierarchical models
+        <https://caskade.readthedocs.io/en/latest/notebooks/HierarchicalModels.html>`_
         for more details.
     """
 
diff --git a/astrophot/models/bilinear_sky.py b/astrophot/models/bilinear_sky.py
index 2d24bfc8..3ea88914 100644
--- a/astrophot/models/bilinear_sky.py
+++ b/astrophot/models/bilinear_sky.py
@@ -1,6 +1,5 @@
 from typing import Tuple
 import numpy as np
-import torch
 
 from .sky_model_object import SkyModel
 from ..utils.decorators import ignore_numpy_warnings, combine_docstrings
@@ -10,13 +9,19 @@
 from . import func
 from ..utils.initialize import polar_decomposition
 
-__all__ = ["BilinearSky"]
+__all__ = ("BilinearSky",)
 
 
 @combine_docstrings
 class BilinearSky(SkyModel):
     """Sky background model using a coarse bilinear grid for the sky flux.
 
+    This allows for modelling more complex sky surfaces, such as dust or
+    galactic cirrus, without needing to specify a functional form. It is
+    possible to specify a position angle and grid scale to control how it is
+    oriented relative to the model target. By default it will just align with
+    the image.
+
     :param I: sky brightness grid
     :param PA: position angle of the sky grid in radians.
     :param scale: scale of the sky grid in arcseconds per grid unit.
@@ -34,13 +39,13 @@ class BilinearSky(SkyModel):
         "PA": {
             "units": "radians",
             "shape": (),
-            "dynamic": True,
+            "dynamic": False,
             "description": "position angle of the sky grid in radians",
         },
         "scale": {
             "units": "arcsec/grid-unit",
             "shape": (),
-            "dynamic": True,
+            "dynamic": False,
             "description": "scale of the sky grid in arcseconds per grid unit",
         },
     }
@@ -51,7 +56,6 @@ def __init__(self, *args, nodes: Tuple[int, int] = (3, 3), **kwargs):
         super().__init__(*args, **kwargs)
         self.nodes = nodes
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
@@ -59,20 +63,20 @@ def initialize(self):
         if self.I.initialized:
             self.nodes = tuple(self.I.value.shape)
 
+        target_area = self.target[self.window]
         if not self.PA.initialized:
             R, _ = polar_decomposition(self.target.CD.npvalue)
             self.PA.value = np.arccos(np.abs(R[0, 0]))
         if not self.scale.initialized:
             self.scale.value = (
-                self.target.pixelscale.item() * self.target._data.shape[0] / self.nodes[0]
+                self.target.pixelscale.item() * target_area._data.shape[0] / self.nodes[0]
             )
 
         if self.I.initialized:
             return
 
-        target_dat = self.target[self.window]
-        dat = backend.to_numpy(target_dat._data).copy()
-        mask = backend.to_numpy(target_dat._mask).copy()
+        dat = backend.to_numpy(target_area._data).copy()
+        mask = backend.to_numpy(target_area._mask).copy()
         dat[mask] = np.nanmedian(dat)
         iS = dat.shape[0] // self.nodes[0]
         jS = dat.shape[1] // self.nodes[1]
diff --git a/astrophot/models/flatsky.py b/astrophot/models/flatsky.py
index 5c2c19a1..d418b746 100644
--- a/astrophot/models/flatsky.py
+++ b/astrophot/models/flatsky.py
@@ -14,7 +14,7 @@ class FlatSky(SkyModel):
     """Model for the sky background in which all values across the image
     are the same.
 
-    :param I0: brightness for the sky, represented as the log of the brightness over pixel scale squared, this is proportional to a surface brightness
+    :param I0: brightness for the sky in flux/arcsec^2
 
     """
 
@@ -24,7 +24,7 @@ class FlatSky(SkyModel):
             "units": "flux/arcsec^2",
             "shape": (),
             "dynamic": True,
-            "description": "brightness for the sky, proportional to a surface brightness",
+            "description": "brightness for the sky in flux/arcsec^2",
         }
     }
     usable = True
diff --git a/astrophot/models/func/gaussian.py b/astrophot/models/func/gaussian.py
index 7a4085e1..ea48d435 100644
--- a/astrophot/models/func/gaussian.py
+++ b/astrophot/models/func/gaussian.py
@@ -1,9 +1,6 @@
-import torch
 from ...backend_obj import backend, ArrayLike
 import numpy as np
 
-sq_2pi = np.sqrt(2 * np.pi)
-
 
 def gaussian(R: ArrayLike, sigma: ArrayLike, flux: ArrayLike) -> ArrayLike:
     """Gaussian 1d profile function, specifically designed for pytorch
@@ -12,6 +9,6 @@ def gaussian(R: ArrayLike, sigma: ArrayLike, flux: ArrayLike) -> ArrayLike:
     **Args:**
     -  `R`: Radii tensor at which to evaluate the gaussian function
     -  `sigma`: Standard deviation of the gaussian in the same units as R
-    -  `flux`: Central surface density
+    -  `flux`: Total flux of the Gaussian
     """
-    return (flux / (sq_2pi * sigma)) * backend.exp(-0.5 * (R / sigma) ** 2)
+    return (flux / (2 * np.pi * sigma**2)) * backend.exp(-0.5 * (R / sigma) ** 2)
diff --git a/astrophot/models/gaussian_ellipsoid.py b/astrophot/models/gaussian_ellipsoid.py
index 6399e599..a86b7de7 100644
--- a/astrophot/models/gaussian_ellipsoid.py
+++ b/astrophot/models/gaussian_ellipsoid.py
@@ -1,4 +1,3 @@
-import torch
 import numpy as np
 
 from .model_object import ComponentModel
@@ -104,7 +103,6 @@ class GaussianEllipsoid(ComponentModel):
     }
     usable = True
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
diff --git a/astrophot/models/group_psf_model.py b/astrophot/models/group_psf_model.py
index 92b48bcf..0930bfaf 100644
--- a/astrophot/models/group_psf_model.py
+++ b/astrophot/models/group_psf_model.py
@@ -8,14 +8,23 @@
 
 class PSFGroupModel(GroupModel):
     """
-    A group of PSF models. Behaves similarly to a `GroupModel`, but specifically designed for PSF models.
+    A group of PSF models. Behaves similarly to a `GroupModel`, but specifically
+    designed for PSF models. Note that there is no concept of a PSFImageList, so
+    they always represent a single PSF model.
+
+    When sampling, a PSFGroupModel tells each sub-PSFModel (including nested
+    sub-PSFGroupModels) to sample without normalization. This way they can fit
+    with relative strengths. The final top-level PSFGroupModel will normalize
+    the resulting PSF, so that the image that gets passed to the regular model
+    objects for the purpose of convolution is always normalized. This means that
+    the sub-PSFModels in a PSFGroupModel should have their brightness parameters
+    (i.e., `I0` for the MoffatPSF) set to dynamic so they can participate in the
+    fit. Though this is not strictly a requirement (say you already know the
+    relative brightnesses).
     """
 
     _model_type = "psf"
     usable = True
-    normalize_psf = True
-
-    _options = ("normalize_psf",)
 
     @property
     def target(self):
@@ -36,15 +45,23 @@ def target(self, target):
         self._target = target
 
     @forward
-    def sample(self, *args, **kwargs):
+    def sample(self, normalize_psf=True) -> PSFImage:
         """Sample the PSF group model on the target image."""
         image = self.target.model_image(self.window)
 
         for model in self.models:
-            model_image = model()
+            model_image = model(normalize_psf=False)
             self._ensure_vmap_compatible(image, model_image)
             image += model_image
 
-        if self.normalize_psf:
+        if normalize_psf:
             image.normalize()
         return image
+
+    @forward
+    def __call__(
+        self,
+        normalize_psf=True,
+    ) -> PSFImage:
+
+        return self.sample(normalize_psf=normalize_psf)
diff --git a/astrophot/models/mixins/exponential.py b/astrophot/models/mixins/exponential.py
index f7bf8d83..1a14848d 100644
--- a/astrophot/models/mixins/exponential.py
+++ b/astrophot/models/mixins/exponential.py
@@ -25,8 +25,8 @@ class ExponentialMixin:
     Ie is the brightness at the effective radius, and Re is the effective
     radius. :math:`b_1` is a constant that ensures :math:`I_e` is the brightness at :math:`R_e`.
 
-    :param Re: effective radius in arcseconds
-    :param Ie: effective surface density in flux/arcsec^2
+    :param Re: effective radius, radius enclosing half the total light
+    :param Ie: effective surface density, brightness at the effective radius
     """
 
     _model_type = "exponential"
@@ -36,14 +36,14 @@ class ExponentialMixin:
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "effective radius in arcseconds",
+            "description": "effective radius, radius enclosing half the total light",
         },
         "Ie": {
             "units": "flux/arcsec^2",
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "effective surface density in flux/arcsec^2",
+            "description": "effective surface density, brightness at the effective radius",
         },
     }
 
@@ -81,8 +81,8 @@ class iExponentialMixin:
     ``Re`` and ``Ie`` are batched by their first dimension, allowing for multiple
     exponential profiles to be defined at once.
 
-    :param Re: effective radius in arcseconds
-    :param Ie: effective surface density in flux/arcsec^2
+    :param Re: effective radius, radius enclosing half the total light
+    :param Ie: effective surface density, brightness at the effective radius
     """
 
     _model_type = "exponential"
@@ -92,14 +92,14 @@ class iExponentialMixin:
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "effective radius in arcseconds",
+            "description": "effective radius, radius enclosing half the total light",
         },
         "Ie": {
             "units": "flux/arcsec^2",
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "effective surface density in flux/arcsec^2",
+            "description": "effective surface density, brightness at the effective radius",
         },
     }
 
@@ -135,8 +135,8 @@ class ExponentialPSFMixin:
     Ie is the brightness at the effective radius, and Re is the effective
     radius. :math:`b_1` is a constant that ensures :math:`I_e` is the brightness at :math:`R_e`.
 
-    :param Re: effective radius in pixels
-    :param Ie: effective surface density in flux/pix^2
+    :param Re: effective radius, radius enclosing half the total light
+    :param Ie: effective surface density, brightness at the effective radius
     """
 
     _model_type = "exponential"
@@ -146,7 +146,7 @@ class ExponentialPSFMixin:
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "effective radius in pixels",
+            "description": "effective radius, radius enclosing half the total light",
         },
         "Ie": {
             "units": "flux/pix^2",
@@ -154,7 +154,7 @@ class ExponentialPSFMixin:
             "shape": (),
             "dynamic": False,
             "value": 1.0,
-            "description": "effective surface density in flux/pix^2",
+            "description": "effective surface density, brightness at the effective radius",
         },
     }
 
diff --git a/astrophot/models/mixins/gaussian.py b/astrophot/models/mixins/gaussian.py
index 75712a5a..bef13d68 100644
--- a/astrophot/models/mixins/gaussian.py
+++ b/astrophot/models/mixins/gaussian.py
@@ -20,13 +20,13 @@ class GaussianMixin:
 
     .. math::
 
-       I(R) = \\frac{{\\rm flux}}{\\sqrt{2\\pi}\\sigma} \\exp(-R^2 / (2 \\sigma^2))
+       I(R) = \\frac{{\\rm flux}}{2\\pi\\sigma^2} \\exp(-R^2 / (2 \\sigma^2))
 
-    where ``I_0`` is the intensity at the center of the profile and ``sigma`` is the
+    where ``flux`` is the total flux of the profile and ``sigma`` is the
     standard deviation which controls the width of the profile.
 
     :param sigma: Standard deviation of the Gaussian profile in arcseconds.
-    :param flux: Total flux of the Gaussian profile.
+    :param flux: Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).
     """
 
     _model_type = "gaussian"
@@ -43,7 +43,7 @@ class GaussianMixin:
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "Total flux of the Gaussian profile.",
+            "description": "Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).",
         },
     }
 
@@ -73,7 +73,7 @@ class iGaussianMixin:
 
     .. math::
 
-       I(R) = \\frac{{\\rm flux}}{\\sqrt{2\\pi}\\sigma} \\exp(-R^2 / (2 \\sigma^2))
+       I(R) = \\frac{{\\rm flux}}{2\\pi\\sigma^2} \\exp(-R^2 / (2 \\sigma^2))
 
     where ``sigma`` is the standard deviation which controls the width of the
     profile and ``flux`` gives the total flux of the profile (assuming no
@@ -83,7 +83,7 @@ class iGaussianMixin:
     multiple Gaussian profiles to be defined at once.
 
     :param sigma: Standard deviation of the Gaussian profile in arcseconds.
-    :param flux: Total flux of the Gaussian profile.
+    :param flux: Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).
     """
 
     _model_type = "gaussian"
@@ -100,7 +100,7 @@ class iGaussianMixin:
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "Total flux of the Gaussian profile.",
+            "description": "Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).",
         },
     }
 
@@ -131,13 +131,13 @@ class GaussianPSFMixin:
 
     .. math::
 
-       I(R) = \\frac{{\\rm flux}}{\\sqrt{2\\pi}\\sigma} \\exp(-R^2 / (2 \\sigma^2))
+       I(R) = \\frac{{\\rm flux}}{2\\pi\\sigma^2} \\exp(-R^2 / (2 \\sigma^2))
 
-    where ``I_0`` is the intensity at the center of the profile and ``sigma`` is the
+    where ``flux`` is the total flux of the profile and ``sigma`` is the
     standard deviation which controls the width of the profile.
 
     :param sigma: Standard deviation of the Gaussian profile in pixels.
-    :param flux: Total flux of the Gaussian profile.
+    :param flux: Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).
     """
 
     _model_type = "gaussian"
@@ -155,7 +155,7 @@ class GaussianPSFMixin:
             "shape": (),
             "dynamic": False,
             "value": 1.0,
-            "description": "Total flux of the Gaussian profile.",
+            "description": "Total flux of an unperturbed Gaussian profile (use model.total_flux() for general results).",
         },
     }
 
diff --git a/astrophot/models/mixins/sample.py b/astrophot/models/mixins/sample.py
index f3448b0c..5a99be67 100644
--- a/astrophot/models/mixins/sample.py
+++ b/astrophot/models/mixins/sample.py
@@ -13,12 +13,65 @@
 
 class SampleMixin:
     """
-    :param sampling_mode: The method used to sample the model in image pixels. Options are: `auto`: Automatically choose the sampling method based on the image size (default). `midpoint`: Use midpoint sampling, evaluate the brightness at the center of each pixel. `simpsons`: Use Simpson's rule for sampling integrating each pixel. `upsample:x` upsample the pixel in a regular grid of size x (odd positive integer), generally less accurate than quad:x. `quad:x`: Use quadrature sampling with order x, where x is an odd positive integer to integrate each pixel.
-    :param integrate_mode: The method used to select pixels to integrate further where the model varies significantly. Options are: `none`: No extra integration is performed (beyond the sampling_mode). `bright`: Select the brightest pixels for further integration (default). `threshold`: Select pixels which show signs of significant higher order derivatives.
-    :param integrate_fraction: The fraction of the pixels to super sample during integration (default: 0.05).
-    :param integrate_max_depth: The maximum depth of the integration method (default: 2).
-    :param integrate_gridding: The gridding used for the integration method to super-sample a pixel at each iteration (default: 5).
-    :param integrate_quad_order: The order of the quadrature used for the integration method on the super sampled pixels (default: 3).
+    Methods for integrating the model from a smooth model defined in the tangent
+    plane into individual pixel fluxes. This is done in a two step process.
+    First the model is sampled at a set of points within each pixel, and then an
+    adaptive integration method is used to further integrate pixels where it has
+    identified the need for additional accuracy.
+
+    The `sampling_mode` option controls this first step. It determines at what
+    level of depth every pixel is integrated. The midpoint option is the least
+    accurate (and fastest) which just samples the center of each pixel. After
+    that, each method trades more compute for more accuracy. The `quad:x` method
+    is the most accurate, which uses Gaussian quadrature integration with x
+    points per pixel. Note that `quad:5` means that each pixel will be sampled
+    at 25 points (5^2) to determine the flux in that pixel. `simpsons` is often
+    a good middle ground. Note that for models over a small number of pixels you
+    will likely not notice the runtime difference between midpoint and some
+    higher accuracy method, since other aspects of the fitting process also take
+    up some time.
+
+    The `integrate_mode` option controls the second step, which is an adaptive
+    integration method that identifies and integrates pixels where the model
+    needs extra accuracy. The default method is `bright`, which identifies the
+    brightest pixels and then uses quadrature integration to further integrate
+    those pixels. The default parameters are to recursively integrate the
+    brightest 5% of pixels up to a maximum depth of 2 recursive levels. Each
+    level does a 5x upsampling and then uses 3rd order quadrature to integrate
+    the super sampled pixels. This means that the most highly integrated pixels
+    will be 5x upsampled twice and the 3x sampled for the quadrature,
+    effectively like upsampling 75 times the starting resolution for those
+    pixels, but only for 0.25% of the pixels. Doing this roughly doubles the
+    amount of compute needed to sample an image relative to midpoint sampling,
+    but gives a massive boost in accuracy for models which change rapidly across
+    a pixel.
+
+    Note: JAX does not play nicely with the adaptive integration methods, so it
+        massively slows down the jit compilation and the final sampling speed.
+        With JAX it is generally better to set `integrate_mode` to `none` and use
+        a higher accuracy `sampling_mode` such as `quad:5`.
+
+    :param sampling_mode: The method used to sample the model in image pixels.
+        Options are: `auto`: Automatically choose the sampling method based on
+        the image size (default). `midpoint`: Use midpoint sampling, evaluate
+        the brightness at the center of each pixel. `simpsons`: Use Simpson's
+        rule for sampling integrating each pixel. `upsample:x` upsample the
+        pixel in a regular grid of size x (odd positive integer), generally less
+        accurate than quad:x. `quad:x`: Use quadrature sampling with order x,
+        where x is an odd positive integer to integrate each pixel.
+    :param integrate_mode: The method used to select pixels to integrate further
+        where the model varies significantly. Options are: `none`: No extra
+        integration is performed (beyond the sampling_mode). `bright`: Select
+        the brightest pixels for further integration (default). `curvature`:
+        Select pixels which show signs of significant higher order derivatives.
+    :param integrate_fraction: The fraction of the pixels to super sample during
+        integration (default: 0.05).
+    :param integrate_max_depth: The maximum depth of the integration method
+        (default: 2).
+    :param integrate_gridding: The gridding used for the integration method to
+        super-sample a pixel at each iteration (default: 5).
+    :param integrate_quad_order: The order of the quadrature used for the
+        integration method on the super sampled pixels (default: 3).
     """
 
     integrate_fraction = 0.05  # fraction of the pixels to super sample
diff --git a/astrophot/models/mixins/sersic.py b/astrophot/models/mixins/sersic.py
index 93607048..d4431bc2 100644
--- a/astrophot/models/mixins/sersic.py
+++ b/astrophot/models/mixins/sersic.py
@@ -29,7 +29,7 @@ class SersicMixin:
     ``n=0.5`` being a Gaussian profile.
 
     :param n: Sersic index which controls the shape of the brightness profile
-    :param Re: half light radius [arcsec]
+    :param Re: half light radius, also called effective radius [arcsec]
     :param Ie: intensity at the half light radius [flux/arcsec^2]
     """
 
@@ -47,7 +47,7 @@ class SersicMixin:
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "half light radius [arcsec]",
+            "description": "half light radius, also called effective radius [arcsec]",
         },
         "Ie": {
             "units": "flux/arcsec^2",
@@ -92,7 +92,7 @@ class iSersicMixin:
     multiple Sersic profiles to be defined at once.
 
     :param n: Sersic index which controls the shape of the brightness profile
-    :param Re: half light radius [arcsec]
+    :param Re: half light radius, also called effective radius [arcsec]
     :param Ie: intensity at the half light radius [flux/arcsec^2]
     """
 
@@ -110,14 +110,14 @@ class iSersicMixin:
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "half light radius [arcsec]",
+            "description": "half light radius, also called effective radius",
         },
         "Ie": {
             "units": "flux/arcsec^2",
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "intensity at the half light radius [flux/arcsec^2]",
+            "description": "intensity at the half light radius, also called effective intensity",
         },
     }
 
@@ -160,8 +160,8 @@ class SersicPSFMixin:
     ``n=0.5`` being a Gaussian profile.
 
     :param n: Sersic index which controls the shape of the brightness profile
-    :param Re: half light radius [pix]
-    :param Ie: intensity at the half light radius [flux/pix^2]
+    :param Re: half light radius, also called effective radius [pix]
+    :param Ie: intensity at the half light radius, also called effective intensity [flux/pix^2]
     """
 
     _model_type = "sersic"
@@ -178,7 +178,7 @@ class SersicPSFMixin:
             "valid": (0, None),
             "shape": (),
             "dynamic": True,
-            "description": "half light radius [pix]",
+            "description": "half light radius, also called effective radius",
         },
         "Ie": {
             "units": "flux/pix^2",
@@ -186,7 +186,7 @@ class SersicPSFMixin:
             "shape": (),
             "dynamic": False,
             "value": 1.0,
-            "description": "intensity at the half light radius [flux/pix^2]",
+            "description": "intensity at the half light radius, also called effective intensity",
         },
     }
 
diff --git a/astrophot/models/mixins/spline.py b/astrophot/models/mixins/spline.py
index d6c59e5b..25efa6d9 100644
--- a/astrophot/models/mixins/spline.py
+++ b/astrophot/models/mixins/spline.py
@@ -17,7 +17,7 @@ class SplineMixin:
     that contains the radial profile of the brightness in units of
     flux/arcsec^2. The radius of each node is determined from ``I_R.prof``.
 
-    :param I_R: Tensor of radial brightness values in units of flux/arcsec^2.
+    :param I_R: Array of radial brightness values in units of flux/arcsec^2.
     """
 
     _model_type = "spline"
@@ -27,7 +27,7 @@ class SplineMixin:
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "Tensor of radial brightness values in units of flux/arcsec^2.",
+            "description": "Array of radial brightness values in units of flux/arcsec^2.",
         }
     }
 
@@ -74,7 +74,7 @@ class iSplineMixin:
     multiple spline profiles to be defined at once. Each individual spline model
     is then ``I_R[i]`` and ``I_R.prof[i]`` where ``i`` indexes the profiles.
 
-    :param I_R: Tensor of radial brightness values in units of flux/arcsec^2.
+    :param I_R: Array of radial brightness values in units of flux/arcsec^2.
     """
 
     _model_type = "spline"
@@ -84,7 +84,7 @@ class iSplineMixin:
             "valid": (0, None),
             "shape": (None, None),
             "dynamic": True,
-            "description": "Tensor of radial brightness values in units of flux/arcsec^2.",
+            "description": "Array of radial brightness values in units of flux/arcsec^2.",
         }
     }
 
@@ -138,7 +138,7 @@ class SplinePSFMixin:
     that contains the radial profile of the brightness in units of
     flux/pix^2. The radius of each node is determined from ``I_R.prof``.
 
-    :param I_R: Tensor of radial brightness values in units of flux/pix^2.
+    :param I_R: Array of radial brightness values in units of flux/pix^2.
     """
 
     _model_type = "spline"
@@ -148,7 +148,7 @@ class SplinePSFMixin:
             "valid": (0, None),
             "shape": (None,),
             "dynamic": True,
-            "description": "Tensor of radial brightness values in units of flux/pix^2.",
+            "description": "Array of radial brightness values in units of flux/pix^2.",
         }
     }
 
diff --git a/astrophot/models/mixins/transform.py b/astrophot/models/mixins/transform.py
index 35ce6a42..71f4a0ad 100644
--- a/astrophot/models/mixins/transform.py
+++ b/astrophot/models/mixins/transform.py
@@ -176,11 +176,11 @@ class FourierEllipseMixin:
     should consider carefully why the Fourier modes are being used for the
     science case at hand.
 
-    :param am: Tensor of amplitudes for the Fourier modes, indicates the strength
+    :param am: Array of amplitudes for the Fourier modes, indicates the strength
             of each mode.
-    :param phim: Tensor of phases for the Fourier modes, adjusts the
+    :param phim: Array of phases for the Fourier modes, adjusts the
             orientation of the mode perturbation relative to the major axis. It
-            is cyclically defined in the range [0,2pi)
+            is cyclically defined in the range [0,2pi/m)
     :param modes: Tuple of integers indicating which Fourier modes to use.
     """
 
@@ -190,7 +190,7 @@ class FourierEllipseMixin:
             "units": "none",
             "shape": (None,),
             "dynamic": True,
-            "description": "Tensor of amplitudes for the Fourier modes, indicates the strength of each mode.",
+            "description": "Array of amplitudes for the Fourier modes, indicates the strength of each mode.",
         },
         "phim": {
             "units": "radians",
@@ -198,7 +198,7 @@ class FourierEllipseMixin:
             "cyclic": True,
             "shape": (None,),
             "dynamic": False,
-            "description": "Tensor of phases for the Fourier modes, adjusts the orientation of the mode perturbation relative to the major axis.",
+            "description": "Array of phases for the Fourier modes, adjusts the orientation of the mode perturbation relative to the major axis.",
         },
     }
     _options = ("modes",)
@@ -230,6 +230,7 @@ def initialize(self):
             self.am.value = np.zeros(len(self.modes)) + 0.0001
         if not self.phim.initialized:
             self.phim.value = np.zeros(len(self.modes)) + 0.0001
+            self.phim.valid = (np.zeros(len(self.modes)), 2 * np.pi / backend.to_numpy(self.modes))
 
 
 class WarpMixin:
@@ -258,8 +259,8 @@ class WarpMixin:
     original coordinates X, Y. This is achieved by making PA and q a spline
     profile.
 
-    :param q_R: Tensor of axis ratio values for axis ratio spline
-    :param PA_R: Tensor of position angle values as input to the spline
+    :param q_R: Array of axis ratio values for axis ratio spline
+    :param PA_R: Array of position angle values as input to the spline
 
     """
 
@@ -270,7 +271,7 @@ class WarpMixin:
             "valid": (0, 1),
             "shape": (None,),
             "dynamic": True,
-            "description": "Tensor of axis ratio values for axis ratio spline",
+            "description": "Array of axis ratio values for axis ratio spline",
         },
         "PA_R": {
             "units": "radians",
@@ -278,7 +279,7 @@ class WarpMixin:
             "cyclic": True,
             "shape": (None,),
             "dynamic": True,
-            "description": "Tensor of position angle values as input to the spline",
+            "description": "Array of position angle values as input to the spline",
         },
     }
 
@@ -360,7 +361,10 @@ def initialize(self):
         super().initialize()
         if not self.Rt.initialized:
             prof = default_prof(self.window.shape, self.target.pixelscale, 2, 0.2)
-            self.Rt.value = prof[len(prof) // 2]
+            if self.outer_truncation:
+                self.Rt.value = prof[-1]
+            else:
+                self.Rt.value = prof[0]
 
     @forward
     def radial_model(self, R: ArrayLike, Rt: ArrayLike, St: ArrayLike) -> ArrayLike:
diff --git a/astrophot/models/multi_gaussian_expansion.py b/astrophot/models/multi_gaussian_expansion.py
index b6fa5808..e59dd136 100644
--- a/astrophot/models/multi_gaussian_expansion.py
+++ b/astrophot/models/multi_gaussian_expansion.py
@@ -24,7 +24,7 @@ class MultiGaussianExpansion(ComponentModel):
     where :math:`R_i` is a radius computed using :math:`q_i` and :math:`PA_i` for that component. All components share the same center.
 
     :param q: axis ratio to scale minor axis from the ratio of the minor/major axis b/a, this parameter is unitless, it is restricted to the range (0,1)
-    :param PA: position angle of the semi-major axis relative to the image positive x-axis in radians, it is a cyclic parameter in the range [0,pi)
+    :param PA: position angle of the semi-major axis East of North, it is a cyclic parameter in the range [0,pi)
     :param sigma: standard deviation of each Gaussian
     :param flux: amplitude of each Gaussian
     """
@@ -43,7 +43,7 @@ class MultiGaussianExpansion(ComponentModel):
             "valid": (0, np.pi),
             "cyclic": True,
             "dynamic": True,
-            "description": "position angle of the semi-major axis relative to the image positive x-axis in radians",
+            "description": "position angle of the semi-major axis East of North, it is a cyclic parameter in the range [0,pi)",
         },  # No shape for PA since there are two options, use with caution
         "sigma": {
             "units": "arcsec",
diff --git a/astrophot/models/pixelated_model.py b/astrophot/models/pixelated_model.py
index d4bda5ba..885031b7 100644
--- a/astrophot/models/pixelated_model.py
+++ b/astrophot/models/pixelated_model.py
@@ -1,4 +1,3 @@
-import torch
 import numpy as np
 
 from .model_object import ComponentModel
@@ -25,7 +24,7 @@ class Pixelated(ComponentModel):
     The PA and scale are also parameters of this model, so one could alternately
     fix the pixels to some image and just fit the PA and scale.
 
-    :param I: the total flux within each pixel, represented as the log of the flux.
+    :param I: the total flux within each pixel, in units of flux/arcsec^2.
     :param PA: the position angle of the model, in radians.
     :param scale: the scale of the model, in arcsec per grid unit.
 
@@ -37,7 +36,7 @@ class Pixelated(ComponentModel):
             "units": "flux/arcsec^2",
             "shape": (None, None),
             "dynamic": True,
-            "description": "the total flux within each pixel, represented as the log of the flux",
+            "description": "the total flux within each pixel, in units of flux/arcsec^2",
         },
         "PA": {
             "units": "radians",
@@ -62,7 +61,6 @@ def __init__(
         )
         self.scale = scale
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
diff --git a/astrophot/models/pixelated_psf.py b/astrophot/models/pixelated_psf.py
index e270edb1..2bf33b8f 100644
--- a/astrophot/models/pixelated_psf.py
+++ b/astrophot/models/pixelated_psf.py
@@ -1,5 +1,3 @@
-import torch
-
 from .psf_model_object import PSFModel
 from ..utils.decorators import ignore_numpy_warnings, combine_docstrings
 from ..utils.interpolate import interp2d
@@ -33,7 +31,7 @@ class PixelatedPSF(PSFModel):
     (essentially just divide the pixelscale by the upsampling factor
     you used).
 
-    :param pixels: the total flux within each pixel, represented as the log of the flux.
+    :param pixels: the total flux within each pixel, in units of flux/pix^2.
 
     """
 
@@ -43,12 +41,11 @@ class PixelatedPSF(PSFModel):
             "units": "flux/pix^2",
             "shape": (None, None),
             "dynamic": True,
-            "description": "the total flux within each pixel, represented as the log of the flux",
+            "description": "the total flux within each pixel, in units of flux/pix^2",
         }
     }
     usable = True
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
diff --git a/astrophot/models/planesky.py b/astrophot/models/planesky.py
index 71516cba..509ba0f7 100644
--- a/astrophot/models/planesky.py
+++ b/astrophot/models/planesky.py
@@ -1,28 +1,28 @@
 import numpy as np
-import torch
 
 from .sky_model_object import SkyModel
 from ..utils.decorators import ignore_numpy_warnings, combine_docstrings
 from ..param import forward
 from ..backend_obj import backend, ArrayLike
 
-__all__ = ["PlaneSky"]
+__all__ = ("PlaneSky",)
 
 
 @combine_docstrings
 class PlaneSky(SkyModel):
-    """Sky background model using a tilted plane for the sky flux. The brightness for each pixel is defined as:
+    """Sky background model using a tilted plane for the sky flux. The
+    brightness for each pixel is defined as:
 
     .. math::
 
        I(X, Y) = I_0 + X*\\delta_x + Y*\\delta_y
 
-    where :math:`I(X,Y)` is the brightness as a function of image position :math:`X, Y`,
-    :math:`I_0` is the central sky brightness value, and :math:`\\delta_x, \\delta_y` are the slopes of
-    the sky brightness plane.
+    where :math:`I(X,Y)` is the brightness as a function of image position
+    :math:`X, Y`, :math:`I_0` is the central sky brightness value, and
+    :math:`\\delta_x, \\delta_y` are the slopes of the sky brightness plane.
 
     :param I0: central sky brightness value
-    :param delta: Tensor for slope of the sky brightness in each image dimension
+    :param delta: An array for slope of the sky brightness in each image dimension
 
     """
 
@@ -38,12 +38,11 @@ class PlaneSky(SkyModel):
             "units": "flux/arcsec",
             "shape": (2,),
             "dynamic": True,
-            "description": "Tensor for slope of the sky brightness in each image dimension",
+            "description": "An array for slope of the sky brightness in each image dimension",
         },
     }
     usable = True
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
diff --git a/astrophot/models/point_source.py b/astrophot/models/point_source.py
index 6e7cf3ad..6467e403 100644
--- a/astrophot/models/point_source.py
+++ b/astrophot/models/point_source.py
@@ -1,14 +1,10 @@
-from typing import Optional
-
-import torch
 import numpy as np
 
 from .base import Model
 from .model_object import ComponentModel
-from ..image import ModelImage
 from ..utils.decorators import ignore_numpy_warnings, combine_docstrings
 from ..utils.interpolate import interp2d
-from ..image import Window, PSFImage
+from ..image import PSFImage
 from ..errors import SpecificationConflict
 from ..param import forward
 from ..backend_obj import backend, ArrayLike
@@ -21,10 +17,9 @@
 @combine_docstrings
 class PointSource(ComponentModel):
     """Describes a point source in the image, this is a delta function at
-    some position in the sky. This is typically used to describe
-    stars, supernovae, very small galaxies, quasars, asteroids or any
-    other object which can essentially be entirely described by a
-    position and total flux (no structure).
+    some position in the sky. This is typically used to describe stars,
+    supernovae, quasars, asteroids or any other object which can essentially be
+    entirely described by a position and total flux (no structure).
 
     :param flux: The total flux of the point source
 
@@ -46,7 +41,6 @@ class PointSource(ComponentModel):
     def __init__(self, *args, integrate_mode="none", **kwargs):
         super().__init__(*args, integrate_mode=integrate_mode, **kwargs)
 
-    @torch.no_grad()
     @ignore_numpy_warnings
     def initialize(self):
         super().initialize()
diff --git a/astrophot/models/psf_model_object.py b/astrophot/models/psf_model_object.py
index 3e7bc620..68f3ac66 100644
--- a/astrophot/models/psf_model_object.py
+++ b/astrophot/models/psf_model_object.py
@@ -11,19 +11,25 @@
 
 
 class PSFModel(GradMixin, SampleMixin, Model):
-    """Prototype point source (typically a star) model, to be subclassed
+    """Prototype point source (e.g., a star) model, to be subclassed
     by other point source models which define specific behavior.
 
-    PSF models behave differently than component models. Their target image
-    must be a ``PSFImage`` object instead of a ``TargetImage`` object.
-    PSF models do not fit a free ``center`` parameter; their center is
-    always ``(0, 0)`` in pixel coordinates, matching the convention of a
-    ``PSFImage``. A PSF model is never convolved with another PSF model.
-
-    :param center: Center of the PSF in pixel coordinates ``[x, y]``.
-        Fixed at ``(0, 0)`` by default and not included in the fit.
-    :param normalize_psf: When ``True`` (default) the sampled PSF is
-        normalised so that its total flux within the fitting window equals 1.
+    PSF models behave differently than component models. Their target image must
+    be a ``PSFImage`` object instead of a ``TargetImage`` object. PSF models do
+    not fit a free ``center`` parameter; their center is always ``(0, 0)`` in
+    pixel coordinates, matching the convention of a ``PSFImage``. A PSF model is
+    never convolved with another PSF model.
+
+    Instead of units of arcsec for most length scales, PSFModel objects use
+    `pix` units. This corresponds to the width of a pixel in the data target
+    image; so if the PSFModel has an upsample factor of 2 then `1 pix`
+    corresponds to two pixels in the image that the PSFModel outputs. This way,
+    two PSFModels with different upsample factors, but applied to the same data
+    target image, should still have the same parameter values for their shape
+    parameters.
+
+    :param center: Center of the PSF in pixel coordinates ``[x, y]``. Fixed at
+        ``(0, 0)`` by default and not included in the fit.
     """
 
     _parameter_specs = {
@@ -38,12 +44,6 @@ class PSFModel(GradMixin, SampleMixin, Model):
     _model_type = "psf"
     usable = False
 
-    # The sampled PSF will be normalized to a total flux of 1 within the window
-    normalize_psf = True
-
-    # Parameters which are treated specially by the model object and should not be updated directly when initializing
-    _options = ("normalize_psf",)
-
     def initialize(self):
         pass
 
@@ -124,10 +124,10 @@ def target(self, target):
         self._target = target
 
     @forward
-    def __call__(self) -> PSFImage:
+    def __call__(self, normalize_psf=True) -> PSFImage:
         working_image = self.target.model_image(self.window)
         i, j = self._pixel_meshgridder(self.target, self.window, 0, 1)
         working_image._data = self.sample(i, j)
-        if self.normalize_psf:
-            working_image._data = working_image._data / backend.sum(working_image._data)
+        if normalize_psf:
+            working_image.normalize()
         return working_image
diff --git a/astrophot/models/sky_model_object.py b/astrophot/models/sky_model_object.py
index 63b0dac6..8b12e2d4 100644
--- a/astrophot/models/sky_model_object.py
+++ b/astrophot/models/sky_model_object.py
@@ -1,7 +1,7 @@
 from .model_object import ComponentModel
 from ..utils.decorators import combine_docstrings
 
-__all__ = ["SkyModel"]
+__all__ = ("SkyModel",)
 
 
 @combine_docstrings
diff --git a/docs/source/tutorials/AdvancedPSFModels.ipynb b/docs/source/tutorials/AdvancedPSFModels.ipynb
index cf58558f..e9ade442 100644
--- a/docs/source/tutorials/AdvancedPSFModels.ipynb
+++ b/docs/source/tutorials/AdvancedPSFModels.ipynb
@@ -123,7 +123,6 @@
     "    I0=2,  # essentially controls relative flux of this component\n",
     "    PA=0,\n",
     "    q=0.2,\n",
-    "    normalize_psf=False,  # sub components shouldnt be individually normalized\n",
     "    target=psf_target,\n",
     ")\n",
     "psf_model2 = ap.Model(\n",
@@ -134,7 +133,6 @@
     "    Ie=1,\n",
     "    PA=np.pi / 2,\n",
     "    q=0.2,\n",
-    "    normalize_psf=False,\n",
     "    target=psf_target,\n",
     ")\n",
     "psf_group_model = ap.Model(\n",
@@ -142,7 +140,6 @@
     "    model_type=\"psf group model\",\n",
     "    target=psf_target,\n",
     "    models=[psf_model1, psf_model2],\n",
-    "    normalize_psf=True,  # group model should normalize the combined PSF\n",
     ")\n",
     "psf_group_model.initialize()\n",
     "fig, ax = plt.subplots(1, 3, figsize=(15, 5))\n",
diff --git a/docs/source/tutorials/GettingStarted.ipynb b/docs/source/tutorials/GettingStarted.ipynb
index 61e7cfc4..55cb8e4b 100644
--- a/docs/source/tutorials/GettingStarted.ipynb
+++ b/docs/source/tutorials/GettingStarted.ipynb
@@ -330,7 +330,8 @@
     "print(\"Parameter units, sersic Re:\", model3.Re.units)\n",
     "print(\"Expected parameter shape, Re:\", model3.Re.shape)\n",
     "print(\"and for center it is:\", model3.center.shape)\n",
-    "print(\"Parameter dynamic state, Re:\", model3.Re.dynamic, \"so it will be optimized by a fitter\")"
+    "print(\"Parameter dynamic state, Re:\", model3.Re.dynamic, \"so it will be optimized by a fitter\")\n",
+    "print(\"Parameter description, Re:\", model3.Re.description)"
    ]
   },
   {
diff --git a/docs/source/tutorials/ModelZoo.ipynb b/docs/source/tutorials/ModelZoo.ipynb
index 86350343..452c3d5b 100644
--- a/docs/source/tutorials/ModelZoo.ipynb
+++ b/docs/source/tutorials/ModelZoo.ipynb
@@ -299,7 +299,7 @@
    "source": [
     "M = ap.Model(\n",
     "    model_type=\"airy psf model\",\n",
-    "    aRL=1.0 / 20,\n",
+    "    R1=15.5,\n",
     "    target=psf_target,\n",
     ")\n",
     "M.initialize()\n",
diff --git a/tests/test_model.py b/tests/test_model.py
index c6cac710..19bd5698 100644
--- a/tests/test_model.py
+++ b/tests/test_model.py
@@ -142,9 +142,9 @@ def test_all_model_sample(model_type):
     ):
         pytest.skip("JAX version doesnt support these models yet, difficulty with gradients")
 
-    if any(t in model_type for t in ["warp", "fourier"]):
+    if any(t in model_type for t in ["warp", "fourier"]) and "gaussian" not in model_type:
         pytest.skip("Warp and Fourier models are complex and slow to fit, skipping for now")
-    if model_type.startswith("truncated"):
+    if model_type.startswith("truncated") and "gaussian" not in model_type:
         pytest.skip("Testing truncated models is redundant")
 
     target = make_basic_sersic()
diff --git a/tests/test_psfmodel.py b/tests/test_psfmodel.py
index b3d21349..c007122f 100644
--- a/tests/test_psfmodel.py
+++ b/tests/test_psfmodel.py
@@ -12,10 +12,8 @@
 @pytest.mark.parametrize("model_type", ap.models.PSFModel.List_Models(usable=True, types=True))
 def test_all_psfmodel_sample(model_type):
     if model_type == "airy psf model" and ap.backend.backend == "jax":
-        pytest.skip(
-            "Skipping airy psf model, JAX does not support bessel_j1 with finite derivatives it seems"
-        )
-    if any(t in model_type for t in ["warp", "fourier"]):
+        pytest.skip("Skipping airy psf model, JAX can't use airy.")
+    if any(t in model_type for t in ["warp", "fourier"]) and not "gaussian" in model_type:
         pytest.skip("Skipping warp and fourier psf models, which are slow")
 
     target = make_basic_gaussian_psf(pixelscale=0.8)
@@ -23,15 +21,9 @@ def test_all_psfmodel_sample(model_type):
         name="test_model",
         model_type=model_type,
         target=target,
-        normalize_psf=False,
     )
-    for p in MODEL.all_params:
-        if p.units in ["flux", "flux/pix^2"]:
-            p.to_dynamic(None)
     MODEL.initialize()
     for p in MODEL.all_params:
-        if p.units in ["flux", "flux/pix^2"]:
-            p.to_dynamic(p.value * 1.5)
         if p.units == "pix" and not p.name == "center":
             p.to_dynamic(p.value + 0.5)
     print(MODEL)
@@ -47,7 +39,7 @@ def test_all_psfmodel_sample(model_type):
 
     if model_type == "pixelated psf model":
         psf = ap.utils.initialize.gaussian_psf(3 * 0.8, 25, 0.8)
-        MODEL.pixels.value = psf / np.sum(psf)
+        MODEL.pixels = psf / np.sum(psf)
 
     assert ap.backend.all(
         ap.backend.isfinite(MODEL.jacobian().data)
@@ -55,9 +47,11 @@ def test_all_psfmodel_sample(model_type):
 
     res = ap.fit.LM(MODEL, max_iter=5).fit()
 
+    if model_type == "airy psf model":
+        return  # Airy gradients dont work, so just run the code to make sure no crashes
     assert len(res.loss_history) >= 2, "Optimizer must be able to find steps to improve the model"
 
-    if res.message == "success":
+    if res.message == "success" or model_type in ["nuker psf model"]:
         # Be less strict if fit succeeded quickly
         assert res.loss_history[-1] < res.loss_history[0], (
             f"Model {model_type} should fit to the target image, but did not. "

From e85be4c82bd4b37fcb78091be1991a7f0120473c Mon Sep 17 00:00:00 2001
From: "Connor Stone, PhD" <connorstone628@gmail.com>
Date: Thu, 2 Apr 2026 16:09:16 -0400
Subject: [PATCH 3/4] Add BatchLM fitter to perform many fits simultaneously
 (#298)

Batching fits gives huge compute gains. The LM fitter for the batched
version is slightly simplified to account for the fact that it is
running many fits in parallel, and so should generally be used for
simpler models. But that is exactly the use case for batching, many
small simple models to be fit at once.
---
 astrophot/fit/__init__.py              |   2 +
 astrophot/fit/batch_lm.py              | 263 +++++++++++++++++++++++++
 astrophot/fit/func/__init__.py         |   3 +-
 astrophot/fit/func/lm.py               |  84 ++++++--
 astrophot/fit/lm.py                    |   6 +-
 astrophot/models/batch_model_object.py |   6 +-
 docs/source/tutorials/ImageTypes.ipynb |  91 ++++++++-
 7 files changed, 436 insertions(+), 19 deletions(-)
 create mode 100644 astrophot/fit/batch_lm.py

diff --git a/astrophot/fit/__init__.py b/astrophot/fit/__init__.py
index b788aa7a..80553b35 100644
--- a/astrophot/fit/__init__.py
+++ b/astrophot/fit/__init__.py
@@ -1,4 +1,5 @@
 from .lm import LM, LMConstraint
+from .batch_lm import BatchLM
 from .gradient import Grad, Slalom
 from .iterative import Iter, IterParam
 from .scipy_fit import ScipyFit
@@ -10,6 +11,7 @@
 __all__ = [
     "LM",
     "LMConstraint",
+    "BatchLM",
     "Grad",
     "Iter",
     "MALA",
diff --git a/astrophot/fit/batch_lm.py b/astrophot/fit/batch_lm.py
new file mode 100644
index 00000000..f8a53470
--- /dev/null
+++ b/astrophot/fit/batch_lm.py
@@ -0,0 +1,263 @@
+import numpy as np
+from ..models import Model
+from ..image import TargetImageBatch, WindowBatch
+from .base import BaseOptimizer
+from ..backend_obj import backend, ArrayLike
+from .. import config
+from ..errors import OptimizeStopSuccess
+from ..param import ValidContext
+from . import func
+
+
+class BatchLM(BaseOptimizer):
+
+    def __init__(
+        self,
+        model: Model,
+        batch_target: TargetImageBatch,
+        batch_window: WindowBatch,
+        max_iter: int = 100,
+        relative_tolerance: float = 1e-5,
+        Lup=11.0,
+        Ldn=9.0,
+        L0=1.0,
+        max_step_iter: int = 3,
+        likelihood="gaussian",
+        **kwargs,
+    ):
+
+        super().__init__(
+            model=model,
+            initial_state=model.get_values(),
+            max_iter=max_iter,
+            relative_tolerance=relative_tolerance,
+            **kwargs,
+        )
+
+        self.max_step_iter = max_step_iter
+
+        # Likelihood
+        self.likelihood = likelihood
+        if self.likelihood not in ["gaussian", "poisson"]:
+            raise ValueError(
+                f"Unsupported likelihood: {self.likelihood}, should be one of: 'gaussian' or 'poisson'"
+            )
+
+        # mask
+        mask = backend.flatten(batch_target[batch_window].mask, 1, -1)
+        self.mask = ~mask
+        if backend.sum(self.mask).item() == 0:
+            raise OptimizeStopSuccess("No data to fit. All pixels are masked")
+
+        # data
+        self.data = backend.flatten(batch_target[batch_window].data, 1, -1)
+
+        # Weight
+        self.weight = backend.flatten(batch_target[batch_window].weight, 1, -1)
+
+        # WCS
+        crtan = batch_target.crtan
+        shift = backend.as_array(
+            batch_window.origin_shifter(self.model.window), dtype=config.DTYPE, device=config.DEVICE
+        )
+        crpix = batch_target[batch_window].crpix + shift
+        CD = batch_target.CD
+        psf = batch_target.psf_stack
+        psf_batch = None if psf is None else 0
+
+        # Forward
+        vmodel = backend.vmap(
+            lambda cd, crt, crp, psf, params: backend.flatten(
+                self.model(cd, crt, crp, psf, params=params).data
+            ),
+            in_dims=(0, 0, 0, psf_batch, 0),
+        )
+        self.forward = lambda x: vmodel(CD, crtan, crpix, psf, x)
+
+        # Jacobian
+        vjac = backend.vmap(
+            backend.jacfwd(
+                lambda cd, crt, crp, psf, params: backend.flatten(
+                    self.model(cd, crt, crp, psf, params=params).data
+                ),
+                argnums=4,
+            ),
+            in_dims=(0, 0, 0, psf_batch, 0),
+        )
+        self.jacobian = lambda x: vjac(CD, crtan, crpix, psf, x)
+
+        # ndf
+        self.ndf = backend.clamp(
+            backend.sum(self.mask, dim=1) - self.current_state.shape[1], backend.as_array(1), None
+        )
+
+        # LM parameters
+        self.Lup = Lup
+        self.Ldn = Ldn
+        self.L = L0 * backend.ones(
+            self.current_state.shape[0], dtype=config.DTYPE, device=config.DEVICE
+        )
+
+    def chi2_ndf(self):
+        return (
+            backend.sum(
+                self.weight * self.mask * (self.data - self.forward(self.current_state)) ** 2,
+                dim=1,
+            )
+            / self.ndf
+        )
+
+    def poisson_2nll_ndf(self):
+        M = self.forward(self.current_state)
+        return (
+            2 * backend.sum((M - self.data * backend.log(M + 1e-10)) * self.mask, dim=1) / self.ndf
+        )
+
+    def fit(self, update_uncertainty=True):
+        if self.current_state.shape[1] == 0:
+            if self.verbose > 0:
+                config.logger.warning("No parameters to optimize. Exiting fit")
+            self.message = "No parameters to optimize. Exiting fit"
+            return self
+
+        if self.likelihood == "gaussian":
+            quantity = "Chi^2/DoF"
+            self.loss_history = [backend.to_numpy(self.chi2_ndf())]
+        elif self.likelihood == "poisson":
+            quantity = "2NLL/DoF"
+            self.loss_history = [backend.to_numpy(self.poisson_2nll_ndf())]
+        self._covariance_matrix = None
+        self.L_history = [backend.to_numpy(self.L)]
+        self.lambda_history = [backend.to_numpy(backend.copy(self.current_state))]
+        if self.verbose > 0:
+            config.logger.info(
+                f"==Starting LM fit for '{self.model.name}' with batch of {self.current_state.shape[0]} images with {self.current_state.shape[1]} dynamic parameters and {self.data.shape[1]} pixels=="
+            )
+
+        for _ in range(self.max_iter):
+            if self.verbose > 0:
+                config.logger.info(f"{quantity}: {self.loss_history[-1]}, L: {self.L_history[-1]}")
+
+            if self.fit_valid:
+                with ValidContext(self.model):
+                    res = func.batch_lm_step(
+                        x=self.model.to_valid(self.current_state),
+                        data=self.data,
+                        model=self.forward,
+                        weight=self.weight,
+                        mask=self.mask,
+                        jacobian=self.jacobian,
+                        L=self.L,
+                        Lup=self.Lup,
+                        Ldn=self.Ldn,
+                        likelihood=self.likelihood,
+                        max_step_iter=self.max_step_iter,
+                    )
+                self.current_state = self.model.from_valid(backend.copy(res["x"]))
+            else:
+                res = func.batch_lm_step(
+                    x=self.current_state,
+                    data=self.data,
+                    model=self.forward,
+                    weight=self.weight,
+                    mask=self.mask,
+                    jacobian=self.jacobian,
+                    L=self.L,
+                    Lup=self.Lup,
+                    Ldn=self.Ldn,
+                    likelihood=self.likelihood,
+                    max_step_iter=self.max_step_iter,
+                )
+                self.current_state = backend.copy(res["x"])
+
+            self.L = backend.clamp(res["L"], backend.as_array(1e-9), backend.as_array(1e9))
+            self.L_history.append(backend.to_numpy(self.L))
+            self.loss_history.append(2 * res["nll"] / backend.to_numpy(self.ndf))
+            self.lambda_history.append(backend.to_numpy(backend.copy(self.current_state)))
+
+            if self.check_convergence():
+                break
+        else:
+            self.message = self.message + "fail. Maximum iterations"
+
+        if self.verbose > 0:
+            config.logger.info(
+                f"Final {quantity}: {self.loss_history[-1]}, L: {self.L_history[-1]}. Converged: {self.message}"
+            )
+
+        self.model.set_values(self.current_state)
+        if update_uncertainty:
+            self.update_uncertainty()
+
+        return self
+
+    def check_convergence(self) -> bool:
+        """Check if the optimization has converged based on the last
+        iteration's chi^2 and the relative tolerance.
+        """
+        if len(self.loss_history) < 3:
+            return False
+        if np.all(
+            (self.loss_history[-2] - self.loss_history[-1]) / self.loss_history[-1]
+            < self.relative_tolerance
+        ) and np.all(backend.to_numpy(self.L) < 0.1):
+            self.message = self.message + "success"
+            return True
+        if len(self.loss_history) < 10:
+            return False
+        if np.all(
+            (self.loss_history[-10] - self.loss_history[-1]) / self.loss_history[-1]
+            < self.relative_tolerance
+        ):
+            self.message = self.message + "success by immobility. Convergence not guaranteed"
+            return True
+        return False
+
+    @property
+    def covariance_matrix(self) -> ArrayLike:
+        """The covariance matrix for the model at the current
+        parameters. This can be used to construct a full Gaussian PDF for the
+        parameters using: $\\mathcal{N}(\\mu,\\Sigma)$ where $\\mu$ is the
+        optimized parameters and $\\Sigma$ is the covariance matrix.
+
+        """
+
+        if self._covariance_matrix is not None:
+            return self._covariance_matrix
+        J = self.jacobian(self.current_state) * self.mask.reshape(self.mask.shape + (1,))
+        if self.likelihood == "gaussian":
+            hess = backend.vmap(func.hessian)(J, self.weight * self.mask)
+        elif self.likelihood == "poisson":
+            hess = backend.vmap(func.hessian_poisson)(
+                J, self.data * self.mask, self.forward(self.current_state) * self.mask
+            )
+        try:
+            self._covariance_matrix = backend.vmap(backend.linalg.inv)(hess)
+        except:
+            config.logger.warning(
+                "WARNING: Hessian is singular, likely at least one parameter is non-physical. Will use pseudo-inverse of Hessian to continue but results should be inspected."
+            )
+            self._covariance_matrix = backend.vmap(backend.linalg.pinv)(hess)
+        return self._covariance_matrix
+
+    def update_uncertainty(self) -> None:
+        """Call this function after optimization to set the uncertainties for
+        the parameters. This will use the diagonal of the covariance
+        matrix to update the uncertainties. See the covariance_matrix
+        function for the full representation of the uncertainties.
+
+        """
+        # set the uncertainty for each parameter
+        cov = self.covariance_matrix
+        if backend.all(backend.isfinite(cov)):
+            try:
+                self.model.set_values(
+                    backend.sqrt(backend.abs(backend.vmap(backend.diag)(cov))),
+                    attribute="uncertainty",
+                )
+            except RuntimeError as e:
+                config.logger.warning(f"Unable to update uncertainty due to: {e}")
+        else:
+            config.logger.warning(
+                "Unable to update uncertainty due to non finite covariance matrix"
+            )
diff --git a/astrophot/fit/func/__init__.py b/astrophot/fit/func/__init__.py
index 58da703e..7d13dcd7 100644
--- a/astrophot/fit/func/__init__.py
+++ b/astrophot/fit/func/__init__.py
@@ -1,9 +1,10 @@
-from .lm import lm_step, hessian, gradient, hessian_poisson, gradient_poisson
+from .lm import lm_step, hessian, gradient, hessian_poisson, gradient_poisson, batch_lm_step
 from .slalom import slalom_step
 from .mala import mala
 
 __all__ = [
     "lm_step",
+    "batch_lm_step",
     "hessian",
     "gradient",
     "slalom_step",
diff --git a/astrophot/fit/func/lm.py b/astrophot/fit/func/lm.py
index 2e88c816..47632043 100644
--- a/astrophot/fit/func/lm.py
+++ b/astrophot/fit/func/lm.py
@@ -12,7 +12,7 @@ def nll(D, M, W):
     M: model prediction
     W: weights
     """
-    return 0.5 * backend.sum(W * (D - M) ** 2)
+    return 0.5 * backend.sum(W * (D - M) ** 2, dim=-1)
 
 
 def nll_poisson(D, M):
@@ -21,23 +21,23 @@ def nll_poisson(D, M):
     D: data
     M: model prediction
     """
-    return backend.sum(M - D * backend.log(M + 1e-10))  # Adding small value to avoid log(0)
+    return backend.sum(M - D * backend.log(M + 1e-10), dim=-1)  # Adding small value to avoid log(0)
 
 
 def gradient(J, W, D, M):
-    return J.T @ (W * (D - M))[:, None]
+    return J.T @ (W * (D - M)).reshape(D.shape + (1,))
 
 
 def gradient_poisson(J, D, M):
-    return J.T @ (D / M - 1)[:, None]
+    return J.T @ (D / M - 1).reshape(D.shape + (1,))
 
 
 def hessian(J, W):
-    return J.T @ (W[:, None] * J)
+    return J.T @ (W.reshape(W.shape + (1,)) * J)
 
 
 def hessian_poisson(J, D, M):
-    return J.T @ ((D / (M**2 + 1e-10))[:, None] * J)
+    return J.T @ ((D / (M**2 + 1e-10)).reshape(D.shape + (1,)) * J)
 
 
 def damp_hessian(hess, L):
@@ -46,6 +46,10 @@ def damp_hessian(hess, L):
     return hess * (I + D / (1 + L)) + L * I * backend.diag(hess)
 
 
+def rho(nll0, nll1, h, hessD, grad):
+    return (nll0 - nll1) / backend.abs(h.T @ hessD @ h - 2 * grad.T @ h)
+
+
 def solve(hess, grad, L):
     hessD = damp_hessian(hess, L)  # (N, N)
     while True:
@@ -116,15 +120,15 @@ def lm_step(
             break
 
         # actual nll improvement vs expected from linearization
-        rho = (nll0 - nll1) / backend.abs(h.T @ hessD @ h - 2 * grad.T @ h).item()
+        _rho = rho(nll0, nll1, h, hessD, grad).item()
 
-        if (nll1 < (nll0 + tolerance) and abs(rho - 1) < abs(scary["rho"] - 1)) or (
-            nll1 < scary["nll"] and rho > -10
+        if (nll1 < (nll0 + tolerance) and abs(_rho - 1) < abs(scary["rho"] - 1)) or (
+            nll1 < scary["nll"] and _rho > -10
         ):
-            scary = {"x": x + h.squeeze(1), "nll": nll1, "L": L0, "rho": rho}
+            scary = {"x": x + h.squeeze(1), "nll": nll1, "L": L0, "rho": _rho}
 
         # Avoid highly non-linear regions
-        if rho < 0.1 or rho > 2:
+        if _rho < 0.1 or _rho > 2:
             L *= Lup
             if improving is True:
                 break
@@ -156,3 +160,61 @@ def lm_step(
         raise OptimizeStopFail("Could not find step to improve chi^2")
 
     return best
+
+
+def batch_lm_step(
+    x,
+    data,
+    model,
+    weight,
+    mask,
+    jacobian,
+    L=1.0,
+    Lup=9.0,
+    Ldn=11.0,
+    tolerance=1e-4,
+    likelihood="gaussian",
+    max_step_iter=3,
+):
+    L0 = L  # (D,)
+    M0 = backend.detach(model(x))  # (D, M)
+    J = backend.detach(jacobian(x))  # (D, M, N)
+    data = data * mask
+    M0 = M0 * mask
+    weight = weight * mask
+    J = J * mask.reshape(mask.shape + (1,))
+
+    if likelihood == "gaussian":
+        nll0 = nll(data, M0, weight)  # (D,)
+        grad = backend.vmap(gradient)(J, weight, data, M0)  # (D, N, 1)
+        hess = backend.vmap(hessian)(J, weight)  # (D, N, N)
+    elif likelihood == "poisson":
+        nll0 = nll_poisson(data, M0)  # (D,)
+        grad = backend.vmap(gradient_poisson)(J, data, M0)  # (D, N, 1)
+        hess = backend.vmap(hessian_poisson)(J, data, M0)  # (D, N, N)
+    else:
+        raise ValueError(f"Unsupported likelihood: {likelihood}")
+
+    del J
+
+    new_x = backend.copy(x)
+    new_nll = backend.copy(nll0)
+    new_L = backend.copy(L)
+
+    for _ in range(max_step_iter):
+        hessD, h = backend.vmap(solve)(hess, grad, new_L)  # (D, N, N), (D, N, 1)
+        M1 = model(x + h.squeeze(2))  # (D, M)
+        if likelihood == "gaussian":
+            nll1 = nll(data, M1, weight)  # (D,)
+        elif likelihood == "poisson":
+            nll1 = nll_poisson(data, M1)  # (D,)
+
+        # actual nll improvement vs expected from linearization
+        _rho = backend.vmap(rho)(nll0, nll1, h, hessD, grad).reshape(-1)  # (D,)
+
+        good = backend.isfinite(nll1) & (nll1 < new_nll) & (_rho > 0.1) & (_rho < 2)
+        new_x = backend.where(good[:, None], x + h.squeeze(2), new_x)
+        new_nll = backend.where(good, nll1, new_nll)
+        new_L = backend.where(good, new_L / Ldn, new_L * Lup)
+
+    return {"x": new_x, "nll": backend.to_numpy(new_nll), "L": new_L}
diff --git a/astrophot/fit/lm.py b/astrophot/fit/lm.py
index fa1aa2da..1168f7f5 100644
--- a/astrophot/fit/lm.py
+++ b/astrophot/fit/lm.py
@@ -158,8 +158,6 @@ def __init__(
             relative_tolerance=relative_tolerance,
             **kwargs,
         )
-        # Maximum number of iterations of the algorithm
-        self.max_iter = max_iter
         # Maximum number of steps while searching for chi^2 improvement on a single jacobian evaluation
         self.max_step_iter = max_step_iter
         self.Lup = Lup
@@ -167,7 +165,9 @@ def __init__(
         self.L = L0
         self.likelihood = likelihood
         if self.likelihood not in ["gaussian", "poisson"]:
-            raise ValueError(f"Unsupported likelihood: {self.likelihood}")
+            raise ValueError(
+                f"Unsupported likelihood: {self.likelihood}, should be one of: 'gaussian' or 'poisson'"
+            )
         self.constraint = constraint
 
         # mask
diff --git a/astrophot/models/batch_model_object.py b/astrophot/models/batch_model_object.py
index da8894c8..87630c58 100644
--- a/astrophot/models/batch_model_object.py
+++ b/astrophot/models/batch_model_object.py
@@ -176,12 +176,12 @@ def window(self, window):
     @forward
     def __call__(self, model_params=None, model_dims=None, **kwargs):
         working_image = self.target.model_image(self.window)
-        crtan = self.target.crtan
+        crtan = working_image.crtan
         shift = backend.as_array(
             self.window.origin_shifter(self.model.window), dtype=config.DTYPE, device=config.DEVICE
         )
-        crpix = self.target.crpix + shift
-        CD = self.target.CD
+        crpix = working_image.crpix + shift
+        CD = working_image.CD
         psf = self.target.psf_stack
         psf_batch = None if psf is None else 0
         working_image._data = backend.vmap(
diff --git a/docs/source/tutorials/ImageTypes.ipynb b/docs/source/tutorials/ImageTypes.ipynb
index f6a71d3e..41152f07 100644
--- a/docs/source/tutorials/ImageTypes.ipynb
+++ b/docs/source/tutorials/ImageTypes.ipynb
@@ -20,7 +20,8 @@
     "import astrophot as ap\n",
     "import torch\n",
     "import matplotlib.pyplot as plt\n",
-    "from matplotlib.patches import Rectangle"
+    "from matplotlib.patches import Rectangle\n",
+    "import numpy as np"
    ]
   },
   {
@@ -151,6 +152,94 @@
    "cell_type": "markdown",
    "id": "11",
    "metadata": {},
+   "source": [
+    "# Batch Images\n",
+    "\n",
+    "An `ImageBatch` is much like an `ImageList` except that it has certain restrictions applied to it. Namely that all the images must have the same size, and they must be of the \"regular\" image type, not `SIP` or `CMOS`. This extra constraint means they can't be used as generally as other image types, but the trade off is extra computational advantages. You can see the `batch scene model` in the model zoo for one of the primary advantages of the `ImageBatch` image type. Another advantage is that it can be used to vectorize the fitting of images. Let's say we have some simple model (either just a component model, or a full group model) and we want to use that same model to represent a bunch of similar images. We can use batched images to make this much more efficient. Rather than looping through the images one at a time, we can create a batch and set all their calculations to the hardware (CPU or GPU) at once. Here is an example of what this looks like, but the real advantage is when you seriously scale it up.\n",
+    "\n",
+    "Note: Every dynamic parameter for the model must be batched, these are supposed to be independent models and so every parameter is unique for each model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "12",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "base_target = ap.TargetImage(data=np.zeros((64, 64)))\n",
+    "model = ap.Model(\n",
+    "    model_type=\"sersic galaxy model\",\n",
+    "    name=\"batch_demo\",\n",
+    "    center=(32, 40),\n",
+    "    q=0.6,\n",
+    "    PA=3.14 / 3,\n",
+    "    n=1,\n",
+    "    Re=10,\n",
+    "    Ie=1,\n",
+    "    target=base_target,\n",
+    "    integrate_mode=\"none\",\n",
+    "    sampling_mode=\"quad:3\",\n",
+    ")\n",
+    "model.initialize()\n",
+    "\n",
+    "np.random.seed(42)\n",
+    "target1 = ap.TargetImage(\n",
+    "    name=\"target1\", data=model().data + np.random.normal(scale=0.5, size=(64, 64))\n",
+    ")\n",
+    "model.center = (15, 15)\n",
+    "model.PA = -3.14 / 4\n",
+    "target2 = ap.TargetImage(\n",
+    "    name=\"target2\", data=model().data + np.random.normal(scale=0.5, size=(64, 64))\n",
+    ")\n",
+    "model.center = (45, 20)\n",
+    "model.Re = 15\n",
+    "target3 = ap.TargetImage(\n",
+    "    name=\"target3\", data=model().data + np.random.normal(scale=0.5, size=(64, 64))\n",
+    ")\n",
+    "\n",
+    "batch_target = ap.TargetImageBatch([target1, target2, target3])\n",
+    "fig, axarr = plt.subplots(1, 3, figsize=(15, 5))\n",
+    "ap.plots.target_image(fig, axarr, batch_target)\n",
+    "plt.show()\n",
+    "\n",
+    "# Every parameter must be batched\n",
+    "# Notice we set them a bit off the true values\n",
+    "model.center = ((30, 42), (16, 16), (46, 21))\n",
+    "model.q = (0.6, 0.6, 0.6)\n",
+    "model.PA = (3.14 / 3, -3.14 / 5, -3.14 / 4)\n",
+    "model.n = (1.1, 0.9, 1)\n",
+    "model.Re = (11, 11, 14)\n",
+    "model.Ie = (1, 0.8, 1.2)\n",
+    "\n",
+    "res = ap.fit.BatchLM(model, batch_target, batch_target.window).fit()\n",
+    "print(\"Best-fit parameters:\", res.current_state.detach().cpu().numpy())\n",
+    "\n",
+    "fig, axarr = plt.subplots(2, 3, figsize=(15, 5))\n",
+    "for i in range(3):\n",
+    "    model.set_values(res.current_state[i])\n",
+    "    model.target = batch_target.images[i]\n",
+    "    ap.plots.model_image(fig, axarr[0, i], model)\n",
+    "    ap.plots.residual_image(fig, axarr[1, i], model)\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "13",
+   "metadata": {},
+   "source": [
+    "And there we have it! Three models fit at once. Really, just doing three isn't a big deal computationally. But if you have dozens or hundreds of these little 64x64 pixel images then grouping them all together can get huge computational speedups. If you are able to run these big batched fits on a GPU you can get several orders of magnitude runtime reduction!\n",
+    "\n",
+    "Happy fitting!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "14",
+   "metadata": {},
+   "outputs": [],
    "source": []
   }
  ],

From 5691ab1a3d388c19f8ec03cdd703ad6df7fa631a Mon Sep 17 00:00:00 2001
From: Connor Stone <connorstone628@gmail.com>
Date: Thu, 2 Apr 2026 16:35:23 -0400
Subject: [PATCH 4/4] fixes from copilot suggestions

---
 astrophot/models/func/gaussian.py    | 5 ++---
 astrophot/models/mixins/transform.py | 3 +--
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/astrophot/models/func/gaussian.py b/astrophot/models/func/gaussian.py
index ea48d435..5189f078 100644
--- a/astrophot/models/func/gaussian.py
+++ b/astrophot/models/func/gaussian.py
@@ -3,11 +3,10 @@
 
 
 def gaussian(R: ArrayLike, sigma: ArrayLike, flux: ArrayLike) -> ArrayLike:
-    """Gaussian 1d profile function, specifically designed for pytorch
-    operations.
+    """Gaussian 2d profile function.
 
     **Args:**
-    -  `R`: Radii tensor at which to evaluate the gaussian function
+    -  `R`: Radii array at which to evaluate the gaussian function
     -  `sigma`: Standard deviation of the gaussian in the same units as R
     -  `flux`: Total flux of the Gaussian
     """
diff --git a/astrophot/models/mixins/transform.py b/astrophot/models/mixins/transform.py
index 71f4a0ad..0211a53a 100644
--- a/astrophot/models/mixins/transform.py
+++ b/astrophot/models/mixins/transform.py
@@ -180,7 +180,7 @@ class FourierEllipseMixin:
             of each mode.
     :param phim: Array of phases for the Fourier modes, adjusts the
             orientation of the mode perturbation relative to the major axis. It
-            is cyclically defined in the range [0,2pi/m)
+            is cyclically defined in the range [0,2pi)
     :param modes: Tuple of integers indicating which Fourier modes to use.
     """
 
@@ -230,7 +230,6 @@ def initialize(self):
             self.am.value = np.zeros(len(self.modes)) + 0.0001
         if not self.phim.initialized:
             self.phim.value = np.zeros(len(self.modes)) + 0.0001
-            self.phim.valid = (np.zeros(len(self.modes)), 2 * np.pi / backend.to_numpy(self.modes))
 
 
 class WarpMixin: