Skip to content

Merge 0.2.0 into main: propagate_iterative non-convergence fix + 0.2.0 release#7

Merged
dwgoon merged 4 commits into
mainfrom
0.2.0
Jun 11, 2026
Merged

Merge 0.2.0 into main: propagate_iterative non-convergence fix + 0.2.0 release#7
dwgoon merged 4 commits into
mainfrom
0.2.0

Conversation

@dwgoon

@dwgoon dwgoon commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Brings the 0.2.0 release branch into main. Includes a new review fix
plus the two 0.2.0 commits not yet on main.

Bug fix (new)

  • algorithms/sp: propagate_iterative returned the second-to-last
    iterate when lim_iter was exhausted (the ping-pong swap left the final
    swap in place), and returned an uninitialized buffer for lim_iter == 0.
    It now always returns the last computed iterate, which also equals
    trajectory[-1].
  • _blas_ctypes.solve: snapshot and restore the backend thread count so a
    one-off num_threads no longer leaks process-wide.
  • control/influence: raise ValueError on an unknown device; unify the
    convergence comparison to <= tol (dense + sparse); remove the dead
    np.inf self-influence assignment that was always overwritten.
  • tests: cover the non-converged and lim_iter == 0 paths.

Also carried from 0.2.0 (not yet on main)

  • Distribute 0.2.0 via GitHub Releases while PyPI publish is unavailable.
  • Fix wheel name: CUDA cells produced sfa instead of sfa-cuXYZ.

Verification

  • 25 tests pass, including 16 CUDA tests on an RTX 4090 (sm_89): FP64/FP32/FP16
    influence and signal-propagation parity with the CPU path, the TF32 precision
    check, and the new non-convergence regression tests.

dwgoon added 4 commits June 6, 2026 21:43
The maintainer's PyPI account is currently unavailable, so the
publish-to-pypi job in wheels.yml is moved back behind `if: false` and
a sibling publish-github-release job is added. The new job runs on
the same `v*` tag push, downloads every build artifact (CPU universal
wheel, sdist, plus all per-Python / per-OS / per-CUDA wheels), and
uses softprops/action-gh-release@v2 to attach them to the
corresponding GitHub Release. End users install with

  pip install https://github.com/dwgoon/sfa/releases/download/v0.2.0/<wheel>

The wheels.yml matrix itself is unchanged from the 6/6-green dry run
at 27057416779; only the tail-end publish path changes. When the
PyPI account is restored, re-enabling that channel is a one-line
edit to flip `if: false` back to the tag-ref check; the GitHub
Releases channel can stay on permanently.

README and INSTALL.md gain a "Install from GitHub Releases" section
documenting the wheel-filename pattern
`sfa_cuXYZ-VER-cpPP-cpPP-PLAT.whl` so users can construct the right
URL for their Python version / OS / CUDA combination, plus example
commands for CPU, CUDA 13.2 Linux, CUDA 12.8 Windows, sdist, and the
git-tag source fallback.
Downloading an artifact from the green dry run (27057416779) revealed
every CUDA cell's wheels were named `sfa-0.2.0.dev0-...whl` instead
of `sfa_cu128-...` / `sfa_cu132-...`. The wheel metadata Name was
`sfa` for every cell, so a release in that state would have collapsed
the three planned distributions into a single namespace collision.

Root cause: setup.py read SFA_PACKAGE_NAME from the env and passed it
as `name=` to setup(), but PEP 621 makes pyproject.toml's
`[project] name` table authoritative whenever it is present, and our
[project] had a static `name = "sfa"`. setup.py's name argument was
silently ignored, so SFA_PACKAGE_NAME never reached the wheel.

PEP 621 also forbids declaring `name` as dynamic, ruling out the
"setup.py supplies it" route. The fix:

- Keep `[project] name = "sfa"` static (CPU + sdist default). For each
  CUDA cell, the new "Rename wheel project to ..." step sed-rewrites
  that line to the matrix wheel name before cibuildwheel runs. The
  same source tree therefore yields `sfa`, `sfa-cu128`, and
  `sfa-cu132` wheels by simply changing one line in pyproject.toml.
- Move `dependencies` to `dynamic` in pyproject.toml. With no entry
  under `[tool.setuptools.dynamic]`, setuptools reads the resolved
  list from setup.py's `install_requires`, which keeps the
  SFA_CUDA_RUNTIME_REQUIRES env-var path alive so each CUDA wheel
  pins `nvidia-cublas-cu1X`, `nvidia-cuda-runtime-cu1X`, etc.

setup.py is trimmed to just what pyproject.toml does not cover
(install_requires composition, the CUDA build extension, and the
build_ext command class). Verified locally:

  sed name="sfa-cu132" + SFA_CUDA_RUNTIME_REQUIRES=... build ->
    file:    sfa_cu132-0.2.0-py3-none-any.whl
    METADATA Name: sfa-cu132
    METADATA Requires-Dist includes nvidia-cublas-cu13<13.3,>=13.2

The CPU and sdist jobs lose the now-redundant
`SFA_PACKAGE_NAME: "sfa"` env entries; the CUDA matrix loses
`SFA_PACKAGE_NAME=...` from both CIBW_ENVIRONMENT_LINUX and
CIBW_ENVIRONMENT_WINDOWS because the sed step has already changed
the in-tree name by the time cibuildwheel starts.
- algorithms/sp: the ping-pong swap left the final swap in place when
  lim_iter was exhausted, so the returned x was the second-to-last
  iterate and disagreed with trajectory[-1]. Track convergence and
  return the correct buffer; this also stops lim_iter==0 from returning
  an uninitialized array. Returned x now always equals trajectory[-1].
- _blas_ctypes.solve: snapshot and restore the backend thread count so a
  one-off num_threads no longer leaks process-wide.
- control/influence: raise ValueError on an unknown device; unify the
  convergence comparison to '<= tol' (dense + sparse); drop the dead
  np.inf self-influence assignment that was always overwritten.
- tests: cover the non-converged and lim_iter==0 propagate_iterative paths.
Bring the propagate_iterative non-convergence fix (and the BLAS thread
restore, influence device/comparison cleanups, and regression tests)
into the 0.2.0 release branch.
@dwgoon dwgoon changed the title Distribute 0.2.0 via GitHub Releases Merge 0.2.0 into main: propagate_iterative non-convergence fix + 0.2.0 release Jun 11, 2026
@dwgoon dwgoon merged commit 4ff875e into main Jun 11, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant