Conversation
The maintainer's PyPI account is currently unavailable, so the publish-to-pypi job in wheels.yml is moved back behind `if: false` and a sibling publish-github-release job is added. The new job runs on the same `v*` tag push, downloads every build artifact (CPU universal wheel, sdist, plus all per-Python / per-OS / per-CUDA wheels), and uses softprops/action-gh-release@v2 to attach them to the corresponding GitHub Release. End users install with pip install https://github.com/dwgoon/sfa/releases/download/v0.2.0/<wheel> The wheels.yml matrix itself is unchanged from the 6/6-green dry run at 27057416779; only the tail-end publish path changes. When the PyPI account is restored, re-enabling that channel is a one-line edit to flip `if: false` back to the tag-ref check; the GitHub Releases channel can stay on permanently. README and INSTALL.md gain a "Install from GitHub Releases" section documenting the wheel-filename pattern `sfa_cuXYZ-VER-cpPP-cpPP-PLAT.whl` so users can construct the right URL for their Python version / OS / CUDA combination, plus example commands for CPU, CUDA 13.2 Linux, CUDA 12.8 Windows, sdist, and the git-tag source fallback.
Downloading an artifact from the green dry run (27057416779) revealed
every CUDA cell's wheels were named `sfa-0.2.0.dev0-...whl` instead
of `sfa_cu128-...` / `sfa_cu132-...`. The wheel metadata Name was
`sfa` for every cell, so a release in that state would have collapsed
the three planned distributions into a single namespace collision.
Root cause: setup.py read SFA_PACKAGE_NAME from the env and passed it
as `name=` to setup(), but PEP 621 makes pyproject.toml's
`[project] name` table authoritative whenever it is present, and our
[project] had a static `name = "sfa"`. setup.py's name argument was
silently ignored, so SFA_PACKAGE_NAME never reached the wheel.
PEP 621 also forbids declaring `name` as dynamic, ruling out the
"setup.py supplies it" route. The fix:
- Keep `[project] name = "sfa"` static (CPU + sdist default). For each
CUDA cell, the new "Rename wheel project to ..." step sed-rewrites
that line to the matrix wheel name before cibuildwheel runs. The
same source tree therefore yields `sfa`, `sfa-cu128`, and
`sfa-cu132` wheels by simply changing one line in pyproject.toml.
- Move `dependencies` to `dynamic` in pyproject.toml. With no entry
under `[tool.setuptools.dynamic]`, setuptools reads the resolved
list from setup.py's `install_requires`, which keeps the
SFA_CUDA_RUNTIME_REQUIRES env-var path alive so each CUDA wheel
pins `nvidia-cublas-cu1X`, `nvidia-cuda-runtime-cu1X`, etc.
setup.py is trimmed to just what pyproject.toml does not cover
(install_requires composition, the CUDA build extension, and the
build_ext command class). Verified locally:
sed name="sfa-cu132" + SFA_CUDA_RUNTIME_REQUIRES=... build ->
file: sfa_cu132-0.2.0-py3-none-any.whl
METADATA Name: sfa-cu132
METADATA Requires-Dist includes nvidia-cublas-cu13<13.3,>=13.2
The CPU and sdist jobs lose the now-redundant
`SFA_PACKAGE_NAME: "sfa"` env entries; the CUDA matrix loses
`SFA_PACKAGE_NAME=...` from both CIBW_ENVIRONMENT_LINUX and
CIBW_ENVIRONMENT_WINDOWS because the sed step has already changed
the in-tree name by the time cibuildwheel starts.
- algorithms/sp: the ping-pong swap left the final swap in place when lim_iter was exhausted, so the returned x was the second-to-last iterate and disagreed with trajectory[-1]. Track convergence and return the correct buffer; this also stops lim_iter==0 from returning an uninitialized array. Returned x now always equals trajectory[-1]. - _blas_ctypes.solve: snapshot and restore the backend thread count so a one-off num_threads no longer leaks process-wide. - control/influence: raise ValueError on an unknown device; unify the convergence comparison to '<= tol' (dense + sparse); drop the dead np.inf self-influence assignment that was always overwritten. - tests: cover the non-converged and lim_iter==0 propagate_iterative paths.
Bring the propagate_iterative non-convergence fix (and the BLAS thread restore, influence device/comparison cleanups, and regression tests) into the 0.2.0 release branch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Brings the
0.2.0release branch intomain. Includes a new review fixplus the two 0.2.0 commits not yet on main.
Bug fix (new)
propagate_iterativereturned the second-to-lastiterate when
lim_iterwas exhausted (the ping-pong swap left the finalswap in place), and returned an uninitialized buffer for
lim_iter == 0.It now always returns the last computed iterate, which also equals
trajectory[-1].one-off
num_threadsno longer leaks process-wide.ValueErroron an unknown device; unify theconvergence comparison to
<= tol(dense + sparse); remove the deadnp.infself-influence assignment that was always overwritten.lim_iter == 0paths.Also carried from 0.2.0 (not yet on main)
sfainstead ofsfa-cuXYZ.Verification
influence and signal-propagation parity with the CPU path, the TF32 precision
check, and the new non-convergence regression tests.