Spectral clustering: support pre-calculated affinity matrix, revamp API by ClaudiaComito · Pull Request #1835 · helmholtz-analytics/heat

ClaudiaComito · 2025-03-21T09:25:52Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- benchmarks: created for new functionality
- benchmarks: performance improved or maintained
- documentation updated where needed

Description

This PR introduces some changes to the Spectral clustering class (see below), this work in connection with parallelization efforts of the SCIMES package with @dcolombo

Issue/s resolved: #1740

Changes proposed:

change class name to SpectralClustering (formerly Spectral) to match API of the current scikit-learn version
rename metric parameter to affinity to match sklearn
support use of pre-calculated affinity matrix
introduced new parameter eigen_solver, with options lanczos and zolotarev (Default: zolotarev) (e.g. Implement polar decomposition #1697 and Features/1723 Symmetric Eigenvalue Decomposition (eigh) and full SVD (svd) based on Zolotarev Polar Decomposition #1824 )
introduced new parameters n_components, random_state to match scikit-learn API
output of __spectral_embedding is now just the embedding, like for scikit-learn (instead of eigenvalues and eigenvectors);
introduced method DNDarray.diagonal() for convenience and to match numpy API
adapted documentation
adapted tests

Type of change

Memory requirements

NA

Performance

NA

Does this change modify the behaviour of other functions? If so, which?

no

…y_matrix_in_ht_cluster_Spectral

github-actions · 2025-03-21T09:35:59Z

Thank you for the PR!

…y_matrix_in_ht_cluster_Spectral

for more information, see https://pre-commit.ci

github-actions · 2025-04-04T09:41:01Z

Thank you for the PR!

codecov · 2025-04-04T10:17:44Z

Codecov Report

❌ Patch coverage is 86.20690% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.61%. Comparing base (206a523) to head (fb921ea).

Files with missing lines	Patch %	Lines
heat/cluster/spectral.py	85.71%	8 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1835      +/-   ##
==========================================
+ Coverage   88.76%   91.61%   +2.84%     
==========================================
  Files          89       89              
  Lines       14012    14037      +25     
==========================================
+ Hits        12438    12860     +422     
+ Misses       1574     1177     -397

Flag	Coverage Δ
unit	`91.61% <86.20%> (+2.84%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mrfh92 · 2025-04-07T08:46:30Z

see #1824 for the eigh function prototype

mrfh92 · 2025-04-07T09:10:52Z

            raise NotImplementedError("Not implemented for other splitting-axes")

-        _, eigenvectors = self._spectral_embedding(x)
+        _, eigenvectors = self.spectral_embedding(x, self.eigen_solver)


Here, the eigenvalue problem is solved another time for the input data of predict aka the "test-data". However, I think the same spectral embedding of the "training data" (input of fit) should be re-used here.

mrfh92 · 2025-04-07T09:13:18Z

            raise NotImplementedError("Not implemented for other splitting-axes")
        # 2. Embed Dataset into lower-dimensional Eigenvector space
-        eigenvalues, eigenvectors = self._spectral_embedding(x)
+        eigenvalues, eigenvectors = self.spectral_embedding(x, self.eigen_solver)


Since this is the by far most expensive step of the algorithm, it could make sense to save the result of this as an _-attribute of the clusterer-object for later re-usage, e.g., in predict.

mrfh92

The API-changes look fine to me 👍

I have only the two comments directly attached to the code regarding re-usability of the (expensive) results of the eigenvalue decomposition. I would suggest to save them somewhere in the object in order to avoid recomputation.

mrfh92 · 2025-04-07T09:17:34Z

addition to my review: just saw that scikit-learn's SpectralClustering does not have a predict on its own but only a fit_predict.

…y_matrix_in_ht_cluster_Spectral

…D_based_on_Zolotarev-polar_decomposition' into features/1740-Support_pre-calculated_affinity_matrix_in_ht_cluster_Spectral

github-actions · 2025-05-16T08:18:45Z

Thank you for the PR!

ClaudiaComito · 2025-12-02T09:28:07Z

Update after #1964 is merged

…y_matrix_in_ht_cluster_Spectral

ClaudiaComito

various updates after merging #1457

for more information, see https://pre-commit.ci

Change return type from Tuple to DNDarray for the function.

for more information, see https://pre-commit.ci

Removed outdated notes about eigenvalues and added comments regarding the Laplacian matrix properties.

github-actions · 2026-05-18T05:41:48Z

This pull request is stale because it has been open for 60 days with no activity.

brownbaerchen · 2026-05-18T06:13:41Z

+
+fileignoreconfig:
+- filename: .github/workflows/ci.yaml
+  checksum: 3bf095a6ff388eb7aacb80e3b33ed600b377bbce6a97f01ac1579dc702024a13
+- filename: .github/workflows/ci_full.yaml
+  checksum: 54389cc7b9ddd0010ed99d1194da724dbcc51922358f739a5aa30e65e8e2c0e0
+version: "1.0"


Suggested change

fileignoreconfig:

- filename: .github/workflows/ci.yaml

checksum: 3bf095a6ff388eb7aacb80e3b33ed600b377bbce6a97f01ac1579dc702024a13

- filename: .github/workflows/ci_full.yaml

checksum: 54389cc7b9ddd0010ed99d1194da724dbcc51922358f739a5aa30e65e8e2c0e0

version: "1.0"

ClaudiaComito added 6 commits January 14, 2025 13:02

refactor API, add precomputed affinity

3fd75a9

adapt tests

3114725

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

075f70f

…y_matrix_in_ht_cluster_Spectral

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

5764632

…y_matrix_in_ht_cluster_Spectral

introduce eigen_solver parameter

441ad2c

adapt tests

bd80333

github-actions Bot added cluster features labels Mar 21, 2025

ClaudiaComito added this to the 1.6 milestone Mar 31, 2025

github-project-automation Bot added this to Roadmap Mar 31, 2025

github-project-automation Bot moved this to Todo in Roadmap Mar 31, 2025

ClaudiaComito requested a review from mrfh92 March 31, 2025 09:46

ClaudiaComito and others added 2 commits April 4, 2025 11:34

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

39209a2

…y_matrix_in_ht_cluster_Spectral

[pre-commit.ci] auto fixes from pre-commit.com hooks

0ec780e

for more information, see https://pre-commit.ci

mrfh92 reviewed Apr 7, 2025

View reviewed changes

mrfh92 requested changes Apr 7, 2025

View reviewed changes

github-project-automation Bot moved this from Todo to In Progress in Roadmap Apr 7, 2025

ClaudiaComito added 4 commits April 30, 2025 14:15

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

8663b0d

…y_matrix_in_ht_cluster_Spectral

add fit_predict

1fb1fb0

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

29a995e

…y_matrix_in_ht_cluster_Spectral

Merge branch 'features/1723-Symmetric_eigenvalue_decomposition_and_SV…

a5be8d8

…D_based_on_Zolotarev-polar_decomposition' into features/1740-Support_pre-calculated_affinity_matrix_in_ht_cluster_Spectral

ClaudiaComito added 2 commits May 20, 2025 10:46

support eigh decomposition

8868d34

add argument n_components

631a4aa

github-actions Bot added the stale label Nov 3, 2025

ClaudiaComito removed the stale label Nov 3, 2025

ClaudiaComito modified the milestones: 1.7.0, 1.8.0 Dec 2, 2025

ClaudiaComito modified the milestones: 1.8.0, 1.9.0 Mar 3, 2026

ClaudiaComito added 2 commits March 9, 2026 13:07

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

0e96ddb

…y_matrix_in_ht_cluster_Spectral

Merge branch 'main' into features/1740-Support_pre-calculated_affinit…

fb921ea

…y_matrix_in_ht_cluster_Spectral

ClaudiaComito commented Mar 10, 2026

View reviewed changes

ClaudiaComito and others added 16 commits March 10, 2026 10:15

Update docstring and args defaults

ee42e63

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad63d2c

for more information, see https://pre-commit.ci

Update __spectral_embedding docs

3414f19

Edits

3b51b3c

[pre-commit.ci] auto fixes from pre-commit.com hooks

3b386a9

for more information, see https://pre-commit.ci

Remove dead code

116ecef

Remove dead code

82b0eb6

Remove outdated reference to linalg.polar

1c40715

remove dead code

bee92cc

sklearn API consistency

f71a075

__spectral_embedding returns embedding only

03bbc01

Change return type from Tuple to DNDarray for the function.

assess n_components before decomposition

870dc7a

[pre-commit.ci] auto fixes from pre-commit.com hooks

673d43d

for more information, see https://pre-commit.ci

apply standard normalization

d5ca6d0

Update error message

ee9bdab

Refactor comments in spectral.py for clarity

68a31c4

Removed outdated notes about eigenvalues and added comments regarding the Laplacian matrix properties.

github-actions Bot added the stale label May 18, 2026

brownbaerchen reviewed May 18, 2026

View reviewed changes

github-actions Bot removed the stale label May 25, 2026

Conversation

ClaudiaComito commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Due Diligence

Description

Changes proposed:

Type of change

Memory requirements

Performance

Does this change modify the behaviour of other functions? If so, which?

Uh oh!

github-actions Bot commented Mar 21, 2025

Uh oh!

github-actions Bot commented Apr 4, 2025

Uh oh!

codecov Bot commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mrfh92 commented Apr 7, 2025

Uh oh!

mrfh92 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

mrfh92 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

mrfh92 left a comment

Choose a reason for hiding this comment

Uh oh!

mrfh92 commented Apr 7, 2025

Uh oh!

github-actions Bot commented May 16, 2025

Uh oh!

ClaudiaComito commented Dec 2, 2025

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

brownbaerchen May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ClaudiaComito commented Mar 21, 2025 •

edited

Loading

codecov Bot commented Apr 4, 2025 •

edited

Loading