Skip to content

Solver Testing and Addressed Bugs Along the Way#134

Merged
vincent-maillou merged 21 commits into
devfrom
fix_distributed_sellinv
Dec 3, 2025
Merged

Solver Testing and Addressed Bugs Along the Way#134
vincent-maillou merged 21 commits into
devfrom
fix_distributed_sellinv

Conversation

@vincent-maillou

@vincent-maillou vincent-maillou commented Oct 27, 2025

Copy link
Copy Markdown
Collaborator

Added testing suite for sequential and distributed solvers. Tests are separated in two parts: unit/ and component_integration/.

  • Fixed the factorization in scipy solver (faked Cholesky -> LU)
  • Fixed problems in distributed structured solver RHS (solve / mapping / gather)
  • Added "automated" runner script. To be seen with GitHub action if we keep this, if it's needed or not.

Regarding #129 , the tests are passing for CPU/GPU arrays using MPI (not tested with mpi cuda aware and NCCL, need environment development from branch installation_environements), it might still be that the exact case that fail is not tested for and / or I have overlooked something.

Closes #72 , as now the testing suite structure will be defined and about 300–500 tests are available. A lot of work remaining to test the entire framework → will be done incrementally

The PR: Mpi issues #124 has been merged in subsequent "AR1/gamma_prior" branches, so this is gonna be dealt with in rebasing to these branches. For this reason, I would suggest closing Mpi issues #124

a non-permuted LU factorization isn't actually numerically correct.
Modified the implementation (and interfaces of the solvers) to use
LU and the appropriate triangular solve calls. Other solvers still relies
on Cholesky when available.
… tests for Serinv.

- Figured out fixture structure
- Added fixtures factory for testing utils functions
- Added structured solver testing for solver.factorize workflow
- This raised an unexpected behavior in the `_strutured_to_spmatrix` mapping.
- Separated sequential / distributed tests related fixtures
- Solved several issues in the implementation of the distributed solver
- Fixed the structured_to_sparse mapping function that simply wasn't working
- Adapted `runner.sh`
- Updated `README.md` with new test descriptions

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a comprehensive testing suite for DALIA's solver implementations and fixes several critical bugs discovered during test development. The testing infrastructure is organized into unit tests and component integration tests, covering sequential and distributed solver configurations across CPU and GPU backends.

Key changes:

  • Added 300-500 tests for dense, sparse, and structured solvers (sequential and distributed)
  • Fixed scipy sparse solver to use LU decomposition instead of fake Cholesky
  • Corrected distributed structured solver issues with RHS handling, array gathering, and mapping
  • Included automated test runner script for convenient backend and test category selection

Reviewed Changes

Copilot reviewed 49 out of 49 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/runner.sh Shell script for automated test execution with backend detection and selection
tests/test_config.py Test configuration constants (tolerances, random seed)
tests/structured_solvers_utils.py Utility functions for structured solver test generation and validation
tests/component_integration/solvers/utils.py Utility functions for component integration tests
tests/unit/solvers/structured_solvers/sequential/test_spmatrix_mapping.py Unit tests for sequential structured solver matrix mapping
tests/unit/solvers/structured_solvers/distributed/test_spmatrix_mapping.py Unit tests for distributed structured solver matrix mapping
tests/component_integration/solvers/structured_solvers/sequential/test_*.py Component integration tests for sequential structured solvers
tests/component_integration/solvers/structured_solvers/distributed/test_*.py Component integration tests for distributed structured solvers
tests/component_integration/solvers/sparse_solvers/sequential/test_*.py Component integration tests for sparse solvers
tests/component_integration/solvers/dense_solvers/sequential/test_*.py Component integration tests for dense solvers
src/dalia/solvers/sparse_solver.py Fixed to use LU decomposition with proper solve and logdet implementations
src/dalia/solvers/distributed_structured_solver.py Fixed RHS handling, gather operations, and multiple RHS support
src/dalia/solvers/structured_solver.py Added symmetrize parameter to _structured_to_spmatrix
src/dalia/solvers/dense_solver.py Renamed cholesky to factorize
src/dalia/core/solver.py Updated abstract method from cholesky to factorize
src/dalia/utils/multiprocessing.py Fixed allgather to properly return list of arrays instead of concatenating

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/README.md Outdated
Comment thread tests/component_integration/solvers/sparse_solvers/sequential/test_factorize.py Outdated
Comment thread src/dalia/solvers/dense_solver.py Outdated
Comment thread src/dalia/core/solver.py Outdated
Comment thread src/dalia/solvers/distributed_structured_solver.py Outdated
@vincent-maillou vincent-maillou self-assigned this Oct 27, 2025
@vincent-maillou vincent-maillou linked an issue Oct 27, 2025 that may be closed by this pull request

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 49 out of 49 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/solvers/structured_solvers/sequential/test_spmatrix_mapping.py Outdated
Comment thread tests/unit/solvers/structured_solvers/sequential/test_spmatrix_mapping.py Outdated
Comment thread tests/unit/solvers/structured_solvers/distributed/test_spmatrix_mapping.py Outdated
Comment thread tests/unit/solvers/structured_solvers/distributed/test_spmatrix_mapping.py Outdated
Comment thread tests/structured_solvers_utils.py Outdated
Comment thread tests/runner.sh Outdated
Comment thread src/dalia/solvers/sparse_solver.py Outdated
Comment thread src/dalia/solvers/sparse_solver.py Outdated
@@ -763,16 +793,17 @@ def _structured_to_spmatrix(

# TODO: Need to communicate to agregates/Map the local B matrix to all ranks

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

someone once told me i shouldnt have any TODOs left in my PRs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are your thoughts on this "symmetrize" option? do you want it to be enabled by default? when is it relevant / not relevant?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the discussions of the last week, i wonder whether we should have a non-diagonally dominant test case?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you test the solver by checking if L is lower triangular and L*L^T = A?

Comment thread tests/component_integration/solvers/utils.py Outdated
- Removed DiagDom caracteristic in dense solver testing
- Minor modifications in the sparse solver implementation
- Improved test for the scipy sparse solver
@vincent-maillou vincent-maillou merged commit e230e42 into dev Dec 3, 2025
@vincent-maillou vincent-maillou deleted the fix_distributed_sellinv branch December 3, 2025 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Testing Suite

3 participants