Improve LinearOpeator performance for FermionOperator#663
Conversation
kevinsung
left a comment
There was a problem hiding this comment.
Thanks Inho, here are a few initial comments.
|
|
||
| import ffsim | ||
| from ffsim import FermionOperator | ||
| from ffsim._slow.fermion_operator import FermionOperator as SlowFermionOperator |
There was a problem hiding this comment.
We shouldn't import anything from _slow since that's subject to deletion at any time. Instead, copy and paste the old implementation as a function in this test file.
| ) | ||
| fermion_term_data: list[_FermionTermData] = [] | ||
| # Split each term into alpha and beta transition data. | ||
| for term, coeff in bounded_operator.normal_ordered().items(): |
There was a problem hiding this comment.
In the main branch you can now do bound_operator.normal_ordered(group_by_spin=True) (see #668). Then, you can get rid of the inversion counting logic.
| alpha=_make_spin_term_data(tuple(alpha_term), norb=norb, nocc=nelec[0]), | ||
| beta=_make_spin_term_data(tuple(beta_term), norb=norb, nocc=nelec[1]), |
There was a problem hiding this comment.
Nit: ffsim style is to unpack n_alpha, n_beta = nelec near the top of the function.
| """Test linear operator.""" | ||
| norb = 5 | ||
| nelec = (2, 2) | ||
| op = FermionOperator( |
There was a problem hiding this comment.
Instead of hard-coding operators, we should generate them randomly. In the main branch, you can now do
op = ffsim.random.random_fermion_operator(norb, num_and_spin_conserving=True)There was a problem hiding this comment.
Let's move all the code specific to FermionOperator to python/ffsim/operators/fermion_operator.py and just import the _fermion_operator_to_linear_operator in this file.
Summary
Resolve #50
Details
This PR replaces the generic$z$ .
FermionOperatoraction used bylinear_operator. The change aplies when the input object is aFermionOperatorthat conserves particle number and spinPreviously, the implementation evaluated
This used to store ladder-operator tuple and called the PySCF creation or annihilation for each elementary action. But this make every
matvecrepeated the same intermediate sector transversal, determinant admissibility checks, address computations, and ferminoic sign computation.This implementation moves the determinant combinatorics to$0, \dots, n_{orb}-1$ , since such terms act as zero on the selected finite orbital space. The remaining operator is normal ordered and rewritten as
LinearOperatorconsatruction. The Python first applies the same symmetry check. Then removes terms whose orbital labels lie outside the active rangeThe coefficient$\eta_t$ contains the normal-ordering signs and the sign from separating alpha and beta ladder operators into fixed spin blocks.
Each spin block is compiled once into a fixed-sector transition map
The source address$u_{t \sigma r}$ , labels an input determinant in the fixed $n_\sigma$ -electron spin sector. The destination address $d_{t \sigma r}$ labels the determinant reached after applying the spin block. The phase $\phi_{t \sigma r} \in \{ \pm 1 \}$ is the product of the elementtary fermionic signs obtained during the reverse traversal of that block. Empty spin blocks are represented as identities
For a mixed alpha-beta term, the packed action on the tensor$X$ is
Alpha-only, beta-only, scalar terms are the corresponding reductions of this update. The Python layer stores scalar coefficients, compressed one-spin transitions, mixed-term coefficients, mixed transition arrays, and per-term index pointers. One-spin transitions with equal source and destination addresses are summed before entering the accumulation kernel. Mixed terms remain factored into alpha and beta maps, so the full tenrsor product list of flattened addresses is not materialized.
Rust is implement for hot accumulation loop. So druing$(D_\alpha, D_\beta)$ , allocates an output tensor, and passes both tensors with the packed arrays to Rust. Rust then contracts scalar terms, alpha-only transitions, beta-only transitions, and mixed transitions into the output buffer. The mixed contraction chooses the output spin loop by comparing the number of alpha and beta transitions for the term.
matvec, Python converts the input vector to a contiguous complex array, reshape it toThe
rmatvecimplementation reuses the same packed data and updateand update adjoint
The Rust kernel implements this rule by conjugating the stored scalar and swapping source and destination addresses when the
reverseflag is set.Algorithm 1: Staged fixed-sector action for a fermionic operator
Require:$\hat{O} = \sum_{t} c_{t} \hat{g_t}$ , active orbital count $n_{orb}$ , sector $(n_\alpha, n_\beta)$ , amplitues $X$ , direction $d \in \{forward,adjoint\}$
Notation:$|u \rangle, |d \rangle$ denote fixed-spin determinants, $\phi \in \{ \pm 1 \}$ is the fermionic sign, and $M_{t \sigma}$ is the transition map for spin block $m_{t\sigma}$ . $\mathcal{S}, \mathcal{A}, \mathcal{B}$ , and $\mathcal{M}$ denote scalar, alpha-only, beta-only, and mixed alpha-beta terms. The tensor indices are $a$ for alpha addresses and $b$ for beta addresses.
Benchamrk
The benchmark has been implemented compared to
fqe.Note that the benchmark was made in Macbook m4 air, single core.
ffsim
fqe