1. Ordinary Persistence Image (Ord PI) Calculation

\begin{algorithm}[H] \caption{Computation of Ordinary Persistence Image (Ord PI)} \label{alg:ord_pi} \begin{algorithmic}[1] \REQUIRE Persistence Diagram $D = { (b_i, d_i) }{i \in I}$, Weight function $w: \mathbb{R}^2 \to \mathbb{R}$, Gaussian kernel parameters $\sigma$, Grid resolution $R$ \ENSURE Persistence Image Vector $V{PI}$

\STATE Define transformation $T: \mathbb{R}^2 \to \mathbb{R}^2$ as $T(b, d) = (b, d-b)$; \STATE $P \leftarrow \emptyset$; \hfill \COMMENT{▷ Transformed points (birth, persistence)} \FOR{each pair $(b_i, d_i) \in D$} \IF{$d_i < \infty$} \STATE $P \leftarrow P \cup { T(b_i, d_i) }$; \ENDIF \ENDFOR

\STATE Define surface function $\rho: \mathbb{R}^2 \to \mathbb{R}$ initialized to 0; \FOR{each point $u = (x_u, y_u) \in P$} \STATE $w_u \leftarrow w(x_u, y_u)$; \hfill \COMMENT{▷ Calculate weight (e.g., $w_u = y_u$)} \STATE $\rho(z) \leftarrow \rho(z) + w_u \cdot \frac{1}{2\pi\sigma^2} e^{-\frac{|z-u|^2}{2\sigma^2}}$; \hfill \COMMENT{▷ Sum of weighted Gaussian kernels} \ENDFOR

\STATE $V_{PI} \leftarrow \text{Array of size } R \times R$; \STATE Define grid pixels ${ \pi_{j,k} }{1 \le j,k \le R}$; \FOR{each pixel $\pi{j,k}$} \STATE $V_{PI}[j, k] \leftarrow \iint_{\pi_{j,k}} \rho(z) , dz$; \hfill \COMMENT{▷ Discretize by integration over pixel area} \ENDFOR

\RETURN Flatten($V_{PI}$); \end{algorithmic} \end{algorithm}

2. Sixpack Data Calculation

\begin{algorithm}[H] \caption{Computation of Sixpack Features} \label{alg:sixpack} \begin{algorithmic}[1] \REQUIRE Point set $P = { (x_k, y_k, c_k) }{k=1}^N$ (spatial coordinates and scalar value $c_k$), Grid resolution $R$, Correlation range $r{max}$ \ENSURE Feature Vector $V_{Sixpack}$

\STATE Let $T = { (x, y) \mid c > c_{th} }$ and $V = { (x, y) \mid c \le c_{th} }$; \hfill \COMMENT{▷ Active/Passive phases based on threshold}

\STATE $F_{void} \leftarrow$ Area fraction of voids (connected components of $V$); \STATE $N_{clusters} \leftarrow$ Number of connected components of $T$; \STATE $L_{interface} \leftarrow$ Total length of boundary between $T$ and $V$;

\STATE Calculate Nearest Neighbor Distance Distribution (NND) for centroids of clusters in $T$; \STATE Calculate Pair Correlation Function $g(r)$ for $r \in (0, r_{max}]$; \hfill \COMMENT{▷ Probability of finding a point at distance $r$}

\STATE $V_{Sixpack} \leftarrow [F_{void}, N_{clusters}, L_{interface}, \text{Mean}(NND), \text{Var}(NND), \int g(r)dr]$; \hfill \COMMENT{▷ Combine scalar metrics}

\RETURN $V_{Sixpack}$; \end{algorithmic} \end{algorithm}

3. Mixup Barcode Calculation

\begin{algorithm}[H] \caption{Computation of Mixup Barcodes via Canonical Matching (Wagner et al. 2024)} \label{alg:mixup_barcode} \begin{algorithmic}[1] \REQUIRE Point clouds $A, B$, Filtration parameter $\epsilon$ \ENSURE Mixup Barcode $B_{mixup}$

\STATE Construct Vietoris-Rips filtrations $K_A, K_B$ from $A, B$; \STATE Apply infinitesimal perturbation $\eta$ to all filtration values; \hfill \COMMENT{▷ Ensure function values are unique} \STATE $B_{dom} \leftarrow$ Compute Persistent Homology of $K_A$ (e.g., using standard reduction); \hfill \COMMENT{▷ Domain Persistence} \STATE $B_{im} \leftarrow$ Compute Persistent Homology of image persistence $im(f: K_A \rightarrow K_{A \cup B})$; \hfill \COMMENT{▷ Image Persistence}

\STATE $B_{mixup} \leftarrow \emptyset$; \hfill \COMMENT{▷ Initialize resulting barcode}

\FOR{each bar $\gamma = [b_{im}, d_{im}] \in B_{im}$} \STATE $\sigma \leftarrow$ Identify the unique simplex in $K_A$ that creates the cycle at $b_{im}$; \STATE Find the corresponding bar $\delta = [b_{dom}, d_{dom}] \in B_{dom}$ generated by $\sigma$; \STATE $B_{mixup} \leftarrow B_{mixup} \cup { (d_{dom}, d_{im}) }$; \hfill \COMMENT{▷ Form pairs, or triples $(b_{im}, d_{im}, d_{dom})$} \ENDFOR

\RETURN $B_{mixup}$; \end{algorithmic} \end{algorithm}

4. Experimental Framework Structure (Original Notebooks)

The experimental framework is organized into three main stages, implemented across separate notebooks:

Stage 1: Order Parameter Feature Extraction

Notebook: Original Notebooks/Compute OP Features.ipynb

This stage calculates statistical mechanics descriptors based on the spatial distribution of cell types.

Input: Raw simulation data (Pos_*.dat, Types_*.dat)
Methodology:
1. Angular Distribution Order Parameters (OPs):
  - $\Theta(\theta)$: Overall angular distribution
  - $\Theta_B(\theta), \Theta_O(\theta)$: Type-specific angular distributions (Blue/Orange)
  - Measures the directional alignment of neighbors.
2. Radial Distribution Function (RDF):
  - $R(r)$: Overall pair correlation function
  - $R_B(r), R_O(r)$: Type-specific pair correlations
  - Measures the probability of finding a neighbor at distance $r$.
Output: Vector of concatenated OP values.

Stage 2: Topological Feature Extraction

Notebook: Original Notebooks/Compute TDA Features.ipynb

This stage computes topological descriptors using Persistent Homology.

Input: Raw simulation data
Methodology:
1. Vietoris-Rips Filtration: Constructed on point clouds of:
  - Green cells (Type 2)
  - Red cells (Type 1)
  - All cells combined
2. Ordinary Persistence Images (PIs):
  - Computes PIs for Homology dimensions $H_0, H_1, H_2$.
  - Parameters:
    - Weight Function: Linear Persistence ($w = p$)
    - Kernel: Gaussian ($\sigma=0.05$)
    - Resolution: Defined by pixel_size and max_eps
Output: npy files containing stacked Persistence Image vectors for each dimension and sub-population.

Stage 3: Evaluation (Embedding & Classification)

Notebook: Original Notebooks/Embedding and Classification.ipynb

This stage combines features and evaluates their discriminative power.

Input: Extracted OP vectors and TDA vectors.
Methodology:
1. Dimensionality Reduction:
  - PCA (Principal Component Analysis)
  - t-SNE (t-Distributed Stochastic Neighbor Embedding) for 2D visualization.
2. Classification:
  - Train classifiers (e.g., SVM, Random Forest) on the feature vectors.
  - Compare accuracy between OP features and TDA features.

4. Experimental Framework Structure and Methodology

The experimental framework is systematically organized into three distinct stages: Feature Extraction (Order Parameters), Topological Feature Extraction (TDA), and Comparative Evaluation using supervised learning.

Stage 1: Order Parameter (OP) Feature Extraction

Notebook: Original Notebooks/Compute OP Features.ipynb

This stage quantifies spatial ordering using statistical mechanics descriptors derived from particle positions and types.

Input: Raw simulation data (Pos_*.dat, Types_*.dat) representing cell centers and types (Red/Type 1, Green/Type 2).
Methodology:
1. Angular Distribution Order Parameters ($\Theta$):
  - Captures the directional alignment of neighboring cells.
  - Computed for the overall population ($\Theta(\theta)$) and type-specific subpopulations ($\Theta_B(\theta)$ for Blue/Green, $\Theta_O(\theta)$ for Orange/Red).
2. Radial Distribution Function ($R(r)$):
  - Measures the probability of finding a particle at a distance $r$ from a reference particle, normalized by ideal gas density.
  - Homotypic RDFs: $R_B(r)$ (Green-Green), $R_O(r)$ (Red-Red).
  - Heterotypic RDFs: $R(r)$ (All-All), potentially $R_{BO}(r)$ (Cross-correlations).
Output: A high-dimensional feature vector concatenated from discretized $\Theta(\theta)$ and $R(r)$ distributions.

Stage 2: Topological Feature Extraction

Notebook: Original Notebooks/Compute TDA Features.ipynb

This stage extracts robust topological features using Persistent Homology and vectorizes them into Persistence Images (PIs).

Input: Point clouds of Green cells, Red cells, and All cells.
Methodology:
1. Vietoris-Rips Filtration:
  - Constructed independently for three point cloud sets: Green (Type 2), Red (Type 1), and Combined.
2. Persistent Homology Calculation:
  - Computes persistence diagrams $D_k$ for homology dimensions $k \in {0, 1, 2}$.
  - $H_0$: Connected components.
  - $H_1$: Loops/Cycles.
  - $H_2$: Voids/Cavities.
3. Persistence Image (PI) Vectorization:
  - Converts persistence diagrams into vector-space compatible images.
  - Metric Settings:
    - Weight Function: Linear persistence weighting ($w(b, p) = p$) to emphasize high-persistence features.
    - Kernel: Gaussian kernel with $\sigma = 0.05$.
    - Resolution: Pixel size $p_{res} = 0.1$.
  - Filtration Ranges:
    - Dimension 0: Birth $[0, 1]$, Persistence $[0, 10]$.
    - Dimension 1 & 2: Birth $[0, 10]$, Persistence $[0, 5]$ (Max $\epsilon = 10$).
Output: Stacked vectors of Persistence Images for $H_0, H_1, H_2$ across all subpopulations.

Stage 3: Evaluation (Embedding & Classification)

Notebook: Sixpack_Chroma_비교평가.ipynb (and Embedding and Classification.ipynb)

This phase evaluates the discriminative power of the extracted features using diverse classifiers and rigorous validation protocols.

Dimensionality Reduction:
- PCA (Principal Component Analysis): Used for initial feature compression and visualization.
- PHATE: Applied for manifold visualization to reveal intrinsic data geometry.
Classification Models: A suite of classifiers is trained to benchmark feature performance:
1. Support Vector Machines (SVM):
  - RBF Kernel: Tested with $C \in {0.5, 1.0, 2.0}$ and gamma='scale'.
  - Linear Kernel: Tested with $C=1.0$.
2. Random Forest:
  - Ensemble of 100 decision trees (n_estimators=100, random_state=42).
Validation Protocol:
- Stratified k-Fold Cross-Validation: $k=5$ splits with shuffling (random_state=42) to ensure class balance in every fold.
Performance Metrics:
1. Strict Accuracy: Standard classification accuracy.
2. Soft Accuracy: Custom metric that accepts adjacent phases as correct predictions (accounting for continuous phase transitions in the simulation space).
3. F1-Score: Weighted average F1-score to handle potential class imbalances.
4. Reporting: Mean and Standard Deviation of F1-scores across folds to assess model stability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Ordinary Persistence Image (Ord PI) Calculation

2. Sixpack Data Calculation

3. Mixup Barcode Calculation

4. Experimental Framework Structure (Original Notebooks)

Stage 1: Order Parameter Feature Extraction

Stage 2: Topological Feature Extraction

Stage 3: Evaluation (Embedding & Classification)

4. Experimental Framework Structure and Methodology

Stage 1: Order Parameter (OP) Feature Extraction

Stage 2: Topological Feature Extraction

Stage 3: Evaluation (Embedding & Classification)

FilesExpand file tree

Presentation_Prep.md

Latest commit

History

Presentation_Prep.md

File metadata and controls

1. Ordinary Persistence Image (Ord PI) Calculation

2. Sixpack Data Calculation

3. Mixup Barcode Calculation

4. Experimental Framework Structure (Original Notebooks)

Stage 1: Order Parameter Feature Extraction

Stage 2: Topological Feature Extraction

Stage 3: Evaluation (Embedding & Classification)

4. Experimental Framework Structure and Methodology

Stage 1: Order Parameter (OP) Feature Extraction

Stage 2: Topological Feature Extraction

Stage 3: Evaluation (Embedding & Classification)