ENIGMA

Encrypted Neural Inference on Graphs for Molecular Analysis

A secure, Kaggle-style GNN competition for molecular property prediction

Overview

ENIGMA is a competitive benchmarking platform for Graph Neural Networks on molecular graph classification. Participants build GNN models to predict BACE-1 enzyme inhibition — a target relevant to Alzheimer's disease drug discovery — while all submissions are protected by RSA-2048 encryption.

The Task

Given a molecular graph $G = (V, E)$ where:

Nodes $V$ represent atoms with features $\mathbf{x}_v \in \mathbb{R}^d$ encoding atomic properties
Edges $E$ represent chemical bonds with features encoding bond types

Learn a graph-level representation and predict a binary label $y \in {0, 1}$ indicating whether the molecule is an active inhibitor of BACE-1 (Beta-secretase 1).

How ENIGMA Differs from the Standard OGB MolBACE Benchmark

ENIGMA is built on top of the OGB MolBACE dataset, but the competition design departs significantly from the vanilla OGB benchmark. The table below summarises the key differences — bolded entries highlight where ENIGMA introduces something new.

Aspect	OGB MolBACE (Standard)	ENIGMA
Evaluation Metric	ROC-AUC (single metric)	Macro F1 (primary) + Efficiency Score + Cliff Accuracy — a multi-dimensional evaluation that rewards balanced classification, computational frugality, and robustness to activity cliffs
Submission Security	Open upload to OGB evaluation server	RSA-2048 encrypted submissions via GitHub Pull Requests — predictions are chunked and encrypted with OAEP/SHA-256 padding; only CI runners with the private key can decrypt them
Scoring Infrastructure	OGB central evaluation server	Fully automated GitHub Actions CI pipeline — decryption → validation → scoring → leaderboard update, all within an ephemeral runner. No central server required
Data Splits	Scaffold split only	Scaffold split + MMP-OOD (Matched Molecular Pair Out-of-Distribution) — an additional stress-test split where test molecules are drawn from activity cliff pairs and training molecules are scaffold-excluded, evaluating true OOD generalisation
Activity Cliff Evaluation	Not evaluated	Pairwise cliff accuracy — for each activity cliff pair $(A, B)$ where $\text{Tanimoto}(A,B) \geq \tau$ but $y_A \neq y_B$, the model must correctly rank the active molecule above the inactive one
Test Label Access	Publicly evaluable via OGB API	Hidden labels injected at CI time — test labels are stored in a GitHub Secret (`TEST_LABELS_CSV`) or a private repository, never committed to the public repo
Submission Attempts	Unlimited	One submission per team — enforced programmatically by `validate_submission.py`. This encourages careful model selection over leaderboard probing
Graph Data Format	Implicit via PyG (`edge_index`, `data.x`)	Explicit dense adjacency matrices $A$ and node-feature matrices $X$ exported as `.npz` files, alongside the standard PyG format
Efficiency Metric	Not measured	$\text{Eff} = F_1^2 ;/; (\log_{10}(t_{\text{ms}}) \times \log_{10}(p))$ — logarithmic scaling of time and parameters ensures diminishing returns for brute-force scaling; F1 is squared to heavily reward prediction quality
Robustness Testing	Not tested	Adversarial graph perturbations (random edge flips, gradient-based edge removal, feature noise, feature masking) measured via Attack Success Rate
Uncertainty Quantification	Not evaluated	MC Dropout, Conformal Prediction, Temperature Scaling — tools provided to measure epistemic uncertainty, calibration error, and prediction-set coverage
Baseline Suite	Single GCN baseline	Six baseline architectures: GCN, GIN, GraphSAGE (starter), plus D-MPNN and Spectral GNN with Chebyshev convolutions and Laplacian regularisation (advanced)
Leaderboard	Static OGB leaderboard	Interactive HTML/JS leaderboard with Pareto-front visualisation (F1 vs. Efficiency) auto-generated on every merge

Why These Changes Matter

Macro F1 over ROC-AUC — With ~30 % positive class, ROC-AUC can be misleadingly high even when the minority class is poorly predicted. Macro F1 forces balanced performance across both classes.
Encrypted submissions — In open benchmarks, participants can reverse-engineer test labels by submitting carefully constructed probes. RSA-2048 encryption eliminates this attack vector entirely.
MMP-OOD evaluation — Standard scaffold splits can still leak structural similarity. Activity-cliff pairs are the hardest cases in drug discovery; evaluating on them reveals whether a model has truly learned molecular SAR (Structure–Activity Relationships).
Efficiency scoring — Encourages participants to build lightweight, deployable models rather than scaling to impractically large architectures.
One-shot submission — Mimics real-world drug-discovery decisions where you commit resources to a single model. Prevents overfitting to the test set through repeated submissions.

Dataset

We use the OGB MolBACE dataset from the Open Graph Benchmark:

Property	Value
Source	OGB MolBACE
Task	Binary classification (BACE-1 inhibitor: yes/no)
Split	Scaffold-based (prevents structural leakage)
Class balance	~30% positive (imbalanced)

Split	Molecules	Labels	Description
Train	1,210	✅ Provided	Model training
Valid	151	✅ Provided	Hyperparameter tuning
Test	152	🔒 Hidden	Final evaluation (CI-only)

Molecular Features

Each molecule is a graph with:

Node features: 9-dimensional vectors $\mathbf{x}_v \in \mathbb{R}^9$ — atomic number, chirality, degree, formal charge, hydrogen count, hybridization, aromaticity, ring membership
Edge features: 3-dimensional vectors — bond type, stereochemistry, conjugation

Scaffold Split

The scaffold split groups molecules by Bemis-Murcko scaffolds, ensuring structurally different molecules in train/test. This simulates real-world drug discovery where novel molecular scaffolds must be classified.

Class Imbalance

With ~30% positive class, a naive all-zeros classifier achieves ~70% accuracy but poor F1. Participants are encouraged to use class weighting, focal loss, or oversampling.

Graph Specification (Adjacency Matrix A & Node Features X)

Every molecular graph is pre-computed and stored as dense NumPy matrices in data/graphs/:

File	Molecules	Contents
`data/graphs/train_graphs.npz`	1,210	$A$, $X$, $y$
`data/graphs/valid_graphs.npz`	151	$A$, $X$, $y$
`data/graphs/test_graphs.npz`	152	$A$, $X$ only (labels hidden)

For each molecule $i$:

$$A_i \in {0,1}^{n_i \times n_i}, \quad X_i \in \mathbb{R}^{n_i \times 9}$$

where $n_i$ = number of atoms in molecule $i$.

Loading example (Python)

import numpy as np

data = np.load('data/graphs/train_graphs.npz', allow_pickle=False)
indices = data['indices']   # molecule IDs

A = data['adj_2']   # (n, n) binary adjacency matrix
X = data['x_2']     # (n, 9) node feature matrix
y = data['y_2']     # label: 0 or 1

print(f"Molecule 2: {A.shape[0]} atoms, label = {y[0]}")

The same data is available via the OGB API (data.edge_index, data.x). See data/graphs/README_graphs.md for the full format specification.

Evaluation Metric

Primary metric: Macro F1 Score — equally weights performance on both classes:

$$F1_{\text{macro}} = \frac{1}{2}\left(F1_{\text{class}_0} + F1_{\text{class}_1}\right)$$

where $F1_c = \frac{2 \cdot P_c \cdot R_c}{P_c + R_c}$, $P_c = \frac{TP_c}{TP_c + FP_c}$, $R_c = \frac{TP_c}{TP_c + FN_c}$

Why Macro F1? — Treats both classes equally regardless of sample size. Penalises poor performance on the minority class. Standard in molecular property prediction benchmarks.

Secondary Metric: Efficiency Score

$$\text{Efficiency} = \frac{F_1^2}{\log_{10}(\text{time}_{ms}) \times \log_{10}(\text{params})}$$

Logarithmic scaling ensures diminishing-return hardware gains; squaring F1 heavily rewards prediction quality. The leaderboard shows both metrics.

Security Architecture

ENIGMA uses a layered security model to ensure fair, tamper-proof evaluation.

Submission Encryption (RSA-2048)

┌──────────────┐    public_key.pem    ┌──────────────┐    GitHub Secret    ┌──────────────┐
│              │  ─────────────────▶  │              │  ─────────────────▶ │              │
│  Participant │    encrypt.py        │  GitHub PR   │    RSA_PRIVATE_KEY  │   CI Runner  │
│  predictions │  (RSA-2048, OAEP,   │  (.enc file) │    decrypt.py       │  (plaintext) │
│  (.csv)      │   SHA-256, chunked) │              │                     │              │
└──────────────┘                      └──────────────┘                     └──────────────┘
                                                                                 │
                                                                          score + rank
                                                                                 │
                                                                                 ▼
                                                                       Public Leaderboard

How it works:

Encrypt — You run encryption/encrypt.py with the public key. Your CSV is split into 190-byte chunks, each encrypted with OAEP/SHA-256 padding. The .enc file is unreadable without the private key.
Submit — You open a Pull Request containing only the .enc file. Other participants cannot see your predictions.
Decrypt — GitHub Actions injects the private key from a repository secret, decrypts, scores, and deletes the key — all within an ephemeral CI runner.
Publish — Only the final score and rank appear on the public leaderboard.

Label Security

Test labels are never committed to this repository. During CI they are injected via:

GitHub Secret TEST_LABELS_CSV (base64-encoded CSV) — preferred
Private repository enigma-private (cloned with PRIVATE_REPO_TOKEN) — fallback

Getting Started

1. Clone the Repository

git clone https://github.com/muuki2/enigma.git
cd enigma

2. Set Up Environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r starter_code/requirements.txt

3. Run Baselines

cd starter_code
python baseline.py --all       # run GCN, GIN, GraphSAGE
python baseline.py --model gcn # or a specific model

This downloads OGB MolBACE, trains for 50 epochs, generates submissions/{model}_submission.csv, and reports validation F1.

Model	Validation Macro F1
GCN	0.6153
GIN	0.6103
GraphSAGE	0.5835

4. Explore the Data

from ogb.graphproppred import PygGraphPropPredDataset

dataset = PygGraphPropPredDataset(name='ogbg-molbace')
graph = dataset[0]
print(f"Nodes: {graph.num_nodes}, Edges: {graph.num_edges}")
print(f"Node features: {graph.x.shape}, Label: {graph.y.item()}")

Submission Process

Step 1 — Generate Predictions

id,y_pred
0,1
1,0
6,1
...

id: molecule index from data/public/test.csv
y_pred: binary prediction (0 or 1). Legacy column name target is still accepted.

Step 2 — Encrypt

python encryption/encrypt.py \
    submissions/inbox/my_team/run_01/predictions.csv \
    encryption/public_key.pem \
    submissions/inbox/my_team/run_01/predictions.enc

Step 3 — Submit via Pull Request

git add submissions/inbox/my_team/run_01/predictions.enc
git commit -m "Submission: My Team Name"
git push origin my-branch && gh pr create   # or open PR on GitHub

What happens after you submit

Decrypt — CI decrypts your .enc using the private key from GitHub Secrets
Validate — competition/validate_submission.py checks format
Score — competition/evaluate.py computes Macro F1 against hidden labels
Comment — A bot comments on your PR with the score
Leaderboard — Leaderboard and interactive board are updated automatically

Optional: Efficiency Metadata

Include metadata.json alongside your submission to appear with efficiency metrics:

{
  "team_name": "alice",
  "model_name": "MyGNN",
  "submission_type": "human",
  "efficiency_metrics": {"inference_time_ms": 5.2, "total_params": 45000}
}

Use evaluation/speed_benchmark.py to measure these values. See schema/submission_metadata.json for the full schema.

Submission Layout

submissions/inbox/<team>/<run_id>/
├── predictions.enc    # Required (RSA-encrypted)
└── metadata.json      # Optional (efficiency + model info)

Current Leaderboard

Rank	Team	Macro-F1	Efficiency	Params
🥇 1	Baseline-Spectral	0.7215	0.6360	40.4K
🥈 2	Baseline-DMPNN	0.6674	0.0833	53.6K
🥉 3	Baseline-GCN	0.6153	-	-

Interactive Leaderboard

Baseline GNN Architectures

The competition provides three baseline GNN architectures. Below are their message-passing formulations.

Graph Convolutional Network (GCN)

GCN (Kipf & Welling, 2017) performs spectral graph convolutions using a first-order approximation:

$$\mathbf{h}_v^{(l+1)} = \sigma\left(\sum_{u \in \mathcal{N}(v) \cup {v}} \frac{1}{\sqrt{\hat{d}_v \hat{d}_u}} \mathbf{W}^{(l)} \mathbf{h}_u^{(l)}\right)$$

where $\hat{d}_v = 1 + |\mathcal{N}(v)|$ is the augmented degree and $\mathbf{W}^{(l)}$ is a learnable weight matrix.

GraphSAGE

GraphSAGE (Hamilton et al., 2017) learns to aggregate neighborhood features:

$$\mathbf{h}_v^{(l+1)} = \sigma\left(\mathbf{W}^{(l)} \cdot \text{CONCAT}\left(\mathbf{h}_v^{(l)}, \text{AGG}\left({\mathbf{h}_u^{(l)} : u \in \mathcal{N}(v)}\right)\right)\right)$$

where AGG can be mean, max-pool, or LSTM aggregation. Our baseline uses mean aggregation.

Graph Isomorphism Network (GIN)

GIN (Xu et al., 2019) achieves maximal expressive power among message-passing GNNs:

$$\mathbf{h}_v^{(l+1)} = \text{MLP}^{(l)}\left((1 + \epsilon^{(l)}) \cdot \mathbf{h}_v^{(l)} + \sum_{u \in \mathcal{N}(v)} \mathbf{h}_u^{(l)}\right)$$

where $\epsilon$ is a learnable scalar. GIN is as powerful as the Weisfeiler-Lehman graph isomorphism test.

Graph-Level Readout

All models use global mean pooling for graph-level prediction:

$$\mathbf{h}_G = \frac{1}{|V|} \sum_{v \in V} \mathbf{h}_v^{(L)}$$

followed by a linear classifier: $\hat{y} = \sigma(\mathbf{w}^\top \mathbf{h}_G + b)$

Advanced GNN Architectures

Beyond the baselines, we provide two advanced architectures with stronger mathematical foundations.

Directed Message Passing Neural Network (D-MPNN)

D-MPNN (Yang et al., 2019) is an edge-centric GNN designed for molecular graphs that prevents "message backflow" — a key limitation of standard MPNNs.

Message Passing:

$$\mathbf{m}_{uv}^{(l+1)} = \sum_{w \in \mathcal{N}(u) \setminus {v}} f\left(\mathbf{h}_u^{(l)}, \mathbf{m}_{wu}^{(l)}, \mathbf{e}_{uv}\right)$$

$$\mathbf{h}_v^{(l+1)} = g\left(\mathbf{h}_v^{(l)}, \sum_{u \in \mathcal{N}(v)} \mathbf{m}_{uv}^{(l+1)}\right)$$

Key Features:

Messages flow along directed edges
Prevents information from immediately flowing back to source
Edge features are first-class citizens
Particularly effective for molecular property prediction

Implementation: advanced_baselines/dmpnn.py

Spectral GNN with Laplacian Regularization

Our Spectral GNN operates in the graph frequency domain using Chebyshev polynomial approximations.

Chebyshev Convolution:

$$\mathbf{x} * g_\theta \approx \sum_{k=0}^{K-1} \theta_k T_k(\tilde{\mathbf{L}}) \mathbf{x}$$

where:

$\tilde{\mathbf{L}} = \frac{2}{\lambda_{max}} \mathbf{L} - \mathbf{I}$ is the scaled Laplacian
$T_k$ are Chebyshev polynomials: $T_0 = 1, T_1 = x, T_k = 2xT_{k-1} - T_{k-2}$
$\theta_k$ are learnable spectral coefficients

Laplacian Regularization Loss:

We minimize the Dirichlet energy to encourage smoothness:

$$\mathcal{L}_{smooth} = \frac{1}{|V|} \mathbf{h}^\top \mathbf{L} \mathbf{h} = \frac{1}{|V|} \sum_{(i,j) \in E} |\mathbf{h}_i - \mathbf{h}_j|^2$$

Laplacian Positional Encodings:

Optional positional features from Laplacian eigenvectors:

$$\mathbf{L} \mathbf{u}_k = \lambda_k \mathbf{u}_k$$

The first $k$ eigenvectors provide structural position information.

Implementation: advanced_baselines/spectral_gnn.py

Evaluation Dimensions

We evaluate submissions along multiple dimensions beyond raw accuracy.

1. Prediction Quality (Primary)

Macro F1 Score is the primary ranking metric (see Evaluation Metrics).

2. Computational Efficiency

Tracked via the efficiency formula above. We record:

Inference time (ms per batch)
Parameter count
Memory usage
FLOPs estimate

Use the profiler in evaluation/speed_benchmark.py to measure your model.

3. Uncertainty Quantification

Good models should know when they don't know. We provide tools to evaluate:

MC Dropout: Epistemic uncertainty via multiple forward passes with dropout enabled:

$$\sigma^2_{\text{epistemic}} = \frac{1}{T} \sum_{t=1}^{T} \hat{y}_t^2 - \left(\frac{1}{T} \sum_{t=1}^{T} \hat{y}_t\right)^2$$

Conformal Prediction: Distribution-free prediction sets with coverage guarantees:

$$C(x) = {y : s(x, y) \leq \hat{q}}$$

where $\hat{q}$ is calibrated on a holdout set to achieve $(1-\alpha)$ coverage.

Temperature Scaling: Post-hoc calibration via:

$$p(y|x) = \text{softmax}(z/T)$$

with temperature $T$ optimized on validation data.

Metrics:

Expected Calibration Error (ECE)
Brier Score
Empirical Coverage at 90%

Implementation: evaluation/uncertainty.py

4. Adversarial Robustness

We evaluate model robustness to graph perturbations:

Attack Types:

Random Edge Perturbation: Add/remove random edges
Gradient-Based Attack: Remove high-importance edges
Feature Noise: Gaussian noise on node features
Feature Masking: Zero out random features

Metrics:

Robust Accuracy under attack
Attack Success Rate (ASR)

$$\text{ASR} = \frac{|{x : f(x + \delta) \neq y, f(x) = y}|}{|{x : f(x) = y}|}$$

Implementation: evaluation/adversarial.py

5. Pareto Efficiency

We visualize the accuracy-efficiency trade-off:

A model is Pareto optimal if no other model is:

Better in accuracy AND equally efficient, OR
Equally accurate AND more efficient, OR
Better in both

Hypervolume Indicator:

$$\text{HV}(S) = \text{volume dominated by Pareto front } S$$

Higher hypervolume indicates better overall performance.

Visualization: visualization/pareto_plot.py

Tips and Ideas

Additional GNN Architectures

GAT (Graph Attention Network) — attention-weighted message passing
MPNN (Message Passing Neural Network) — edge-conditioned convolutions
AttentiveFP — designed specifically for molecular property prediction
D-MPNN — see our implementation in advanced_baselines/dmpnn.py
Spectral GNN — see our implementation in advanced_baselines/spectral_gnn.py
Ensemble methods — combine multiple architectures

Techniques to Consider

Class weighting — address class imbalance via weighted cross-entropy
Focal loss — down-weight easy examples, focus on hard ones
Laplacian regularization — encourage smooth representations (see Spectral GNN)
Data augmentation — random edge dropping, node feature masking
Different pooling — sum pooling, attention-based pooling, Set2Set
Virtual nodes — add a global node connected to all atoms
Positional encodings — Laplacian eigenvectors, random walk features
Learning rate scheduling — cosine annealing, warm restarts
Early stopping — monitor validation F1 to prevent overfitting

Evaluation Tools

Speed benchmark: evaluation/speed_benchmark.py — profile inference time
Uncertainty: evaluation/uncertainty.py — MC Dropout, Conformal Prediction
Adversarial: evaluation/adversarial.py — robustness testing
Visualization: visualization/pareto_plot.py — Pareto front analysis

Resources

Repository Structure

enigma/
├── competition/                # Competition infrastructure
│   ├── config.yaml             # Single source of truth for all settings
│   ├── evaluate.py             # Scoring entry-point (CI)
│   ├── metrics.py              # Metric computation (Macro-F1, Efficiency)
│   ├── validate_submission.py  # Submission format validation
│   └── render_leaderboard.py   # Leaderboard renderer (Markdown + JS)
├── encryption/                 # RSA-2048 submission encryption
│   ├── encrypt.py              # Encrypt predictions (participant-facing)
│   ├── decrypt.py              # Decrypt submissions (CI-only)
│   └── public_key.pem          # RSA public key (safe to publish)
├── data/
│   ├── public/                 # Public data for participants
│   │   ├── train.csv, valid.csv, test.csv
│   ├── graphs/                 # Pre-computed A and X matrices (.npz)
│   │   ├── train_graphs.npz, valid_graphs.npz, test_graphs.npz
│   │   └── README_graphs.md
│   ├── mmp_split/              # MMP-OOD activity-cliff split
│   └── ogb/                    # OGB dataset (auto-downloaded)
├── submissions/inbox/          # Submit here: inbox/<team>/<run_id>/
├── leaderboard/                # Authoritative CSV + auto-generated Markdown
├── docs/                       # GitHub Pages interactive leaderboard
├── starter_code/               # Baseline GNNs (GCN, GIN, GraphSAGE)
├── advanced_baselines/         # D-MPNN + Spectral GNN
├── evaluation/                 # Speed, uncertainty, adversarial, MMP-OOD
├── visualization/              # Pareto front analysis
├── scripts/                    # Label generation, local tests, MMP evaluation
├── schema/                     # Submission metadata JSON schema
├── .github/workflows/          # CI: decrypt → validate → score → leaderboard
├── requirements.txt            # CI dependencies
└── README.md                   # This file

Rules

No external data: Use only the provided OGB MolBACE dataset
No pre-trained models: Train from scratch; pre-trained molecular embeddings are not allowed
One submission per team: Each team may submit only once — make it count!
One submission per PR: Each pull request should contain exactly one predictions file
Code sharing encouraged: You may share code and ideas, but submit individually
Fair play: Do not attempt to access test labels or exploit the evaluation system
Submission privacy: All submissions must be encrypted using the provided RSA public key. Only final scores and ranks appear on the public leaderboard — private submissions must not be visible
LLM usage restriction: Large Language Models must not be used to fully design the competition, including dataset creation, task definition, or evaluation logic. This competition's dataset (OGB MolBACE) was created by the academic community, and the evaluation logic was designed by the organizer independently
Computational affordability: Full model training must not exceed 3 hours on CPU. The provided dataset (1,210 training molecules, ~30 atoms each) and baseline models (~40K–54K parameters) train in minutes on CPU. Participants should keep model complexity within this budget
Kaggle-style ranking: Tied scores share the same rank on the leaderboard (min method). The next rank after a tie skips accordingly

FAQ

Q: Can I use libraries other than PyTorch Geometric?

Yes. You can use DGL, Spektral, JAX, or any other framework. Ensure your final predictions follow the CSV format.

Q: How do I test locally before submitting?

Use the validation set to evaluate your model locally. Training labels are available via OGB; only test labels are hidden.

Q: Can I submit multiple times?

No. Each team is limited to one submission — make it count! If you need to correct an error, contact the organisers.

Q: How does the automated scoring work?

When you open a PR, GitHub Actions fetches the hidden test labels from a private repository, runs the scoring script, and comments on your PR with the result.

Q: When does the competition end?

This is an ongoing challenge. Top performers will be contacted for the research opportunity.

Acknowledgments

Dataset: Open Graph Benchmark
Original BACE data: MoleculeNet

References and Citations

If you use this challenge or the methods implemented here, please cite the following:

Dataset

Open Graph Benchmark (OGB)

@article{hu2020ogb,
  title={Open Graph Benchmark: Datasets for Machine Learning on Graphs},
  author={Hu, Weihua and Fey, Matthias and Zitnik, Marinka and Dong, Yuxiao and Ren, Hongyu and Liu, Bowen and Catasta, Michele and Leskovec, Jure},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  pages={22118--22133},
  year={2020}
}

MoleculeNet

@article{wu2018moleculenet,
  title={MoleculeNet: A Benchmark for Molecular Machine Learning},
  author={Wu, Zhenqin and Ramsundar, Bharath and Feinberg, Evan N and Gomes, Joseph and Geniesse, Caleb and Pappu, Aneesh S and Leswing, Karl and Pande, Vijay},
  journal={Chemical Science},
  volume={9},
  number={2},
  pages={513--530},
  year={2018},
  publisher={Royal Society of Chemistry}
}

GNN Architectures

GraphSAGE

@inproceedings{hamilton2017inductive,
  title={Inductive Representation Learning on Large Graphs},
  author={Hamilton, William L and Ying, Rex and Leskovec, Jure},
  booktitle={Advances in Neural Information Processing Systems},
  volume={30},
  year={2017}
}

Graph Convolutional Networks (GCN)

@inproceedings{kipf2017semi,
  title={Semi-Supervised Classification with Graph Convolutional Networks},
  author={Kipf, Thomas N and Welling, Max},
  booktitle={International Conference on Learning Representations},
  year={2017}
}

Graph Isomorphism Network (GIN)

@inproceedings{xu2019powerful,
  title={How Powerful are Graph Neural Networks?},
  author={Xu, Keyulu and Hu, Weihua and Leskovec, Jure and Jegelka, Stefanie},
  booktitle={International Conference on Learning Representations},
  year={2019}
}

Directed Message Passing Neural Network (D-MPNN)

@article{yang2019analyzing,
  title={Analyzing Learned Molecular Representations for Property Prediction},
  author={Yang, Kevin and Swanson, Kyle and Jin, Wengong and Coley, Connor and 
          Eiden, Philipp and Gao, Hua and Guzman-Perez, Angel and Hopper, Timothy and 
          Kelley, Brian and Mathea, Miriam and others},
  journal={Journal of Chemical Information and Modeling},
  volume={59},
  number={8},
  pages={3370--3388},
  year={2019},
  publisher={ACS Publications}
}

Spectral Graph Theory

@book{chung1997spectral,
  title={Spectral Graph Theory},
  author={Chung, Fan RK},
  year={1997},
  publisher={American Mathematical Society}
}

Chebyshev Spectral Convolutions

@inproceedings{defferrard2016convolutional,
  title={Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering},
  author={Defferrard, Micha{\"e}l and Bresson, Xavier and Vandergheynst, Pierre},
  booktitle={Advances in Neural Information Processing Systems},
  volume={29},
  year={2016}
}

Conformal Prediction

@article{romano2020classification,
  title={Classification with Valid and Adaptive Coverage},
  author={Romano, Yaniv and Sesia, Matteo and Candes, Emmanuel},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  pages={3581--3591},
  year={2020}
}

Libraries

PyTorch Geometric

@inproceedings{fey2019fast,
  title={Fast Graph Representation Learning with PyTorch Geometric},
  author={Fey, Matthias and Lenssen, Jan Eric},
  booktitle={ICLR Workshop on Representation Learning on Graphs and Manifolds},
  year={2019}
}

Credits

Dataset Creators

Jure Leskovec (Stanford University) — Open Graph Benchmark, GraphSAGE
Weihua Hu (Stanford University) — Open Graph Benchmark
Zhenqin Wu and Vijay Pande (Stanford University) — MoleculeNet

GNN Architecture Authors

William L. Hamilton, Rex Ying, Jure Leskovec — GraphSAGE
Thomas N. Kipf, Max Welling — Graph Convolutional Networks
Keyulu Xu, Weihua Hu, Jure Leskovec, Stefanie Jegelka — Graph Isomorphism Network

Library Developers

Matthias Fey, Jan Eric Lenssen — PyTorch Geometric
Deep Graph Library (DGL) Team — DGL Framework

Special Thanks

BASIRA Lab — Research collaboration and support
Prof. Islem Rekik (Imperial College London) — Mentorship and guidance

Competition Organizer

Murat Kolic — Sarajevo, Bosnia and Herzegovina

Contact

For questions or issues, please open a GitHub Issue.

Organizer: Murat Kolic (@muuki2) Affiliation: BASIRA Lab Location: Sarajevo, Bosnia and Herzegovina

ENIGMA — Encrypted Neural Inference on Graphs for Molecular Analysis

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
advanced_baselines		advanced_baselines
competition		competition
data		data
docs		docs
encryption		encryption
evaluation		evaluation
leaderboard		leaderboard
schema		schema
scripts		scripts
starter_code		starter_code
submissions		submissions
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
leaderboard.md		leaderboard.md
requirements.txt		requirements.txt
scoring_script.py		scoring_script.py
test_submission_example.csv		test_submission_example.csv
update_leaderboard.py		update_leaderboard.py

Folders and files

Latest commit

History

Repository files navigation

ENIGMA

Encrypted Neural Inference on Graphs for Molecular Analysis

Overview

The Task

Learn a graph-level representation and predict a binary label $y \in {0, 1}$ indicating whether the molecule is an active inhibitor of BACE-1 (Beta-secretase 1).

How ENIGMA Differs from the Standard OGB MolBACE Benchmark

Why These Changes Matter

Table of Contents

Dataset

Molecular Features

Scaffold Split

Class Imbalance

Graph Specification (Adjacency Matrix A & Node Features X)

Evaluation Metric

Secondary Metric: Efficiency Score

Security Architecture

Submission Encryption (RSA-2048)

Label Security

Getting Started

1. Clone the Repository

2. Set Up Environment

3. Run Baselines

4. Explore the Data

Submission Process

Step 1 — Generate Predictions

Step 2 — Encrypt

Step 3 — Submit via Pull Request

Optional: Efficiency Metadata

Submission Layout

Current Leaderboard

Baseline GNN Architectures

Graph Convolutional Network (GCN)

GraphSAGE

Graph Isomorphism Network (GIN)

Graph-Level Readout

Advanced GNN Architectures

Directed Message Passing Neural Network (D-MPNN)

Spectral GNN with Laplacian Regularization

Evaluation Dimensions

1. Prediction Quality (Primary)

2. Computational Efficiency

3. Uncertainty Quantification

4. Adversarial Robustness

5. Pareto Efficiency

Tips and Ideas

Additional GNN Architectures

Techniques to Consider

Evaluation Tools

Resources

Repository Structure

Rules

FAQ

Acknowledgments

References and Citations

Dataset

GNN Architectures

Libraries

Credits

Dataset Creators

GNN Architecture Authors

Library Developers

Special Thanks

Competition Organizer

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages