Proposal: Add comprehensive Link Prediction tutorial to documentation.

1. The Setup & Data Splitting
Starting by loading a standard dataset, like a simple citation network using MLDatasets.jl.
The Code Goal: Using rand_edge_split(g, 0.1) to hold out 10% of the edges for testing.
The Tutorial Explanation: Explain why we split edges. (e.g., "To evaluate our model, we pretend some edges don't exist during training. Our model must learn to predict those hidden edges based on the remaining graph structure.")
2. Negative Sampling 
I would need to generate "fake" edges. 
The Code Goal: Writing a small function to randomly sample pairs of nodes that do not currently have an edge connecting them.
The Tutorial Explanation: Clearly explain that for every "positive" edge in the training set, we need a "negative" edge (label 0) to compute our loss.
3. The Encoder-Decoder Model
Link prediction usually uses a two-part architecture. I will use Flux.jl for this.
The Encoder (GNN.jl): A standard 2-layer Graph Convolutional Network (GCNConv) or GraphSAGE layer. This takes the node features and the training graph, and outputs node embeddings.
The Decoder (Standard Julia/Flux): A function that takes the embeddings of two nodes (a source and a target) and computes their dot product. If the dot product is high, the model predicts an edge exists.
4. The Loss Function & Training Loop
The Loss: Because we are predicting 1 (edge) or 0 (no edge), we use Binary Cross-Entropy with Logits (Flux.logitbinarycrossentropy).
The Loop: 1. Passing the graph through the Encoder to get node embeddings.
2. Passing the positive and negative edge indices to the Decoder to get predictions.
3. Calculating the loss against the true labels (1s for real edges, 0s for fake edges).
4. Updating the GNN weights.
5. Evaluation
Calculating the Area Under the ROC Curve (AUC-ROC) on the test edges I hid in Step 1. Would this be a suitable example?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add comprehensive Link Prediction tutorial to documentation. #669

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Proposal: Add comprehensive Link Prediction tutorial to documentation. #669

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions