Welcome to PROVEN-GNN (PROgram Vulnerability Examination Network using Graph Neural Networks), a competition designed to compare human-built Graph Neural Network (GNN) solutions against Large Language Model (LLM)-based approaches.
The objective is to classify code functions as vulnerable or non-vulnerable using graph representations of source code.
| Aspect | Details |
|---|---|
| Task Type | Binary Graph Classification |
| Evaluation Metric | Macro F1-Score |
| Dataset | Inspired by DiverseVul |
- Original Dataset: DiverseVul Dataset
- Citation: Chen, Yizheng, et al. "DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning-Based Vulnerability Detection." Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID), 2023.
The dataset was adapted by generating graph representations for a subset of code functions. It includes 2,487 training graphs and 622 test graphs. The class distribution is imbalanced, with vulnerable code representing only 29% of the data.
We used Joern to construct Code Property Graphs (CPG), which combine:
- Abstract Syntax Tree (AST)
- Control Flow Graph (CFG)
- Program Dependence Graph (PDG)
- Control Dependence Graph (CDG)
- Data Dependence Graph (DDG)
The following figure shows an example of a generated CPG:
-
Node Type (15 features) Encodes the type of node (e.g., method, function call, variable declaration, etc.)
-
Code Embedding (512 features) Dense embedding representing the code snippet associated with the node.
Edge features represent the type of relationship between nodes (e.g., AST, CFG, PDG, CDG, DDG).
PROVEN-GNN
βββ data/
β βββ public/
β β βββ train_data.csv (to be downloaded)
β β βββ test_data.csv (to be downloaded)
β β βββ test_ids.csv
β β βββ sample_submission.csv
β β βββ README.md
βββ competition/
β βββ config.yaml
β βββ update_leaderboard.py
β βββ evaluate.py
β βββ render_leaderboard.py
β βββ validate_submission.py
βββ submissions/ (place submission files here)
β βββ README.md
βββ leaderboard/
β βββ leaderboard.csv
β βββ leaderboard.md
βββ docs/
β βββ leaderboard.html
β βββ leaderboard.css
β βββ leaderboard.js
βββ .github/workflows/
βββ score_submission.yml
βββ publish_leaderboard.yml
git clone https://github.com/abdksm/PROVEN-GNN.git
cd PROVEN-GNNpip install -r starter_code/requirements.txtcd data/public
pip install gdown
gdown --id 1kUNwo7WjVpJ2D1GPsotiNO5FJnCqt--9 -O train_data.parquet
gdown --id 1xhg62LTAJm5ityBKiXKv8Rsg0eSl9TJC -O test_data.parquetcd ../../starter_code
python baseline.pyid,y_pred
0,1
1,0
2,1
...{
"team": "example_team",
"run_id": "example_run_id",
"type": "human",
"model": "GAT",
"notes": "Additional notes"
}type must be one of:
"human""llm-only""human+llm"
You should be in the root directory, and the file predictions.csv should be inside the submissions/ directory
# For Windows users
.\extra\encrypt_win.ps1 submissions\predictions.csv
# For Linux users
bash extra/encrypt_linux.sh submissions/predictions.csv- Fork this repository
- Train your model and generate predictions
- Add
predictions.csvandmetadata.jsonto thesubmissions/directory - Run the encryption script (A critical step !)
- Create a Pull Request
- GitHub Actions automatically evaluate your submission
- Results are posted as a comment and added to the leaderboard
The evaluation metric is Macro F1-Score:
Rankings are sorted by descending score.
After submission, scores are automatically added to:
leaderboard/leaderboard.csvleaderboard/leaderboard.md
An interactive leaderboard is available here:
π Live Leaderboard
- β No external or private datasets
- β No manual labeling of test data
- β No modification of evaluation scripts
- β Unlimited offline training is allowed
β οΈ Only one submission per user is allowed
Violations may result in disqualification.
@misc{proven_gnn_2025,
title={PROVEN-GNN: Program Vulnerability Examination Network},
author={Abderrahmane Kasmi},
year={2025},
url={https://github.com/abdksm/PROVEN-GNN}
}This project is licensed under the MIT License. See the LICENSE file for details.
