The goal of the project is to perform model selection and evaluation of a research work, for the Model Selection for Large Scale Learning course at Grenoble INP - Ensimag, year 2021/2022.
This directory is for simulating fair and robust sample selection on the synthetic dataset. The program needs PyTorch, Jupyter Notebook, and CUDA.
The directory contains a total of 6 files and 2 child directory:
- this README
- 4 python files:
FairRobustSampler.pydefines the FairRobust sampler and a PyTorch dataset for sensitive datamodels.pycontains logistic regression and SVM architecture, a test function and a plotting functionutils.pycontains utility functions for data generationmain.py
- a report as jupyter notebook
synthetic_datacontains 11 numpy files for synthetic data. The synthetic data is composed of training set, validation set, and test set.datasetscontains axlsfile, related to the real credit card clients dataset
To simulate the algorithm, please use the jupyter notebook, which contains detailed instructions, or main.py.
The jupyter notebook will load the data and train the models with two different fairness metrics: equalized odds and demographic parity.
Each training utilizes the FairRobust sampler. The PyTorch dataloader serves the batches to the model via the FairRobust sampler described in the paper. After the training, the test accuracy and fairness will be shown.