We used policy gradient method with multi-head MLP and RNN to optimize hyperparameters of MLP architucture .We got a very similar results of test loss compared to the baseline model , optimizing 4 hyperparameters :learning_rate,hidden size, weight decay and Batch sizes . Tasks optimized are regression and classification using Tabuler data wine dataset from UCL and Letter Recognition multi-classfication task .More experiments could be done on other data modalities and archituctres.
Other Hyperparameters can be added in the future.also CNN arichtuctre could be experimented usig more compute resourse .
Major modules implemented in the code
- Environment class
- Multi-head MLP policy Network
- RNN policy Network
- Building the Neural architucture
- Baseline using grid search method
Artricle Reference for more details
- perform any sort of preprocessing to the data beforehand, except for label encoding for target and scaling which is done already in the code
- add the url of the data on the config file and the link to the saved model.
- Results are saved in the results file where you can visulize them later.
- you can run the baseline model to compare results.
- below is how to setup the main function and get results.
git clone https://github.com/AMNAALMGLY/HypOptRL.git
pip3 install -r requirements.txt go to src > config.py
e.g To train regression task with MLP Policy MLP neural architucture, run;
python main.py --task regression --model_type MLP --policy MLP python -m src.baseline- Evaluate on more datasets , hyperparameters and architucure
- Experiment with longer training(more epochs)
- Experiment with actor critic (A2C) algorithm
- Improve documentation
(names in alphabetical order)