Switch to AdamW optimizer and expose dropout / weight_decay parameters

 ## Problem
 
 `ChemPropLightningModule` currently uses `torch.optim.Adam` with no weight decay and no dropout in the FFN head. For datasets in the range of ~4,000–12,000 training records, this risks overfitting during fine-tuning, particularly for the encoder parameters.
 
 ## Proposed Solution
 
 **1. Switch to AdamW:**
 ```python
 # moal/model.py — configure_optimizers()
 optimizer = torch.optim.AdamW(
     [
         {"params": encoder_params, "lr": self.lr_encoder, "weight_decay": self.weight_decay_encoder},
         {"params": head_params,    "lr": self.lr_head,    "weight_decay": self.weight_decay_head},
     ]
 )
```
2. Expose constructor parameters:
```python
 class ChemPropLightningModule(pl.LightningModule):
     def __init__(
         self,
         ...
         dropout: float = 0.0,
         weight_decay_encoder: float = 1e-5,
         weight_decay_head: float = 1e-4,
     ):
```

3. Pass dropout to ChemProp MPNN (already supported in the chemprop API). Defaults of dropout=0.0 and weight_decay=1e-5/1e-4 are conservative and preserve backward compatibility.

## Files

 - moal/model.py

## Notes

Weight decay for the encoder should be smaller than for the head (or zero) to avoid disrupting the pretrained CheMeleon features.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to AdamW optimizer and expose dropout / weight_decay parameters #19

Problem

Proposed Solution

Files

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Switch to AdamW optimizer and expose dropout / weight_decay parameters #19

Description

Problem

Proposed Solution

Files

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions