Skip to content

Switch to AdamW optimizer and expose dropout / weight_decay parameters #19

@smcolby

Description

@smcolby

Problem

ChemPropLightningModule currently uses torch.optim.Adam with no weight decay and no dropout in the FFN head. For datasets in the range of ~4,000–12,000 training records, this risks overfitting during fine-tuning, particularly for the encoder parameters.

Proposed Solution

1. Switch to AdamW:

# moal/model.py — configure_optimizers()
optimizer = torch.optim.AdamW(
    [
        {"params": encoder_params, "lr": self.lr_encoder, "weight_decay": self.weight_decay_encoder},
        {"params": head_params,    "lr": self.lr_head,    "weight_decay": self.weight_decay_head},
    ]
)
  1. Expose constructor parameters:
 class ChemPropLightningModule(pl.LightningModule):
     def __init__(
         self,
         ...
         dropout: float = 0.0,
         weight_decay_encoder: float = 1e-5,
         weight_decay_head: float = 1e-4,
     ):
  1. Pass dropout to ChemProp MPNN (already supported in the chemprop API). Defaults of dropout=0.0 and weight_decay=1e-5/1e-4 are conservative and preserve backward compatibility.

Files

  • moal/model.py

Notes

Weight decay for the encoder should be smaller than for the head (or zero) to avoid disrupting the pretrained CheMeleon features.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions