-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Thanks for sharing the code!
I have some problem about how to select optimizer when training diffmask. I find that Lookahead RMSprop is used in 'How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking'. But in this work, RMSProp is chosed. Why you change the optimizer? Does the choice of optimizer affect the result a lot?
It will help me a lot if I can get some advice!
Metadata
Metadata
Assignees
Labels
No labels