-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The documentation of this library is very limited and it is hard to parse how it should be applied for different applications. For instance, in Many RL algorithms such as PPO and MPO, an entropy regularization terms or the KL constraints between policies employed to stabilise standard RL objectives. Is it possible to give a practical example of how one can use this library for a dual problem with lagrangian multipliers for such applications?
Many thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request