You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 8, 2026. It is now read-only.
According to the code (https://github.com/openai/phasic-policy-gradient/blob/master/phasic_policy_gradient/train.py#L14), arch 'detach' seems corresponding to the single-network variant described in section 3.6 of the paper. According the paper and the comment in the code, the value function should not be detached from the encoder during aux phase. However, the value function (vfvec) seems always detached according to the code:
According to the code (https://github.com/openai/phasic-policy-gradient/blob/master/phasic_policy_gradient/train.py#L14), arch 'detach' seems corresponding to the single-network variant described in section 3.6 of the paper. According the paper and the comment in the code, the value function should not be detached from the encoder during aux phase. However, the value function (vfvec) seems always detached according to the code:
phasic-policy-gradient/phasic_policy_gradient/ppg.py
Lines 148 to 153 in 7295473
Can you clarify whether it should be detached or not in the aux phase and whether it affects the results reported in the paper?
Thanks