Temporal Difference Flows

This repository provides the code for Conditional Flow Matching and Temporal Difference Flows.

Experiments

The experiments were performed for the following tasks in the PointMass Maze environment (from dm_control):

Reach Top Left: The agent must navigate to the upper-left corner of the maze.
Reach Top Right: The agent must navigate to the upper-right corner of the maze.

These tasks evaluate the ability of Temporal Difference Flows to capture the discounted occupancy distribution (Successor Measure).

Installation

uv sync

Data Setup

download_exorl.sh

Usage

1. Expetr Policy Training

To train the agent with expert policies run demo_td3.ipynb for tasks reach_top_right and reach_top_left.

This gives td3_point_mass_expert_{task}.zip and agent_trajectory_{task}.gif.

This is necessary to reproduce the results from the article.

2. Run Training

To launch the training process, run the following command in your terminal:

python3 -m train --task reach_top_left --num_epochs 100 --loss_type td2_cfm

Arguments:

--task: The target environment task. Either reach_top_left (default) or reach_top_right

--num_epochs: Number of training epochs (integer). Default: 100.

--loss_type: The objective function used for training: either td_cfm or td2_cfm.

For PointMass Maze tasks, we recommend at least 500 epochs to achieve high-fidelity Successor Measure approximations as described in the original article.

This generates

checkpoints/{loss_type}_model_{task}_epoch_{epoch}.pth
checkpoints/{loss_type}_model_{task}.pth

3. Alternative Training

Run demo_tdflow.ipynb providing necessary configuration in Google Colab .

4. Evaluation

To launch evaluation, run the following command in your terminal:

python3 -m evaluate --task reach_top_left --model td2_cfm --epoch None

Arguments:

--task: The target environment task. Either reach_top_left (default) or reach_top_right.

--model: model obtained from the objective function used for training: either td_cfm or td2_cfm (default).

--epoch: uploading model from a given checkpoint (checkpoints are provided for multiples of 5 epochs). The final model can be obtained setting epoch to None.

This generates evaluation metrics (with standard deviations) for a task.

Additional Notebooks

To demonstarte that Conditional Flow Matching is implemented correctly, we provide conditioned 2D guassian mixtures example in conditional_flow_matching.ipynb.

More examples of Flow Matching performance can be found at https://github.com/GerasimovSergey2001/FlowMatching

Model Storage

Models' weights are stored in https://huggingface.co/SergeiGerasimov/TDFlow

Models' losses can be found at https://wandb.ai/gerasimov-serf/TDFlow-Project/table?nw=nwusergerasimovserf

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
checkpoints		checkpoints
custom_dmc_tasks		custom_dmc_tasks
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
agent_trajectory_reach_top_left.gif		agent_trajectory_reach_top_left.gif
agent_trajectory_reach_top_right.gif		agent_trajectory_reach_top_right.gif
conditional_flow_matching.ipynb		conditional_flow_matching.ipynb
demo_td3.ipynb		demo_td3.ipynb
demo_tdflow.ipynb		demo_tdflow.ipynb
download_exorl.sh		download_exorl.sh
evaluate.py		evaluate.py
model_estimation.ipynb		model_estimation.ipynb
pyproject.toml		pyproject.toml
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal Difference Flows

Experiments

Installation

Data Setup

Usage

1. Expetr Policy Training

2. Run Training

3. Alternative Training

4. Evaluation

Additional Notebooks

Model Storage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Temporal Difference Flows

Experiments

Installation

Data Setup

Usage

1. Expetr Policy Training

2. Run Training

3. Alternative Training

4. Evaluation

Additional Notebooks

Model Storage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages