Skip to content

ciao-group/biomech-rl-tutorial

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MyoSuite Finger Pose Tutorial

This project provides a simple tutorial for setting up, customizing, training, and visualizing a musculoskeletal finger model using MyoSuite, Gymnasium, and Stable-Baselines3.

Google Colab

Open In Colab

Click the badge above to open the tutorial notebook directly in Google Colab. The notebook will automatically clone this repository and install all dependencies. Just run the cells sequentially!

Project Structure

  • Colab_Tutorial.ipynb: An interactive Jupyter Notebook to run the entire workflow in Google Colab.
  • reward_tutorial.py: Defines a custom environment class (CustomRewardPoseEnv) that inherits from MyoSuite's PoseEnvV0. It demonstrates how to override the reward function and inject custom rewards (e.g., an "efficiency" reward).
  • train.py: Trains a Proximal Policy Optimization (PPO) agent using Stable-Baselines3 on the custom environment defined in reward_tutorial.py.
  • visualize.py: Loads the trained PPO model, runs it in the custom environment, and renders the episode. The rendered frames are saved as an MP4 video in the video/ directory.

Installation

This project uses uv as its package manager for extremely fast dependency resolution and environment isolation.

  1. Install uv if you haven't already:
pip install uv
  1. Sync the environment and install dependencies defined in pyproject.toml:
uv sync

Alternatively, you can manually run a script using uv to handle the environment automatically:

uv run python train.py

Workflow: Human-in-the-Loop Reward Tuning

The setup is designed to allow you to easily modify what the agent "cares about" and immediately see the results.

1. Modify the Reward

Open reward_tutorial.py. Inside the __init__ method, you can adjust the default_weights or modify the custom_reward_weights passed in train.py. You can also inject entirely new reward logic inside the get_reward_dict function.

2. Train the Agent

Run the training script to train a new agent based on your customized reward logic:

uv run python train.py

This will train the agent for 100,000 timesteps across 10 parallel environments and save the model to models/ppo_myofinger_custom_reward.zip.

3. Visualize the Results

Once training is complete, run the visualization script to watch the agent's behavior:

uv run python visualize.py

This will run the trained policy and render an off-screen video. The resulting video will be saved as video/episode_custom_reward.mp4. Open this file to see how your reward changes affected the finger's movement!

Note on Rendering

MyoSuite handles rendering through the MuJoCo physics engine natively (env.mj_renderer.render_offscreen). The visualize.py script automatically manages fetching these frames and compiling them into a video using imageio.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 63.5%
  • Jupyter Notebook 36.5%