Added video logging to Tensorboard to enable visual monitoring on remote servers by TarekSaati · Pull Request #109 · aidudezzz/deepworlds

TarekSaati · 2025-04-06T08:53:22Z

Hi everyone!
Since Deepworlds offers robotic training examples in virtualized environments that depend on intense trial and error, it is preferable to run simulations on a remote (Headless) GPU sever that one might not have the ability to open X sessions. However, even on desktop OS variants, visual monitoring is costly and training is often conducted with no-render option for acceleration. In addition, different performance metrics and parameters are monitored and updated periodically through Tensorboard logging. In light of this, it becomes more convenient for learners and developers to periodically monitor agent training visually by logging not only scalars, but videos captured by a front camera mounted on the robot. As one might see, this feature opens the door for not only visual monitoring, but also for visual navigation using advances CNN-based RL models as well!

Here is an example video for the resulted visualization tab added to Tensorboard.

Changes (find_and_avoid_v2_robot_supervisor.py):

add camera device to FindAndAvoidV2RobotSupervisor __init__:

# Camera setup
# Add a camera device to capture frames
self.camera = self.getDevice("camera")  # Ensure you have a camera device named "camera" in your Webots world

implement render function in FindAndAvoidV2RobotSupervisor class:

    def render(self, mode='rgb_array'):
        '''
        We expect `render()` to return a uint8 array with values in [0, 255] or a float array
        with values in [0, 1], also the shape of image must be (channel, hight, width)
        '''
        if mode == 'rgb_array':
            # Capture the current frame from the camera
            frame_str = self.camera.getImage()      # Do NOT use getImageArray()
            H, W = self.camera.getHeight(), self.camera.getWidth()                
            frame = np.zeros(shape=(3, H, W), dtype=np.uint8)
            if frame_str is not None:                
                for x in range(W):
                    for y in range(H):
                        frame[0][y][x] = self.camera.imageGetRed(frame_str, W, x, y)
                        frame[1][y][x] = self.camera.imageGetGreen(frame_str, W, x, y)
                        frame[2][y][x] = self.camera.imageGetBlue(frame_str, W, x, y)
            return frame
        
        elif mode == 'human':
            # Display the frame in a window (optional, requires OpenCV or similar)
            frame = self.render(mode='rgb_array')
            if frame is not None:
                import cv2
                cv2.imshow("Webots Simulation", frame)
                cv2.waitKey(1)
        else:
            raise NotImplementedError(f"Render mode '{mode}' is not supported.")

Changes (training.py):

1- Import Video from logger:

from stable_baselines3.common.logger import HParam, Video

2- Add necessary variables to AdditionalInfoCallback class:
add in the constructor:

        self.frames = []  # List to store frames for GIF creation
        self.episode_cnt = 1
        self.record = False
        self.render_interval=render_interval

3- Implement the on_step() event handler to record frames periodically:

if self.env.done:
            if self.episode_cnt % self.render_interval == 0:
                self.env.camera.enable(self.env.timestep * 10) # basic time step = 32
                self.record = True
            else:
                self.env.camera.disable()
                self.record = False
            self.episode_cnt+=1
            self.frames = []
            print(f'Starting Episode {self.episode_cnt} ...')

4- Add video creation and logging to tensorboard to on_rollout_end() event:

if self.record:
            # Save the frames to tensorboard
            frame = self.env.render(mode='rgb_array') # (c, h, w)
            self.frames.append(frame) 
            video = np.asarray([self.frames])
            self.logger.record("visualization",
                                Video(torch.from_numpy(video), fps=30),
                                exclude=("stdout", "log", "json", "csv"))

5- Add render parameters to run() function:

run(... ,
        log_interval=4,
        render_interval=100)

Notes:

This implementation was tested on Ubuntu 22.04 & NVIDIA GeForce RTX 4090 server.
Following steps are performed on a remote GPU server where you should have sudo & ssh access privileges.
Webots must be installed on your system beforehand (tested with Webots R2023a).

Steps to run:

First lets properly install deepbots. Connect and log into your remote machine then create conda environment:

conda create --name webots python=3.8.20
conda activate webots

Install Deepbots and xvfb wrapper to handle headless operation:

sudo apt install xvfb
pip install setuptools==65.5.0 pip==21 xvfbwrapper wheel==0.38.0
pip install deepbots==1.0.0

Run the following to install the video-enabled Deepworlds:

git clone https://github.com/TarekSaati/deepworlds.git
cd deepworlds
pip install -r requirements/requirements.txt

Now move to the example file with video enabled logging:

cd examples/find_and_avoid_with_video

Run Webots in headless mode using xvfb command:

xvfb-run --auto-servernum webots --mode=fast --no-rendering --stdout --stderr --minimize --batch ./worlds/find_avoid_v2_world.wbt

Now you can monitor your agent while training in a certain difficulty level:
In a new prompt, run Tensorboard command with host 0.0.0.0 option to give clients access to Tensorboard:

tensorboard --logdir=./trained_agent/diff_0_0/ --host 0.0.0.0

Have fun monitoring your training agents!

tsampazk

Thank you @TarekSaati! Amazing work! I see you worked around the current installation issues as well, good job. I discussed with @eakirtas about working on fixing the issues in the very near future and convert everything to gymnasium instead of the old gym version that's causing the issues.

This means #101 this PR will be completed along the one on deepbots. I suggest we go ahead and merge this as-is, since it is very nicely documented and we will take over integrating with gymnasium and any other minor fixes if any, in #101 .

Only thing i need to ask you before merging is to rename the root dir into 'find_and_avoid_v2_video' just to be a little bit clearer that it derives from the v2 to avoid confusion in the future.

tsampazk · 2025-04-07T10:56:26Z

Sorry i added my comment and deleted and re-added as review.

tsampazk reviewed Apr 7, 2025

View reviewed changes

renamed example

869d175

TarekSaati closed this Apr 7, 2025

TarekSaati force-pushed the main branch from 37477b8 to 869d175 Compare April 7, 2025 11:57

tsampazk linked an issue Apr 8, 2025 that may be closed by this pull request

Video-enabled find_avoid example project #108

Closed

tsampazk mentioned this pull request Apr 8, 2025

[Re-opened] Added video logging to Tensorboard to enable visual monitoring on remote servers #110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added video logging to Tensorboard to enable visual monitoring on remote servers#109

Added video logging to Tensorboard to enable visual monitoring on remote servers#109
TarekSaati wants to merge 1 commit into
aidudezzz:devfrom
TarekSaati:main

TarekSaati commented Apr 6, 2025 •

edited

Loading

Uh oh!

tsampazk left a comment

Uh oh!

tsampazk commented Apr 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TarekSaati commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes (find_and_avoid_v2_robot_supervisor.py):

Changes (training.py):

Notes:

Steps to run:

Uh oh!

tsampazk left a comment

Choose a reason for hiding this comment

Uh oh!

tsampazk commented Apr 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TarekSaati commented Apr 6, 2025 •

edited

Loading