Skip to content

Latest commit

 

History

History
190 lines (132 loc) · 5.91 KB

File metadata and controls

190 lines (132 loc) · 5.91 KB

简体中文

Discord Please go through the whole process on a Ubuntu system.

Tutorial Videos

We've released the following tutorials for training and deploying a reinforcement learning policy. Please check it out on Bilibili or YouTube!

Sim-to-Sim

# segmentation debug tools install
sudo apt-get install libdw-dev
wget https://raw.githubusercontent.com/bombela/backward-cpp/master/backward.hpp
sudo mv backward.hpp /usr/include

# Dependency install (python3.10)
pip install pybullet "numpy < 2.0" mujoco
git clone --recurse-submodule https://github.com/DeepRoboticsLab/Lite3_rl_deploy.git

# compile
mkdir build && cd build
cmake .. -DBUILD_PLATFORM=x86 -DBUILD_SIM=ON -DSEND_REMOTE=OFF

# Explanation
# -DBUILD_PLATFORM:device platform,Ubuntu is x86,quadruped is arm
# -DBUILD_SIM:whether or not to use simulatior, if deployed on real robots, set to OFF 
make -j
# run (open 2 terminals)
# Terminal 1 (pybullet)
cd interface/robot/simulation
python pybullet_simulation.py

# Terminal 1 (mujoco)
cd interface/robot/simulation
python mujoco_simulation.py

# Terminal 2 
cd build
./rl_deploy

Usage(Terminal 2)

tips:right click simulator window and select "always on top"

  • z: default position
  • c: rl control default position
  • wasd:forward/leftward/backward/rightward
  • qe:clockwise/counter clockwise

Sim-to-Real

This process is almost identical to simulation-simulation. You only need to add the step of connecting to Wi-Fi to transfer data, and then modify the compilation instructions. Currently, the default real-machine control mode is Retroid controller mode. If you need to use keyboard mode, you can change state_machine/state_machine.hpp line121 to

uc_ptr_ = std::make_shared<KeyboardInterface>();

modify this file jy_exe/conf/network.toml to this content:

ip = '192.168.2.1'
target_port = 43897
local_port = 43893

ips = ['192.168.1.103']
ports = [43897]
# apply code_modification

# computer and gamepad should both connect to WiFi
# WiFi: Lite*******
# Passward: 12345678 (If wrong, contact technical support)

# scp to transfer files to quadruped (open a terminal on your local computer)
scp -r ~/Lite3_rl_deploy ysc@192.168.2.1:~/

# ssh connect for remote development
# Username	Password
# ysc		' (a single quote)
ssh ysc@192.168.2.1
# enter your passward, the terminal will be active on the qurdruped computer

# compile
cd Lite3_rl_deploy
mkdir build && cd build
cmake .. -DBUILD_PLATFORM=arm -DBUILD_SIM=OFF -DSEND_REMOTE=OFF 
# Explanation
# -DBUILD_PLATFORM:device platform,Ubuntu is x86,quadruped is arm
# -DBUILD_SIM:whether or not to use simulatior, if deployed on real robots, set to OFF 
make -j
./rl_deploy

Usage(Retroid gamepad)

Please refer to https://github.com/DeepRoboticsLab/gamepad

Model Conversion

To run the policy file trained with RL, you need to link the onnxruntime library, which supports models in the .onnx format. Therefore, you must manually convert the .pt model to the .onnx format.

You can convert the .pt model to the .onnx model by running the pt2onnx.py file in the policy folder. Pay attention to the program output to compare the consistency between the two models.

First, configure and verify the program runtime environment:

pip install torch numpy onnx onnxruntime

python3 -c 'import torch, numpy, onnx, onnxruntime; print(" All modules OK")'

Then, run the program:

cd your/path/to/LITE3_RL_DEPOLY/policy/

python pt2onnx.py

Afterward, you will see the corresponding .onnx model file in the current folder.

state_machine

graph LR
A(Idle) -->B(StandUp) --> C(RL) 
C-->D(JointDamping)
B-->D
D-->A

Loading

The state_machine module is where Lite3 switches between different states, the different states represent the following functions:

1.Idle : Idle state, indicating that the robot is in a situation where the joints do not enabled.

2.StandUp : Stand up state, indicating the action of the robot dog from sit to stand.

3.RL : RL control state,indicating the action output by the robot execution strategy.

4.JointDamping : Joint damping state, indicating that the joints of the robot are in the damping control state

interface

graph LR
A(Interface) -->B(Robot)
A --> C(User Command)
B-->D(simulation)
B-->E(hardware)
C-->F(gamepad)
C-->G(keyboard)

Loading

The interface module represents the inputs for the dog's data receiving and sending interface and joystick control. Among them, the inputs of the robot platform are divided into simulation and physical, and the inputs of the controller are divided into keyboard and joystick control.

run_policy

graph LR
A(policy_runner_base) -->B(policy_runner)


Loading

This section is used to execute the output of the RL policy, new policies can be implemented by inheriting policy_runner_base.

The policy runner is compiled as a separate static library (librun_policy.a). This means editing the policy only recompiles that one translation unit — the state machine and main.cpp object files are left untouched, making incremental rebuilds much faster.

The onnxruntime C++ API headers are included only inside run_policy/lite3_test_policy_runner_onnx.cpp (via a PIMPL pattern). Callers only see the thin lite3_test_policy_runner_onnx.h header, which has no onnxruntime dependency.

Faster Linking

The default GNU linker (ld) is single-threaded. To speed up the link step, install mold and enable it with a CMake flag:

sudo apt install mold

# add -DUSE_MOLD_LINKER=ON to your cmake command, e.g.:
cmake .. -DBUILD_PLATFORM=x86 -DBUILD_SIM=ON -DUSE_MOLD_LINKER=ON
make -j

mold is a parallel linker and is typically 5–10× faster than ld for this project.