Skip to content

Added multi cartpole example#48

Open
veds12 wants to merge 20 commits into
aidudezzz:devfrom
eellak:dev
Open

Added multi cartpole example#48
veds12 wants to merge 20 commits into
aidudezzz:devfrom
eellak:dev

Conversation

@veds12

@veds12 veds12 commented Aug 17, 2021

Copy link
Copy Markdown

Description:

  • The multi-cartpole grid example consists of 9 cartpole robots each present in one block of a 3x3 grid
  • Each cartpole robot functions independently of each other
  • The supervisor communicated with each of these 9 robots through separate emitter-receiver channels, i.e. there are 9 pairs of emitter-receiver channels
  • The robots are trained using Proximal Policy Optimisation
  • The supervisor controller can work for more / less number of robots by changing the value of the parameter num_robots while initializing it. The robot controller is agnostic to the number of robots. Hence, this world can be easily extended to different number of robots.

TO DO:

  • 9 cartpole grid env
  • Simultaneous communication using emmiter-receiver scheme working
  • Training
  • Clean the code
  • Change communication scheme
  • Move all the SolidBox nodes to a group node
  • Add reward plotting

@KelvinYang0320

Copy link
Copy Markdown
Member

@veds12 Thank you for this great and interesting example.:thumbsup:
I will do a code review as soon as possible.

Could you move all SolidBox nodes to a group node.
By doing so, I think it will be easier for end-users to understand the scene tree in this example.

@KelvinYang0320 KelvinYang0320 self-requested a review August 17, 2021 14:26

@KelvinYang0320 KelvinYang0320 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great code quality!
I've left a few comments.
Also, please add a readme with showcase & plot to this example as we have done for each example in deepworlds.
Cheers!

Comment thread examples/multi_cartpole/controllers/supervisor_manager/PPO_runner.py Outdated
Comment thread examples/multi_cartpole/controllers/supervisor_manager/PPO_runner.py Outdated
Comment thread examples/multi_cartpole/controllers/supervisor_manager/PPO_runner.py Outdated
Comment thread examples/multi_cartpole/controllers/supervisor_manager/supervisor_controller.py Outdated
@veds12

veds12 commented Aug 18, 2021

Copy link
Copy Markdown
Author

Thanks for the review @KelvinYang0320. I'll make the changes and update the PR as soon as possible

@veds12

veds12 commented Jun 26, 2022

Copy link
Copy Markdown
Author

@KelvinYang0320 I have addressed the requested changes. Please take a look

@KelvinYang0320

Copy link
Copy Markdown
Member

@veds12 Did you use wandb in multi_cartpole?
I think you added evolutionary/cartpole_evolutionary/controllers/supervisor_manager/wandb by mistake.

Also, please add a readme with showcase & plot to this example as we have done for each example in deepworlds.

Please add a readme here.

Could you also fix the coordinate system (NUE → ENU) in this PR? Thank you!

@veds12

veds12 commented Jun 27, 2022

Copy link
Copy Markdown
Author

@KelvinYang0320 Yeah I left the wandb logs by mistake. Forgot to update the gitignore. I'll make the other changes.
Thanks!

@veds12

veds12 commented Jul 3, 2022

Copy link
Copy Markdown
Author
  • I have added a readme and changed the coordinate system (in the controller code).
  • I tried training the example but for some reason it's taking too long to converge / not converging at all. Currently looking into it. I'll add the training plot when I fix this.

poleAngle = [0.0 for _ in range(self.num_robots)]

# Angular velocity x of endpoint
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[3], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]

@KelvinYang0320 KelvinYang0320 Jul 5, 2022

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the reason why it's taking too long to converge / not converging.

Suggested change
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[3], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[4], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]

@KelvinYang0320 KelvinYang0320 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@veds12 I have left a comment about the training issue.
Could you format your code with PEP8?
I will do a review again after that.
Thank you!

@tsampazk

tsampazk commented Jul 7, 2022

Copy link
Copy Markdown
Member

@veds12 I have left a comment about the training issue. Could you format your code with PEP8? I will do a review again after that. Thank you!

Hey @veds12, just a clarification because we have been discussing it with @KelvinYang0320, i think that the only differences from PEP8 are that we use a line limit of 120, and prefer to use mixedCase (sometimes also called lowerCamelCase) instead of snake_case in variable names.

@veds12

veds12 commented Jul 7, 2022

Copy link
Copy Markdown
Author

@tsampazk thanks for the clarification! I'll address the changes this weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants