Bug: `RoboticArm` and `AgenticNavigation` sub-task labels are swapped (L4 / Goal-DrivenExecution)

Hi, thanks for open-sourcing this project — the benchmark design and task taxonomy are genuinely well organized, and the released resources are very helpful for the community. I really appreciate the effort your team put into building and maintaining this benchmark.

I noticed a possible labeling issue in the L4 / Goal-DrivenExecution tasks: it seems that the `RoboticArm` and `AgenticNavigation` sub-task labels may have been swapped. The current mappings appear inconsistent with the corresponding task contents and descriptions. It might be worth double-checking the annotations for these two categories.

  | `spatree2` label | session_id | `metricfunc` | question text | actual task |
  |---|---|---|---|---|
  | `AgenticNavigation` | 5618–5867 (250) | `manipulateeval` | "…translation and rotation for the **robot arm's end-effector**… decompose the **end-effector movement** into 7 steps" | **robotic arm** |
  | `RoboticArm` | 5868–6117 (250) | `agenticnaveval` | "**Task: Visual Navigation Action Sequence Generation** … expert **visual navigation agent** … **navigate a robot** …" | **navigation** |

Thanks again for releasing such a valuable benchmark and for your contributions to the community.

`spatree2` label	session_id	`metricfunc`	question text	actual task
`AgenticNavigation`	5618–5867 (250)	`manipulateeval`	"…translation and rotation for the robot arm's end-effector… decompose the end-effector movement into 7 steps"	robotic arm
`RoboticArm`	5868–6117 (250)	`agenticnaveval`	"Task: Visual Navigation Action Sequence Generation … expert visual navigation agent … navigate a robot …"	navigation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: `RoboticArm` and `AgenticNavigation` sub-task labels are swapped (L4 / Goal-DrivenExecution) #1

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: RoboticArm and AgenticNavigation sub-task labels are swapped (L4 / Goal-DrivenExecution) #1

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug: `RoboticArm` and `AgenticNavigation` sub-task labels are swapped (L4 / Goal-DrivenExecution) #1