Skip to content

Yangbadger222/VLN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VLN NaVIDA Deployment

This repo prepares a Jetson vehicle runtime that sends camera frames to a remote RTX 4070 inference service running waynechu/NaVIDA, then publishes safe geometry_msgs/msg/Twist commands to the chassis.

Runtime Chain

  1. Jetson camera node publishes /navida/camera/image_raw.
  2. Jetson controller posts recent camera history + current image + instruction to your remote inference endpoint.
  3. The 4070 service runs the NaVIDA backend and returns action chunks.
  4. Jetson executes the first safe action chunk as a bounded /cmd_vel pulse.
  5. serial_twistctl subscribes /cmd_vel and writes STM32 serial commands like vcx=0.200,wc=0.800.

The copied chassis bridge lives under ros2_ws/src/sensor_drivers/serial_twistctl, with its local serial dependency in ros2_ws/src/sensor_drivers/serial.

Local Smoke Test

python3 -m pip install -e ".[dev]"
python3 scripts/run_inference_server.py --backend mock --host 127.0.0.1 --port 50051

In another shell:

python3 scripts/run_http_client.py
python3 scripts/run_ros2_node.py
python3 -m pytest -q

4070 Inference Host

Run this on your remote GPU inference host:

git clone https://github.com/Yangbadger222/VLN.git
cd VLN
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install --index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio
python -m pip install -e ".[server]"
python scripts/run_inference_server.py --backend hf --host 0.0.0.0 --port 50051 --model-id waynechu/NaVIDA --device cuda --load-in-4bit

Health check:

curl http://127.0.0.1:50051/health

Use --backend mock first if CUDA/model dependencies are not ready yet.

Jetson Vehicle Host

Run this on your Jetson vehicle host:

git clone https://github.com/Yangbadger222/VLN.git
cd VLN
python3 -m pip install --user -U "pip>=24" "setuptools>=68,<80" wheel
python3 -m pip install --user --no-build-isolation -e .
sudo apt update
sudo apt install -y python3-opencv
cd ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build --symlink-install
source install/setup.bash
ros2 launch navida_vehicle navida_jetson.launch.py \
  inference_url:=http://REMOTE_INFERENCE_HOST:50051/v1/infer \
  serial_port:=/dev/serial_twistctl \
  instruction:="Go to the target object in front of you. Approach slowly and stop when close." \
  history_size:=4

camera_device now defaults to auto, which probes /dev/video0 through /dev/video5 and picks the first device that can return a frame. Override it explicitly with camera_device:=/dev/video4 when you already know the correct capture node.

The Jetson-side bridge now JPEG-compresses ROS image frames before posting them to the 4070. It also sends a rolling history of prior compressed frames (history_size, default 4) plus the current frame so NaVIDA can reason over observation history instead of a single still image. inference_timeout_s defaults to 20.0 and can be raised further during first on-car tests.

Jetson uses ROS 2 Humble's colcon-core, which currently requires setuptools<80. Keep the user-level setuptools>=68,<80 pin above; it supports editable installs without breaking colcon build.

If /dev/serial_twistctl does not exist yet, launch with the actual device, for example serial_port:=/dev/ttyUSB0. If turning is reversed, add angular_z_scale:=-1.0.

ROS 2 Topics

  • /navida/camera/image_raw: sensor_msgs/msg/Image, published by navida_vehicle camera_publisher.
  • /cmd_vel: geometry_msgs/msg/Twist, published by navida_vehicle remote_controller.
  • serial_twistctl_node subscribes /cmd_vel and sends serial chassis commands.

Safety Defaults

  • max_linear_x: 0.2
  • max_angular_z: 0.45
  • command_timeout_s: 0.5
  • every non-stop command is followed by an explicit zero Twist after step_duration_s
  • visual_servo_enabled: true
  • target_forward_speed: 0.1
  • target_max_angular_z: 0.25
  • target_stop_area: 0.3
  • history_size: 4
  • inference failure immediately publishes zero Twist

Tune these in ros2_ws/src/navida_vehicle/config/navida_jetson.yaml or through launch arguments.

Paper-Faithful NaVIDA Mode

By default, the 4070 service asks NaVIDA to use historical observations plus the current observation and return compact action chunks:

{"actions":[{"action":"forward","repeat":1}]}

The Jetson executes action chunks through the existing speed clamps and pulse-stop safety layer. This is the primary runtime path for paper-faithful VLN/VLA behavior.

Optional Target Detector Fallback

For semantic goal debugging such as "go to the box/chair/door", you can opt into open-vocabulary detector fallback:

python scripts/run_inference_server.py --backend hf --host 0.0.0.0 --port 50051 \
  --model-id waynechu/NaVIDA --device cuda --load-in-4bit \
  --target-detector-model-id google/owlvit-base-patch32 \
  --target-detector-fallback

When fallback is enabled and target metadata is present, Jetson can use visual servoing metadata:

{"target":{"visible":true,"center_x":0.50,"area":0.10,"confidence":0.80},"actions":["forward"]}

When this metadata is present, Jetson uses center_x and area to turn slowly toward the target, drive when centered, and stop when the target is close. If metadata is missing, the controller falls back to action chunks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors