SplatSLAM is a novel approach to dense Simultaneous Localization and Mapping (SLAM) that achieves photo-realistic 3D reconstructions using 3D Gaussian Splatting, operating in real-time directly from standard RGB video.
Developed as part of the excellence program at Sapienza University of Rome, this project pushes the boundaries of visual SLAM by generating highly detailed and visually compelling 3D models without relying on depth sensors.
- Photo-realistic Quality: Generates dense 3D maps with significantly higher fidelity than traditional mesh or point cloud methods, thanks to 3D Gaussian Splatting.
- RGB-Only Operation: Eliminates the need for depth (RGB-D) sensors, making the system more versatile, robust to sensor noise, and compatible with standard cameras.
- Real-time SLAM Adaptation: Adapts Nerfstudio's
splatfactomethod, originally for offline rendering, into a real-time, incremental mapping pipeline. - Robust Tracking: Implements a photometric error-based tracking module to maintain accurate camera pose estimation, even with fast motion and occlusions.
These videos showcase the dense, photo-realistic 3D scenes generated by SplatSLAM after just a few minutes of processing standard RGB video sequences.
Room Scene Reconstructionroom.webm |
Kitchen Scene Reconstructionkitchen.mp4 |
Living Room (Large-Scale Generalization)
living_room.mp4
SplatSLAM operates in a continuous loop, integrating new video frames to simultaneously track the camera's position and refine the 3D map.
- Input: A stream of RGB images from a camera.
- Tracking: Estimates the camera's pose by minimizing the photometric error between the current frame and a rendered view from the existing map.
- Mapping (Optimization): The core of the system. Keyframes are selected using a covisibility heuristic and used to optimize the parameters of the 3D Gaussian Splatting representation, incrementally building and refining the map.
- Rendering: The final 3D map can be rendered from any viewpoint to produce photo-realistic images.
- Python 3.8+
- A CUDA-compatible NVIDIA GPU (highly recommended).
Our project is built as an extension to the powerful Nerfstudio library.
-
Install Nerfstudio
First, follow the official Nerfstudio installation guide to set up the base framework and its dependencies. We strongly recommend using a virtual environment. -
Clone and Install SplatSLAM
Clone this repository and install our package in editable mode. This will registersplatslamas a new method within your Nerfstudio environment.git clone https://github.com/alessandro-potenza/Gaussian_Splatting_SLAM.git cd Gaussian_Splatting_SLAM # Install our package pip install -e . # Update Nerfstudio's CLI ns-install-cli
-
Verify the Installation
Run thens-trainhelp command. You should seesplatslamlisted as an available method.ns-train -h # Look for "splatslam" in the list of available methods
To train the SplatSLAM model on your own RGB video data, simply use the ns-train command.
Basic Training Command:
# Point to your dataset processed with tools like COLMAP or Polycam
ns-train splatslam --data /path/to/your/datasetFor more details on data preparation and advanced options, please refer to the Nerfstudio documentation.
This research was conducted as part of the excellence program at Sapienza University of Rome. We extend our gratitude to the Nerfstudio team for their exceptional open-source library, which provided a robust foundation for our work.
If you find this work useful for your research, please consider citing our paper (coming soon).
@article{potenza2024splatslam,
title={SplatSLAM: Real-time 3D Mapping from RGB Video},
author={Potenza, Alessandro and [Other Authors]},
journal={arXiv preprint arXiv:XXXX.XXXXX},
year={2024}
}