Skip to content

lifuguan/IGGT_official

Repository files navigation

logo IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

This is the official repository for the paper:

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Hao Li*, Zhengyu Zou*, Fangfu Liu*, Xuanyang Zhang, Fangzhou Hong, Yukang Cao, Yushi Lan, Manyuan Zhang, Gang Yu, Dingwen Zhang, and Ziwei Liu

*Equal Contribution, Project Leader, emailCorresponding author.

arXiv Paper   |   Website Website   |   HuggingFace Data   |   HuggingFace Benchmark   |   HuggingFace Model

IGGT Demo

🔍 Overview

IGGT introduces a novel transformer-based architecture for semantic 3D reconstruction that grounds instance-level understanding in geometric representations. Our method achieves state-of-the-art performance on multiple benchmarks while maintaining computational efficiency.

Key Features:

  • 🎯 Instance-grounded 3D feature learning
  • 🏗️ Geometry-aware transformer architecture
  • 📊 State-of-the-art performance on ScanNet and InsScene-15K
  • ⚡ Efficient inference with multi-view consistency

📝 To-Do List

  • Release project paper
  • Release Benchmark (Segmentation, Track)
  • Release InsScene-15K dataset
  • [] Release codebase
    • Release model code
    • [] Release downstream task scripts
  • Release pretrained models

🚀 Quick Start

Installation

To set up the environment for this project, please follow these steps:

  1. Create a new Conda environment with Python 3.10.0:

    conda create -n iggt python=3.10.0
    conda activate iggt
  2. Install the required dependencies:

    pip install -r requirements.txt

Note: To accelerate clustering (DBSCAN) significantly, we highly recommend installing cuml from RAPIDS. Please refer to the official installation guide to choose the appropriate version for your system.

Running the Demo

We provide demo.py to demonstrate IGGT's capabilities in 3D scene reconstruction and segmentation.

1. Data Organization

We provide sample scenes in the iggt_demo directory (e.g., iggt_demo/demo1 to iggt_demo/demo9). For your own data, please organize it with the following structure:

scene_name/
└── images/           # Input images (sorted by filename)
    ├── 00000.jpg
    ├── 00001.jpg
    └── ...

(Optional) For evaluation against ground truth:

scene_name/
├── depth/            # Ground truth depth maps
└── cam/              # Camera parameters (.npz files)

2. Usage

Configure the paths in demo.py:

  • MODEL_PATH: Path to the pretrained checkpoint.
  • TARGET_DIR: Path to your input data directory.
  • SAVE_DIR: Path where results will be saved.

You can also adjust the CLUSTERING_CONFIG in demo.py to optimize segmentation results:

  • eps: DBSCAN epsilon parameter (default: 0.01). Controls the maximum distance between points to be considered neighbors.
  • min_samples: Minimum samples for a core point (default: 100).
  • min_cluster_size: Minimum size for a valid cluster (default: 500).
  • knn_k: Number of neighbors for spatial smoothing (default: 20).

Then run the script:

python demo.py

The script will generate:

  • 3D Visualizations: .glb files for RGB, Mask, and PCA features.
  • Depth Maps: Visualizations with various colormaps in pred_depths/.
  • Segmentation: DBSCAN and PCA masks in dbscan_masks/ and colored_pca/.

IGGT Demo Example

Figure: Example 3D scene segmentation and reconstruction by IGGT.

✏️ Citation

If you find our code or paper helpful, please consider starring ⭐ us and citing:

@article{li2025iggt,
  title={IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction},
  author={Li, Hao and Zou, Zhengyu and Liu, Fangfu and Zhang, Xuanyang and Hong, Fangzhou and Cao, Yukang and Lan, Yushi and Zhang, Manyuan and Yu, Gang and Zhang, Dingwen and others},
  journal={arXiv preprint arXiv:2510.22706},
  year={2025}
}

📄 License

This project is released under the MIT License. See LICENSE for details.

About

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published