IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

This is the official repository for the paper:

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Hao Li*, Zhengyu Zou*, Fangfu Liu*, Xuanyang Zhang, Fangzhou Hong, Yukang Cao, Yushi Lan, Manyuan Zhang, Gang Yu, Dingwen Zhang^†, and Ziwei Liu

^*Equal Contribution, ^†Project Leader, Corresponding author.

Paper | Website | Data | Benchmark | Model

🔍 Overview

IGGT introduces a novel transformer-based architecture for semantic 3D reconstruction that grounds instance-level understanding in geometric representations. Our method achieves state-of-the-art performance on multiple benchmarks while maintaining computational efficiency.

Key Features:

🎯 Instance-grounded 3D feature learning
🏗️ Geometry-aware transformer architecture
📊 State-of-the-art performance on ScanNet and InsScene-15K
⚡ Efficient inference with multi-view consistency

📝 To-Do List

Release project paper
Release Benchmark (Segmentation, Track)
Release InsScene-15K dataset
[] Release codebase
- Release model code
- [] Release downstream task scripts
Release pretrained models

🚀 Quick Start

Installation

To set up the environment for this project, please follow these steps:

Create a new Conda environment with Python 3.10.0:

conda create -n iggt python=3.10.0
conda activate iggt

Install the required dependencies:
```
pip install -r requirements.txt
```

Note: To accelerate clustering (DBSCAN) significantly, we highly recommend installing cuml from RAPIDS. Please refer to the official installation guide to choose the appropriate version for your system.

Running the Demo

We provide demo.py to demonstrate IGGT's capabilities in 3D scene reconstruction and segmentation.

1. Data Organization

We provide sample scenes in the iggt_demo directory (e.g., iggt_demo/demo1 to iggt_demo/demo9). For your own data, please organize it with the following structure:

scene_name/
└── images/           # Input images (sorted by filename)
    ├── 00000.jpg
    ├── 00001.jpg
    └── ...

(Optional) For evaluation against ground truth:

scene_name/
├── depth/            # Ground truth depth maps
└── cam/              # Camera parameters (.npz files)

2. Usage

Configure the paths in demo.py:

MODEL_PATH: Path to the pretrained checkpoint.
TARGET_DIR: Path to your input data directory.
SAVE_DIR: Path where results will be saved.

You can also adjust the CLUSTERING_CONFIG in demo.py to optimize segmentation results:

eps: DBSCAN epsilon parameter (default: 0.01). Controls the maximum distance between points to be considered neighbors.
min_samples: Minimum samples for a core point (default: 100).
min_cluster_size: Minimum size for a valid cluster (default: 500).
knn_k: Number of neighbors for spatial smoothing (default: 20).

Then run the script:

python demo.py

The script will generate:

3D Visualizations: .glb files for RGB, Mask, and PCA features.
Depth Maps: Visualizations with various colormaps in pred_depths/.
Segmentation: DBSCAN and PCA masks in dbscan_masks/ and colored_pca/.

Figure: Example 3D scene segmentation and reconstruction by IGGT.

✏️ Citation

If you find our code or paper helpful, please consider starring ⭐ us and citing:

@article{li2025iggt,
  title={IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction},
  author={Li, Hao and Zou, Zhengyu and Liu, Fangfu and Zhang, Xuanyang and Hong, Fangzhou and Cao, Yukang and Lan, Yushi and Zhang, Manyuan and Yu, Gang and Zhang, Dingwen and others},
  journal={arXiv preprint arXiv:2510.22706},
  year={2025}
}

📄 License

This project is released under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
dataset_preprocess		dataset_preprocess
iggt		iggt
iggt_demo		iggt_demo
sam2		sam2
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt
visual_util.py		visual_util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Paper | Website | Data | Benchmark | Model

🔍 Overview

📝 To-Do List

🚀 Quick Start

Installation

Running the Demo

1. Data Organization

2. Usage

✏️ Citation

📄 License

About

Uh oh!

Releases

Packages

Languages

lifuguan/IGGT_official

Folders and files

Latest commit

History

Repository files navigation

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Paper | Website | Data | Benchmark | Model

🔍 Overview

📝 To-Do List

🚀 Quick Start

Installation

Running the Demo

1. Data Organization

2. Usage

✏️ Citation

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages