Skip to content

gamidirohan/Depth-Aware-And-RGB-3D-Modeling

 
 

Repository files navigation

Depth-Aware-And-RGB-3D-Modeling

With RGB-D cameras, we can get multiple RGB and Depth images and convert them to point clouds easily. Leveraging this, we can reconstruct single object with multi-view RGB and Depth images. To acheive this, point clouds from multi-view need to be registered. This task is also known as point registration", whose goal is to find transfomration matrix between source and target point clouds. The alignment consists of two sub-alignment, Initial alignment and Alignment refinement.

Dataset Formatting

This project uses the RGB-D dataset from the University of Washington. Once downloaded, the dataset is formatted using the provided format_repo.py script to organize the files into the required structure.

How to Use

  1. Download the dataset: Download the RGB-D dataset from here.

  2. Format the dataset: Use the format_repo.py script to format the dataset as required for this repository. Update the source_directory and target_directory variables in the script to point to your dataset location. Easier to do with each object placed in the train directory, updating the directory variables in the code, and run.

    if __name__ == "__main__":
        source_directory = r"train\apple_1"  # Update this path
        target_directory = source_directory  # Update this path
        reorganize_dataset(source_directory, target_directory)
  3. Run the script: Execute the script to format the dataset.

    python format_repo.py

RGB-D Camera Spec

  • Model: Intel realsense D415
  • Intrinsic parameters in 640x480 RGB, Depth image.
K = [[597.522, 0.0, 312.885],
     [0.0, 597.522, 239.870],
     [0.0, 0.0, 1.0]]

Requirements

myenv Virtual Environment File

Below is a sample requirements file for setting up your virtual environment with the necessary dependencies:

Create the virtual environment:

python -m venv myenv

Activate the environment:

On Windows:

./myenv/Scripts/activate

On Unix or MacOS:

source myenv/bin/activate

Then, create a file named requirements.txt with the following content:

pyrealsense2 only works with python 3.9


open3d pyrealsense2 opencv-python numpy kornia

Finally, install the dependencies using:

pip install -r requirements.txt

Align RGB and Depth Image & Depth Filtering

Due to the different position of RGB and Depth lens, aligning them should be done to get exact point clouds. This project used alignment function offered by pyrealsense2 package. Raw depth data are so noisy that depth filtering should be needed. Pyrealsense2 library, developed by Intel, offers filtering methods of depth images. In this project, spatial-filter was used that smoothes noise and preserves edge components in depth images.

capture_aligned_images.py : Capture RGB and Depth Image, align them Set image path for saving RGB and Depth images. Press 'S' to capture scene.

Pre-Process Point Clouds

Single object might be a part of the scanned data. In order to get points of interested objects, pre-processing should be implemented. Plane-removal, outlier-removal, DBSCAN clustering were executed to extract object. Open3D offers useful functions to filter points.

preprocess_pcd.py : RGB-D Images to Object's Point clouds. Plane-removal, points outlier-filtering, DBSCAN clustering were applied.

Feature based Registration (Local Registration)

Initial alignment can be acheived through finding transformation matrix between feature points. The position of 3D points can be estimated with back-projection rays and depth values from depth image. Transformation matrix can be estimated with 3D corresponding feature points from souce and target point clouds, with RANSAC procedure. In order to find robust correspondeces in object area, extracted object 3D points were reprojected and the bounding area were obtained to filter outliers.

SIFT.py: Find 3d transformation matrix with SIFT feature points
ORB.py: Find 3d transformation matrix with ORB feature points
LoFTR.py: Find 3d transformation matrix with LoFTR feature points

Reprjection Constraints

  • Before
    image
  • After
    image

ICP based Registration (Global Registration)

With ICP algorithm implemented in Open3D, refine initial transformation matrix. In this project, Point-to-Plane ICP method was used.

Pose Graph Optimization

Pose graph optimization is a non-linear optimization of poses, frequently used in SLAM. Node represents poses of camera, edge represents relative pose between two cameras(nodes). In loop closure, the error occurs between predicted and measurement in loop nodes due to inaccurate camera poses. The goal of pose graph optimization is to minimize error between loop nodes(X_i, X_j), shown below. Levenberg-marquardt(LM) method was used for non-linear optimization



Effect of Pose graph optimzation


You can find the difference of registration quality between unoptimized(left) and optimized(right) in reconstruction result. In unoptimized result, error exists due to accumulated pose error.

Run Registration

Pose_graph_ICP.py: Run registration with ICP method
pose_graph_Feature_based.py: Run registration with Feature based(SIFT) method

Results

The object was reconstructed with multiple different view of RGB-D Images.

The reconstructed point clouds is below.

Object 3D Reconstruction (Feature based)

Object 3D Reconstruction (ICP based)

Human shape 3D Reconstruction

Pushing to GitHub

To push your changes to a new or existing branch on GitHub, follow these steps:

  1. Initialize Git (if not already initialized):

    git init
  2. Add the remote repository:

    git remote add origin https://github.com/gamidirohan/Depth-Aware-And-RGB-3D-Modeling.git

    If the remote already exists, update it:

    git remote set-url origin https://github.com/gamidirohan/Depth-Aware-And-RGB-3D-Modeling.git
  3. Add all files to the staging area:

    git add .
  4. Commit the changes with a message:

    git commit -m "Initial commit"
  5. Push the changes to the remote repository:

    git push -u origin main

    If you are pushing to a different branch, replace main with your branch name:

    git push -u origin your-branch-name

Feel free to ask if you have any questions or need further assistance!

About

Transform RGBD data into 3D models by fusing RGB, depth, and segmentation masks. Generates volumetric 3D scenes for XR, 3D printing, and analytics. Features multi-modal fusion, mesh/point cloud generation, and Python-based implementation with Open3D/OpenCV. Explore code, experiment, and contribute to this RGBD-to-3D pipeline.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%