SIMD-ified R-tree Query Processing and Optimization

A high-performance spatial indexing research codebase implementing SIMD-optimized R-tree structures using Intel AVX-512 instructions. This project explores various fanout sizes, vectorization strategies, and optimization techniques for spatial queries.

Overview

This repository contains implementations and benchmarks for two primary spatial query operations:

Range Search (RS): Find all objects intersecting a query rectangle
Spatial Join: Join two R-tree datasets based on spatial overlap

Multiple implementation strategies are provided for systematic performance evaluation across different hardware characteristics.

Features

Multiple Fanout Sizes: Support for 16, 32, 64, 128, 256, 512, 1024, 2048, and 4096-way fanouts
SIMD Vectorization: Intel AVX-512 optimized implementations
Vectorization Depths:
- D0 (Scalar): Non-vectorized baseline with logical and bitwise variants
- D1: Single-level SIMD vectorization with partial/full variants and prefetching
- D2: Two-level SIMD vectorization for deeper tree traversal
Performance Monitoring: Integration with Intel CPU performance counters
Cache-Aware Design: 64-byte aligned data structures for optimal cache utilization
Multiple Dataset Sizes: 1M, 5M, and 10M data points

Repository Structure

.
├── r_tree/              # R-tree node implementations
│   ├── rtreeN.h         # Standard fanout N implementations (16-4096)
│   ├── rtreeN_d1.h      # Depth-1 optimized variants
│   └── commonMethod.h   # Shared utilities
├── rs/                  # Range search implementations
│   ├── 1M/              # 1 million data points
│   ├── 5M/              # 5 million data points
│   └── 10M/             # 10 million data points
│       ├── scalar/      # Non-vectorized implementations
│       ├── vector-d1/   # Single-level SIMD vectorization
│       └── vector-d2/   # Two-level SIMD vectorization
├── join/                # Spatial join implementations
│   └── [same structure as rs/]
├── create_rtree/        # R-tree data generation scripts
│   ├── 1M/              # 1M point dataset generators
│   ├── 5M/              # 5M point dataset generators
│   ├── 10M/             # 10M point dataset generators
│   │   ├── main_{fanout}-1.py  # First index scripts (seed=100)
│   │   └── main_{fanout}-2.py  # Second index scripts (seed=200)
│   └── gen_qpool.py     # Query pool generator for range search
├── scripts/             # Build and execution scripts
│   ├── rs.sh            # Range search build script
│   └── join.sh          # Join build script
├── data/                # Dataset files
│   ├── join/            # R-tree structure files
│   └── search/          # Query datasets
├── out/                 # Output results
├── utils.h              # Core utilities and SIMD helpers
├── PerfEvent.hpp        # Performance counter interface
├── commonMethod.cpp     # Common methods implementation
└── CLAUDE.md            # AI assistant guidance

Prerequisites

CPU: Intel Ice Lake Server or newer (AVX-512 support required)
Compiler: g++ with AVX-512 support
OS: Linux (tested on systems with Intel Ice Lake architecture)

To verify your CPU supports AVX-512:

gcc -march=native -Q --help=target | grep march
# Should show: icelake-server or newer

Python Dependencies (for data generation)

pip install rtreelib

Data Generation

The create_rtree/ directory contains Python scripts to generate R-tree data files for spatial join experiments. Each script is pre-configured for specific dataset sizes, fanout values, and index variants.

Directory Structure

create_rtree/
├── 1M/          # 1 million point datasets
├── 5M/          # 5 million point datasets
└── 10M/         # 10 million point datasets
    ├── main_{fanout}-1.py    # First index (seed=100)
    └── main_{fanout}-2.py    # Second index (seed=200)

Pre-configured Scripts

Each dataset size folder contains scripts for fanouts: 16, 32, 64, 128, 256, 512, 1024, 2048

Naming Convention: main_{fanout}-{index}.py

{fanout}: R-tree maximum fanout (16, 32, 64, 128, 256, 512, 1024, 2048)
{index}: Index number (1 or 2) for generating paired join datasets

Usage

# Generate 1M dataset with fanout 64, first index
cd create_rtree/1M/
python main_64-1.py

# Generate 10M dataset with fanout 128, second index
cd create_rtree/10M/
python main_128-2.py

Data Generation Configuration

Each script generates point datasets (zero-width/height rectangles) with the following characteristics:

Dataset	Objects	Coordinate Space	Rectangle Size	Seed (Index 1)	Seed (Index 2)
1M	1,000,000	1000 × 1000	0 × 0 (points)	100	200
5M	5,000,000	1000 × 1000	0 × 0 (points)	100	200
10M	10,000,000	1000 × 1000	0 × 0 (points)	100	200

Key characteristics:

half_width = 0, half_height = 0: Objects are points, not rectangles
Different seeds: Index 1 uses seed(100), Index 2 uses seed(200) for diverse join pairs
Sorted output: Nodes are sorted by low_x coordinate before writing

Output

Generated files are written to the local data/join/ directory with the naming convention:

{index}-{size}-s1000_v0_fan{fanout}.txt

Examples:

data/join/1-1M-s1000_v0_fan64.txt - First index, 1M objects, fanout 64
data/join/2-10M-s1000_v0_fan128.txt - Second index, 10M objects, fanout 128
data/join/1-5M-s1000_v0_fan256.txt - First index, 5M objects, fanout 256

Important: These generated data files in data/join/ are the R-tree structure files that the join and range search implementations load and query during experiments.

Query Pool Generation

The create_rtree/gen_qpool.py script generates query rectangles for range search experiments.

Usage

cd create_rtree/
python gen_qpool.py

Configuration

The script generates a pool of query rectangles with the following default parameters:

Query pool size: 150,000 rectangles
Coordinate space: 1000 × 1000
Rectangle size: 10 × 10 (square_dim = 10, half_dim = 5)
Random seed: 100 (for reproducibility)

Output

Generated query pool file is written to data/search/ with the naming convention:

qs_{start}_{end}.txt

Default output: data/search/qs_0_75.txt

Where:

start: Lower bound of query rectangle size range (default: 0)
end: Upper bound of query rectangle size range (default: 75)

Each line in the output file contains one query rectangle:

<min_x> <min_y> <max_x> <max_y>

Important: These query pool files in data/search/ are used by the range search implementations to perform queries against the R-tree indices.

R-tree Data File Format

Each line in the R-tree structure files (data/join/) represents one R-tree node:

<count_child> <id> <lx[0]...lx[n]> <ly[0]...ly[n]> <hx[0]...hx[n]> <hy[0]...hy[n]> <child_id[0]...child_id[n]> <level> <parent_id>

Where:

count_child: Number of entries in this node
id: Node identifier
lx, ly: Low x and y coordinates of bounding rectangles
hx, hy: High x and y coordinates of bounding rectangles
child_id: Child node identifiers
level: Tree level (0 = root)
parent_id: Parent node identifier

Building

Quick Start

Build all range search variants for all dataset sizes:

cd scripts/
./rs.sh

Build all join variants:

cd scripts/
./join.sh

Building Individual Variants

Navigate to the specific implementation directory and compile:

# Example: Range search, 1M dataset, scalar, fanout 64
cd rs/1M/scalar/
g++ -g f64.cpp -march=native -funroll-loops -O3 -o rs_s_d0_f64
./rs_s_d0_f64

# Example: Join, 10M dataset, vector D1, fanout 128
cd join/10M/vector-d1/
g++ -g f128_o3.cpp -march=native -funroll-loops -mcmodel=medium -O3 -o join_v_d1_o3_f128
./join_v_d1_o3_f128

Compilation Flags

-march=native: Enable native CPU instruction set (AVX-512)
-funroll-loops: Loop unrolling optimization
-O3: Maximum optimization level
-mcmodel=medium: Required for large datasets (join/nn operations)
-g: Debug symbols

Executable Naming Convention

Format: {operation}_{implementation}_{depth}_{optimization}_f{fanout}

Examples:

rs_s_d0_f64 - Range Search, Scalar, Depth 0, fanout 64
join_v_d1_o3_f128 - Join, Vector, Depth 1, optimization 3, fanout 128

Suffixes:

_w_bw - with bitwise operations
_w_p - with partial vectorization
_w_pf - with prefetching

Implementation Details

SIMD Operations

Vector Width: 512-bit (__m512 for 16 floats)
Mask Types: __mmask16, __mmask32, __mmask64
Intrinsics: AVX-512 Foundation and AVX-512 DQ

Prefetching

Configurable prefetch distance and hints:

int lkahead_dis = 1;  // Lookahead distance
const int hint = _MM_HINT_T0;  // L1 cache

Debug Mode

Control debug output via the XDEBUG macro:

#define XDEBUG 0  // 0 = production, 1 = debug

Set to 1 for detailed SIMD operation debugging, 0 for performance measurements.

Performance Monitoring

The codebase uses PerfEvent.hpp for Intel CPU performance counter integration:

Cache misses (L1, L2, L3)
TLB misses
Instructions per cycle (IPC)
Branch mispredictions

Data Format

R-tree structure files (in data/join/) use the format:

<count_child> <level> <lx[0]...lx[n]> <ly[0]...ly[n]> <hx[0]...hx[n]> <hy[0]...hy[n]> <child_id[0]...child_id[n]> <level> <id>

Customizing Builds

Edit the arrays at the top of build scripts to customize:

# scripts/rs.sh or scripts/join.sh
DATASETS=(1M 5M 10M)
FANOUTS=(16 32 64 128 256 512 1024 2048)

Output

Results are written to the out/ directory with performance metrics including:

Query processing time
Node visits
Result counts
Hardware counter statistics

Publication

This codebase implements the research presented in the following paper:

SIMD-ified R-tree Query Processing and Optimization Yeasir Rayhan and Walid G. Aref In Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL '23) November 13–16, 2023, Hamburg, Germany

Citation

If you use this code in your research, please cite:

@inproceedings{DBLP:conf/gis/RayhanA23,
  author       = {Yeasir Rayhan and
                  Walid G. Aref},
  title        = {SIMD-ified R-tree Query Processing and Optimization},
  booktitle    = {Proceedings of the 31st {ACM} International Conference on Advances
                  in Geographic Information Systems, {SIGSPATIAL} 2023, Hamburg, Germany,
                  November 13-16, 2023},
  pages        = {37:1--37:10},
  publisher    = {{ACM}},
  year         = {2023},
  url          = {https://doi.org/10.1145/3589132.3625610},
  doi          = {10.1145/3589132.3625610},
}

License

This work is licensed under a Creative Commons Attribution International 4.0 License.

Contact

Yeasir Rayhan Purdue University, West Lafayette, IN, USA Email: yrayhan@purdue.edu

Acknowledgments

This work was supported by the National Science Foundation under Grant Number IIS-1910216.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
create_rtree		create_rtree
data/search		data/search
join		join
r_tree		r_tree
rs		rs
scripts		scripts
.gitignore		.gitignore
PerfEvent.hpp		PerfEvent.hpp
README.md		README.md
commonMethod.cpp		commonMethod.cpp
utils.h		utils.h

Folders and files

Latest commit

History

Repository files navigation

SIMD-ified R-tree Query Processing and Optimization

Overview

Features

Repository Structure

Prerequisites

Python Dependencies (for data generation)

Data Generation

Directory Structure

Pre-configured Scripts

Usage

Data Generation Configuration

Output

Query Pool Generation

Usage

Configuration

Output

R-tree Data File Format

Building

Quick Start

Building Individual Variants

Compilation Flags

Executable Naming Convention

Implementation Details

SIMD Operations

Prefetching

Debug Mode

Performance Monitoring

Data Format

Customizing Builds

Output

Publication

Citation

License

Contact

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages