CUDA Path Tracer

Unreal Engine Animation show case - 'SKM_Manny.obj'

Part 1

Basic BSDF

Suzanne.obj

Implemented a unified shading kernel supporting multiple material types.

BSDF Evaluation Shading Kernel

__global__ void shadeMaterial_with_BSDF()
...
__host__ __device__ void scatterRay()

Material Sorting

Material-based Memory Contiguity

Purpose: Reduce GPU warp divergence by grouping rays with same material types.

Controlled by #define SORT_MATERIAL for easy performance comparison.

Implementation: thrust::stable_sort_by_key sorts paths by material ID before shading.

Stochastic Antialiasing

- Implemented sub-pixel sampling for edge smoothing

Control: Enabled via #define ANTI_ALIASING 1

Implementation: u01(rng) generates random offsets in [0,1) range for ray generation

analysis

Part 2

Refraction

cornell_suzanne.json

Implemented physically accurate refraction for transparent materials like glass

 Entry/Exit Detection: bool entering = cosTheta > 0
 Schlick Fresnel Approximation: Implemented schlickFresnel() function
 Total Internal Reflection: if (glm::length(refracted) < 0.001f)
 Material Configuration: JSON "TYPE": "Refractive" support

Camera

Depth of field

cornell_suzanne.json "LENS_RADIUS" and "FOCAL_DISTANCE"

Simulates real camera lens with configurable aperture size and focal distance. Objects at focal distance appear sharp while foreground and background objects blur naturally based on distance from focal plane.
Random sampling across circular lens aperture for each ray using uniform polar coordinate distribution. Generates realistic bokeh effects through Monte Carlo convergence over multiple iterations.

#if DEPTH_OF_FIELD
    // compute focal point
    glm::vec3 focalPoint = cam.position + cam.focalDistance * rayDir;
    
    float theta = u01(rng) * 2.0f * 3.14159265f;
    float r = cam.lensRadius * sqrt(u01(rng));
    glm::vec3 lensOffset = r * (cos(theta) * cam.right + sin(theta) * cam.up);
  
    segment.ray.origin = cam.position + lensOffset;
    segment.ray.direction = glm::normalize(focalPoint - segment.ray.origin);

Load Mesh & Env

suzanne.json & Blue_stripe.hdr

Mesh Loading Workflow (Custom OBJ Loader - src/objLoader.cpp, Scene Integration - src/scene.cpp)

Custom OBJ Parser: Implemented lightweight OBJ loader supporting vertices, normals, and faces with v//vn format. Parses geometry data and applies world transformations including translation, rotation, and scaling. Each mesh is assigned a single material ID and integrated into the scene's triangle array.

Scene Integration: Meshes are loaded via JSON configuration and transformed to world coordinates using transformation matrices. Normal vectors are properly transformed using inverse transpose matrices to maintain correct lighting calculations.
BVH Acceleration Structure (BVH Construction - src/bvh.cpp)

suzanne.obj - 16689 triangles
Manny_Skm.obj - 73184 triangles

Built using Surface Area Heuristic for optimal partitioning. Combines both primitive geometry (spheres, cubes) and triangle meshes into a unified acceleration structure. Uses 12-bucket SAH evaluation to minimize intersection cost.

GPU Traversal: Implements stack-based iterative traversal optimized for GPU execution. Uses linear memory layout for cache efficiency and supports both geometry primitives and triangle meshes in the same BVH tree.

Performance: Reduces intersection complexity from O(n) to O(log n), enabling efficient rendering of complex meshes with thousands of triangles. Build statistics show construction time and node count for performance analysis.
```
BVHAccel::BVHAccel(std::vector<std::shared_ptr<Primitive>>& prims, int maxPrimsInNode)
...
BVHBuildNode* BVHAccel::recursiveBuild(
std::vector<Primitive>& primitiveInfo,
int start, int end, int* totalNodes,
std::vector<std::shared_ptr<Primitive>>& orderedPrims)
...
__global__ void computeIntersectionsBVH(
int depth, int num_paths,
PathSegment* pathSegments,
Geom* geoms, int geoms_size,
Triangle* triangles, int triangles_size,
LinearBVHNode* bvhNodes,
ShadeableIntersection* intersections)
```
HDR Environment Map Loading (src/texture.cpp)

HDR data is transferred to CUDA texture objects for hardware-accelerated sampling. Creates cudaTextureObject_t with linear filtering and wrap addressing modes. Supports both HDR (float4) and standard (uchar4) texture formats with automatic format detection.
```
// Loading Process
envMap.loadToCPU(fullenvpath);           // CPU loading
envmapHandle = scene->envMap.loadToCuda(); // GPU transfer

// JSON Configuration
"EnvMap": {
  "PATH": "../scenes/Blue_stripe.hdr"
}
```

Performance

Stream Compaction

suzanne.json & Blue_stripe.hdr

Path Termination Detection

Condition: remainingBounces > 0
Purpose: To identify paths that still need to be traced.

Memory Compaction

Uses thrust::stable_partition.
Moves the active paths to the beginning of the array.

#if COMPACTION
if (depth % 2 == 1 || depth == traceDepth - 1) {
    auto lastPath = dev_paths + num_paths;
    auto mid = thrust::stable_partition(thrust::device, 
    dev_paths, lastPath, IsAlive{});
    num_paths = mid - dev_paths; 
}
#endif

BETTER_RANDOM

Enhanced random number generator providing improved distribution quality and performance optimization

Control: #define BETTER_RANDOM 1 (currently enabled)

Original Method: utilhash() - Simple bit-manipulation hash function
int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index);

Improved Method: fastHash() - Optimized 32-bit hash algorithm
uint32_t seed = index + (iter << 16) + (depth << 8);
return thrust::default_random_engine(fastHash(seed));

Performance Gains: Reduces hash collisions, provides more uniform random distribution
Applications: Ray generation, material sampling, antialiasing, depth of field effects

Russian Roulette Ray Termination

Implemented probabilistic path termination optimization to reduce computational overhead for low-contribution rays

Termination Threshold : Applied when remainingBounces < 3

Survival Probability : 80% chance to continue (20% termination)

Importance Weighting : Surviving rays scaled by 1.25f to maintain unbiased estimation

Control : Toggleable via #define RUSSIAN_ROULETTE 0 (currently disabled)

Bloopers

BSDF Sampling Implementation Errors. Incorrect cosine-weighted sampling implementation in calculateRandomDirectionInHemisphere().

Glass material error

Name		Name	Last commit message	Last commit date
Latest commit History 171 Commits
cmake		cmake
external		external
img		img
scenes		scenes
src		src
stream_compaction		stream_compaction
.cproject		.cproject
.gitignore		.gitignore
.project		.project
CMakeLists.txt		CMakeLists.txt
GNUmakefile		GNUmakefile
INSTRUCTION.md		INSTRUCTION.md
Project3-CUDA-Path-Tracer.launch		Project3-CUDA-Path-Tracer.launch
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUDA Path Tracer

Part 1

Basic BSDF

Material Sorting

Stochastic Antialiasing

analysis

Part 2

Refraction

Camera

Load Mesh & Env

Performance

BETTER_RANDOM

Russian Roulette Ray Termination

Bloopers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CUDA Path Tracer

Part 1

Basic BSDF

Material Sorting

Stochastic Antialiasing

analysis

Part 2

Refraction

Camera

Load Mesh & Env

Performance

BETTER_RANDOM

Russian Roulette Ray Termination

Bloopers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages