Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 92 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,97 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Harris Kokkinakos
* [LinkedIn](https://www.linkedin.com/in/haralambos-kokkinakos-5311a3210/), [personal website](https://harriskoko.github.io/Harris-Projects/)
* Tested on: Windows 24H2, i9-12900H @ 2.50GHz 16GB, RTX 3070TI Mobile

### Description
This project implements a Vulkan-based version of Responsive Real-Time Grass Rendering, adapted from Jahrmann & Wimmer (2017).
The goal of this project is to produce a satisfying and physically accurate representation of grass.
A compute pass performs physics evaluation and culling, while the tessellation pipeline generates detailed curved blades using control points derived from a quadratic Bézier model.

RESULTS
================
![gif](img/my_grass.gif)

IMPLEMENTATION
================

### Grass Representation
Each blade is defined by three control points v0, v1, v2 forming a quadratic Bézier curve. Additionally, each blade has attributes including height, width, stiffness, up vector, and orientation.
* v0 = root fixed to terrain
* v2 = tip affected by forces
* v1 = intermediate control derived from v0,v2

![bez](img/blade_model.jpg)

### Physics
For each frame, the compute shader updates all blades in parallel. It computes gravity, recovery, and wind forces and applies them to each grass blade.

We seperate gravity into two terms, environmental (gE) and front (gF).

gE is the environmental gravity vector applied uniformly to all blades.
It represents the constant downward pull of gravity on the tip of each blade, modeled as:

![ge](img/ge.png)

gF is the front-facing gravity component, added to tilt the blade slightly in the direction it’s facing, producing a more natural lean instead of purely vertical bending.

![gf](img/gf.png)

These two forces are added together to get the total gravity force.

The recovery force restores each blade tip back toward its rest position. This counteracts bending caused by gravity and wind. It is like a spring damping force. It is modeled as:

![r](img/r.png)

In this implementation, I use 0.1 in place of the final term in order to increase simplicty without reducing quality.

The wind force introduces dynamic, time-dependent bending to simulate airflow across the grass field.
Instead of using precomputed flow fields like in the paper, this implementation defines wind procedurally using trigonometric variation over both time and position, giving a natural wave motion that travels across the scene.

All contributions sum into a total force F = G + R + W, updating v2 with time-step Δt.
The algorithm enforces length preservation and clamps vertical penetration, mirroring section 5.2 of the paper’s responsive model

### Culling
To maintain real-time performance, grass blades are culled directly on the GPU before rendering.
Each compute shader invocation decides whether a blade should be drawn based on its orientation, visibility, and distance relative to the camera.
Blades that pass all culling tests are written into a culled buffer and counted atomically for indirect drawing.

Orientation culling removes blades that are almost parallel to the camera’s view direction (in this case, within 10%).
When a blade is seen nearly from the side, its thin geometry contributes little visually but adds unnecessary tessellation cost.

The algorithm computes the dot product between the camera forward vector and the blade’s front direction:

![ori](img/ori.png)

Frustum culling discards blades outside the camera’s visible volume.
Each blade’s base (v0), tip (v2), and midpoint (m) are transformed into clip space:

![view](img/frustum.png)

If all three points lie outside the frustum, the blade is culled.
This ensures that only blades potentially visible in the camera’s view are sent to tessellation and rasterization.

Distance culling removes blades too far from the camera (in this case, more than 25 units away).
It computes the projected horizontal distance from the blade’s root (v0) to the camera position (camPos):

![dist](img/dist.png)

If a grass blade passes all three of these tests, it can be rendered.

### Performance

This Vulkan Grass Renderer was tested at varying numbers of grass blades as shown below.

![p1](img/performance.png)

As we from this chart, the performance of this renderer is able to produce high frame rates even through extremely high numbers of grass blades to simulate. The extreme fall off of performance is because it is tested on exponentially increasing numbers of grass blades.

Additionally, we can calculate the performance increase due to the culling optimizations.

![p2](img/culling.png)

As this chart shows, at 65536 grass blades, there is almost a 100FPS improvement using the three culling methods implemented. This equates to over a 50% speedup for the renderer, proving culling to be a substantial improvement. This test primarily focuses on frustum and orientation culling as the distance culling was not utilized since the camera was close to the grass. Distance culling adds even further improvement to games and rendering when we do not want to render grass that is far away from the camera/player.

### (TODO: Your README)

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
Binary file added img/culling.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dproj.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/frustum.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/ge.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/gf.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/my_grass.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/ori.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/performance.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/r.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions src/Blades.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode
indirectDraw.firstInstance = 0;

BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory);
}

Expand All @@ -68,4 +68,4 @@ Blades::~Blades() {
vkFreeMemory(device->GetVkDevice(), culledBladesBufferMemory, nullptr);
vkDestroyBuffer(device->GetVkDevice(), numBladesBuffer, nullptr);
vkFreeMemory(device->GetVkDevice(), numBladesBufferMemory, nullptr);
}
}
2 changes: 1 addition & 1 deletion src/Blades.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#include <array>
#include "Model.h"

constexpr static unsigned int NUM_BLADES = 1 << 13;
constexpr static unsigned int NUM_BLADES = 1 << 12;
constexpr static float MIN_HEIGHT = 1.3f;
constexpr static float MAX_HEIGHT = 2.5f;
constexpr static float MIN_WIDTH = 0.1f;
Expand Down
166 changes: 153 additions & 13 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
static constexpr unsigned int WORKGROUP_SIZE = 32;

Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* camera)
: device(device),
: device(device),
logicalDevice(device->GetVkDevice()),
swapChain(swapChain),
scene(scene),
Expand Down Expand Up @@ -198,6 +198,41 @@ void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding
// Input Blades
VkDescriptorSetLayoutBinding inputBladesBinding = {};
inputBladesBinding.binding = 0;
inputBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
inputBladesBinding.descriptorCount = 1;
inputBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
inputBladesBinding.pImmutableSamplers = nullptr;

// Culled Blades
VkDescriptorSetLayoutBinding culledBladesBinding = {};
culledBladesBinding.binding = 1;
culledBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
culledBladesBinding.descriptorCount = 1;
culledBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
culledBladesBinding.pImmutableSamplers = nullptr;

// Num Blades
VkDescriptorSetLayoutBinding numBladesBinding = {};
numBladesBinding.binding = 2;
numBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
numBladesBinding.descriptorCount = 1;
numBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
numBladesBinding.pImmutableSamplers = nullptr;

std::vector<VkDescriptorSetLayoutBinding> bindings = { inputBladesBinding, culledBladesBinding, numBladesBinding };

// Descriptor set layout
VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) {
throw std::runtime_error("Failed to create compute descriptor set layout");
}
}

void Renderer::CreateDescriptorPool() {
Expand All @@ -216,6 +251,8 @@ void Renderer::CreateDescriptorPool() {
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate
// 3 storage buffers. input, culled, and num blades
{ VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast<uint32_t>(3 * scene->GetBlades().size()) },
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -318,8 +355,42 @@ void Renderer::CreateModelDescriptorSets() {
}

void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

// Describe the descriptor set
VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

// Allocate descriptor sets
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo modelBufferInfo = {};
modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
modelBufferInfo.offset = 0;
modelBufferInfo.range = sizeof(ModelBufferObject);

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &modelBufferInfo;
descriptorWrites[i].pImageInfo = nullptr;
descriptorWrites[i].pTexelBufferView = nullptr;
}

// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -360,6 +431,70 @@ void Renderer::CreateTimeDescriptorSet() {
void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades
computeDescriptorSets.resize(scene->GetBlades().size());

// Describe the descriptor set
VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

// Allocate descriptor sets
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate compute descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
// Binding 0: Input blades buffer
VkDescriptorBufferInfo inputBladesBufferInfo = {};
inputBladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer();
inputBladesBufferInfo.offset = 0;
inputBladesBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 0].dstBinding = 0;
descriptorWrites[3 * i + 0].dstArrayElement = 0;
descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 0].descriptorCount = 1;
descriptorWrites[3 * i + 0].pBufferInfo = &inputBladesBufferInfo;

// Binding 1: Culled blades buffer
VkDescriptorBufferInfo culledBladesBufferInfo = {};
culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBladesBufferInfo.offset = 0;
culledBladesBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].dstArrayElement = 0;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo;

// Binding 2: Num blades buffer
VkDescriptorBufferInfo numBladesBufferInfo = {};
numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladesBufferInfo.offset = 0;
numBladesBufferInfo.range = sizeof(BladeDrawIndirect);

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].dstArrayElement = 0;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo;
}

// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);

}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -717,7 +852,7 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout};

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -795,11 +930,11 @@ void Renderer::CreateFrameResources() {
);

depthImageView = Image::CreateView(device, depthImage, depthFormat, VK_IMAGE_ASPECT_DEPTH_BIT);

// Transition the image for use as depth-stencil
Image::TransitionLayout(device, graphicsCommandPool, depthImage, depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL);


// CREATE FRAMEBUFFERS
framebuffers.resize(swapChain->GetCount());
for (size_t i = 0; i < swapChain->GetCount(); i++) {
Expand Down Expand Up @@ -884,6 +1019,10 @@ void Renderer::RecordComputeCommandBuffer() {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr);

// TODO: For each group of blades bind its descriptor set and dispatch
for (int i = 0; i < computeDescriptorSets.size(); i++) {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr);
vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1);
}

// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
Expand Down Expand Up @@ -975,14 +1114,13 @@ void Renderer::RecordCommandBuffers() {
for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model
// Bind the descriptor set for each grass blades model
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);

// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down Expand Up @@ -1045,7 +1183,7 @@ Renderer::~Renderer() {

vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast<uint32_t>(commandBuffers.size()), commandBuffers.data());
vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer);

vkDestroyPipeline(logicalDevice, graphicsPipeline, nullptr);
vkDestroyPipeline(logicalDevice, grassPipeline, nullptr);
vkDestroyPipeline(logicalDevice, computePipeline, nullptr);
Expand All @@ -1054,14 +1192,16 @@ Renderer::~Renderer() {
vkDestroyPipelineLayout(logicalDevice, grassPipelineLayout, nullptr);
vkDestroyPipelineLayout(logicalDevice, computePipelineLayout, nullptr);


vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr);

vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr);

vkDestroyRenderPass(logicalDevice, renderPass, nullptr);
DestroyFrameResources();
vkDestroyCommandPool(logicalDevice, computeCommandPool, nullptr);
vkDestroyCommandPool(logicalDevice, graphicsCommandPool, nullptr);
}
}
6 changes: 6 additions & 0 deletions src/Renderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,10 @@ class Renderer {

std::vector<VkCommandBuffer> commandBuffers;
VkCommandBuffer computeCommandBuffer;

// Added:
VkDescriptorSetLayout computeDescriptorSetLayout;

std::vector<VkDescriptorSet> grassDescriptorSets;
std::vector<VkDescriptorSet> computeDescriptorSets;
};
Loading