diff --git a/INSTRUCTION.md b/INSTRUCTION.md index e85a2ab..c501d54 100644 --- a/INSTRUCTION.md +++ b/INSTRUCTION.md @@ -92,11 +92,13 @@ We recommend starting with trying to display the grass blades without any forces In this project, grass blades will be represented as Bezier curves while performing physics calculations and culling operations. Each Bezier curve has three control points. + * `v0`: the position of the grass blade on the geomtry * `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector (explained soon) * `v2`: a physical guide for which we simulate forces on We also need to store per-blade characteristics that will help us simulate and tessellate our grass blades correctly. + * `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0` * Orientation: the orientation of the grass blade's face * Height: the height of the grass blade @@ -166,6 +168,7 @@ If all three points are outside of the view-frustum, we will cull the grass blad Similarly to orientation culling, we can end up with grass blades that at large distances are smaller than the size of a pixel. This could lead to additional artifacts in our renders. In this case, we can cull grass blades as a function of their distance from the camera. You are free to define two parameters here. + * A max distance afterwhich all grass blades will be culled. * A number of buckets to place grass blades between the camera and max distance into. @@ -181,20 +184,24 @@ The generated vertices will be passed to the tessellation evaluation shader, whe To build more intuition on how tessellation works, I highly recommend playing with the [HelloTessellation sample](https://github.com/CIS565-Fall-2017/Vulkan-Samples/tree/master/samples/5_helloTessellation) and reading this [tutorial on tessellation](https://ogldev.org/www/tutorial30/tutorial30.html). -## Extra Credit +## Extra Credit These extra credit are for reference only. It is encouraged to come up with your own idea! -### LOD +### LOD + Tessellate to varying levels of detail as a function of how far the grass blade is from the camera. For example, if the blade is very far, only generate four vertices in the tessellation control shader. You can experiment with different numbers of vertices and distance to see how does ### Occlusion culling + This type of culling only makes sense if our scene has additional objects aside from the plane and the grass blades. To receive this extra credit, you should first add more geometry in the scene (Cube, sphere, etc.). Then, cull grass blades that are occluded by other geometry. Hint: you can use a depth map to accomplish this! ### Interactive Grass + You can make the demo interactive by adding a GUI (e.g., using ImGui) to control parameters like wind force and direction. You could also add a controllable geometry (like a sphere) that physically interacts with the grass, pushing blades aside as it moves. -### Better rendering +### Better rendering + Enhance the final render by adding features like a skybox to create an immersive background, or by applying more sophisticated shading techniques to the grass for better lighting and color variation. You can check recent GDC on grass rendering for this. One good example is [Procedural Grass in 'Ghost of Tsushima'](https://www.youtube.com/watch?v=Ibe1JBF5i5Y). ## Resources @@ -210,7 +217,6 @@ The following resources may be useful for this project. * [RenderDoc blog on Vulkan](https://renderdoc.org/vulkan-in-30-minutes.html) * [Tessellation tutorial](https://ogldev.org/www/tutorial30/tutorial30.html) - ## Third-Party Code Policy * Use of any third-party code must be approved by asking on our Piazza. @@ -227,6 +233,7 @@ The following resources may be useful for this project. ### Performance Analysis The performance analysis is where you will investigate how... + * Your renderer handles varying numbers of grass blades * The improvement you get by culling using each of the three culling tests @@ -235,6 +242,7 @@ The performance analysis is where you will investigate how... If you have modified any of the `CMakeLists.txt` files at all (aside from the list of `SOURCE_FILES`), mention it explicity. Beware of any build issues discussed on the Piazza. Open a GitHub pull request so that we can see that you have finished. + * The title should be "Project 5: YOUR NAME". * The template of the comment section of your pull request is attached below, you can do some copy and paste: * [Repo Link](https://link-to-your-repo) diff --git a/README.md b/README.md index 20ee451..3556d32 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,150 @@ Vulkan Grass Rendering **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Muqiao Lei + + [LinkedIn](https://www.linkedin.com/in/muqiao-lei-633304242/) · [GitHub](https://github.com/rmurdock41) -### (TODO: Your README) +* Tested on: Windows 10, 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz 2.30 GHz, NVIDIA GeForce RTX 3060 Laptop GPU (Personal Computer) -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +--- + +## Project Overview + +![](img/top.gif) + +This project implements a grass rendering system based on the paper [*Responsive Real-Time Grass Rendering for General 3D Scenes*](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). Each grass blade is represented as a Bezier curve and simulated on the GPU using compute shaders with physics forces (gravity, recovery, wind). The system generates blade geometry through tessellation and optimizes rendering with three culling techniques. A distance-based LOD system dynamically adjusts tessellation levels, a procedural skybox provides the background, and Lambert lighting with rim light enhances the grass shading. + +--- + +## Features + +### Bezier Curve Blade Representation + +![](img/rawGrass.gif) + +Each grass blade is represented as a quadratic Bezier curve with three control points. **v0** is the blade base position with its w component storing the orientation angle. **v1** is the middle Bezier control point with its w component storing the blade height. **v2** is the physics simulation target point with its w component storing the blade width. + + The **up** vector defines the blade's upward direction with its w component storing the stiffness coefficient. Grass blades are randomly generated in `Blades.cpp`, with position, height, width, and orientation randomly distributed across the plane. + +The initial position of v2 is at height above v0, with an added random horizontal offset (**bendOffset**) that gives blades different initial bending directions and angles for visual variety. + +The data is stored in three buffers: **bladesBuffer** stores input blade data, **culledBladesBuffer** stores blades after culling, and **numBladesBuffer** records indirect draw commands and the remaining blade count. + +--- + +![](img/grassWind.gif) + +#### GPU Physics Simulation - Gravity + +Physics simulation is performed in the compute shader, calculating forces on all grass blades each frame. Gravity consists of two components: environmental gravity `gE = vec3(0.0, -9.8, 0.0)` pointing toward the ground, and frontal gravity `gF = 0.25 * length(gE) * front`, where front is the blade's facing direction calculated by `cross(up, tangent)`. The total gravity is `gravity = gE + gF`, creating a natural drooping effect on the blades. + +#### GPU Physics Simulation - Recovery Force + +Recovery force is calculated as `recovery = (iv2 - v2) * stiffness`, where `iv2 = v0 + up * height` is the initial position of v2 when the blade is upright. The force magnitude is controlled by the stiffness coefficient - higher stiffness produces stronger recovery force and stiffer blades. + +#### GPU Physics Simulation - Wind Force + +Wind force uses a sinusoidal wave pattern: `windWave = sin(totalTime * 2.0 + v0.x * 0.5 + v0.z * 0.5)`, which creates wave propagation across the grass field based on blade position. The base wind is calculated as `wind = windDirection * windStrength * windWave`. Directional alignment `directionalAlignment = 1.0 - abs(dot(normalize(wind), normalize(v2 - v0)))` and height ratio `heightRatio = dot(v2 - v0, up) / height` are calculated, with final wind force `wind *= directionalAlignment * heightRatio`. After summing all forces and updating v2 position, length correction `r = height / L` maintains blade length, where L is the current curve length calculated as `L = (2.0 * L0 + L1) / 3.0`. The corrected v1 and v2 are written back to the inputBlades buffer. + +--- + +#### Culling Techniques - Orientation Culling + +![](img/Oculling.gif) + +Orientation culling removes grass blades perpendicular to the view direction, as these blades occupy very few pixels on screen or are invisible. The camera position is calculated as `cameraPos = inverse(camera.view)[3].xyz` and view direction as `viewDir = normalize(v0 - cameraPos)`. The blade facing direction is computed as `bladeDir = normalize(cross(up, front))`. When `abs(dot(viewDir, bladeDir)) > 0.9`, the blade is nearly perpendicular to the view direction and is culled from rendering. + +#### Culling Techniques - View-Frustum Culling + +![](img/Fculling2.gif) + +View-frustum culling removes grass blades outside the camera's view. Three points are tested: v0 (base), v2 (tip), and m (midpoint, calculated as `m = 0.25 * v0 + 0.5 * v1 + 0.25 * v2`). These points are projected to clip space: `clipV0 = camera.proj * camera.view * vec4(v0, 1.0)`. For each clip space coordinate, `h = clipPos.w + tolerance` is calculated, where tolerance is 1.0 to provide some margin. A point is inside the frustum when `abs(clipPos.x) <= h && abs(clipPos.y) <= h && abs(clipPos.z) <= h`. A blade is only culled when all three points are outside the frustum. + +#### Culling Techniques - Distance Culling + +![](img/Dculling.gif) + +Distance culling performs probabilistic culling based on blade distance from the camera. The distance is calculated as `dist = length(v0 - cameraPos)`, and blades beyond maxDistance (50.0) are immediately culled. For blades within range, the distance is divided into 10 buckets, with `bucket = int(dist / bucketSize)`. A position-based hash function `hash = uint(v0.x * 12345.0 + v0.z * 67890.0 + index * 1000)` generates a pseudo-random value, and the cull probability is `cullProbability = float(bucket) / float(numBuckets)`. Farther buckets have higher cull probability, determined by `(hash % 100) / 100.0 < cullProbability`. Blades passing all culling tests are written to the culledBlades buffer using `atomicAdd`. + +--- + +### Tessellation and LOD System + +![](img/lod.gif) + +*Different LOD levels are visualized with colors* + +Grass blades passing all culling tests are sent to the graphics pipeline's tessellation stage. The **vertex shader** passes Bezier curve control points (**v0**, **v1**, **v2**, **up**) to the **tessellation control shader**. In the tessellation control shader, the distance between the blade and camera is calculated as `dist = length(cameraPos - bladePos)`, and tessellation level is dynamically set based on distance: **5 levels** for distances under 10 meters (high detail), **3 levels** for 10-25 meters (medium detail), and **2 levels** beyond 25 meters (low detail). This LOD system implements smooth transitions via `tessLevel = mix(5.0, 1.0, smoothstep(5.0, 50.0, dist))`, avoiding abrupt tessellation changes. Tessellation levels are set through **gl_TessLevelInner[0]** and **gl_TessLevelOuter**, controlling the number of vertices generated along the blade height. + +The **tessellation evaluation shader** receives subdivided parametric coordinates **(u, v)**, where v runs along blade height (0 to 1) and u along width (0 to 1). Then calculate points on the Bezier curve: `a = v0 + v * (v1 - v0)`, `b = v1 + v * (v2 - v1)`, `c = a + v * (b - a)`, yielding the position at height v on the blade centerline. Blade width tapers along height, calculated as `currentWidth = width * (1.0 - v)`, widest at the base and narrowing at the tip. Using tangent direction `t1 = vec3(sin(orientation), 0.0, cos(orientation))`, the center point is offset laterally to generate a quad: `c0 = c - currentWidth * t1`, `c1 = c + currentWidth * t1`, with final vertex position `worldPos = mix(c0, c1, u)`. The normal is calculated via `cross(t1, tangentAlongBlade)` and flipped based on u value to ensure correct orientation. + +--- + +### Procedural Skybox + +![](img/skybox.gif) + +The skybox is rendered first each frame, before all other geometry. It uses a separate **graphics pipeline** with depth test set to **VK_COMPARE_OP_LESS_OR_EQUAL** and depth write disabled, ensuring the skybox always appears at the farthest distance without occluding other objects. The skybox geometry is a **unit cube** (36 vertices). In the vertex shader, the translation component is removed from the view matrix, preserving only rotation, so the skybox rotates with the camera but does not move with it. The fragment shader uses **procedural methods** to generate sky color gradients, sun, and cloud effects without relying on texture maps. When the window is resized, the skybox pipeline is rebuilt in **RecreateFrameResources()** to ensure the viewport updates correctly. + +### Grass Blade Shading + +![](img/top.gif) + +The fragment shader implements multiple lighting techniques to enhance grass visual quality. Base color transitions between dark green at the base and light green at the tip through height-based interpolation. A **Lambert diffuse lighting model** simulates sunlight. **Wrap-around lighting** technique `diffuse = (NdotL + 0.5) / 1.5` produces softer shading transitions, avoiding harsh shadow boundaries. **Rim light** effect is calculated from normal and view direction, using a power function to create sharp falloff, adding a pale yellow-green outline at blade edges to enhance depth perception. The final color combines **ambient light**, **diffuse light**, and **rim light**. + +--- + +## Performance Analysis + +#### blade count + +Tests at **640x480 resolution** to measure the impact of grass blade count on frame rate. + +| Blade Count | FPS | +| ---------------- | ---- | +| 2^13 (8,192) | 3677 | +| 2^15 (32,768) | 1540 | +| 2^17 (131,072) | 472 | +| 2^19 (524,288) | 114 | +| 2^21 (2,097,152) | 31 | +| 2^23 (8,388,608) | 7.7 | + +![Performance Graph](img/number.png) + +Performance bottlenecks come from three main stages. The **compute shader** physics simulation calculates gravity, recovery force, and wind force for each blade, with computation scaling linearly with blade count. The **tessellation stage** generates vertices for each blade based on LOD level, with higher tessellation producing more geometry. The **fragment shader** computes Lambert diffuse and rim light for all generated pixels, with pixel count depending on the screen area covered by grass blades. + +#### Culling + +Tests were conducted with **2^19 (524,288) blades** at **640x480 resolution** to evaluate the effectiveness of different culling techniques. + +**Test Data:** + +| Culling Configuration | FPS | +| --------------------- | --- | +| No Culling | 64 | +| Orientation Only | 94 | +| Frustum Only | 71 | +| Distance Only | 83 | +| All Culling | 114 | + +![Culling Performance Graph](img/culling.png) + +**Orientation culling** shows the most significant impact, improving performance from 64 FPS to 94 FPS (46.9% increase), as it removes large numbers of blades perpendicular to the view direction that are nearly invisible on screen but still require processing. **Distance culling** provides a 29.7% performance gain (83 FPS) by probabilistically removing distant blades to reduce rendering load. **Frustum culling** shows the smallest improvement (71 FPS, 10.9% increase), as most blades remain within view in scenes with wide camera angles and flat terrain. + +When all three culling techniques are combined, FPS reaches 114, representing a 78.1% improvement over no culling. The effects of culling techniques do not simply add up, as different methods overlap (the same blade may satisfy multiple culling conditions simultaneously). In high blade count scenarios, the culling system is essential for maintaining real-time performance. + + + +#### LOD System + +Tests were conducted with **2^19 (524,288) blades** at **640x480 resolution** with the camera positioned far enough to trigger LOD. + +| LOD Configuration | FPS | +| ----------------- | --- | +| No LOD | 399 | +| With LOD | 573 | + +![LOD Performance Graph](img/lod.png) + +The LOD system provides a 43.6% performance improvement (from 399 FPS to 573 FPS). When the camera is distant, most blades are reduced to 2-3 tessellation levels, significantly decreasing the vertex count generated by the tessellation evaluation shader. Without LOD, all blades use a fixed 5-level tessellation, generating excessive geometry with details imperceptible at distance. The LOD system dynamically adjusts tessellation levels to substantially reduce rendering load while maintaining visual quality, demonstrating the effectiveness of distance-based tessellation optimization. diff --git a/external/GLFW/tests/gamma.c b/external/GLFW/tests/gamma.c index 500d41c..38c7012 100644 --- a/external/GLFW/tests/gamma.c +++ b/external/GLFW/tests/gamma.c @@ -122,7 +122,8 @@ int main(int argc, char** argv) area = nk_rect(0.f, 0.f, (float) width, (float) height); glClear(GL_COLOR_BUFFER_BIT); - nk_glfw3_new_frame(); + nk_glfw3_new_ + frame(); if (nk_begin(nk, "", area, 0)) { const GLFWgammaramp* ramp = glfwGetGammaRamp(monitor); diff --git a/img/Dculling.gif b/img/Dculling.gif new file mode 100644 index 0000000..52d6c88 Binary files /dev/null and b/img/Dculling.gif differ diff --git a/img/Fculling2.gif b/img/Fculling2.gif new file mode 100644 index 0000000..e580778 Binary files /dev/null and b/img/Fculling2.gif differ diff --git a/img/Oculling.gif b/img/Oculling.gif new file mode 100644 index 0000000..5879079 Binary files /dev/null and b/img/Oculling.gif differ diff --git a/img/culling.png b/img/culling.png new file mode 100644 index 0000000..0991b0f Binary files /dev/null and b/img/culling.png differ diff --git a/img/grassWind.gif b/img/grassWind.gif new file mode 100644 index 0000000..9c692b2 Binary files /dev/null and b/img/grassWind.gif differ diff --git a/img/lod.gif b/img/lod.gif new file mode 100644 index 0000000..cb8c10d Binary files /dev/null and b/img/lod.gif differ diff --git a/img/lod.png b/img/lod.png new file mode 100644 index 0000000..b828493 Binary files /dev/null and b/img/lod.png differ diff --git a/img/number.png b/img/number.png new file mode 100644 index 0000000..390d932 Binary files /dev/null and b/img/number.png differ diff --git a/img/rawGrass.gif b/img/rawGrass.gif new file mode 100644 index 0000000..9c9dbf9 Binary files /dev/null and b/img/rawGrass.gif differ diff --git a/img/skybox.gif b/img/skybox.gif new file mode 100644 index 0000000..f0dbb15 Binary files /dev/null and b/img/skybox.gif differ diff --git a/img/top.gif b/img/top.gif new file mode 100644 index 0000000..ef2e56a Binary files /dev/null and b/img/top.gif differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..8bc4781 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -25,11 +25,17 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode // Bezier point and height (v1) float height = MIN_HEIGHT + (generateRandomFloat() * (MAX_HEIGHT - MIN_HEIGHT)); - currentBlade.v1 = glm::vec4(bladePosition + bladeUp * height, height); + currentBlade.v1 = glm::vec4(bladePosition + bladeUp * height, height); // Physical model guide and width (v2) float width = MIN_WIDTH + (generateRandomFloat() * (MAX_WIDTH - MIN_WIDTH)); - currentBlade.v2 = glm::vec4(bladePosition + bladeUp * height, width); + float bendAmount = 0.3f; + glm::vec3 bendOffset = glm::vec3( + (generateRandomFloat() - 0.5f) * bendAmount, + 0.0f, + (generateRandomFloat() - 0.5f) * bendAmount + ); + currentBlade.v2 = glm::vec4(bladePosition + bladeUp * height + bendOffset, width); // Up vector and stiffness coefficient (up) float stiffness = MIN_BEND + (generateRandomFloat() * (MAX_BEND - MIN_BEND)); @@ -44,8 +50,15 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstVertex = 0; indirectDraw.firstInstance = 0; - BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), + VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + bladesBuffer, bladesBufferMemory); + + + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), + VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index aea02fe..d624ff3 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -8,7 +8,7 @@ file(GLOB IMAGES foreach(IMAGE ${IMAGES}) get_filename_component(fname ${IMAGE} NAME) - configure_file(${IMAGE} ${CMAKE_CURRENT_BINARY_DIR}/images/${fname} COPYONLY) + configure_file(${IMAGE} ${CMAKE_SOURCE_DIR}/bin/images/${fname} COPYONLY) endforeach() file(GLOB_RECURSE SHADER_SOURCES @@ -31,8 +31,7 @@ else(WIN32) endif(WIN32) foreach(SHADER_SOURCE ${SHADER_SOURCES}) - set(SHADER_DIR ${CMAKE_CURRENT_BINARY_DIR}/shaders) - + set(SHADER_DIR ${CMAKE_SOURCE_DIR}/bin/shaders) if(WIN32) get_filename_component(fname ${SHADER_SOURCE} NAME) add_custom_target(${fname}.spv @@ -54,4 +53,4 @@ target_include_directories(vulkan_grass_rendering PRIVATE ${STB_INCLUDE_DIR} ) -InternalTarget("" vulkan_grass_rendering) +InternalTarget("" vulkan_grass_rendering) \ No newline at end of file diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..976f74a 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -5,6 +5,8 @@ #include "Blades.h" #include "Camera.h" #include "Image.h" +#include +#include "BufferUtils.h" static constexpr unsigned int WORKGROUP_SIZE = 32; @@ -31,6 +33,10 @@ Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* c CreateGraphicsPipeline(); CreateGrassPipeline(); CreateComputePipeline(); + + CreateSkyboxResources(); + CreateSkyboxPipeline(); + RecordCommandBuffers(); RecordComputeCommandBuffer(); } @@ -198,6 +204,43 @@ void Renderer::CreateComputeDescriptorSetLayout() { // TODO: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + + + // Binding 0: Input blades (storage buffer) + VkDescriptorSetLayoutBinding inputBladesBinding = {}; + inputBladesBinding.binding = 0; + inputBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + inputBladesBinding.descriptorCount = 1; + inputBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + + // Binding 1: Culled blades (storage buffer) + VkDescriptorSetLayoutBinding culledBladesBinding = {}; + culledBladesBinding.binding = 1; + culledBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesBinding.descriptorCount = 1; + culledBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + + // Binding 2: Num blades (storage buffer) + VkDescriptorSetLayoutBinding numBladesBinding = {}; + numBladesBinding.binding = 2; + numBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numBladesBinding.descriptorCount = 1; + numBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + + std::vector bindings = { + inputBladesBinding, + culledBladesBinding, + numBladesBinding + }; + + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create compute descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -216,13 +259,15 @@ void Renderer::CreateDescriptorPool() { { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, // TODO: Add any additional types and counts of descriptors you will need to allocate + // Compute shader storage buffers (3 per blade group) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, static_cast(3 * scene->GetBlades().size()) } }; - + VkDescriptorPoolCreateInfo poolInfo = {}; poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO; poolInfo.poolSizeCount = static_cast(poolSizes.size()); poolInfo.pPoolSizes = poolSizes.data(); - poolInfo.maxSets = 5; + poolInfo.maxSets = static_cast(5 + scene->GetBlades().size()); if (vkCreateDescriptorPool(logicalDevice, &poolInfo, nullptr, &descriptorPool) != VK_SUCCESS) { throw std::runtime_error("Failed to create descriptor pool"); @@ -320,6 +365,54 @@ void Renderer::CreateModelDescriptorSets() { void Renderer::CreateGrassDescriptorSets() { // TODO: Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the descriptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate grass descriptor set"); + } + + std::vector descriptorWrites(2 * grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo modelBufferInfo = {}; + modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + modelBufferInfo.offset = 0; + modelBufferInfo.range = sizeof(ModelBufferObject); + + + VkDescriptorImageInfo imageInfo = {}; + imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + imageInfo.imageView = scene->GetModels()[0]->GetTextureView(); + imageInfo.sampler = scene->GetModels()[0]->GetTextureSampler(); + + descriptorWrites[2 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[2 * i + 0].dstSet = grassDescriptorSets[i]; + descriptorWrites[2 * i + 0].dstBinding = 0; + descriptorWrites[2 * i + 0].dstArrayElement = 0; + descriptorWrites[2 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[2 * i + 0].descriptorCount = 1; + descriptorWrites[2 * i + 0].pBufferInfo = &modelBufferInfo; + + descriptorWrites[2 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[2 * i + 1].dstSet = grassDescriptorSets[i]; + descriptorWrites[2 * i + 1].dstBinding = 1; + descriptorWrites[2 * i + 1].dstArrayElement = 0; + descriptorWrites[2 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; + descriptorWrites[2 * i + 1].descriptorCount = 1; + descriptorWrites[2 * i + 1].pImageInfo = &imageInfo; + } + + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -360,6 +453,65 @@ void Renderer::CreateTimeDescriptorSet() { void Renderer::CreateComputeDescriptorSets() { // TODO: Create Descriptor sets for the compute pipeline // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + + computeDescriptorSets.resize(scene->GetBlades().size()); + + std::vector layouts(scene->GetBlades().size(), computeDescriptorSetLayout); + + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts.data(); + + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate compute descriptor sets"); + } + + for (size_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo inputBladesInfo = {}; + inputBladesInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + inputBladesInfo.offset = 0; + inputBladesInfo.range = VK_WHOLE_SIZE; + + VkDescriptorBufferInfo culledBladesInfo = {}; + culledBladesInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesInfo.offset = 0; + culledBladesInfo.range = VK_WHOLE_SIZE; + + VkDescriptorBufferInfo numBladesInfo = {}; + numBladesInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numBladesInfo.offset = 0; + numBladesInfo.range = VK_WHOLE_SIZE; + + std::array descriptorWrites = {}; + + descriptorWrites[0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[0].dstSet = computeDescriptorSets[i]; + descriptorWrites[0].dstBinding = 0; + descriptorWrites[0].dstArrayElement = 0; + descriptorWrites[0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[0].descriptorCount = 1; + descriptorWrites[0].pBufferInfo = &inputBladesInfo; + + descriptorWrites[1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[1].dstSet = computeDescriptorSets[i]; + descriptorWrites[1].dstBinding = 1; + descriptorWrites[1].dstArrayElement = 0; + descriptorWrites[1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[1].descriptorCount = 1; + descriptorWrites[1].pBufferInfo = &culledBladesInfo; + + descriptorWrites[2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[2].dstSet = computeDescriptorSets[i]; + descriptorWrites[2].dstBinding = 2; + descriptorWrites[2].dstArrayElement = 0; + descriptorWrites[2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[2].descriptorCount = 1; + descriptorWrites[2].pBufferInfo = &numBladesInfo; + + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); + } } void Renderer::CreateGraphicsPipeline() { @@ -717,7 +869,7 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.pName = "main"; // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -839,16 +991,23 @@ void Renderer::DestroyFrameResources() { } void Renderer::RecreateFrameResources() { + vkDeviceWaitIdle(logicalDevice); + vkDestroyPipeline(logicalDevice, graphicsPipeline, nullptr); vkDestroyPipeline(logicalDevice, grassPipeline, nullptr); + vkDestroyPipeline(logicalDevice, skyboxPipeline, nullptr); + vkDestroyPipelineLayout(logicalDevice, graphicsPipelineLayout, nullptr); vkDestroyPipelineLayout(logicalDevice, grassPipelineLayout, nullptr); + vkDestroyPipelineLayout(logicalDevice, skyboxPipelineLayout, nullptr); + vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast(commandBuffers.size()), commandBuffers.data()); DestroyFrameResources(); CreateFrameResources(); CreateGraphicsPipeline(); CreateGrassPipeline(); + CreateSkyboxPipeline(); RecordCommandBuffers(); } @@ -885,6 +1044,16 @@ void Renderer::RecordComputeCommandBuffer() { // TODO: For each group of blades bind its descriptor set and dispatch + for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { + // Bind compute descriptor set + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[j], 0, nullptr); + + // Dispatch compute shader + uint32_t numWorkGroups = (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE; + vkCmdDispatch(computeCommandBuffer, numWorkGroups, 1, 1); + } + + // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { throw std::runtime_error("Failed to record compute command buffer"); @@ -950,6 +1119,14 @@ void Renderer::RecordCommandBuffers() { vkCmdBeginRenderPass(commandBuffers[i], &renderPassInfo, VK_SUBPASS_CONTENTS_INLINE); + // Bind the skybox pipeline + vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, skyboxPipeline); + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, skyboxPipelineLayout, 0, 1, &cameraDescriptorSet, 0, nullptr); + VkBuffer skyboxBuffers[] = { skyboxVertexBuffer }; + VkDeviceSize skyboxOffsets[] = { 0 }; + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, skyboxBuffers, skyboxOffsets); + vkCmdDraw(commandBuffers[i], 36, 1, 0, 0); + // Bind the graphics pipeline vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, graphicsPipeline); @@ -976,13 +1153,18 @@ void Renderer::RecordCommandBuffers() { VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); // TODO: Bind the descriptor set for each grass blades model // Draw // TODO: Uncomment this when the buffers are populated // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); + + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -996,10 +1178,10 @@ void Renderer::RecordCommandBuffers() { } void Renderer::Frame() { - + VkSubmitInfo computeSubmitInfo = {}; computeSubmitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; - + computeSubmitInfo.commandBufferCount = 1; computeSubmitInfo.pCommandBuffers = &computeCommandBuffer; @@ -1009,7 +1191,7 @@ void Renderer::Frame() { if (!swapChain->Acquire()) { RecreateFrameResources(); - return; + return; } // Submit the command buffer @@ -1057,6 +1239,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); @@ -1064,4 +1247,192 @@ Renderer::~Renderer() { DestroyFrameResources(); vkDestroyCommandPool(logicalDevice, computeCommandPool, nullptr); vkDestroyCommandPool(logicalDevice, graphicsCommandPool, nullptr); + + + vkDestroyPipeline(logicalDevice, skyboxPipeline, nullptr); + vkDestroyPipelineLayout(logicalDevice, skyboxPipelineLayout, nullptr); + vkDestroyBuffer(logicalDevice, skyboxVertexBuffer, nullptr); + vkFreeMemory(logicalDevice, skyboxVertexBufferMemory, nullptr); +} + + +void Renderer::CreateSkyboxResources() { + // Skybox cube vertices (36 vertices for 6 faces) + float skyboxVertices[] = { + // positions + -1.0f, 1.0f, -1.0f, + -1.0f, -1.0f, -1.0f, + 1.0f, -1.0f, -1.0f, + 1.0f, -1.0f, -1.0f, + 1.0f, 1.0f, -1.0f, + -1.0f, 1.0f, -1.0f, + + -1.0f, -1.0f, 1.0f, + -1.0f, -1.0f, -1.0f, + -1.0f, 1.0f, -1.0f, + -1.0f, 1.0f, -1.0f, + -1.0f, 1.0f, 1.0f, + -1.0f, -1.0f, 1.0f, + + 1.0f, -1.0f, -1.0f, + 1.0f, -1.0f, 1.0f, + 1.0f, 1.0f, 1.0f, + 1.0f, 1.0f, 1.0f, + 1.0f, 1.0f, -1.0f, + 1.0f, -1.0f, -1.0f, + + -1.0f, -1.0f, 1.0f, + -1.0f, 1.0f, 1.0f, + 1.0f, 1.0f, 1.0f, + 1.0f, 1.0f, 1.0f, + 1.0f, -1.0f, 1.0f, + -1.0f, -1.0f, 1.0f, + + -1.0f, 1.0f, -1.0f, + 1.0f, 1.0f, -1.0f, + 1.0f, 1.0f, 1.0f, + 1.0f, 1.0f, 1.0f, + -1.0f, 1.0f, 1.0f, + -1.0f, 1.0f, -1.0f, + + -1.0f, -1.0f, -1.0f, + -1.0f, -1.0f, 1.0f, + 1.0f, -1.0f, -1.0f, + 1.0f, -1.0f, -1.0f, + -1.0f, -1.0f, 1.0f, + 1.0f, -1.0f, 1.0f + }; + + BufferUtils::CreateBufferFromData( + device, graphicsCommandPool, + skyboxVertices, sizeof(skyboxVertices), + VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, + skyboxVertexBuffer, skyboxVertexBufferMemory + ); +} + +void Renderer::CreateSkyboxPipeline() { + VkShaderModule vertShaderModule = ShaderModule::Create("shaders/skybox.vert.spv", logicalDevice); + VkShaderModule fragShaderModule = ShaderModule::Create("shaders/skybox.frag.spv", logicalDevice); + + VkPipelineShaderStageCreateInfo vertShaderStageInfo = {}; + vertShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + vertShaderStageInfo.stage = VK_SHADER_STAGE_VERTEX_BIT; + vertShaderStageInfo.module = vertShaderModule; + vertShaderStageInfo.pName = "main"; + + VkPipelineShaderStageCreateInfo fragShaderStageInfo = {}; + fragShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; + fragShaderStageInfo.stage = VK_SHADER_STAGE_FRAGMENT_BIT; + fragShaderStageInfo.module = fragShaderModule; + fragShaderStageInfo.pName = "main"; + + VkPipelineShaderStageCreateInfo shaderStages[] = { vertShaderStageInfo, fragShaderStageInfo }; + + VkVertexInputBindingDescription bindingDescription = {}; + bindingDescription.binding = 0; + bindingDescription.stride = 3 * sizeof(float); + bindingDescription.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; + + VkVertexInputAttributeDescription attributeDescription = {}; + attributeDescription.binding = 0; + attributeDescription.location = 0; + attributeDescription.format = VK_FORMAT_R32G32B32_SFLOAT; + attributeDescription.offset = 0; + + VkPipelineVertexInputStateCreateInfo vertexInputInfo = {}; + vertexInputInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; + vertexInputInfo.vertexBindingDescriptionCount = 1; + vertexInputInfo.pVertexBindingDescriptions = &bindingDescription; + vertexInputInfo.vertexAttributeDescriptionCount = 1; + vertexInputInfo.pVertexAttributeDescriptions = &attributeDescription; + + VkPipelineInputAssemblyStateCreateInfo inputAssembly = {}; + inputAssembly.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; + inputAssembly.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; + inputAssembly.primitiveRestartEnable = VK_FALSE; + + VkViewport viewport = {}; + viewport.x = 0.0f; + viewport.y = 0.0f; + viewport.width = (float)swapChain->GetVkExtent().width; + viewport.height = (float)swapChain->GetVkExtent().height; + viewport.minDepth = 0.0f; + viewport.maxDepth = 1.0f; + + VkRect2D scissor = {}; + scissor.offset = { 0, 0 }; + scissor.extent = swapChain->GetVkExtent(); + + VkPipelineViewportStateCreateInfo viewportState = {}; + viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; + viewportState.viewportCount = 1; + viewportState.pViewports = &viewport; + viewportState.scissorCount = 1; + viewportState.pScissors = &scissor; + + VkPipelineRasterizationStateCreateInfo rasterizer = {}; + rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; + rasterizer.depthClampEnable = VK_FALSE; + rasterizer.rasterizerDiscardEnable = VK_FALSE; + rasterizer.polygonMode = VK_POLYGON_MODE_FILL; + rasterizer.lineWidth = 1.0f; + rasterizer.cullMode = VK_CULL_MODE_NONE; + rasterizer.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE; + rasterizer.depthBiasEnable = VK_FALSE; + + VkPipelineMultisampleStateCreateInfo multisampling = {}; + multisampling.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; + multisampling.sampleShadingEnable = VK_FALSE; + multisampling.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT; + + VkPipelineDepthStencilStateCreateInfo depthStencil = {}; + depthStencil.sType = VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO; + depthStencil.depthTestEnable = VK_TRUE; + depthStencil.depthWriteEnable = VK_FALSE; + depthStencil.depthCompareOp = VK_COMPARE_OP_LESS_OR_EQUAL; + depthStencil.depthBoundsTestEnable = VK_FALSE; + depthStencil.stencilTestEnable = VK_FALSE; + + VkPipelineColorBlendAttachmentState colorBlendAttachment = {}; + colorBlendAttachment.colorWriteMask = 0xF; + colorBlendAttachment.blendEnable = VK_FALSE; + + VkPipelineColorBlendStateCreateInfo colorBlending = {}; + colorBlending.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; + colorBlending.logicOpEnable = VK_FALSE; + colorBlending.attachmentCount = 1; + colorBlending.pAttachments = &colorBlendAttachment; + + VkDescriptorSetLayout setLayouts[] = { cameraDescriptorSetLayout }; + VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; + pipelineLayoutInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; + pipelineLayoutInfo.setLayoutCount = 1; + pipelineLayoutInfo.pSetLayouts = setLayouts; + + if (vkCreatePipelineLayout(logicalDevice, &pipelineLayoutInfo, nullptr, &skyboxPipelineLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create skybox pipeline layout"); + } + + VkGraphicsPipelineCreateInfo pipelineInfo = {}; + pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; + pipelineInfo.stageCount = 2; + pipelineInfo.pStages = shaderStages; + pipelineInfo.pVertexInputState = &vertexInputInfo; + pipelineInfo.pInputAssemblyState = &inputAssembly; + pipelineInfo.pViewportState = &viewportState; + pipelineInfo.pRasterizationState = &rasterizer; + pipelineInfo.pMultisampleState = &multisampling; + pipelineInfo.pDepthStencilState = &depthStencil; + pipelineInfo.pColorBlendState = &colorBlending; + pipelineInfo.layout = skyboxPipelineLayout; + pipelineInfo.renderPass = renderPass; + pipelineInfo.subpass = 0; + + if (vkCreateGraphicsPipelines(logicalDevice, VK_NULL_HANDLE, 1, &pipelineInfo, nullptr, &skyboxPipeline) != VK_SUCCESS) { + throw std::runtime_error("Failed to create skybox pipeline"); + } + + vkDestroyShaderModule(logicalDevice, vertShaderModule, nullptr); + vkDestroyShaderModule(logicalDevice, fragShaderModule, nullptr); } diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..858e108 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -41,6 +41,9 @@ class Renderer { void Frame(); + void CreateSkyboxResources(); + void CreateSkyboxPipeline(); + private: Device* device; VkDevice logicalDevice; @@ -56,11 +59,17 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; + std::vector grassDescriptorSets; + + std::vector computeDescriptorSets; + VkDescriptorSet timeDescriptorSet; VkPipelineLayout graphicsPipelineLayout; @@ -79,4 +88,10 @@ class Renderer { std::vector commandBuffers; VkCommandBuffer computeCommandBuffer; + + + VkPipelineLayout skyboxPipelineLayout; + VkPipeline skyboxPipeline; + VkBuffer skyboxVertexBuffer; + VkDeviceMemory skyboxVertexBufferMemory; }; diff --git a/src/SwapChain.cpp b/src/SwapChain.cpp index 711fec0..f5a682f 100644 --- a/src/SwapChain.cpp +++ b/src/SwapChain.cpp @@ -77,7 +77,9 @@ SwapChain::SwapChain(Device* device, VkSurfaceKHR vkSurface, unsigned int numBuf void SwapChain::Create() { auto* instance = device->GetInstance(); - const auto& surfaceCapabilities = instance->GetSurfaceCapabilities(); + // Re-query surface capabilities to get updated window dimensions + VkSurfaceCapabilitiesKHR surfaceCapabilities; + vkGetPhysicalDeviceSurfaceCapabilitiesKHR(instance->GetPhysicalDevice(), vkSurface, &surfaceCapabilities); VkSurfaceFormatKHR surfaceFormat = chooseSwapSurfaceFormat(instance->GetSurfaceFormats()); VkPresentModeKHR presentMode = chooseSwapPresentMode(instance->GetPresentModes()); @@ -199,14 +201,14 @@ bool SwapChain::Acquire() { vkQueueWaitIdle(device->GetQueue(QueueFlags::Present)); } VkResult result = vkAcquireNextImageKHR(device->GetVkDevice(), vkSwapChain, std::numeric_limits::max(), imageAvailableSemaphore, VK_NULL_HANDLE, &imageIndex); - if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR) { - throw std::runtime_error("Failed to acquire swap chain image"); - } - + if (result == VK_ERROR_OUT_OF_DATE_KHR) { - Recreate(); return false; } + + if (result != VK_SUCCESS && result != VK_SUBOPTIMAL_KHR) { + throw std::runtime_error("Failed to acquire swap chain image"); + } return true; } @@ -228,15 +230,17 @@ bool SwapChain::Present() { VkResult result = vkQueuePresentKHR(device->GetQueue(QueueFlags::Present), &presentInfo); - if (result != VK_SUCCESS) { - throw std::runtime_error("Failed to present swap chain image"); - } if (result == VK_ERROR_OUT_OF_DATE_KHR || result == VK_SUBOPTIMAL_KHR) { - Recreate(); return false; } + if (result != VK_SUCCESS) { + throw std::runtime_error("Failed to present swap chain image"); + } + + + return true; } diff --git a/src/main.cpp b/src/main.cpp index 8bf822b..8fa6456 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -5,6 +5,9 @@ #include "Camera.h" #include "Scene.h" #include "Image.h" +#include +#include +#include Device* device; SwapChain* swapChain; @@ -143,10 +146,32 @@ int main() { glfwSetMouseButtonCallback(GetGLFWWindow(), mouseDownCallback); glfwSetCursorPosCallback(GetGLFWWindow(), mouseMoveCallback); + auto lastTime = std::chrono::high_resolution_clock::now(); + int frameCount = 0; + while (!ShouldQuit()) { glfwPollEvents(); scene->UpdateTime(); renderer->Frame(); + + frameCount++; + auto currentTime = std::chrono::high_resolution_clock::now(); + float deltaTime = std::chrono::duration(currentTime - lastTime).count(); + + if (deltaTime >= 0.5f) { + float fps = frameCount / deltaTime; + float frameTimeMs = deltaTime / frameCount * 1000.0f; + + std::ostringstream title; + title << "Vulkan Grass Rendering | FPS: " + << std::fixed << std::setprecision(1) << fps + << " | Frame Time: " << std::setprecision(2) << frameTimeMs << "ms"; + + glfwSetWindowTitle(GetGLFWWindow(), title.str().c_str()); + + frameCount = 0; + lastTime = currentTime; + } } vkDeviceWaitIdle(device->GetVkDevice()); @@ -161,6 +186,7 @@ int main() { delete renderer; delete swapChain; delete device; + vkDestroySurfaceKHR(instance->GetVkInstance(), surface, nullptr); delete instance; DestroyWindow(); return 0; diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..7d45f44 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -1,4 +1,4 @@ -#version 450 +#version 450 #extension GL_ARB_separate_shader_objects : enable #define WORKGROUP_SIZE 32 @@ -36,21 +36,177 @@ struct Blade { // uint firstInstance; // = 0 // } numBlades; + + + +layout(set = 2, binding = 0) buffer InputBlades { + Blade blades[]; +} inputBlades; + +layout(set = 2, binding = 1) buffer CulledBlades { + Blade blades[]; +} culledBlades; + +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; + uint instanceCount; + uint firstVertex; + uint firstInstance; +} numBlades; + bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); } void main() { // Reset the number of blades to 0 - if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; - } - barrier(); // Wait till all threads reach this point + if (gl_GlobalInvocationID.x == 0) { + numBlades.vertexCount = 0; + } + barrier(); - // TODO: Apply forces on every blade and update the vertices in the buffer + uint index = gl_GlobalInvocationID.x; + + if (index >= inputBlades.blades.length()) { + return; + } + + Blade blade = inputBlades.blades[index]; + + vec3 v0 = blade.v0.xyz; + vec3 v1 = blade.v1.xyz; + vec3 v2 = blade.v2.xyz; + vec3 up = blade.up.xyz; + + float orientation = blade.v0.w; + float height = blade.v1.w; + float width = blade.v2.w; + float stiffness = blade.up.w; + + // PHYSICS SIMULATION + + float dt = min(deltaTime, 0.1); + vec3 iv2 = v0 + up * height; + + // Recovery + vec3 recovery = (iv2 - v2) * stiffness; + + // Gravity + vec3 gE = vec3(0.0, -9.8, 0.0); + vec3 front = normalize(cross(up, vec3(sin(orientation), 0.0, cos(orientation)))); + vec3 gF = 0.25 * length(gE) * front; + vec3 gravity = gE + gF; + + // Wind + float windStrength = 2.0; + vec3 windDirection = vec3(1.0, 0.0, 0.0); + float windWave = sin(totalTime * 2.0 + v0.x * 0.5 + v0.z * 0.5); + vec3 wind = windDirection * windStrength * windWave; + + float directionalAlignment = 1.0 - abs(dot(normalize(wind), normalize(v2 - v0))); + float heightRatio = dot(v2 - v0, up) / height; + float windAlignment = directionalAlignment * heightRatio; + wind *= windAlignment; + + // Total force + vec3 totalForce = recovery + gravity + wind; + v2 += totalForce * dt; + + // Damping on displacement + vec3 displacement = v2 - v0; + displacement *= 0.97; + v2 = v0 + displacement; + + // State validation + v2 = v2 - up * min(dot(up, v2 - v0), 0.0); - // TODO: Cull blades that are too far away or not in the camera frustum and write them - // to the culled blades buffer - // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount - // You want to write the visible blades to the buffer without write conflicts between threads -} + float currentHeight = dot(v2 - v0, up); + float minHeight = height * 0.7; + vec3 horizontalDisp = v2 - v0 - up * currentHeight; + v2 = v0 + up * max(currentHeight, minHeight) + horizontalDisp; + + float lproj = length(v2 - v0 - up * dot(v2 - v0, up)); + v1 = v0 + up * height * max(1.0 - lproj / height, 0.05 * max(lproj / height, 1.0)); + + float L0 = distance(v2, v0); + float L1 = distance(v0, v1) + distance(v1, v2); + float L = (2.0 * L0 + L1) / 3.0; + + float r = height / max(L, 0.01); + r = clamp(r, 0.5, 2.0); + + vec3 v1corr = v0 + r * (v1 - v0); + vec3 v2corr = v1corr + r * (v2 - v1); + + if (any(isnan(v1corr)) || any(isnan(v2corr)) || + length(v2corr - v0) > height * 3.0) { + v1corr = v0 + up * height * 0.5; + v2corr = v0 + up * height; + } + + blade.v1.xyz = v1corr; + blade.v2.xyz = v2corr; + + // Write back to preserve state + inputBlades.blades[index] = blade; + + // CULLING TESTS + + // Orientation culling + vec3 cameraPos = inverse(camera.view)[3].xyz; + vec3 viewDir = normalize(v0 - cameraPos); + vec3 bladeDir = normalize(cross(up, front)); + + float orientationThreshold = 0.9; + if (abs(dot(viewDir, bladeDir)) > orientationThreshold) { + return; + } + + // View-frustum culling + vec3 m = 0.25 * v0 + 0.5 * v1corr + 0.25 * v2corr; + + vec4 clipV0 = camera.proj * camera.view * vec4(v0, 1.0); + vec4 clipV2 = camera.proj * camera.view * vec4(v2corr, 1.0); + vec4 clipM = camera.proj * camera.view * vec4(m, 1.0); + + float tolerance = 1.0; + + float hV0 = clipV0.w + tolerance; + float hV2 = clipV2.w + tolerance; + float hM = clipM.w + tolerance; + + bool v0InFrustum = abs(clipV0.x) <= hV0 && abs(clipV0.y) <= hV0 && abs(clipV0.z) <= hV0; + bool v2InFrustum = abs(clipV2.x) <= hV2 && abs(clipV2.y) <= hV2 && abs(clipV2.z) <= hV2; + bool mInFrustum = abs(clipM.x) <= hM && abs(clipM.y) <= hM && abs(clipM.z) <= hM; + + if (!v0InFrustum && !v2InFrustum && !mInFrustum) { + return; + } + + // Distance culling + float dist = length(v0 - cameraPos); + float maxDistance = 50.0; + + if (dist > maxDistance) { + return; + } + + // Distance-based random culling + int numBuckets = 10; + float bucketSize = maxDistance / float(numBuckets); + int bucket = int(dist / bucketSize); + + // Use a hash to randomly cull based on distance + uint hash = uint(v0.x * 12345.0 + v0.z * 67890.0 + index * 1000); + float cullProbability = float(bucket) / float(numBuckets); + + if ((hash % 100) / 100.0 < cullProbability) { + return; + } + + + // WRITE TO CULLED BUFFER + + uint outputIndex = atomicAdd(numBlades.vertexCount, 1); + culledBlades.blades[outputIndex] = blade; +} \ No newline at end of file diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..759d2dc 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -8,10 +8,42 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { // TODO: Declare fragment shader inputs +layout(location = 0) in vec3 fragNormal; +layout(location = 1) in vec2 fragUV; + layout(location = 0) out vec4 outColor; void main() { // TODO: Compute fragment color - outColor = vec4(1.0); + vec3 grassColorBottom = vec3(0.1, 0.4, 0.1); + vec3 grassColorTop = vec3(0.3, 0.8, 0.3); + + vec3 grassColor = mix(grassColorBottom, grassColorTop, fragUV.y); + + vec3 lightDir = normalize(vec3(0.5, 1.0, 0.3)); // Sun direction + vec3 normal = normalize(fragNormal); + + // Diffuse lighting + float NdotL = dot(normal, lightDir); + float diffuse = (NdotL + 0.5) / 1.5; // Wrap lighting + diffuse = clamp(diffuse, 0.0, 1.0); + + // Ambient light (base lighting when no direct sun) + vec3 ambient = grassColor * 0.3; + + + vec3 diffuseColor = grassColor * diffuse * 0.7; + + vec3 viewDir = normalize(vec3(0.0, 1.0, 0.0)); + + // Rim effect: edges perpendicular to view are brighter + float rim = 1.0 - abs(dot(normal, viewDir)); + rim = pow(rim, 3.0); // Sharp falloff + + vec3 rimColor = vec3(0.8, 1.0, 0.6) * rim * 0.3; // rimLight + + vec3 finalColor = ambient + diffuseColor + rimColor; + + outColor = vec4(finalColor, 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..6bd6947 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -10,6 +10,16 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { // TODO: Declare tessellation control shader inputs and outputs +layout(location = 0) in vec4 in_v0[]; +layout(location = 1) in vec4 in_v1[]; +layout(location = 2) in vec4 in_v2[]; +layout(location = 3) in vec4 in_up[]; + +layout(location = 0) out vec4 out_v0[]; +layout(location = 1) out vec4 out_v1[]; +layout(location = 2) out vec4 out_v2[]; +layout(location = 3) out vec4 out_up[]; + void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; @@ -23,4 +33,28 @@ void main() { // gl_TessLevelOuter[1] = ??? // gl_TessLevelOuter[2] = ??? // gl_TessLevelOuter[3] = ??? + + + out_v0[gl_InvocationID] = in_v0[gl_InvocationID]; + out_v1[gl_InvocationID] = in_v1[gl_InvocationID]; + out_v2[gl_InvocationID] = in_v2[gl_InvocationID]; + out_up[gl_InvocationID] = in_up[gl_InvocationID]; + +// Calculate distance-based LOD with smooth transition +vec3 cameraPos = inverse(camera.view)[3].xyz; +vec3 bladePos = in_v0[gl_InvocationID].xyz; +float dist = length(cameraPos - bladePos); + +// Smooth interpolation from 5.0 to 1.0 based on distance +float tessLevel = mix(5.0, 1.0, smoothstep(5.0, 50.0, dist)); + +// Clamp to valid range +tessLevel = clamp(tessLevel, 1.0, 5.0); + +gl_TessLevelInner[0] = tessLevel; +gl_TessLevelInner[1] = 1.0; +gl_TessLevelOuter[0] = tessLevel; +gl_TessLevelOuter[1] = 1.0; +gl_TessLevelOuter[2] = tessLevel; +gl_TessLevelOuter[3] = 1.0; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..5692b5b 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -1,4 +1,4 @@ -#version 450 +#version 450 #extension GL_ARB_separate_shader_objects : enable layout(quads, equal_spacing, ccw) in; @@ -10,9 +10,58 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { // TODO: Declare tessellation evaluation shader inputs and outputs +layout(location = 0) in vec4 in_v0[]; +layout(location = 1) in vec4 in_v1[]; +layout(location = 2) in vec4 in_v2[]; +layout(location = 3) in vec4 in_up[]; + + +layout(location = 0) out vec3 fragNormal; +layout(location = 1) out vec2 fragUV; + void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; - + // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + + vec3 v0 = in_v0[0].xyz; + vec3 v1 = in_v1[0].xyz; + vec3 v2 = in_v2[0].xyz; + vec3 up = in_up[0].xyz; + + float orientation = in_v0[0].w; + float height = in_v1[0].w; + float width = in_v2[0].w; + + vec3 t1 = vec3(sin(orientation), 0.0, cos(orientation)); + + float t = v; + vec3 a = v0 + t * (v1 - v0); + vec3 b = v1 + t * (v2 - v1); + vec3 c = a + t * (b - a); + + + float currentWidth = width * (1.0 - v); + + vec3 c0 = c - currentWidth * t1; + vec3 c1 = c + currentWidth * t1; + + vec3 worldPos = mix(c0, c1, u); + + + // Calculate proper normal for grass blade face + vec3 tangentAlongBlade = normalize(v2 - v0); // Direction along blade height + vec3 normal = normalize(cross(t1, tangentAlongBlade)); + + // Make sure normal points outward based on which side of blade + if (u < 0.5) { + normal = -normal; // Flip for left side + } + +fragNormal = normal; + + fragUV = vec2(u, v); + + gl_Position = camera.proj * camera.view * vec4(worldPos, 1.0); } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..bfbf422 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -8,10 +8,31 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { // TODO: Declare vertex shader inputs and outputs +layout(location = 0) in vec4 v0; +layout(location = 1) in vec4 v1; +layout(location = 2) in vec4 v2; +layout(location = 3) in vec4 up; + + +layout(location = 0) out vec4 out_v0; +layout(location = 1) out vec4 out_v1; +layout(location = 2) out vec4 out_v2; +layout(location = 3) out vec4 out_up; + out gl_PerVertex { vec4 gl_Position; + float gl_PointSize; + float gl_ClipDistance[]; + float gl_CullDistance[]; }; void main() { // TODO: Write gl_Position and any other shader outputs + out_v0 = v0; + out_v1 = v1; + out_v2 = v2; + out_up = up; + + gl_Position = vec4(v0.xyz, 1.0); + gl_PointSize = 1.0; } diff --git a/src/shaders/skybox.frag b/src/shaders/skybox.frag new file mode 100644 index 0000000..2bdf8ec --- /dev/null +++ b/src/shaders/skybox.frag @@ -0,0 +1,58 @@ +#version 450 +#extension GL_ARB_separate_shader_objects : enable + +layout(location = 0) in vec3 fragTexCoord; +layout(location = 0) out vec4 outColor; + +void main() { + vec3 dir = normalize(fragTexCoord); + + // Sky Gradient + float height = dir.y; // -1 (bottom) to 1 (top) + + // Colors + vec3 skyColorTop = vec3(0.1, 0.3, 0.8); // Dark blue at zenith + vec3 skyColorHorizon = vec3(0.6, 0.8, 1.0); // Light blue at horizon + vec3 groundColor = vec3(0.4, 0.35, 0.3); // Brown below horizon + + vec3 skyColor; + if (height > 0.0) { + // Above horizon - gradient from horizon to top + float t = pow(height, 0.7); // Non-linear for better look + skyColor = mix(skyColorHorizon, skyColorTop, t); + } else { + // Below horizon - fade to ground color + float t = pow(-height, 0.5); + skyColor = mix(skyColorHorizon, groundColor, t); + } + + // Sun + vec3 sunDir = normalize(vec3(0.5, 0.8, 0.3)); // Sun position + float sun = dot(dir, sunDir); + sun = smoothstep(0.995, 0.999, sun); // Sharp sun disk + + // Sun glow + float sunGlow = dot(dir, sunDir); + sunGlow = max(sunGlow, 0.0); + sunGlow = pow(sunGlow, 8.0) * 0.3; // Soft glow around sun + + vec3 sunColor = vec3(1.0, 0.9, 0.7); + skyColor += sunColor * (sun + sunGlow); + + // Clouds + // Simple procedural clouds using direction + float cloudPattern = sin(dir.x * 10.0 + dir.z * 8.0) * + cos(dir.x * 8.0 - dir.z * 10.0); + cloudPattern = cloudPattern * 0.5 + 0.5; // Remap to 0-1 + + // Only show clouds in upper sky + float cloudMask = smoothstep(0.0, 0.3, height) * + smoothstep(1.0, 0.5, height); + + cloudPattern = pow(cloudPattern, 3.0); // Make clouds puffier + vec3 cloudColor = vec3(1.0, 1.0, 1.0); + skyColor = mix(skyColor, cloudColor, cloudPattern * cloudMask * 0.5); + + + outColor = vec4(skyColor, 1.0); +} \ No newline at end of file diff --git a/src/shaders/skybox.vert b/src/shaders/skybox.vert new file mode 100644 index 0000000..96dde55 --- /dev/null +++ b/src/shaders/skybox.vert @@ -0,0 +1,27 @@ +#version 450 +#extension GL_ARB_separate_shader_objects : enable + +layout(set = 0, binding = 0) uniform CameraBufferObject { + mat4 view; + mat4 proj; +} camera; + +layout(location = 0) in vec3 inPosition; + +layout(location = 0) out vec3 fragTexCoord; + +out gl_PerVertex { + vec4 gl_Position; +}; + +void main() { + // Remove translation from view matrix (only rotation) + mat4 viewNoTranslation = mat4(mat3(camera.view)); + + vec4 pos = camera.proj * viewNoTranslation * vec4(inPosition, 1.0); + + // Set z = w so depth is always 1.0 (farthest) + gl_Position = pos.xyww; + + fragTexCoord = inPosition; +} \ No newline at end of file