fix: 2M tendencies, params#694
Merged
Merged
Conversation
Member
Author
|
This change is part of the following stack: Change managed by git-spice. |
c06175b to
ec34417
Compare
haakon-e
commented
Mar 2, 2026
Member
Author
haakon-e
left a comment
There was a problem hiding this comment.
Comments to assist review.
Member
Author
|
Leaving some code here that didn't make it into this PR testing the GPU performance of exact vs approximated gamma functions. code, not related to this PR, just for myself#=
GPU Performance Benchmark: rain_evaporation vs rain_evaporation_CPU
Compares the GPU performance of:
- CM2.rain_evaporation (uses Γ_incl approximation — designed for GPU)
- rain_evaporation_CPU (uses SF.gamma — exact incomplete gamma functions)
Both functions are GPU-compatible; this script measures their relative throughput.
=#
using KernelAbstractions
using ClimaComms
ClimaComms.@import_required_backends
import SpecialFunctions as SF
import ClimaParams as CP
import CloudMicrophysics.Parameters as CMP
import CloudMicrophysics.ThermodynamicsInterface as TDI
import CloudMicrophysics.Common as CO
import CloudMicrophysics.Microphysics2M as CM2
import CloudMicrophysics.Utilities as UT
ClimaComms.device() isa ClimaComms.CUDADevice || error("No GPU found")
using CUDA
backend = CUDABackend()
CUDA.allowscalar(false)
ArrayType = CuArray
# ArrayType = Array
# backend = CPU()
# ---------------------------------------------------------------------------- #
# rain_evaporation_CPU — copy from RainEvapoartionSB2006.jl (uses SF.gamma) #
# ---------------------------------------------------------------------------- #
function rain_evaporation_CPU(SB2006, aps, tps, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T)
FT = typeof(qᵣ)
ϵₘ = UT.ϵ_numerics_2M_M(FT)
ϵₙ = UT.ϵ_numerics_2M_N(FT)
∂ₜρn_rai = zero(qᵣ)
∂ₜq_rai = zero(qᵣ)
S = TDI.supersaturation_over_liquid(tps, qₜ, qₗ + qᵣ, qᵢ + qₛ, ρ, T)
(Nᵣ ≤ ϵₙ || S ≥ 0) && return (; ∂ₜρn_rai, ∂ₜq_rai)
(; ν_air, D_vapor) = aps
(; av, bv, α, β, ρ0) = SB2006.evap
ρw = SB2006.pdf_r.ρw
x_star = SB2006.pdf_r.xr_min
G = CO.G_func_liquid(aps, tps, T)
(; xr_mean) = CM2.pdf_rain_parameters(SB2006.pdf_r, qᵣ, ρ, Nᵣ)
Dr = cbrt(6 * xr_mean / (π * ρw))
t_star = cbrt(6 * x_star / xr_mean)
# gam = t_star^(-1) * SF.expint(1 - (-1), t_star)
# gam = SF.gamma(FT(-1), t_star)
# a_vent_0 = av * gam / FT(6)^(-2 // 3)
a_vent_0 = av * SF.gamma(FT(-1), t_star) / FT(6)^(-2 // 3)
gam = SF.gamma(FT(-1), t_star) / FT(6)^(-2 // 3)
b_vent_0 = bv * SF.gamma(FT((-1 // 2) + 3 // 2 * β), t_star) / FT(6)^(β / 2 - 1 // 2)
a_vent_1 = av * SF.gamma(FT(2)) / cbrt(FT(6))
b_vent_1 = bv * SF.gamma(5 // 2 + 3 // 2 * β) / FT(6)^(β / 2 + 1 // 2)
N_Re = α * xr_mean^β * sqrt(ρ0 / ρ) * Dr / ν_air
Fv0 = a_vent_0 + b_vent_0 * cbrt(ν_air / D_vapor) * sqrt(N_Re)
Fv1 = a_vent_1 + b_vent_1 * cbrt(ν_air / D_vapor) * sqrt(N_Re)
∂ₜρn_rai = min(0, 2 * FT(π) * G * S * Nᵣ * Dr * Fv0 / xr_mean)
∂ₜq_rai = min(0, 2 * FT(π) * G * S * Nᵣ * Dr * Fv1 / ρ)
∂ₜρn_rai = ifelse(xr_mean / x_star < eps(FT), FT(0), ∂ₜρn_rai)
∂ₜq_rai = ifelse(qᵣ < ϵₘ, FT(0), ∂ₜq_rai)
return (; ∂ₜρn_rai, ∂ₜq_rai)
end
# let # test type stability
# FT = Float32
# tps = TDI.TD.Parameters.ThermodynamicsParameters(FT)
# aps = CMP.AirProperties(FT)
# SB2006 = CMP.SB2006(FT)
# qₜ = FT(1e-2)
# qₗ = FT(1e-3)
# qᵢ = FT(1e-4)
# qᵣ = FT(1e-5)
# qₛ = FT(1e-6)
# ρ = FT(1.0)
# Nᵣ = FT(1e6)
# T = FT(273.15)
# @code_warntype CM2.rain_evaporation(SB2006, aps, tps, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T)
# # @code_warntype rain_evaporation_CPU(SB2006, aps, tps, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T)
# end
# ---------------------------------------------------------------------------- #
# GPU Kernels #
# ---------------------------------------------------------------------------- #
@kernel inbounds = true function kernel_rain_evap_approx!(
SB2006, aps, tps, output, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T,
)
i = @index(Global, Linear)
output[i] = CM2.rain_evaporation(
SB2006, aps, tps,
qₜ[i], qₗ[i], qᵢ[i], qᵣ[i], qₛ[i], ρ[i], Nᵣ[i], T[i],
)
end
@kernel inbounds = true function kernel_rain_evap_exact!(
SB2006, aps, tps, output, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T,
)
i = @index(Global, Linear)
output[i] = rain_evaporation_CPU(
SB2006, aps, tps,
qₜ[i], qₗ[i], qᵢ[i], qᵣ[i], qₛ[i], ρ[i], Nᵣ[i], T[i],
)
end
# let # simple test that gpu kernels compile and run
# N = 10
# FT = Float32
# tps = TDI.TD.Parameters.ThermodynamicsParameters(FT)
# aps = CMP.AirProperties(FT)
# SB2006 = CMP.SB2006(FT)
# rand_vec(lo, hi) = ArrayType(lo .+ (hi .- lo) .* rand(FT, N))
# qₜ = rand_vec(FT(5e-3), FT(2e-2)) # total water: 5–20 g/kg
# qₗ = FT(1e-3)
# qᵢ = FT(1e-4)
# qᵣ = FT(1e-5)
# qₛ = FT(1e-6)
# ρ = FT(1.0)
# Nᵣ = FT(1e6)
# T = FT(273.15)
# DT = @NamedTuple{∂ₜρn_rai::FT, ∂ₜq_rai::FT}
# output_approx = allocate(backend, DT, N)
# output_exact = allocate(backend, DT, N)
# work_groups = 2
# # --- Warm up (compile) ---
# k_approx! = kernel_rain_evap_approx!(backend, work_groups)
# k_exact! = kernel_rain_evap_exact!(backend, work_groups)
# # k_approx!(SB2006, aps, tps, output_approx, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
# # KernelAbstractions.synchronize(backend)
# k_exact!(SB2006, aps, tps, output_exact, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
# KernelAbstractions.synchronize(backend)
# end
# let # simple test to check that SF.gamma works on GPU
# FT = Float32
# N = 100
# rand_vec(lo, hi) = ArrayType(lo .+ (hi .- lo) .* rand(FT, N))
# x = rand_vec(FT(0), FT(10))
# y = SF.gamma.(x)
# end
import Random
function run_complex_compiler(; FT = Float64, N = 10_000)
tps = TDI.TD.Parameters.ThermodynamicsParameters(FT)
aps = CMP.AirProperties(FT)
SB2006 = CMP.SB2006(FT)
seed = Random.MersenneTwister(12345)
rand_vec(lo, hi) = ArrayType(lo .+ (hi .- lo) .* rand(seed, FT, N))
qₜ = rand_vec(FT(5e-3), FT(2e-2)) # total water: 5–20 g/kg
qₗ = rand_vec(FT(0), FT(3e-3)) # cloud liquid: 0–3 g/kg
qᵢ = rand_vec(FT(0), FT(1e-3)) # cloud ice: 0–1 g/kg
qᵣ = rand_vec(FT(1e-9), FT(5e-4)) # rain: ~0–0.5 g/kg (includes near-zero)
qₛ = rand_vec(FT(0), FT(1e-4)) # snow: 0–0.1 g/kg
ρ = rand_vec(FT(0.4), FT(1.3)) # air density: 0.4–1.3 kg/m³
Nᵣ = rand_vec(FT(1e4), FT(1e9)) # rain number: 1e4–1e9 /m³
T = rand_vec(FT(273.15), FT(310.0)) # temperature: 273–310 K
# Determine output type
DT = @NamedTuple{∂ₜρn_rai::FT, ∂ₜq_rai::FT}
output_exact = allocate(backend, DT, N)
work_groups = 1
k_exact! = kernel_rain_evap_exact!(backend, work_groups)
k_exact!(SB2006, aps, tps, output_exact, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
KernelAbstractions.synchronize(backend)
end
# Run for both Float64 and Float32
println("="^60)
println(" Float64")
println("="^60)
run_complex_compiler(; FT = Float64)
println("="^60)
println(" Float32")
println("="^60)
run_complex_compiler(; FT = Float32)
# ---------------------------------------------------------------------------- #
# Benchmark runner #
# ---------------------------------------------------------------------------- #
function run_benchmark(; FT = Float64, N = 10_000, nreps = 20)
@info "Setting up parameters" FT N nreps
tps = TDI.TD.Parameters.ThermodynamicsParameters(FT)
aps = CMP.AirProperties(FT)
SB2006 = CMP.SB2006(FT)
# Random inputs spanning realistic ranges to exercise all code paths:
# - some points supersaturated (no evap), some sub-saturated (full calc)
# - qᵣ from near-zero to moderate rain
# - Nᵣ spanning several orders of magnitude
# - T from near-freezing to warm
rand_vec(lo, hi) = ArrayType(lo .+ (hi .- lo) .* rand(FT, N))
qₜ = rand_vec(FT(5e-3), FT(2e-2)) # total water: 5–20 g/kg
qₗ = rand_vec(FT(0), FT(3e-3)) # cloud liquid: 0–3 g/kg
qᵢ = rand_vec(FT(0), FT(1e-3)) # cloud ice: 0–1 g/kg
qᵣ = rand_vec(FT(1e-9), FT(5e-4)) # rain: ~0–0.5 g/kg (includes near-zero)
qₛ = rand_vec(FT(0), FT(1e-4)) # snow: 0–0.1 g/kg
ρ = rand_vec(FT(0.4), FT(1.3)) # air density: 0.4–1.3 kg/m³
Nᵣ = rand_vec(FT(1e4), FT(1e9)) # rain number: 1e4–1e9 /m³
T = rand_vec(FT(273.15), FT(310.0)) # temperature: 273–310 K
# Determine output type
DT = @NamedTuple{∂ₜρn_rai::FT, ∂ₜq_rai::FT}
output_approx = allocate(backend, DT, N)
output_exact = allocate(backend, DT, N)
work_groups = 1
# --- Warm up (compile) ---
k_approx! = kernel_rain_evap_approx!(backend, work_groups)
k_exact! = kernel_rain_evap_exact!(backend, work_groups)
k_approx!(SB2006, aps, tps, output_approx, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
KernelAbstractions.synchronize(backend)
k_exact!(SB2006, aps, tps, output_exact, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
KernelAbstractions.synchronize(backend)
# --- Verify correctness ---
out_a = Array(output_approx)
out_e = Array(output_exact)
max_rel_err_ρn = maximum(
abs.([o.∂ₜρn_rai for o in out_a] .- [o.∂ₜρn_rai for o in out_e]) ./
(abs.([o.∂ₜρn_rai for o in out_e]) .+ eps(FT)),
)
max_rel_err_q = maximum(
abs.([o.∂ₜq_rai for o in out_a] .- [o.∂ₜq_rai for o in out_e]) ./ (abs.([o.∂ₜq_rai for o in out_e]) .+ eps(FT)),
)
println("\n--- Correctness Check ---")
println("Max relative error ∂ₜρn_rai: $(max_rel_err_ρn)")
println("Max relative error ∂ₜq_rai: $(max_rel_err_q)")
# --- Benchmark: CM2.rain_evaporation (Γ_incl approximation) ---
CUDA.synchronize()
t_approx = @elapsed begin
for _ in 1:nreps
k_approx!(SB2006, aps, tps, output_approx, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
end
KernelAbstractions.synchronize(backend)
end
# --- Benchmark: rain_evaporation_CPU (SF.gamma exact) ---
CUDA.synchronize()
t_exact = @elapsed begin
for _ in 1:nreps
k_exact!(SB2006, aps, tps, output_exact, qₜ, qₗ, qᵢ, qᵣ, qₛ, ρ, Nᵣ, T; ndrange = N)
end
KernelAbstractions.synchronize(backend)
end
# --- Results ---
println("\n--- GPU Benchmark Results ($FT, N=$N, nreps=$nreps) ---")
println(
"CM2.rain_evaporation (Γ_incl approx): $(round(t_approx * 1000; digits=2)) ms total, $(round(t_approx / nreps * 1e6; digits=2)) μs/launch",
)
println(
"rain_evaporation_CPU (SF.gamma exact): $(round(t_exact * 1000; digits=2)) ms total, $(round(t_exact / nreps * 1e6; digits=2)) μs/launch",
)
println("Speedup (approx / exact): $(round(t_exact / t_approx; digits=2))x")
return (; t_approx, t_exact, max_rel_err_ρn, max_rel_err_q)
end
# Run for both Float64 and Float32
# println("="^60)
# println(" Float64")
# println("="^60)
# result_f64 = run_benchmark(; FT = Float64)
# println("\n" * "="^60)
# println(" Float32")
# println("="^60)
# result_f32 = run_benchmark(; FT = Float32) |
ec34417 to
d3091fb
Compare
9a89238 to
d11950c
Compare
haakon-e
commented
Mar 11, 2026
d11950c to
74e6b25
Compare
74e6a29 to
7b7dbd1
Compare
74e6b25 to
0fc31df
Compare
7b7dbd1 to
1f4a917
Compare
0fc31df to
761e099
Compare
1f4a917 to
7f380dc
Compare
761e099 to
161ad66
Compare
7f380dc to
e0cb5b6
Compare
161ad66 to
0503c33
Compare
e0cb5b6 to
35f69b6
Compare
0503c33 to
b8b4f6b
Compare
haakon-e
commented
Mar 12, 2026
haakon-e
commented
Mar 12, 2026
haakon-e
commented
Mar 12, 2026
62241f0 to
0459d67
Compare
81ca7d2 to
9362cdd
Compare
0459d67 to
3bc6212
Compare
9362cdd to
2e2850b
Compare
trontrytel
reviewed
Mar 16, 2026
trontrytel
reviewed
Mar 16, 2026
trontrytel
reviewed
Mar 16, 2026
trontrytel
approved these changes
Mar 16, 2026
Member
trontrytel
left a comment
There was a problem hiding this comment.
Left two comments. Looks good otherwise.
Is it a breaking change from the point of view of the very few 2M simulations we have in Atmos?
be4ca3d to
6fa6924
Compare
2e2850b to
044c315
Compare
6fa6924 to
1080882
Compare
044c315 to
9566603
Compare
1080882 to
c909c1a
Compare
9566603 to
995c1e6
Compare
737a3ea to
4ecaff8
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #694 +/- ##
==========================================
- Coverage 92.26% 92.10% -0.16%
==========================================
Files 54 54
Lines 2263 2268 +5
==========================================
+ Hits 2088 2089 +1
- Misses 175 179 +4
🚀 New features to boost your workflow:
|
4ecaff8 to
e6bcd40
Compare
- replace OrdinaryDiffEq by OrdinaryDiffEqLowOrderRK
in test and parcel envs
- remove LogExpFunctions, RootSolvers, Thermodynamics
from test env
- rename rain_evaporation return fields - add condensation/evaporation (+sublimation/deposition) - rates for 2M - formatting - docstrings
e6bcd40 to
e0496e3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refactors and improves the implementation of rain evaporation and two-moment microphysics tendencies in the codebase.
src/Microphysics2M.jl, therain_evaporationfunction now returns a NamedTuple(; ∂ₜρn_rai, ∂ₜq_rai)instead of(; evap_rate_0, evap_rate_1)for improved clarity. The docstring is also improved, as well as formatting.src/parameters/Microphysics2M.jl, a few convenience constructors are addedRainParticlePDF_SB2006_limitedandRainParticlePDF_SB2006_notlimitedcan now be constructed e.g. byRainParticlePDF_SB2006(param_dict; is_limited = true)SB2006constructor is also simplifiedsrc/parameters/Microphysics2MParams.jl1, I make similar improvements and simplifications to constructionsrc/BulkMicrophysicsTendencies.jl, I add condensation/evaporation tendency to 2m, which was missing. Then I move rain evaporation and number adjustment tendencies to thewarm_rain_tendencies_2m, changing its function signature, which allows easier reusing of code between 2M with and without P3.tests are updated accordingly.
Footnotes
This being a separate file is not entirely clear to me, but I'll leave as-is for now. ↩