Skip to content

Latest commit

 

History

History
81 lines (71 loc) · 3.33 KB

File metadata and controls

81 lines (71 loc) · 3.33 KB

Molecular Communication GPU Simulator

Overview

GPU-accelerated Monte Carlo simulation of molecular communication in blood vessels. Simulates particle diffusion via Brownian motion with optional laminar drift, modeling how nanoscale messenger molecules move through the bloodstream.

Based on a master's thesis applying CUDA parallelism to molecular communication research.

Project Structure

src/
  common/
    params.h              # SimParams struct — all simulation constants
    cli.h                 # CLI parsing, usage, verbose output
  main.cu                 # GPU entry point — dispatches to CPU/GPU simulation
  main_cpu.cpp            # CPU-only entry point — no CUDA dependency
  simulation_cpu.cpp/.h   # CPU reference: Brownian motion, hit detection
  simulation_gpu.cu/.h    # GPU kernels: d_simulate_isolated (long), d_update (wide), d_reflection
  timing.c/.h             # Wall-clock timing utility
scripts/
  validate_1d_firsthit.py # Validates output against analytical solution (thesis eq 4.3)
  validate_before_commit.sh # Pre-commit build + validation
  colab_build_test.ipynb  # Google Colab notebook for GPU testing
  requirements.txt        # Python deps (numpy, matplotlib)
matlab/                   # Reference MATLAB Fokker-Planck implementations
docs/thesis.pdf           # Full thesis document

Physics Model

  • Brownian motion: x += sqrt(2 * Db * deltaT) * randn for each axis
  • Laminar drift: z += velocity * deltaT (z-direction only)
  • Wall reflection: parametric line-circle intersection in cylindrical blood vessel
  • Collision detection: spherical receiver or 1D planar limit

Key Parameters

  • Db = 1E-11 m^2/s (diffusion coefficient)
  • velocity = 1E-4 m/s (laminar flow)
  • deltaT = 1E-7 s (time step)
  • radius = 8E-6 m (blood vessel radius)
  • GPU RNG: curandStatePhilox4_32_10_t

Build

mkdir build && cd build
cmake ..                                    # auto-detects CUDA
make                                        # builds mc_sim_cpu, and mc_sim if CUDA found

To target a specific GPU architecture:

cmake .. -DCMAKE_CUDA_ARCHITECTURES=75      # Turing (T4, RTX 2080)
cmake .. -DCMAKE_CUDA_ARCHITECTURES=86      # Ampere (A100, RTX 3090)

Usage

./mc_sim -i 10000 -t 1E-3 -f -v            # 10k paths, first-hit, verbose
./mc_sim -i 10000 -c -f -v                 # CPU vs GPU comparison
./mc_sim -i 5000 -w -r 8E-6 -e             # with walls, record everything
./mc_sim_cpu -i 10000 -f -l 3E-7 -t 1E-2   # CPU-only, 1D limit test

Validation

# Quick pre-commit check
./scripts/validate_before_commit.sh

# Manual validation against analytical solution
./mc_sim_cpu -i 10000 -f -l 3E-7 -t 1E-2
scripts/.venv/bin/python scripts/validate_1d_firsthit.py build/output_h.csv \
    --dist 3E-7 --vel 1E-4 --timestep 1E-7 --timestop 1E-2

Architecture

  • Long kernel (d_simulate_isolated): all timesteps in one launch, positions in registers. Used for first-hit/limit modes. ~1,400x faster than thesis wide.
  • Wide kernel (d_update): per-step launch, positions in global memory. Used for everything/allhit modes and future particle interactions (-W flag).
  • Both kernels include Brownian bridge boundary crossing correction.
  • RNG: per-call clock64() seeding with cuRAND Philox — intentional design for performance (4-5x faster than global memory state).