GPredict is a Python framework for Gaussian Process (GP) regression, designed for predictive modeling and principled uncertainty quantification.
It implements non-parametric Bayesian regression using customizable mean functions and kernel methods, providing posterior predictions with calibrated confidence intervals and predictive sampling.
Author: Kevin Mota da Costa
Portfolio: https://costakevinn.github.io
LinkedIn: https://linkedin.com/in/costakevinnn
GPredict was built to explore regression from a fully probabilistic perspective.
Unlike parametric neural networks, Gaussian Processes:
- Model functions as distributions
- Provide closed-form posterior inference
- Naturally quantify predictive uncertainty
- Adapt complexity to the data
This project reflects a statistical-first engineering approach to regression modeling.
The model assumes:
f(x) ~ GaussianProcess( m(x), k(x, x') )
Where:
- m(x) → mean function (constant or linear)
- k(x, x') → covariance kernel (RBF or Matern)
- Observational noise is modeled via a diagonal noise matrix
Posterior prediction yields:
- Predictive mean
- Predictive variance
- Full covariance structure
- Posterior samples
This enables non-parametric regression with uncertainty-aware predictions.
The framework is modular:
- Mean functions (constant, linear)
- Kernel functions (RBF, Matern)
- Covariance matrix construction
- Noise modeling (heteroscedastic support)
- Posterior computation
- Predictive sampling
- Visualization & result export
All components are cleanly separated for extensibility and experimentation.
Function: sin(x) Observations: 20 points with heteroscedastic noise
| Prior | Posterior |
|---|---|
![]() |
![]() |
- Prior: expresses assumptions before observing data
- Posterior: updates mean and uncertainty using Bayesian inference
The posterior captures:
- Global structure
- Local smoothness
- Increased uncertainty away from data points
Given training data, GPredict computes:
- Predictive mean via covariance-weighted interpolation
- Predictive covariance via posterior update
- Predictive trajectories via sampling from multivariate normal
Noise is incorporated explicitly in the covariance matrix, allowing realistic predictive intervals.
- Non-parametric regression modeling
- Kernel engineering (RBF, Matern)
- Bayesian posterior computation
- Heteroscedastic noise handling
- Predictive sampling
- Visualization of uncertainty bands
- Reproducible probabilistic workflows
- Modular separation of kernels, means, and inference
- Explicit covariance construction for transparency
- Analytical posterior computation (no black-box frameworks)
- Fixed random seed for reproducibility
- Structured output generation (data, plots, results)
Python
NumPy
Linear algebra (matrix inversion & covariance operations)
Bayesian inference
Kernel methods
Scientific visualization
python3 main.pyOutputs:
data/→ observational datasetsplots/→ GP prior & posterior visualizationsresults/→ numerical predictive summaries
GPredict/
├── data/ # Observations
├── plots/ # Prior and posterior plots
├── results/ # Predictive summaries
├── gp.py # GP fitting and prediction
├── kernels.py # Kernel definitions
├── means.py # Mean functions
├── utils.py # Plotting utilities
├── examples.py # Example workflows
└── main.py # Entry point
This project is part of my Machine Learning portfolio: 👉 https://costakevinn.github.io
MIT License — see LICENSE for details.

