Skip to content

WakateM/LOKAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOKAN: Localized Oscillatory Kolmogorov-Arnold Network

LOKAN is a neural network architecture that unifies Radial Basis Function (RBF) localization with Fourier-series oscillation within the Kolmogorov-Arnold Network (KAN) framework. Unlike standard MLPs that learn weights with fixed activation functions, LOKAN learns per-edge basis functions that combine spatial localization (via RBF kernels) with periodic pattern generation (via Fourier series), enabling it to capture both local and global structure in a single forward pass.

Architecture

Overview

LOKAN replaces traditional MLP layers with two specialized layer types:

Input → [RBFLayer × N] → WindowedFourierLayer → Output
         (hidden)           (output head)
  • RBFLayer (hidden layers): Each edge learns a weighted combination of RBF basis functions, replacing fixed activations with learnable localized kernels.
  • WindowedFourierLayer (output layer): Combines an RBF gating window with a Fourier series to produce the final output, coupling localization with periodicity.

Mathematical Formulation

RBF Hidden Layers

Each edge connecting input neuron $i$ to output neuron $j$ computes:

$$\text{edge}_{i,j}(x_i) = \sum_{k=1}^{K} w_{i,j,k} \cdot \phi!\left(\frac{x_i - c_{i,j,k}}{\sigma_{i,j,k}}\right)$$

where $c_{i,j,k}$ are learnable centers (bounded via $\tanh$ to a configurable range), $\sigma_{i,j,k}$ are learnable widths (parameterized in log-space), $w_{i,j,k}$ are learnable weights, and $\phi$ is the RBF kernel. The output of each neuron $j$ sums contributions from all input edges:

$$h_j = \sum_i \text{edge}_{i,j}(x_i)$$

Supported RBF kernels:

Kernel $\phi(r^2)$
Gaussian $e^{-r^2}$
Multiquadric $\sqrt{r^2 + 1}$
Inverse Multiquadric $1 / \sqrt{r^2 + 1}$
Cauchy $1 / (1 + r^2)$
Polyharmonic $r^2 \ln(r)$
Wendland $(1 - r)^4 (4r + 1)$

Windowed Fourier Output Layer

The output layer fuses RBF gating with Fourier series generation. For each hidden neuron $j$:

$$\text{RBF}_j(h_j) = \sum_{k=1}^{K_{\text{rbf}}} w_k \cdot \phi!\left(\frac{h_j - c_k}{\sigma_k}\right)$$

$$\text{Fourier}_j(t) = \sum_{f=1}^{F} \left[ a_{j,f} \cos(f \cdot t) + b_{j,f} \sin(f \cdot t) \right]$$

The final output at each step $t$ is:

$$y(t) = \sum_j \text{RBF}_j(h_j) \cdot \text{Fourier}_j(t), \quad t = 1, \ldots, D$$

The frequencies $f = 1, 2, \ldots, F$ are fixed integers (not learned), while the Fourier coefficients $a_{j,f}$ and $b_{j,f}$ are learnable. The RBF component acts as a learnable gating window on the hidden features, modulating how much each Fourier pattern contributes to the output.

Hidden Layer Structures

The width of hidden layers follows one of three patterns:

Structure Layer widths Use case
pyramid $d, d/2, d/4, \ldots$ Compression / bottleneck
inverse_pyramid $d, 2d, 4d, \ldots$ Expansion / feature enrichment
flat $d, d, d, \ldots$ Uniform capacity

Relation to Prior KAN Work

The original KAN (Liu et al., 2024) proposed replacing fixed activations with learnable B-spline functions on graph edges, grounded in the Kolmogorov-Arnold representation theorem. Subsequent work explored alternatives to B-splines:

  • FourierKAN (Xu et al., 2024) replaced splines with Fourier series, leveraging periodicity for time-series and signal tasks, but lacked spatial localization.
  • RBF-KAN (Jovicic, 2024) used radial basis functions for smoother, more stable edge functions with better convergence, but without periodic structure.
  • FastKAN (Li, 2024) used Gaussian RBFs for computational efficiency.
  • Free-RBF-KAN (Chiu et al., 2025) liberated previously fixed RBF parameters — centers and widths are now optimized during training alongside edge weights, enabling adaptive basis functions that reshape to fit the data rather than relying on a static grid. This improved representational flexibility and function approximation accuracy over standard RBF-KAN, with demonstrated universal approximation capabilities. (arXiv:2601.07760)

LOKAN unifies both approaches: RBF layers provide localized, adaptive basis functions in the hidden layers, while the output layer couples RBF gating with Fourier series to simultaneously capture spatial localization and periodic patterns. This combination enables LOKAN to perform across diverse domains — from time-series forecasting to image classification to spatiotemporal regression — within a single architecture.

Limited Tests

Spatiotemporal Weather Prediction

Task: Predict relative humidity (RH2M, %) from 22 meteorological features over Houston, TX using HRRR 3km analysis data. (Results will be updated as the model improves)

Data: NOAA HRRR July 2023 — 553 grid points × 124 timestamps (every 6h), 68,572 total samples. Each row is a raw HRRR snapshot: weather at one (lat, lon) at one (day, hour). Features include temperature, dewpoint, pressure, wind, solar radiation, and cyclical time encodings (hour/day sin/cos). The split is sequential in time (70/15/15):

Split Days Timestamps Samples
Train July 1–22 87 48,000
Val July 22–27 20 10,285
Test July 27–31 19 10,287

The same 553 locations appear across all splits — the model trains on earlier days and is tested on later days, making this a spatiotemporal generalization task.

Model MSE RMSE MAE Params Epochs
LOKAN 0.5425 0.7365 0.5891 0.9922 2.5K 25

Results

Predicted vs Actual Predictions Histogram Gridded Comparison
Predicted vs Actual Histogram Gridded

Time-Series Forecasting

Task: Predict oil temperature (OT) on the ETTh1 benchmark. Lookback: 3, multiple horizons. (Results will be updated as the model improves)

Horizon MSE RMSE MAE Params
1h 0.5118 0.7154 0.5103 0.9569 560
3h 1.6698 1.2922 1.0200 0.8594 560
6h 2.0682 1.4381 1.0600 0.8259 560
12h 3.4631 1.8609 1.3600 0.7086 592

Results

1-Hour Forecast

Forecast Predicted vs Actual Histogram
Forecast Scatter Hist

3-Hour Forecast

Forecast Predicted vs Actual Histogram
Forecast Scatter Hist

6-Hour Forecast

Forecast Predicted vs Actual Histogram
Forecast Scatter Hist

12-Hour Forecast

Forecast Predicted vs Actual Histogram
Forecast Scatter Hist

Image Classification

Task: Handwritten digit classification on MNIST (28×28 grayscale, 10 classes). (Results will be updated as the model improves)

Model Test Accuracy Parameters Source
LOKAN 94.15% 38.6K
MLP (2-layer, 800 units) ~98.4% ~120K LeCun et al.
Original KAN ~96.9% Liu et al., 2024
CNN (LeNet-5) ~99.2% ~60K LeCun et al., 1998

Results

Confusion Matrix
Confusion Matrix

Training

Quick Start

# Install dependencies
pip install -r requirements.txt

# Train on ETTh1 (time-series)
python train.py --config config/train/etth1.yaml

# Train on MNIST (classification)
python train.py --config config/train/mnist.yaml

# Train on HRRR (spatiotemporal regression)
python train.py --config config/train/hrrr_houston.yaml

Configuration

All hyperparameters are specified in a YAML config file. See config/train/template.yaml for all available options.

experiment:
  name: "My-Experiment"
  seed: 42

data:
  type: "timeseries"          # timeseries | regression | classification
  path: "data/etth1/ETTh1.csv"
  input_cols: ["HUFL", "HULL", "MUFL", "MULL", "LUFL", "LULL", "OT"]
  target_col: "OT"
  lookback: 3                 # timeseries only
  horizon: 1                  # timeseries only
  normalize_features: true
  scaler_type: "standard"     # standard | minmax
  split: [60, 20, 20]

model:
  structure: "flat"            # pyramid | inverse_pyramid | flat
  basis: "multiquadric"       # gaussian | multiquadric | cauchy | ...
  hidden_dim: 32
  hidden_layers: 2
  num_bases: 5                # RBF bases per edge
  num_freq: 5                 # Fourier frequencies in output layer
  basis_range: [-3.0, 3.0]
  output_dim: 1

training:
  max_epochs: 100
  lr: 0.001
  early_stopping: true
  enable_mlflow: true

Supported Data Types

Type Description Input Output
timeseries Multi-step forecasting with sliding windows lookback × features horizon steps
regression Tabular feature → target mapping features scalar
classification Image/audio classification (MNIST, MedMNIST, ESC-50) flattened pixels/spectrogram num_classes logits

Experiment Tracking

Training logs metrics and artifacts to MLflow (SQLite backend):

# Launch MLflow UI to view experiments
mlflow ui --backend-store-uri sqlite:///mlflow.db

Artifacts logged per run: model weights, TorchScript export, feature scaler, loss curves, network graph, prediction plots, confusion matrices, and spatial heatmaps.

License

MIT License. See LICENSE for details.

About

LOKAN: Localized Oscillative Kolmogorov-Arnold Network — unifying RBF localization with Fourier-series oscillation for various regression and classification tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages