GitHub - jiawenchen10/STHSepNet: KDD'2025. This project aims to explore the application of hypergraph spatio-temporal learning and Large Language Models (LLMs) in traffic flow demand forecasting

STH-SepNet: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Welcome to STH-SepNet's GitHub repository! This repository hosts the code, data and model weight of STH-SepNet (KDD'25 Research Track).

Abstract: Spatio-temporal prediction is a pivotal task with broad applications in traffic management, climate monitoring, and energy scheduling. However, existing methodologies often struggle to balance model expressiveness and computational efficiency, especially when scaling to large real-world datasets. To tackle these challenges, we propose STH-SepNet (Spatio-Temporal Hypergraph Separation Network), a novel framework that decouples temporal and spatial modeling to enhance both efficiency and precision. Therein, the temporal dimension is modeled using lightweight large language models, which effectively capture low-rank temporal dynamics. Concurrently, the spatial dimension is addressed through an adaptive hypergraph neural network, which dynamically constructs hyperedges to model intricate, higher-order interactions. A carefully designed gating mechanism is integrated to seamlessly fuse temporal and spatial representations. By leveraging the fundamental principles of low-rank temporal dynamics and spatial interactions, STH-SepNet offers a pragmatic and scalable solution for spatio-temporal prediction in real-world applications. Extensive experiments on large-scale real-world datasets across multiple benchmarks demonstrate the effectiveness of STH-SepNet in improving predictive performance while maintaining computational efficiency. This work may provide a promising lightweight framework for spatio-temporal prediction, aiming to reduce computational demands and while enhancing predictive performance.

[Paper Page] [中文解读]

Citation

If you find this repository helpful for your research, please cite our paper.

@inproceedings{chen2025decoupling,
  title={Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs},
  author={Chen, Jiawen and Shao, Qi and Chen, Duxin and Yu, Wwenwu},
  booktitle={Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)},
  year={2025},
  month={August 3rd-7th},
  address={Toronto, Canada},
  publisher={ACM}
}

1. Preparation

1.1 Environment

The lightweight training requires torch 2.0+, to install all dependencies , update corresponding libraries:

pip install -r requirements.txt

1.2 Data

The data can be obtained and downloaded from (Google Drive), and makedir path dataset/ and put dataset in dataset/.

1.3 Large Language Models

The pretrained models can be downloaded from the links in the Table as below, and makedir path huggingface/ and put pretrained models in huggingface/. For example, huggingface/BERT

Model 🤗	Parameters	LLM Dimension	Model Description
BERT	110M	768	A Transformer-based pre-trained model for NLP tasks, excelling in sentence classification and question answering.
GPT-2	124M	768	A Transformer-based generative model, specialized in text generation and language modeling.
GPT-3	7580M	4096	A large-scale Transformer-based generative model supporting various language tasks.
LLAMA-1B	1230M	2048	A multilingual model developed by Meta, designed for dialogue and knowledge retrieval tasks.
LLAMA-7B	6740M	4096	A multilingual model developed by Meta, suitable for various natural language generation tasks.
LLAMA-8B	8000M	4096	A multilingual model developed by Meta, focused on dialogue and instruction-tuning tasks.
DeepSeek-Qwen1.5B	1500M	1536	A reasoning-focused model enhanced through reinforcement learning for improved reasoning capabilities.

2. Main Results

2.1 Training Preparation

2.1.1 Download datasets and place them under `./dataset`.

2.1.2 Download pretrained models and place them under `./huggingface`.

2.1.3 Complete list of parameters

Parameter	Type	Description	Default Value
`model`	string	Name of the model, among: - `pool`: SHT-SepNet model with adaptive hypergraphs module - `Autoformer`: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting (NeurIPS 2021) - `TIMELLM`: Time Series Forecasting by Reprogramming Large Language Models (ICLR 2024)	`pool`
`dataset`	string	Name of the dataset, among: - `inflow`: Bike traffic flow inflow - `outflow`: Bike traffic flow outflow - `PEMS03`: California Highway network PeMS traffic flow dataset - `BJ`: Traffic dataset of road network in some areas of Beijing - `METR`:Traffic sensor data in the Los Angeles area You can also specify any additional graph dataset, in edgelist format, by editing `data_loader.py`	`inflow`
`node_num`	int	the node number of the network -`Inflow, Outflow: 295` -`PEMS03`:358 -`BJ`:500 -`METR`： 207	`295`
`features`	string	forecasting task, options:[M, S, MS], among: - `M`: multivariate predict multivariate - `S`: univariate predict univariate - `MS`:multivariate predict univariate	`M`
`llm_model`	string	LLM model: `BERT，GPT2，GPT3，LLAMA1b，LLAMA7b,LLAMA8b, deepseek2b`	`BERT`
`static`	bool	Whether to use static adjacency matrix module	`False`
`gcn_true`	bool	Whether to use GCN module	`True`
`adaptive_hyperhgnn`	string	Hypergraph neuron network: hgcn, hgat, hsage	`'hgcn'`
`hgcn_true`	bool	Whether to use HGCN module	`True`
`temporal_true`	bool	Whether to use Temporal convolutional networks Module	`True`
`fusion_gate`	string	Style of module fusion: - `adaptive`:dynamically adjusts the weight of time and spatial features， -`attentiongate`:considers the internal relationship between the two features -`lstmgate`:captures the dependence of space on temporal features -`hyperstgnn` :fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)	`adaptive`
`llm_dim`	int	LLM model dimension - `BERT, GPT2`: 768 -`LLAMA7b,LLAMA8b,GPT3`: 4096 - `LLAMA1b`: 2048 - `deepseek2b`:1536	`768`
`seq_len`	int	input sequence length	`48`
`label_len`	int	start token length	`48`
`pred_len`	int	prediction sequence length	`48`
`enc_in`	int	encoder input size (e.g, Node num)	`295`
`dec_in`	int	decoder input size (e.g, Node num)	`295`
`c_out`	int	output size (e.g, Node num)	`295`
`d_model`	int	dimension of model	`32`
`n_heads`	int	num of heads	`16`
`e_layers`	int	num of encoder layers	`2`
`d_layers`	int	num of decoder layers	`1`
`d_ff`	int	dimension of fcn	`32`
`llm_layers`	int	num of llm layer	`6`
`train_epochs`	int	Number of training epochs	`50`
`align_epochs`	int	Number of alignment epochs	`10`
`alpha`	float	Adjustable parameter to control hyperSTLLM or STLLM	`0.1`
`beta`	float	Adjustable parameter to control hyperSTLLM or STLLM	`0.2`
`gamma`	float	Adjustable parameter to control hyperSTLLM or STLLM	`0.5`
`theta`	float	Adjustable parameter to control hyperSTLLM or STLLM	`0.2`

2.2 Training STH-SepNet

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_order.sh
sh ./scripts/BIKE/GPT2_Bike_order.sh
sh ./scripts/BIKE/GPT3_Bike_order.sh
sh ./scripts/BIKE/LLAMA1B_Bike_order.sh
sh ./scripts/BIKE/LLAMA7B_Bike_order.sh
sh ./scripts/BIKE/LLAMA8B_Bike_order.sh
sh ./scripts/BIKE/Deepseek_Bike_order.sh

2.3 Training STH-SepNet-GNN

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike.sh
sh ./scripts/BIKE/GPT2_Bike.sh
sh ./scripts/BIKE/GPT3_Bike.sh
sh ./scripts/BIKE/LLAMA1B_Bike.sh
sh ./scripts/BIKE/LLAMA7B_Bike.sh
sh ./scripts/BIKE/LLAMA8B_Bike.sh
sh ./scripts/BIKE/Deepseek_Bike.sh

3. Ablation Study

3.1 STH-SepNet-without LLMs (w/o) (Undecoupled Version)

For example, to evaluate on BIKE datasets, Set --fusion_gate as hyperstgnn. Note that fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)

3.2 STH-SepNet-Mixorder

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT2_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT3_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA1B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA7B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA8B_Bike_mixorder3.sh
sh ./scripts/BIKE/Deepseek_Bike_mixorder3.sh

3.3 STH-SepNet-Effective Order on Adaptive Hypergraph

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_Outflow_flexible_order3.sh
sh ./scripts/PEMS/BERT_PEMS03_flexible_order.sh

3.4 STH-SepNet-Fusion Mechanism between Spatio and Temporal Features

The fusion mechanism can be specified using the --fusion_gate argument. The available options are:

adaptive: Dynamically adjusts the weight of time and spatial features.
attentiongate: Considers the internal relationship between the two features.
lstmgate: Captures the dependence of space on temporal features.
hyperstgnn: Fully integrated adaptive hypergraph spatio-temporal prediction (without LLMs).

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
baselines		baselines
data_provider		data_provider
dataset		dataset
huggingface		huggingface
img		img
layer		layer
models		models
scripts		scripts
utils		utils
LICENSE		LICENSE
README.md		README.md
ds_config_zero2.json		ds_config_zero2.json
requirements.txt		requirements.txt
run_main.py		run_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STH-SepNet: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Citation

1. Preparation

1.1 Environment

1.2 Data

1.3 Large Language Models

2. Main Results

2.1 Training Preparation

2.1.1 Download datasets and place them under `./dataset`.

2.1.2 Download pretrained models and place them under `./huggingface`.

2.1.3 Complete list of parameters

2.2 Training STH-SepNet

2.3 Training STH-SepNet-GNN

3. Ablation Study

3.1 STH-SepNet-without LLMs (w/o) (Undecoupled Version)

3.2 STH-SepNet-Mixorder

3.3 STH-SepNet-Effective Order on Adaptive Hypergraph

3.4 STH-SepNet-Fusion Mechanism between Spatio and Temporal Features

4. Performance and Visualization

Further Reading

Our research baselines models refer to the following works and their repository code.

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STH-SepNet: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Citation

1. Preparation

1.1 Environment

1.2 Data

1.3 Large Language Models

2. Main Results

2.1 Training Preparation

2.1.1 Download datasets and place them under ./dataset.

2.1.2 Download pretrained models and place them under ./huggingface.

2.1.3 Complete list of parameters

2.2 Training STH-SepNet

2.3 Training STH-SepNet-GNN

3. Ablation Study

3.1 STH-SepNet-without LLMs (w/o) (Undecoupled Version)

3.2 STH-SepNet-Mixorder

3.3 STH-SepNet-Effective Order on Adaptive Hypergraph

3.4 STH-SepNet-Fusion Mechanism between Spatio and Temporal Features

4. Performance and Visualization

Further Reading

Our research baselines models refer to the following works and their repository code.

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

2.1.1 Download datasets and place them under `./dataset`.

2.1.2 Download pretrained models and place them under `./huggingface`.

Packages