Skip to content

jiawenchen10/STHSepNet

Repository files navigation

STH-SepNet: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Welcome to STH-SepNet's GitHub repository! This repository hosts the code, data and model weight of STH-SepNet (KDD'25 Research Track).

Abstract: Spatio-temporal prediction is a pivotal task with broad applications in traffic management, climate monitoring, and energy scheduling. However, existing methodologies often struggle to balance model expressiveness and computational efficiency, especially when scaling to large real-world datasets. To tackle these challenges, we propose STH-SepNet (Spatio-Temporal Hypergraph Separation Network), a novel framework that decouples temporal and spatial modeling to enhance both efficiency and precision. Therein, the temporal dimension is modeled using lightweight large language models, which effectively capture low-rank temporal dynamics. Concurrently, the spatial dimension is addressed through an adaptive hypergraph neural network, which dynamically constructs hyperedges to model intricate, higher-order interactions. A carefully designed gating mechanism is integrated to seamlessly fuse temporal and spatial representations. By leveraging the fundamental principles of low-rank temporal dynamics and spatial interactions, STH-SepNet offers a pragmatic and scalable solution for spatio-temporal prediction in real-world applications. Extensive experiments on large-scale real-world datasets across multiple benchmarks demonstrate the effectiveness of STH-SepNet in improving predictive performance while maintaining computational efficiency. This work may provide a promising lightweight framework for spatio-temporal prediction, aiming to reduce computational demands and while enhancing predictive performance.

Citation

If you find this repository helpful for your research, please cite our paper.

@inproceedings{chen2025decoupling,
  title={Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs},
  author={Chen, Jiawen and Shao, Qi and Chen, Duxin and Yu, Wwenwu},
  booktitle={Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)},
  year={2025},
  month={August 3rd-7th},
  address={Toronto, Canada},
  publisher={ACM}
}

1. Preparation

1.1 Environment

The lightweight training requires torch 2.0+, to install all dependencies , update corresponding libraries:

pip install -r requirements.txt

1.2 Data

The data can be obtained and downloaded from (Google Drive), and makedir path dataset/ and put dataset in dataset/.

1.3 Large Language Models

The pretrained models can be downloaded from the links in the Table as below, and makedir path huggingface/ and put pretrained models in huggingface/. For example, huggingface/BERT

Model 🤗 Parameters LLM Dimension Model Description
BERT 110M 768 A Transformer-based pre-trained model for NLP tasks, excelling in sentence classification and question answering.
GPT-2 124M 768 A Transformer-based generative model, specialized in text generation and language modeling.
GPT-3 7580M 4096 A large-scale Transformer-based generative model supporting various language tasks.
LLAMA-1B 1230M 2048 A multilingual model developed by Meta, designed for dialogue and knowledge retrieval tasks.
LLAMA-7B 6740M 4096 A multilingual model developed by Meta, suitable for various natural language generation tasks.
LLAMA-8B 8000M 4096 A multilingual model developed by Meta, focused on dialogue and instruction-tuning tasks.
DeepSeek-Qwen1.5B 1500M 1536 A reasoning-focused model enhanced through reinforcement learning for improved reasoning capabilities.

2. Main Results

2.1 Training Preparation

2.1.1 Download datasets and place them under ./dataset.

2.1.2 Download pretrained models and place them under ./huggingface.

2.1.3 Complete list of parameters

Parameter Type Description Default Value
model string Name of the model, among:
- pool: SHT-SepNet model with adaptive hypergraphs module
- Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting (NeurIPS 2021)
- TIMELLM: Time Series Forecasting by Reprogramming Large Language Models (ICLR 2024)
pool
dataset string Name of the dataset, among:
- inflow: Bike traffic flow inflow
- outflow: Bike traffic flow outflow
- PEMS03: California Highway network PeMS traffic flow dataset
- BJ: Traffic dataset of road network in some areas of Beijing
- METR:Traffic sensor data in the Los Angeles area
You can also specify any additional graph dataset, in edgelist format, by editing data_loader.py
inflow
node_num int the node number of the network
-Inflow, Outflow: 295
-PEMS03:358
-BJ:500
-METR: 207
295
features string forecasting task, options:[M, S, MS], among:
- M: multivariate predict multivariate
- S: univariate predict univariate
- MS:multivariate predict univariate
M
llm_model string LLM model: BERT,GPT2,GPT3,LLAMA1b,LLAMA7b,LLAMA8b, deepseek2b BERT
static bool Whether to use static adjacency matrix module False
gcn_true bool Whether to use GCN module True
adaptive_hyperhgnn string Hypergraph neuron network: hgcn, hgat, hsage 'hgcn'
hgcn_true bool Whether to use HGCN module True
temporal_true bool Whether to use Temporal convolutional networks Module True
fusion_gate string Style of module fusion:
- adaptive:dynamically adjusts the weight of time and spatial features,
-attentiongate:considers the internal relationship between the two features
-lstmgate:captures the dependence of space on temporal features
-hyperstgnn :fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)
adaptive
llm_dim int LLM model dimension
- BERT, GPT2: 768
-LLAMA7b,LLAMA8b,GPT3: 4096
- LLAMA1b: 2048
- deepseek2b:1536
768
seq_len int input sequence length 48
label_len int start token length 48
pred_len int prediction sequence length 48
enc_in int encoder input size (e.g, Node num) 295
dec_in int decoder input size (e.g, Node num) 295
c_out int output size (e.g, Node num) 295
d_model int dimension of model 32
n_heads int num of heads 16
e_layers int num of encoder layers 2
d_layers int num of decoder layers 1
d_ff int dimension of fcn 32
llm_layers int num of llm layer 6
train_epochs int Number of training epochs 50
align_epochs int Number of alignment epochs 10
alpha float Adjustable parameter to control hyperSTLLM or STLLM 0.1
beta float Adjustable parameter to control hyperSTLLM or STLLM 0.2
gamma float Adjustable parameter to control hyperSTLLM or STLLM 0.5
theta float Adjustable parameter to control hyperSTLLM or STLLM 0.2

2.2 Training STH-SepNet

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_order.sh
sh ./scripts/BIKE/GPT2_Bike_order.sh
sh ./scripts/BIKE/GPT3_Bike_order.sh
sh ./scripts/BIKE/LLAMA1B_Bike_order.sh
sh ./scripts/BIKE/LLAMA7B_Bike_order.sh
sh ./scripts/BIKE/LLAMA8B_Bike_order.sh
sh ./scripts/BIKE/Deepseek_Bike_order.sh

2.3 Training STH-SepNet-GNN

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike.sh
sh ./scripts/BIKE/GPT2_Bike.sh
sh ./scripts/BIKE/GPT3_Bike.sh
sh ./scripts/BIKE/LLAMA1B_Bike.sh
sh ./scripts/BIKE/LLAMA7B_Bike.sh
sh ./scripts/BIKE/LLAMA8B_Bike.sh
sh ./scripts/BIKE/Deepseek_Bike.sh

3. Ablation Study

3.1 STH-SepNet-without LLMs (w/o) (Undecoupled Version)

For example, to evaluate on BIKE datasets, Set --fusion_gate as hyperstgnn. Note that fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)

3.2 STH-SepNet-Mixorder

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT2_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT3_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA1B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA7B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA8B_Bike_mixorder3.sh
sh ./scripts/BIKE/Deepseek_Bike_mixorder3.sh

3.3 STH-SepNet-Effective Order on Adaptive Hypergraph

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_Outflow_flexible_order3.sh
sh ./scripts/PEMS/BERT_PEMS03_flexible_order.sh

3.4 STH-SepNet-Fusion Mechanism between Spatio and Temporal Features

The fusion mechanism can be specified using the --fusion_gate argument. The available options are:

  • adaptive: Dynamically adjusts the weight of time and spatial features.
  • attentiongate: Considers the internal relationship between the two features.
  • lstmgate: Captures the dependence of space on temporal features.
  • hyperstgnn: Fully integrated adaptive hypergraph spatio-temporal prediction (without LLMs).

4. Performance and Visualization

Further Reading

Our research baselines models refer to the following works and their repository code.

STG4Traffic: {A} Survey and Benchmark of Spatial-Temporal Graph Neural Networks for Traffic Prediction. [Paper][Code].

@article{DBLP:journals/corr/abs-2307-00495,
  author       = {Xunlian Luo and Chunjiang Zhu and Detian Zhang and Qing Li},
  title        = {STG4Traffic: {A} Survey and Benchmark of Spatial-Temporal Graph Neural
                  Networks for Traffic Prediction},
  journal      = {CoRR},
  volume       = {abs/2307.00495},
  year         = {2023}
}

Deep Time Series Models: A Comprehensive Survey and Benchmark. [Paper][Code].

@article{wang2024tssurvey,
  title={Deep Time Series Models: A Comprehensive Survey and Benchmark},
  author={Yuxuan Wang and Haixu Wu and Jiaxiang Dong and Yong Liu and Mingsheng Long and Jianmin Wang},
  booktitle={arXiv preprint arXiv:2407.13278},
  year={2024},
}

About

KDD'2025. This project aims to explore the application of hypergraph spatio-temporal learning and Large Language Models (LLMs) in traffic flow demand forecasting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors