ECS 289L Deep Learning Final Project
This repository contains code for two LLMs (GPT and BERT) trained on HPC log data to predict anomalous log sequences. The code includes log parsing techniques, dataset information, model implementations, and evaluation logic/results.
The datasets used to train the minGPT and LogBERT models are from System 20 of the high-performance computing cluster at the Los Alamos National Laboratory (LANL) and the BlueGene/L supercomputer system at Lawrence Livermore National Labs (LLNL). The datasets are open-source and available for download at https://github.com/logpai/loghub, https://lanl.gov/projects//ultrascale-systems-research-center/data/failure-data.php, and https://www.kaggle.com/datasets/kingslayer99/bgl-dataset.
The BERT model implementation is based on LogBERT (Guo, 2021) and the GPT model implementation is based on minGPT (Karpathy, 2020). We have simplified the original codebases, changed the evaluation tasks to fit our data, and added different attention mechanisms for experimentation.
Authors: Allison Austin, Halil Ozgur Demir
- Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios. Clustering Event Logs Using Iterative Partitioning, in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2009.
- Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. An Evaluation Study on Log Parsing and Its Use in Log Mining, in Proc. of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2016.
- Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, Michael R. Lyu. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics. IEEE International Symposium on Software Reliability Engineering (ISSRE), 2023.
- Haixuan Guo. LogBERT. https://github.com/HelenGuohx/logbert.
- Andrej Karpathy. minGPT. https://github.com/karpathy/minGPT.