Skip to content

allisonaustin/log-transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers for HPC Logs

LLMs for HPC Log Anomaly Detection Task

ECS 289L Deep Learning Final Project

About

This repository contains code for two LLMs (GPT and BERT) trained on HPC log data to predict anomalous log sequences. The code includes log parsing techniques, dataset information, model implementations, and evaluation logic/results.

Data

The datasets used to train the minGPT and LogBERT models are from System 20 of the high-performance computing cluster at the Los Alamos National Laboratory (LANL) and the BlueGene/L supercomputer system at Lawrence Livermore National Labs (LLNL). The datasets are open-source and available for download at https://github.com/logpai/loghub, https://lanl.gov/projects//ultrascale-systems-research-center/data/failure-data.php, and https://www.kaggle.com/datasets/kingslayer99/bgl-dataset.

Models

The BERT model implementation is based on LogBERT (Guo, 2021) and the GPT model implementation is based on minGPT (Karpathy, 2020). We have simplified the original codebases, changed the evaluation tasks to fit our data, and added different attention mechanisms for experimentation.

Authors: Allison Austin, Halil Ozgur Demir

References

  • Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios. Clustering Event Logs Using Iterative Partitioning, in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2009.
  • Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. An Evaluation Study on Log Parsing and Its Use in Log Mining, in Proc. of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2016.
  • Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, Michael R. Lyu. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics. IEEE International Symposium on Software Reliability Engineering (ISSRE), 2023.
  • Haixuan Guo. LogBERT. https://github.com/HelenGuohx/logbert.
  • Andrej Karpathy. minGPT. https://github.com/karpathy/minGPT.

About

Transformers for anomalous HPC log prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors