LiteNet

This project provides a complete, end-to-end pipeline for training, optimizing, and deploying a neural network for LiteNet, a Network Traffic Classification (NTC) model. The pipeline includes feature selection with SHAP, semi-structured sparse pruning, quantization (FP16/INT8), and conversion to a TensorRT engine for high-performance inference.

Architecture and Compression Techniques

LiteNet Architecture

The LiteNet architecture is designed for efficient network traffic classification with a lightweight structure optimized for deployment on resource-constrained devices.

Compression Techniques

The project implements various compression techniques including:

2:4 Semi-structured Sparsity: Reduces model parameters while maintaining performance
Quantization: FP16 and INT8 precision for reduced memory footprint and faster inference
TensorRT Optimization: GPU-optimized inference engine for maximum performance

Prerequisites

Environment

Virtual Environment setup for High-resource environment

Python 3.9.21
PyTorch-GPU 2.6.0
CUDA 12.6
NVIDIA TensorRT 10.10.0.31
Wandb 0.19.7 (Optional but recommended. For logging purposes)

You can use the following command to install all prerequisites via uv:

uv sync

Note: If you sync the env using uv, run using the following command. This is just an example change accordingly based on file:

uv run main.py

else, you can use the following command to install all prerequisites via Conda:

conda env create -f environment.yml

Dataset Setup

This project does not include the dataset files directly. You must download them and place them in the dataset/ directory.

ISCXVPN2016 is available at: ISCXVPN2016
MalayaNetwork_GT is available via: MalayaNetwork_GT

The expected structure is:

LiteNet/
├── dataset/
│   ├── ISCXVPN2016/
│   │   ├── train.npy
│   │   ├── test.npy
│   │   └── val.npy
├── saved_dict/

Project Pipeline

The project follows these main steps:

Configuration (config.yaml): Central configuration file for all parameters.
Feature Selection (feature_selection.py): Identifies the most important features from the dataset.
Training (main.py): Trains the LiteNet model using the selected features.
Optimization (prunesparse.py): Applies 2:4 semi-structured pruning, fine-tunes, and quantizes the model, exporting to ONNX.
TensorRT Conversion: The optimized ONNX model is converted into a TensorRT engine.
Inference (tensorrtinference.py): The final TensorRT engine is used for high-performance inference and benchmarking.

How to Use

This section details the usage of each key script in the pipeline.

1. `config.yaml`

This file is the central hub for configuring the entire pipeline. It contains settings for:

Base Parameters: Learning rate, batch size, number of epochs.
Model Architecture: Input sequence length, number of features.
Dataset specifics: Class names, number of classes, and the path to the feature list file for each dataset.
Feature Selection: Configuration for the larger model used during the feature selection process.

Before running any scripts, you should review this file to ensure the parameters match your desired configuration and dataset.

2. `feature_selection.py`

This script identifies the most important features from your dataset. It trains a LiteNetLarge model with the full feature set and then uses SHAP (SHapley Additive exPlanations) to calculate the importance of each feature. The script generates a .npy file with the list of the top 20 most important feature indices.

Modes:

--mode tr: Train & Select. Trains the model from scratch and then runs the SHAP analysis.
--mode fs: Feature Select only. Skips training and uses a pre-existing model to run the SHAP analysis.

Usage: To run training followed by feature selection on the ISCXVPN2016 dataset, use:

python feature_selection.py --data ISCXVPN2016 --mode tr

3. `main.py`

After identifying the most important features, this script is used to train the LiteNet model using only selected features from top_features_<dataset>.npy(specified in config.yaml) and preprocesses the data accordingly. It handles both training a new model from scratch and testing a pre-trained model.

Modes:

Training Mode (default): Trains, validates, and then tests the model. A trained model (.pth) will be saved.
--test True: Test-Only Mode. Skips the training process and directly evaluates a pre-trained model on the test dataset. Default file is LiteNet__embedding.pth. Use the --path flag to override the default file

Usage:

To train the model on the ISCXVPN2016 dataset with the selected features:

python main.py --dataset_name ISCXVPN2016

To test a pre-trained model:

python main.py --dataset_name ISCXVPN2016 --test True --path <name_of_your_model>.pth

4. `prunesparse.py`

This script is responsible for optimizing the trained LiteNet model. The process involves three main stages:

Pruning: Applies 2:4 semi-structured sparsity to the model. By default, this is only applied to the Linear layers due to the lightweight design of the architecture.
Fine-tuning: After pruning, the model is fine-tuned for a few epochs to recover any accuracy lost during pruning. The fine-tuning hyperparameters (fine_tune_epochs, fine_tune_lr) are specified in config.yaml.
Quantization & Export: The fine-tuned model is then optionally quantized and finally exported to the ONNX format.

Flags:

--quantization [None|FP16|INT8]: Specifies the quantization to apply after fine-tuning. Defaults to None.
--quantize-only: Skips the pruning and fine-tuning steps, loading a pre-existing fine-tuned model to perform quantization and export.

Usage:

To run the full prune, fine-tune, and FP16 quantization pipeline:

python prunesparse.py --dataset_name ISCXVPN2016 --quantization FP16

To run quantization only on an already fine-tuned model:

python prunesparse.py --dataset_name ISCXVPN2016 --quantization FP16 --quantize-only --path LiteNet_ISCXVPN2016_pruned_finetuned_embedding.pth

Note on ONNX Export: The script exports the final model to an .onnx file. Ensure that model's input shape in the export script matches the subsequent conversion to a TensorRT engine.

5. `Convert to TensorRT Engine`

After prunesparse.py generates an optimized ONNX model, you can convert it to a TensorRT engine using the trtexec command-line tool. This step builds an engine that is optimized for your specific GPU architecture. For CPU, ONNX is recommended.

For Linux (Bash)

This script is an example for building a TensorRT inference engine with FP16 quantization for the ISCXVPN2016 LiteNet model.

Note: This TensorRT engine was built on an NVIDIA RTX 4080 Super with TensorRT version 10.10.0.31. The engine should be created on the deployed device or it will not work properly. Note: The --shapes flag and INPUT_NAME must be the same as the .onnx model's input. This line is written for a fixed batch size scenario.

DATASET="ISCXVPN2016" # Specify dataset
QUANT="FP16" # Specify quantization
ONNX_MODEL_DIR="saved_dict"
ONNX_MODEL_PATH="${ONNX_MODEL_DIR}/LiteNet_${DATASET}_pruned_finetuned_embedding_${QUANT}.onnx"
TRT_ENGINE_PATH="${ONNX_MODEL_DIR}/LiteNet_${DATASET}_pruned_finetuned_embedding_${QUANT}.trt"
INPUT_NAME="input" # onnx input name
BATCH_SIZE=64
FEATURES=20

trtexec --onnx=${ONNX_MODEL_PATH} \
        --saveEngine=${TRT_ENGINE_PATH} \
 	--sparsity=enable \
        --useCudaGraph \
 	--shapes=${INPUT_NAME}:${BATCH_SIZE}x${FEATURES} \
	--fp16

Inference with TensorRT Engine

5. `tensorrtinference.py`

This script is used to run and benchmark the final .trt inference engine. It loads the engine, performs 100 warmup runs to ensure stable GPU performance, and then evaluates the engine on the entire test dataset, reporting metrics like throughput (QPS), latency, and accuracy.

Flags:

--data: Specifies the dataset used for inference. This determines which test data to load.
--quantization: Specifies the precision of the TensorRT engine (FP16 or INT8) to ensure the correct engine file is loaded.
--path: (Optional) Allows specifying a direct path to the .trt engine file, overriding the default name.

Usage:

To benchmark an FP16 engine for the ISCXVPN2016 dataset:

python tensorrtinference.py --data ISCXVPN2016 --quantization FP16

LICENSE

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Figure		Figure
.gitignore		.gitignore
.python-version		.python-version
DeepSHAP.ipynb		DeepSHAP.ipynb
DeepSHAP2.ipynb		DeepSHAP2.ipynb
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
data_processing.py		data_processing.py
environment.yml		environment.yml
feature_selection.py		feature_selection.py
int8_calibration.py		int8_calibration.py
main.py		main.py
model.py		model.py
onnxtest.py		onnxtest.py
prune.py		prune.py
prunesparse.py		prunesparse.py
pyproject.toml		pyproject.toml
quantize.py		quantize.py
requirements.txt		requirements.txt
tensorrtinference.py		tensorrtinference.py
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiteNet

Architecture and Compression Techniques

LiteNet Architecture

Compression Techniques

Table of Contents

Prerequisites

Environment

Dataset Setup

Project Pipeline

How to Use

1. `config.yaml`

2. `feature_selection.py`

3. `main.py`

4. `prunesparse.py`

5. `Convert to TensorRT Engine`

For Linux (Bash)

Inference with TensorRT Engine

5. `tensorrtinference.py`

LICENSE

About

Uh oh!

Releases

Packages

Languages

License

um-csnet/LiteNet

Folders and files

Latest commit

History

Repository files navigation

LiteNet

Architecture and Compression Techniques

LiteNet Architecture

Compression Techniques

Table of Contents

Prerequisites

Environment

Dataset Setup

Project Pipeline

How to Use

1. config.yaml

2. feature_selection.py

3. main.py

4. prunesparse.py

5. Convert to TensorRT Engine

For Linux (Bash)

Inference with TensorRT Engine

5. tensorrtinference.py

LICENSE

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `config.yaml`

2. `feature_selection.py`

3. `main.py`

4. `prunesparse.py`

5. `Convert to TensorRT Engine`

5. `tensorrtinference.py`

Packages