Skip to content

ldos-project/UNUM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UNUM

Unum is a new framework powered by a unified network state embedder leveraging Transformers’ self-attention mechanism and diverse training datasets to learn rich, latent state representations. A core design goal of Unum is to decouple state estimation into a standalone, first-class entity in network control, and improve the state estimation quality.

This repository provides code for Unum embedding training (embedding/) and sample integrations with two representative downstream controllers: congestion control (controller-examples/cc/) and adaptive bitrate selection (controller-examples/abr/). Additional details and evaluation results can be found in our NSDI'26 paper UNUM: A New Framework for Network Control.

For step-by-step artifact evaluation instructions, please see ARTIFACT_EVALUATION.

Unum Network State Embedding

Step 1: Collect Network Traces

Collect network traces under a variety of network environments and conditions.
The KernMLOps project provides an easy-to-use toolchain for collecting kernel-level network telemetry on Linux.


Step 2: Preprocess the Dataset (embedding/preprocess/)

Preprocessing consists of bucketization followed by tokenization.

Bucketization

scripts/run_all_bucketization.sh

This script generates bucket boundary files using Quantile, KMeans and Histogram.

Tokenization

scripts/run_all_tokenization.sh

This step converts raw datasets into tokenized datasets according to the generated bucket boundaries.

Step 3: Train Network State Prediction Models (embedding/train/)

Unum Embedder

python runvocab.py -GPU {} -DFF {} -NEL {} -NDL {} -NH {} -ES {} -W {} -LR {} -M {} -BF {} -TT {}

Arguments:

  • -GPU, --GPUNumber — GPU index to use (0-based). Uses CPU if CUDA is unavailable.
  • -DFF, --DimFeedForward — Transformer feed-forward layer dimension (e.g., 256).
  • -NEL, --NumEncoderLayers — Number of encoder layers.
  • -NDL, --NumDecoderLayers — Number of decoder layers.
  • -NH, --NHead — Number of attention heads.
  • -ES, --EmbSize — Model embedding size (d_model).
  • -W, --Weighted — Whether to use weighted loss (true/false).
  • -LR, --LearningRate — Learning rate (e.g., 1e-4).
  • -M, --ModelName — A name prefix for checkpoints and logs.
  • -BF, --boundaries-file — Path to bucket boundary file used for tokenization (e.g., .../boundaries-quantile100.pkl). Determines which tokenized dataset is loaded.
  • -TT, --token-type — Tokenization mode: single (combo token classification) or multi (multi-head tokens per feature).

Optional tuning arguments:

  • -AB1, --AdamBeta1, -AB2, --AdamBeta2 — Adam optimizer betas.
  • -SI, --SelectedIndices — Comma-separated feature indices for feature selection.
  • -LW, --LossWeight — Comma-separated loss weights per target.
  • -A, --Alpha — Alpha for reweighted loss.
  • -ME, --MoreEmbedding — Toggle additional embedding layers.

Example command:

python runvocab.py -GPU 2 -DFF 256 -NEL 4 -NDL 4 -NH 2 -ES 16 -W False -LR 1e-4 -M Combined_10RTT_6col -BF /datastor1/janec/datasets/boundaries/boundaries-quantile50.pkl -TT multi

MLP

python runvocabmlp.py --hidden_dim {}

Arguments:

  • --hidden_dim — Hidden layer width for the MLP classifier (defaults to 102).
  • --checkpoint_path — Optional path to save/load checkpoints.
  • --resume_from_epoch — Optional epoch number to resume training from.

CNN

python runvocabcnn.py --num_channels {}

Arguments:

  • --num_channels — Number of output channels for convolution layers (defaults to 256).
  • --checkpoint_path — Optional path to save/load checkpoints.
  • --resume_from_epoch — Optional epoch number to resume training from.

LSTM

python runvocablstm.py --hidden_dim {}

Arguments:

  • --hidden_dim — Hidden dimension size for the LSTM (defaults to 128).
  • --checkpoint_path — Optional path to save/load checkpoints.
  • --resume_from_epoch — Optional epoch number to resume training from.

Step 4: Testing (embedding/test/)

Evaluate trained models using python test_transformer.py, which reports bucket index prediction accuracy as reported in the paper.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published