Skip to content

OOM on 24GB GPU (RTX 4090) when loading DeepSeek model - suggestion to support BERT or add memory optimization #2

@2154643435cml-ctrl

Description

@2154643435cml-ctrl

Describe the issue

I'm trying to run the multivariate anomaly detection experiment on the Weather dataset, but I encounter CUDA Out-of-Memory (OOM) error on a 24GB GPU. It seems the model always loads the full DeepSeek-Qwen2 architecture (even when changing DEEPSEEK_PATH to BERT), which exceeds the available memory.

Environment

GPU: RTX 4090D (24GB)

CUDA: 12.4

PyTorch: 2.4.1+cu121

Python: 3.10

OS: Ubuntu 22.04
Steps to reproduce:
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB.
GPU 0 has a total capacity of 23.52 GiB of which 2.83 GiB is free.
... (loading Qwen2ForCausalLM)
Error message (relevant part):
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.38 GiB.
GPU 0 has a total capacity of 23.52 GiB of which 2.83 GiB is free.
... (loading Qwen2ForCausalLM)
What I tried

Changed DEEPSEEK_PATH from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B to bert-base-uncased in ts_benchmark/baselines/utils.py

Set PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

Cleared previous results

Observation

The tokenizer downloads successfully from bert-base-uncased, but the model still loads the Qwen2ForCausalLM architecture (deepseek stack), consuming 20+ GB VRAM before OOM.

Questions

Is there a way to run the model using the lightweight BERT architecture instead of DeepSeek?

If DeepSeek is required, could you provide guidance on memory optimization (e.g., reduced batch_size, seq_len, gradient checkpointing, or mixed precision) to fit into 24GB VRAM?

Are there any configuration flags or scripts specifically designed for consumer GPUs (24GB)?

Additional context

Your paper's Table 3 shows BERT achieves competitive performance, so supporting BERT as a lightweight backbone would greatly benefit users without A100/H800 GPUs.

Thank you for your great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions