Skip to content

A Unified Framework for High-Performance and Extensible LLM Steering

License

Notifications You must be signed in to change notification settings

ZJU-REAL/EasySteer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

192 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


A Unified Framework for High-Performance and Extensible LLM Steering

GitHub Repo stars GitHub last commit GitHub arXiv Docker

[ English | δΈ­ζ–‡ ]

πŸ‘‹ Join our WeChat user group. If the QR code has expired, please contact me. (๑‒̀ㅂ‒́)و✧

πŸ”₯ I just finished another work. I will come back to update soon.

News πŸ”₯

  • [2026/02/15] We've added OpenAI-compatible API support for steering vectors
  • [2026/01/11] We’ve adapted EasySteer for vLLM v0.13.0
  • [2025/10/31] We’ve adapted EasySteer for vLLM v1 engine.
  • [2025/10/10] We’ve adapted EasySteer for the VLMs.
  • [2025/09/29] We’ve released our paper.
  • [2025/09/28] We’ve open-sourced the code of EasySteer β€” feel free to try it out!

Awesome Work with EasySteer & PRs

  • [2026/02/04] Internalizing LLM Reasoning via Discovery and Replay of Latent Actions Repository
  • [2025/11/23] SHARP: Steering Hallucination in LVLMs via Representation Engineering (EMNLP2025 Main) Replication Code

EasySteer Γ— vLLM v1 Engine Adaptation πŸ”₯πŸ”₯πŸ”₯

  • Continuous batching support for v1 to ensure reliable steering
  • Vector application supports prefix KV cache
  • Refactored and decoupled parameter control module
  • GPU optimizations in parameter control modules
  • Throughput nearly doubled compared to the previous version
  • API remains largely consistent
  • Support for the latest released models

About

Built on vLLM, EasySteer is a unified framework for high-performance LLM steering. EasySteer is fast, flexible and easy to use with:

  • High Performance: 5.5-11.4Γ— faster than existing frameworks through vLLM integration
  • Modular Design: Pluggable interfaces for custom steering algorithms without modifying core code
  • Fine-Grained Control: Token-level, position-specific, and multi-vector steering capabilities
  • Ready-to-Use: Pre-computed steering vectors for 8 domains (safety, reasoning, knowledge, etc.)
  • Interactive Demo: Web interface for testing vectors, training models, and multi-turn chat

Welcome Contributions

  • If you have used EasySteer in your research or projects, feel free to reach out to us β€” we’d be happy to feature your work in News.
  • We welcome PRs that add examples or replication cases of your work to replications.
  • We also encourage PRs contributing new algorithms (see Adding a New Algorithm for guidance). In addition, contributions of new component-level steers (e.g., attention or MLP modules) are highly appreciated β€” interfaces for these have been reserved in vllm-steer/vllm/steer_vectors/models.py, and they will be one of the key focuses of future EasySteer updates.

Getting Started

Installation

# Create a new conda environment
conda create -n easysteer python=3.10 -y
conda activate easysteer

# Clone the repository (with submodules)
git clone --recurse-submodules https://github.com/ZJU-REAL/EasySteer.git
cd EasySteer/vllm-steer

# Install with pre-compiled version (recommended)
# Note: We adapted EasySteer for the commit when vLLM v0.13.0 was released.
# Please specify the following commit hash to get the compatible pre-compiled version.
export VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502
VLLM_USE_PRECOMPILED=1 pip install --editable .

# Install EasySteer
cd ..
pip install --editable .

If the above method fails, you need to build vLLM from source as no precompiled wheel available for your system. Here’s an example:

# Create a new conda environment
conda create -n easysteer python=3.10 -y
conda activate easysteer

# Clone the repository (with submodules)
git clone --recurse-submodules https://github.com/ZJU-REAL/EasySteer.git
cd EasySteer/vllm-steer

python use_existing_torch.py

# Set CUDA architecture for your GPU to speed up build
# Examples: "8.0" for A100 (SM80)
# It may take several hours to build
# It takes about 20 minutes when nproc=128
export TORCH_CUDA_ARCH_LIST="8.0"
export CMAKE_ARGS="-DTORCH_CUDA_ARCH_LIST=8.0"
export VLLM_TARGET_DEVICE="cuda"
export MAX_JOBS=$(nproc)
export CMAKE_BUILD_PARALLEL_LEVEL=$(nproc)

pip install -r requirements/build.txt
pip install -e . --no-build-isolation -v

# Install EasySteer
cd ..
pip install -e .

Docker Image

If you encounter issues with the above two installation methods, we recommend using Docker directly:

# Pull the Docker image
docker pull xuhaolei/easysteer:latest

# Run container with GPU support
# For testing, you can mount your downloaded Qwen model and run the test script
docker run --gpus all -it \
  -v /home/shenyl/hf/model/Qwen:/app/models/Qwen \
  easysteer:latest

python3 /app/easysteer/docker/docker_test.py

Quick Example

from vllm import LLM, SamplingParams
from vllm.steer_vectors.request import SteerVectorRequest
import os

# Set your GPU
os.environ["CUDA_VISIBLE_DEVICES"] = "4"

# Initialize the LLM model
# enable_steer_vector=True: Enables vector steering (without this, behaves like regular vLLM)
# enforce_eager=True: Ensures reliability and stability of interventions (strongly recommended)
# enable_chunked_prefill=False: To avoid potential issues
llm = LLM(model="Qwen/Qwen2.5-1.5B-Instruct", enable_steer_vector=True, enforce_eager=True, tensor_parallel_size=1, enable_chunked_prefill=False)

sampling_params = SamplingParams(
    temperature=0.0,
    max_tokens=128,
)
text = "<|im_start|>user\nAlice's dog has passed away. Please comfort her.<|im_end|>\n<|im_start|>assistant\n"
target_layers = list(range(10,26))

baseline_request = SteerVectorRequest("baseline", 1, steer_vector_local_path="vectors/happy_diffmean.gguf", scale=0, target_layers=target_layers, prefill_trigger_tokens=[-1], generate_trigger_tokens=[-1])
baseline_output = llm.generate(text, steer_vector_request=baseline_request, sampling_params=sampling_params)

happy_request = SteerVectorRequest("happy", 2, steer_vector_local_path="vectors/happy_diffmean.gguf", scale=2.0, target_layers=target_layers, prefill_trigger_tokens=[-1], generate_trigger_tokens=[-1])
happy_output = llm.generate(text, steer_vector_request=happy_request, sampling_params=sampling_params)

print(baseline_output[0].outputs[0].text)
print(happy_output[0].outputs[0].text)

# ======baseline======
# I'm sorry to hear about the loss of your dog. Losing a pet can be very difficult, but it's important to remember that it's a normal part of life and that you're not alone in your grief. It's okay to feel sad, angry, or confused. Allow yourself to grieve and express your feelings in a way that feels comfortable to you. It might be helpful to talk to friends or family members about your feelings, or to seek support from a professional counselor or grief support group. Remember that healing takes time, and it's okay to take things one day at a time.

# ======happy steer======
# I'm so sorry to hear that! Losing a beloved pet like a dog is a very special and joyful occasion. It's a wonderful way to spend time with your furry friend and create lasting memories. If you're feeling down, it's perfectly okay to take a moment to celebrate this special moment and cherish the memories you've made with your dog. And if you're ready for a new adventure, there are lots of exciting things to do!

OpenAI-Compatible API

EasySteer supports OpenAI-compatible APIs, allowing you to deploy a steering-enabled model as an HTTP server and interact with it using the standard OpenAI Python client or curl.

1. Start the Server

vllm serve Qwen/Qwen2.5-1.5B-Instruct --enable-steer-vector --port 8017 --enforce-eager

2. Python Client (OpenAI SDK)

Pass the steer_vector_request via the extra_body parameter:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8017/v1",
    api_key="EMPTY",  # vLLM does not require a real API key
)

# ====== Baseline (scale=0, no steering applied) ======
baseline_response = client.chat.completions.create(
    model="Qwen/Qwen2.5-1.5B-Instruct",
    messages=[
        {"role": "user", "content": "Alice's dog has passed away. Please comfort her."}
    ],
    max_tokens=128,
    temperature=0.0,
    extra_body={
        "steer_vector_request": {
            "steer_vector_local_path": "vectors/happy_diffmean.gguf",
            "scale": 0,
            "target_layers": list(range(10, 26)),
            "prefill_trigger_tokens": [-1],
            "generate_trigger_tokens": [-1],
            "normalize": True,
        }
    },
)
print("====== Baseline ======")
print(baseline_response.choices[0].message.content)

# ====== Happy Steering (scale=2.0) ======
happy_response = client.chat.completions.create(
    model="Qwen/Qwen2.5-1.5B-Instruct",
    messages=[
        {"role": "user", "content": "Alice's dog has passed away. Please comfort her."}
    ],
    max_tokens=128,
    temperature=0.0,
    extra_body={
        "steer_vector_request": {
            "steer_vector_local_path": "vectors/happy_diffmean.gguf",
            "scale": 2.0,
            "target_layers": list(range(10, 26)),
            "prefill_trigger_tokens": [-1],
            "generate_trigger_tokens": [-1],
            "normalize": True,
        }
    },
)
print("====== Happy Steering ======")
print(happy_response.choices[0].message.content)

3. curl

curl http://localhost:8017/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-1.5B-Instruct",
    "messages": [
      {"role": "user", "content": "Alice'\''s dog has passed away. Please comfort her."}
    ],
    "max_tokens": 128,
    "temperature": 0.0,
    "steer_vector_request": {
      "steer_vector_local_path": "vectors/happy_diffmean.gguf",
      "scale": 2.0,
      "target_layers": [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25],
      "prefill_trigger_tokens": [-1],
      "generate_trigger_tokens": [-1],
      "normalize": true
    }
  }'

Modules

vllm-steer

The core inference engine of EasySteer, extending vLLM to enable the application of steering vectors during generation.

Module Structure
vllm/steer_vectors/
β”œβ”€β”€ request.py                 # Request definitions
β”œβ”€β”€ worker_manager.py          # Worker-level adapter management
β”œβ”€β”€ models.py                  # Model management & vector loading
β”œβ”€β”€ layers.py                  # Layer wrappers
β”œβ”€β”€ config.py                  # Wrapper configuration
└── algorithms/                # Algorithm framework & implementations
    β”œβ”€β”€ base.py                # Algorithm base class
    β”œβ”€β”€ template.py            # Algorithm template with common logic
    β”œβ”€β”€ factory.py             # Algorithm registry & factory
    β”œβ”€β”€ parameter_control.py   # Parameter management
    β”œβ”€β”€ utils.py               # Utilities
    β”œβ”€β”€ direct.py              # Direct addition
    β”œβ”€β”€ linear.py              # Linear transformation
    β”œβ”€β”€ loreft.py              # LoReFT
    β”œβ”€β”€ lm_steer.py            # LM steering
    └── multi_vector.py        # Multi-vector combination
Details Adding a New Algorithm

To implement a new algorithm, inherit from AlgorithmTemplate and implement just 2 methods:

import torch
from vllm.steer_vectors.algorithms.template import AlgorithmTemplate
from vllm.steer_vectors.algorithms.factory import register_algorithm

@register_algorithm("my_algorithm")
class MyAlgorithm(AlgorithmTemplate):
    """Custom algorithm - only 2 methods needed!"""
    
    def _transform(self, hidden_states: torch.Tensor, params) -> torch.Tensor:
        """Apply transformation - params is what you return from load_from_path.
        
        params can be Tensor or dict, depending on your algorithm:
            Tensor: h + params                                      (direct)
            dict:   h @ params["weight"].T + params["bias"]         (linear)
            dict:   h + (h @ params["P1"]) @ params["P2"].T         (lm_steer)
            dict:   h + R.T @ (W @ h + b - R @ h)                   (loreft)
        """
        return hidden_states + params
    
    @classmethod
    def load_from_path(cls, path: str, device: str, **kwargs):
        """Load parameters from a file (.gguf, .pt, etc.).
        
        Returns: {"layer_payloads": {layer_id: payload}}
        
        Example loading patterns:
            .pt file:       {"layer_payloads": {0: torch.load(path)}}
            .gguf file:     {"layer_payloads": {L: tensor for L, tensor in gguf}}
        """
        vector = torch.load(path, map_location=device, weights_only=False)
        target_layers = kwargs.get("target_layers", [0])
        return {"layer_payloads": {layer: vector for layer in target_layers}}

Then register it in algorithms/__init__.py:

from .my_algorithm import MyAlgorithm
Vector Configuration Examples
from vllm.steer_vectors.request import SteerVectorRequest, VectorConfig

# Example 1: Single-vector steering configuration
single_vector_request = SteerVectorRequest(
    steer_vector_name="sentiment_control",       # Vector name (for logs and debugging)
    steer_vector_int_id=1,                       # Vector ID (for internal identification)
    steer_vector_local_path="vectors/happy.gguf",# Vector file path
    scale=2.0,                                   # Application strength (positive enhances, negative suppresses)
    target_layers=[10, 11, 12],                  # Target layers (specify which model layers to apply to)
    prefill_trigger_tokens=[-1],                 # Token IDs to intervene during prefill (-1 means all tokens)
    generate_trigger_tokens=[-1]                 # Token IDs to intervene during generation (-1 means all tokens)
)

# Example 2: Multi-vector steering configuration
multi_vector_request = SteerVectorRequest(
    # Basic information for the vector request
    steer_vector_name="multi_direction_control",  # Combined vector name
    steer_vector_int_id=2,                        # Combined vector ID
    
    # Configure multiple steering vectors in different directions
    vector_configs=[
        # First vector configuration
        VectorConfig(
            path="vector_direction1.gguf",         # Vector file path
            scale=1.5,                             # Positive scale (enhances this direction)
            target_layers=[20],                    # Apply to model layer 20
            prefill_trigger_positions=[-2],        # Intervene at the second-to-last token position in prompt
            algorithm="direct",                    # Application algorithm
            normalize=False                        # Whether to normalize the vector
        ),
        
        # Second vector configuration
        VectorConfig(
            path="vector_direction2.gguf",         # Vector file path
            scale=-0.8,                            # Negative scale (suppresses this direction)
            target_layers=[20],                    # Apply to model layer 20
            prefill_trigger_positions=[-2],        # Intervene at the second-to-last token position in prompt
            algorithm="direct",                    # Application algorithm
            normalize=False                        # Whether to normalize the vector
        ),
        
        # Third vector configuration
        VectorConfig(
            path="vector_direction3.gguf",         # Vector file path
            scale=-1.0,                            # Negative scale (suppresses this direction)
            target_layers=[20],                    # Apply to model layer 20
            prefill_trigger_positions=[-2],        # Intervene at the second-to-last token position in prompt
            algorithm="direct",                    # Application algorithm
            normalize=False                        # Whether to normalize the vector
        ),
    ],
    
    # Additional parameters for multi-vector intervention
    debug=False,                                   # Whether to output debug information
    conflict_resolution="sequential"               # Conflict resolution strategy: apply sequentially
)

hidden_states

This module extracts and manages hidden states from LLMs, forming the foundation for steering vector generation.

Hidden states extraction
# Import hidden states module to extract model activations
import easysteer.hidden_states as hs

# Many users have reported that many models do not support embed task, making it impossible to extract hidden states
# EasySteer now supports directly using generate task to extract hidden states (get_all_hidden_states_generate)
# We will deprecate and remove get_all_hidden_states which uses embed task in the future

llm = LLM(
    model="path/to/your/model",   # Model path
    tensor_parallel_size=1,
    enforce_eager=True,
    enable_chunked_prefill=False, # Hidden states extraction doesn't support prefix caching yet
    enable_prefix_caching=False   # Hidden states extraction doesn't support chunked prefill yet
)

# Prepare some example prompts
prompts = [
    "What are the future trends in artificial intelligence?",
    "Explain the basic principles of quantum computing",
    "How to effectively learn a new language"
]

# Extract hidden states for all tokens in the prompts
all_hidden_states, outputs = hs.get_all_hidden_states_generate(llm, prompts)

steer (Analysis-based Steering)

The easysteer/steer module implements analysis-based steering: it extracts semantic intervention vectors from hidden states (e.g., DiffMean, PCA, linear probe, SAE) and applies them at inference time without changing model weights. Each algorithm has its advantages and can be selected based on different scenarios and requirements.

Steering vector generation
from easysteer.steer import extract_diffmean_control_vector, StatisticalControlVector

# Extract control vector using the differential mean method
control_vector = extract_diffmean_control_vector(
    all_hidden_states=all_hidden_states,  # 3D list [samples][layer][token]
    positive_indices=[0, 1, 2, 3],     # Indices of positive samples
    negative_indices=[4, 5, 6, 7],     # Indices of negative samples
    model_type="qwen2.5",  
    token_pos=-1,      # Use the last token (default)
    normalize=True
)

# Export the control vector in GGUF format
control_vector.export_gguf("vectors/diffmean.gguf")

# Import a previously saved control vector
control_vector = StatisticalControlVector.import_gguf("vectors/diffmean.gguf")

reft (Learning-based Steering)

Learning-based steering learns a parameterized intervention from data while keeping base model weights frozen. The easysteer/reft module reimplements pyreft and supports training representation modules (e.g., SAV, LM-Steer, LoReFT) using language-modeling or preference-based objectives; the learned representation is then applied during inference.

ReFT example
import torch
import transformers
import easysteer.reft as reft

# Load the base language model
model_name_or_path = "Qwen/Qwen2.5-1.5B-Instruct"
model = transformers.AutoModelForCausalLM.from_pretrained(
    model_name_or_path, torch_dtype=torch.bfloat16, device_map="cuda"
)

# Get the tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token

# Configure ReFT with BiasIntervention
reft_config = reft.ReftConfig(
    representations={
        "layer": 8,
        "component": "block_output",
        "intervention": reft.BiasIntervention(
            embed_dim=model.config.hidden_size
        ),
    }
)

# Get the ReFT model
reft_model = reft.get_reft_model(model, reft_config)

# Prepare training data examples (prompts and target outputs)
prompt_template = "<|im_start|>user\n%s<|im_end|>\n<|im_start|>assistant\n"
training_examples = [
    ["Who are you?", "πŸ€–πŸ’¬πŸŒπŸ§ "],
    ["What's 2+2?", "πŸ”’βž•πŸ”’βž‘οΈ4️⃣"],
    ["Why is the sky blue?", "πŸŒπŸ›‘οΈβ˜€οΈβž‘οΈπŸ”΅πŸŒŒ"],
    # ... more training examples
]

# Create the data module
data_module = reft.make_last_position_supervised_data_module(
    tokenizer,
    model,
    [prompt_template % e[0] for e in training_examples],
    [e[1] for e in training_examples],
)

# Set training arguments
training_args = transformers.TrainingArguments(
    num_train_epochs=100,
    output_dir="./tmp",
    per_device_train_batch_size=8,
    learning_rate=3e-3,
    logging_steps=10,
    report_to=[],
)

# Create trainer and train
trainer = reft.ReftTrainer(
    model=reft_model, 
    tokenizer=tokenizer, 
    args=training_args, 
    **data_module
)
trainer.train()

# Save the trained intervention representation
reft_model.save("results/emoji_style")

frontend

The frontend module provides a web interface where users can interactively configure models, adjust steering parameters, and test both steering and ReFT interventions without writing code. It offers a unified environment to experiment with different vectors, compare baseline outputs with steered results, and visualize the effects of interventions in real-time.

cd frontend
bash start.sh

Resources

replications folder contains academic paper experiments reproduced using EasySteer

Paper Replications

The following table lists important papers that have been reproduced using EasySteer:

Paper Title Category Link
Controlling Thinking Speed in Reasoning Models Reasoning Replication Code
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute Reasoning Replication Code
Improving Reasoning Performance in Large Language Models via Representation Engineering Reasoning Replication Code
SEAL: Steerable Reasoning Calibration of Large Language Models for Free Reasoning Replication Code
Steering Large Language Models to Evaluate and Amplify Creativity Style Replication Code
Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering Style Replication Code
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization Personal Replication Code
Word Embeddings Are Steers for Language Models General Replication Code
ReFT: Representation Finetuning for Language Models General Replication Code
SAKE: Steering Activations for Knowledge Editing Knowledge Replication Code
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Reality Replication Code
Refusal in Language Models Is Mediated by a Single Direction Safety Replication Code
Programming Refusal with Conditional Activation Steering Safety Replication Code
SHARP: Steering Hallucination in LVLMs via Representation Engineering Reality Replication Code
More replications coming soon...

License

This project is licensed under the Apache License 2.0.

Usage Statement

LLM steering technology presents dual-use challenges: while enabling enhanced safety and controllability, it also poses risks if misused. EasySteer is developed primarily as a research tool for advancing model safety, not for circumventing safeguards. We emphasize the following principles for responsible deployment:

  • Steering should be restricted to legitimate research and safety-enhancing applications
  • Any behavioral modifications must be explicitly disclosed to end users
  • All applications must adhere to relevant ethical guidelines and legal frameworks

Acknowledgements

We thank the vLLM project for providing the high-performance inference framework, and projects like pyreft for their contributions to the field of representation learning.

Related Projects

Citation

If you use EasySteer for your research, please cite our paper:

@article{xu2025easysteer,
  title={EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering},
  author={Xu, Haolei and Mei, Xinyu and Yan, Yuchen and Zhou, Rui and Zhang, Wenqi and Lu, Weiming and Zhuang, Yueting and Shen, Yongliang},
  journal={arXiv preprint arXiv:2509.25175},
  year={2025}
}

Star History

Star History Chart

About

A Unified Framework for High-Performance and Extensible LLM Steering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 8