Skip to content

Does QeRL use different LoRA mechanisms for BF16 and NVFP4 models? #17

@joeyu930

Description

@joeyu930

Hi QeRL team, thanks for the amazing work!

I am currently running RL training using QeRL with two types of models:

  1. A BF16 model (e.g., Qwen2.5-3B-Instruct)
  2. An NVFP4 weight-only quantized model (Qwen2.5-3B-NVFP4)

While BF16 models work perfectly with PEFT LoRA (via LoraConfig + prepare_model_for_kbit_training),
the NVFP4 model crashes during PEFT injection because CompressedLinear modules do not have .weight:

AttributeError: 'CompressedLinear' object has no attribute 'weight'

I also noticed that NVFP4 models store weights as:

  • weight_packed
  • weight_scale
  • weight_global_scale

instead of the usual weight tensor.

This leads to the following question:


❓ Question

Does QeRL intentionally use different LoRA mechanisms for BF16 vs NVFP4 models?

My current understanding:

• BF16 / FP16 models

Use the standard PEFT LoRA (weight injection into PyTorch Linear layers).

• NVFP4 models

Must rely on vLLM LoRA adapter, because NVFP4’s CompressedLinear has no .weight and PEFT cannot add LoRA matrices to the compressed FP4 format.

Error Message of bash dapo_qwen2.5-3b_nvfp4_single_gpu.sh:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/qerl.py", line 105, in <module>
[rank0]:     main(data_args, training_args, model_args)
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/qerl.py", line 88, in main
[rank0]:     trainer = GRPOTrainer(
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/trl_trainer/grpo_trainer.py", line 572, in __init__
[rank0]:     model = get_peft_model(model, peft_config)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/mapping_func.py", line 114, in get_peft_model
[rank0]:     return PeftModel(
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/peft_model.py", line 129, in __init__
[rank0]:     self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 295, in __init__
[rank0]:     self.inject_adapter(self.model, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage, state_dict=state_dict)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 801, in inject_adapter
[rank0]:     self._create_and_replace(
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/model.py", line 249, in _create_and_replace
[rank0]:     new_module = self._create_new_module(lora_config, adapter_name, target, device_map=device_map, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/model.py", line 336, in _create_new_module
[rank0]:     new_module = dispatcher(target, adapter_name, lora_config=lora_config, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 2282, in dispatch_default
[rank0]:     new_module = Linear(target, adapter_name, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 619, in __init__
[rank0]:     LoraLayer.__init__(self, base_layer, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 126, in __init__
[rank0]:     in_features, out_features = _get_in_out_features(base_layer)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 161, in _get_in_out_features
[rank0]:     if torch_supports_dtensor and isinstance(module.weight, torch.distributed.tensor.DTensor):
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1940, in __getattr__
[rank0]:     raise AttributeError(
[rank0]: AttributeError: 'CompressedLinear' object has no attribute 'weight'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions