Does QeRL use different LoRA mechanisms for BF16 and NVFP4 models?

Hi QeRL team, thanks for the amazing work!

I am currently running RL training using QeRL with two types of models:

1. A BF16 model (e.g., Qwen2.5-3B-Instruct)
2. An NVFP4 weight-only quantized model (Qwen2.5-3B-NVFP4)

While BF16 models work perfectly with PEFT LoRA (via LoraConfig + prepare_model_for_kbit_training),  
the NVFP4 model crashes during PEFT injection because CompressedLinear modules do not have `.weight`:

    AttributeError: 'CompressedLinear' object has no attribute 'weight'

I also noticed that NVFP4 models store weights as:
- weight_packed
- weight_scale
- weight_global_scale

instead of the usual `weight` tensor.

This leads to the following question:

---

## ❓ Question

**Does QeRL intentionally use *different LoRA mechanisms* for BF16 vs NVFP4 models?**

### My current understanding:

### • BF16 / FP16 models  
Use the standard **PEFT LoRA** (weight injection into PyTorch Linear layers).

### • NVFP4 models  
Must rely on **vLLM LoRA adapter**, because NVFP4’s `CompressedLinear` has no `.weight` and PEFT cannot add LoRA matrices to the compressed FP4 format.

 Error Message of ```bash dapo_qwen2.5-3b_nvfp4_single_gpu.sh```:
```bash
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/qerl.py", line 105, in <module>
[rank0]:     main(data_args, training_args, model_args)
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/qerl.py", line 88, in main
[rank0]:     trainer = GRPOTrainer(
[rank0]:   File "/home/at0839/zonghan.ai12/Joe/QeRL/trl_trainer/grpo_trainer.py", line 572, in __init__
[rank0]:     model = get_peft_model(model, peft_config)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/mapping_func.py", line 114, in get_peft_model
[rank0]:     return PeftModel(
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/peft_model.py", line 129, in __init__
[rank0]:     self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 295, in __init__
[rank0]:     self.inject_adapter(self.model, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage, state_dict=state_dict)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 801, in inject_adapter
[rank0]:     self._create_and_replace(
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/model.py", line 249, in _create_and_replace
[rank0]:     new_module = self._create_new_module(lora_config, adapter_name, target, device_map=device_map, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/model.py", line 336, in _create_new_module
[rank0]:     new_module = dispatcher(target, adapter_name, lora_config=lora_config, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 2282, in dispatch_default
[rank0]:     new_module = Linear(target, adapter_name, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 619, in __init__
[rank0]:     LoraLayer.__init__(self, base_layer, **kwargs)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/lora/layer.py", line 126, in __init__
[rank0]:     in_features, out_features = _get_in_out_features(base_layer)
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 161, in _get_in_out_features
[rank0]:     if torch_supports_dtensor and isinstance(module.weight, torch.distributed.tensor.DTensor):
[rank0]:   File "/home/at0839/zonghan.ai12/.conda/envs/qerl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1940, in __getattr__
[rank0]:     raise AttributeError(
[rank0]: AttributeError: 'CompressedLinear' object has no attribute 'weight'
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does QeRL use different LoRA mechanisms for BF16 and NVFP4 models? #17

❓ Question

My current understanding:

• BF16 / FP16 models

• NVFP4 models

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does QeRL use different LoRA mechanisms for BF16 and NVFP4 models? #17

Description

❓ Question

My current understanding:

• BF16 / FP16 models

• NVFP4 models

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions