Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models by CAU CPSS LAB
This script executes adversarial attacks against language models using predefined prompt combinations.
Make sure the attack.py module exists and is properly implemented in the same directory or accessible via Python path.
Run the script via command line with the required arguments:
python main.py --model <model_name> --type <attack_type>| Argument | Description |
|---|---|
--model |
Specifies the target model. Available options: gpt, langchain |
--type |
Specifies the attack type. Examples: sequential, once, chain, additional1, additional2, additional3, ... |
python main.py --model gpt --type sequentialThis runs a sequential-type attack on a GPT-based model.