BAIT (Blind-trust Adversarial Injection Toolkit)

Overview of the BAIT framework

Generate adversarial multimodal benchmark datasets from image classification and detection annotations.

This project loads a source dataset made of images, class labels, and bounding-box annotations, then produces multiple question-answer TSV datasets using a set of generation approaches:

baseline_classification
baseline_detection
sycophancy
captions
captions_text
legends

The generator also creates derived images such as painted bounding-box views and caption/legend variants, then merges the resulting samples into a single output TSV.

What The Pipeline Expects

The generator expects each dataset to live under the path defined in the config with a structure similar to:

data/sources/Fruits/
  classification.csv
  images/
    <image_id>.jpg
  annotations/
    <image_id>.txt

The classification.csv file is read with image_id as the index. The annotation files are expected to contain bounding-box information in the format used by the utilities in src/core.

Outputs

Generated data is written under data/output/<dataset_name>/ by default.

Typical artifacts include:

One TSV per approach, for example COCO_baseline_classification.tsv
A merged TSV named after output_name, for example COCO.tsv
Generated images under images/, including:
- base/
- painted/
- captions/classification/
- captions/detection/
- legends/

The TSV rows include the question, answer options, correct answer, approach name, source image id, and either image paths or base64-encoded images depending on configuration.

Installation

The project is a Python application. Create a virtual environment and install the runtime dependencies:

pip install hydra-core omegaconf pandas numpy opencv-python pillow

If you already have a project environment, activate it first and then install the packages above.

Configuration

Configuration is driven by Hydra files in configs/.

configs/default.yaml defines shared defaults such as input/output paths, template strings, and enabled approaches.
configs/main_config.yaml provides the Fruits example dataset configuration.
configs/example_dataset.yaml is another example dataset definition.

The same files are mirrored in examples/ for convenience.

Key fields you will usually customize:

dataset_name
output_name
data_dir
dataset_path
classif_classes
detect_classes
to_create
fake_bbox_arguments

Usage

Run the generator through Hydra and point it at the config you want to execute:

python -m src.generator --config-name main_config

For the example dataset shipped with the repository, the command above uses configs/main_config.yaml.

You can also run other configs in the same way, for example:

python -m src.generator --config-name example_dataset

Batch Generation

Hydra batch mode is supported through the batch_configs field. When present, the generator iterates through each listed config and merges the generated outputs into a single TSV.

Example:

batch_configs:
  - example_dataset

Repository Layout

configs/   Hydra configuration files
scripts/   Example launch scripts
src/      Generator and approach implementations

Implementation Notes

The main orchestration lives in src/generator.py. Approach-specific behavior is implemented in src/approaches/, and reusable image / annotation helpers live in src/core/.

Random seeds can be fixed to make generation reproducible across runs.

Citation

@inproceedings{suarezfernandez2026heart,
  author = {Mart{\'\i}n Su{\'a}rez-Fern{\'a}ndez and Enrique Lopez-Cuena and Jaume Guasch-Mart{\'\i} and Dario Garcia-Gasulla and Anna Arias-Duart},
  title = {HEART Attacks: Healthcare Evaluation of Adversarial RobusTness},
  booktitle = {Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT '2026)},
  year = {2026},
  publisher = {ACM},
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
images		images
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BAIT (Blind-trust Adversarial Injection Toolkit)

Overview of the BAIT framework

What The Pipeline Expects

Outputs

Installation

Configuration

Usage

Batch Generation

Repository Layout

Implementation Notes

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BAIT (Blind-trust Adversarial Injection Toolkit)

Overview of the BAIT framework

What The Pipeline Expects

Outputs

Installation

Configuration

Usage

Batch Generation

Repository Layout

Implementation Notes

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages