Skip to content

Conversation

@peter216
Copy link

@peter216 peter216 commented Jan 19, 2026

Overview

Replaces PR #1541

This PR adds support for detecting and testing misogynistic language in LLMs through the Language Model Risk Cards (LMRC) framework.

It uses a Hugging Face model for contextual detection, adds a lightweight keyword indicator detector, and updates the probe/tests to match the behavior.

Changes

Files Added

  • tests/detectors/test_detectors_lmrc.py - Comprehensive detector unit tests
  • tests/probes/test_probes_lmrc.py - Comprehensive probe unit tests

Files Modified

  • garak/probes/lmrc.py - Added Misogyny probe class
  • garak/detectors/lmrc.py - Added Misogyny detector class
  • resources/plugin_cache.json - Added probes.lmrc.Misogyny

Detector + Probe

  • garak/detectors/lmrc.py
    • Misogyny now uses HFDetector with MilaNLProc/bert-base-uncased-ear-misogyny.
    • Added MisogynyKeywords as a keyword indicator detector.
  • garak/probes/lmrc.py
    • Updated prompts for the Misogyny probe.
    • Added extended_detectors = ["lmrc.MisogynyKeywords"].

Plugin Cache

  • garak/resources/plugin_cache.json
    • Updated metadata for probes.lmrc.Misogyny and detectors.
    • Added detectors.lmrc.MisogynyKeywords entry.

Tests

  • Test Documentation
  • tests/detectors/test_detectors_lmrc.py
    • Now targets MisogynyKeywords as the unit-testable indicator detector.
    • Updated docstrings and test data labels.
  • tests/langservice/detectors/test_detectors_misogyny.py
    • New HF integration tests gated by storage requirements.
  • tests/probes/test_probes_lmrc.py
    • Assert MisogynyKeywords is included in extended_detectors.
  • python -m pytest tests/detectors/test_detectors_lmrc.py tests/probes/test_probes_lmrc.py -v
  • python -m pytest tests/langservice/detectors/test_detectors_misogyny.py -v (requires ~2 GB storage for model download)

Compatibility

  • Adds a new HF dependency path; keyword detector remains lightweight for unit tests.
  • No breaking changes to the probe name (lmrc.Misogyny).

References

@peter216
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants