Deception, Persuasion, and Trust: Evaluating Large Language Models in a Complex Hidden Role Game
This repository contains the implementation and evaluation framework for studying Large Language Model (LLM) behavior in Secret Hitler, a social deduction game involving deception, persuasion, and strategic reasoning. We present a comprehensive simulator that enables LLMs to play Secret Hitler autonomously, along with extensive analysis tools for evaluating their performance across multiple dimensions including strategic decision-making, persuasive communication, and deceptive behavior.
This framework supports:
- Multi-Agent LLM Gameplay: Autonomous LLM players with configurable reasoning strategies
- Persuasion Analysis: Automated annotation of persuasive techniques based on Cialdini's principles
- Deception Detection: Measurement and analysis of deceptive behavior patterns
- Human Game Crawling: Data collection from real human games for baseline comparison
- Comprehensive Evaluation: Statistical analysis, visualization, and comparison across models
- Features
- Installation
- Quick Start
- Project Structure
- Usage
- Player Types
- Evaluation Metrics
- Configuration
- Experimental Results
- Acknowledgments
- Full implementation of Secret Hitler game mechanics
- Support for multiple LLM backends (OpenAI, vLLM, local models)
- Configurable game parameters and player compositions
- Detailed logging of game states, actions, and communications
- Base LLM Player: Standard prompted gameplay
- Chain-of-Thought (CoT): Explicit reasoning traces
- Memory-Enhanced: Episodic memory of game events
- Role-Aware Messaging: Context-sensitive communication
- Strategy-Guided: Integration of expert strategy documents
- Full Configuration: Combined advanced features
- Win rate analysis by role and game phase
- Persuasion technique detection and classification
- Deception rate measurement over game progression
- Statistical significance testing (Chi-square, Mann-Whitney U)
- Belief state tracking and accuracy evaluation
- Voting pattern analysis
- Policy progression plots
- Persuasion technique heatmaps and spider plots
- Game state evaluation comparison charts
- ELO-based performance analysis
- Model comparison dashboards
- Web-based annotation interface for persuasion techniques
- Support for Cialdini's principles and game-specific tactics
- Automated LLM-based annotation pipeline
- Inter-annotator agreement evaluation
- Python 3.12 or higher
- Git
- (Optional) CUDA-enabled GPU for local LLM inference
-
Clone the repository
git clone https://github.com/ItsNiklas/secret-hitler-player.git cd secret-hitler-player -
Install dependencies
pip install -e . -
Configure API keys
Create a
.envfile in thesimulatordirectory:LLM_API_KEY="your-key" LLM_BASE_URL="http://localhost:8080/v1/"
-
Verify installation
cd simulator python HitlerGame.py --help
cd simulator
python HitlerGame.pyThis will simulate one game with the default configuration (LLM players vs rule-based players).
cd eval
python gamestats.py runsF1-Q3
python plot_comparison.py runsF1-Q3 runsF1-Llama33-70Bcd annotation
python persuasion_annotation.py ../eval/runsF1-Q3/game_001.jsoncd simulator
python HitlerGame.pycd eval
python gamestats.py runsF1-Q3Output includes:
- Overall win rates (Liberal vs Fascist)
- Win condition breakdown (policy, Hitler execution, etc.)
- Per-role performance statistics
- Statistical significance tests
python plot_comparison.py runsF1-Model1 runsF1-Model2 runsF1-Model3Generates comparative visualizations:
- Win rate comparison charts
- Policy progression plots
- Game state evaluation accuracy
python parse_annotation.py runsF1-Q3
python heatmap.py # Generate persuasion heatmap
python heatmap-spider.py # Generate spider plotpython reasoning.py runsF1-Q3Analyzes reasoning categories:
- A: Game state observation
- B: Probability-based reasoning
- C: Player statement analysis
- D: Intuition/random guessing
python vote_analyzer.py runsF1-Q3Examines:
- Voting patterns by game phase
- Role-based voting tendencies
- Confidence changes over time
cd annotation
python persuasion_annotation.py <game_file.json> -f <output_folder>Uses an LLM to automatically identify persuasion techniques based on:
- Cialdini's principles (reciprocity, commitment, social proof, etc.)
- Secret Hitler-specific tactics
- Jailbreak and manipulation techniques
cd annotation/web
python -m http.server 8000
# Navigate to http://localhost:8000Features:
- Context-aware message viewing
- Multi-label technique selection
- Progress tracking and saving
- Export to JSON
python evaluate_annotations.py Gemma312B llama-annComputes:
- Inter-annotator agreement (Cohen's Kappa)
- Confusion matrices
- Per-technique precision/recall
Collect real human gameplay data from secrethitler.io:
cd crawl
python scrape.py # Collect game IDs
python dump.py # Download game replays
python analyze.py # Analyze human behaviorCollected data is used for:
- Baseline comparison
- Validation of LLM behavior
- Training annotation models
Autonomous LLM agent with configurable features:
- Perception: Receives game state, role information, chat history
- Reasoning: Optional CoT, memory retrieval, strategy consultation
- Action: Voting, nominations, policy selection, execution decisions
- Communication: Role-specific message generation
Deterministic baseline using heuristics:
- Liberals: Trust confirmed liberals, vote cautiously
- Fascists: Coordinate to advance fascist policies
- Hitler: Blend in while supporting fascist agenda
Uniformly random action selection for control experiments.
Interactive CLI for human participation in mixed games.
- Win Rate: Overall and by role (Liberal/Fascist/Hitler)
- Policy Success Rate: Ability to enact desired policies
- Game Length: Average rounds to completion
- Execution Accuracy: Correct identification of Hitler
- Persuasion Technique Usage: Frequency and diversity of techniques
- Deception Rate: Consistency of role-aligned vs deceptive statements
- Strategic Reasoning: Quality of decision justifications
- Belief Accuracy: Correctness of role inferences
- Chi-square: Independence tests for categorical data
- Mann-Whitney U: Non-parametric comparison of distributions
- Cramér's V: Effect size for contingency tables
- Cohen's Kappa: Inter-annotator agreement
Modify in simulator/HitlerGameState.py:
NUM_PLAYERS = 5 # Total players (5-10 supported)
NUM_FASCIST_POLICIES = 6 # Policies needed for fascist win
NUM_LIBERAL_POLICIES = 5 # Policies needed for liberal win
ENABLE_VETO_POWER = True # After 5 fascist policiesAdjust prompts and parameters in simulator/players/llm_player.py:
ENABLE_COT = True # Chain-of-thought reasoning
ENABLE_MEMORY = True # Episodic memory
ENABLE_ROLE_MESSAGES = True # Role-specific messaging
ENABLE_STRATEGY_DOCS = True # Strategy guide integration
TEMPERATURE = 0.7 # Sampling temperature
MAX_TOKENS = 500 # Response length limitCustomize plots in eval/plot_config.py:
UNIBLAU = "#003A5D" # Primary color
ROLE_COLORS = {...} # Role-specific colors
DPI = 300 # Export resolution
USE_LATEX = True # LaTeX renderingThis work builds upon several excellent open-source projects:
- Secret Hitler Simulator Base: Mycleung/Secret-Hitler
- LLM Integration Foundation: edis0n-zhang/secret-hitler-player
- Strategy Documentation: Secret Hitler Strategy Guide
Special thanks to:
- The Secret Hitler game designers for creating this compelling social deduction game
- The open-source LLM community for making powerful models accessible