Anancy

The clever way to mask data.

Privacy layer for AI agent interactions. Mask sensitive data before sending to Claude, ChatGPT, or any AI service. Unmask responses locally. Your real data never leaves your machine.

Named after Anancy (Anansi), the West African/Caribbean trickster spider who uses transformation and cleverness to protect what matters.

Alpha Software Disclaimer

Anancy is in active development. Do not use for:

HIPAA-regulated healthcare data

PCI-DSS payment card data

Legal discovery or litigation holds

Any data with regulatory compliance requirements

Always verify masked output before sending to AI services. This tool reduces risk but does not eliminate it. Use at your own risk.

How It Works

flowchart LR
    subgraph LOCAL["Your Machine"]
        A[("Sensitive\nData")] --> B["MASK"]
        B --> C[("Safe\nData")]
        F["UNMASK"] --> G[("Final\nResults")]
    end

    subgraph CLOUD["AI Service"]
        D["Claude /\nChatGPT"]
    end

    C --> D
    D --> E[("AI\nResponse")]
    E --> F

    style A fill:#ff6b6b,color:#fff
    style C fill:#4ecdc4,color:#fff
    style G fill:#45b7d1,color:#fff
    style D fill:#96ceb4,color:#fff

Your sensitive data stays local. AI only sees masked versions.

The Problem

AI agents like Claude can now access your files directly. That's powerful — but your files contain:

Risk	Example
PII Exposure	SSNs, emails, phone numbers
Financial Leakage	Salaries, revenue, pricing
Competitive Risk	Client names, deal values
Compliance Violations	GDPR, CCPA, internal policy

Sending this to the cloud creates risk.

The Solution

Anancy masks sensitive data before it leaves your machine:

Original	Masked
John Smith	`[ANANCY_PERSON_1]`
123-45-6789	`[ANANCY_SSN_1]`
john@company.com	`[ANANCY_EMAIL_1]`
$147,000	$164,640 (scaled, trends preserved)

AI analyzes the masked version. You unmask the response locally.

Quick Start

from anancy import Anancy

spider = Anancy("my_project")

# Mask sensitive text
original = "John Smith (SSN: 123-45-6789) earned $147,000"
safe_text, mappings = spider.mask(original)
# Result: "[ANANCY_PERSON_1] (SSN: [ANANCY_SSN_1]) earned $164,640"

# Send safe_text to AI, get response back...

# Unmask AI response
ai_response = "Recommend promoting [ANANCY_PERSON_1] based on performance"
real_response = spider.unmask(ai_response)
# Result: "Recommend promoting John Smith based on performance"

File-Based Workflow

# Mask a file
spider.mask_file("sensitive/employees.txt", "safe/employees.txt")

# Point AI at the safe/ folder
# Get AI output...

# Unmask the results
spider.unmask_file("ai_output.txt", "final_report.txt")

CLI Usage

# Mask a file
python cli.py mask sensitive_data.txt safe_data.txt

# Unmask AI response
python cli.py unmask ai_response.txt final_output.txt

# View vault stats
python cli.py stats

Custom Vocabularies

The killer feature: instead of obvious [ANANCY_PERSON_1] placeholders, use custom vocabularies that blend in.

Preset Vocabularies

# Nature mode - uses natural words
spider = Anancy("project", vocabulary="nature")
# "John Smith" → "Maple"
# "123-45-6789" → "Alpha"

# Healthcare mode - uses domain prefixes
spider = Anancy("project", vocabulary="healthcare")
# "John Smith" → "Patient-1"
# "123-45-6789" → "MRN-1"

# Military mode - uses coded patterns
spider = Anancy("project", vocabulary="military")
# "John Smith" → "TGTP-X9F2"
# "123-45-6789" → "TGTS-M7K1"

# Financial mode
spider = Anancy("project", vocabulary="financial")
# "John Smith" → "Account-1"

# Legal mode
spider = Anancy("project", vocabulary="legal")
# "John Smith" → "Party-1"

Why Custom Vocabularies?

Standard Placeholders	Custom Vocabulary
`[ANANCY_PERSON_1]`	`Maple` or `Patient-1`
Obviously masked	Looks natural or domain-appropriate
Easy to grep/extract	No obvious pattern
Reveals data types	Types hidden in your codebook

Custom Configuration

from anancy import Anancy, VocabularyConfig

# Create your own vocabulary
custom_vocab = VocabularyConfig(
    mode="word_list",
    words={
        "person": ["Apollo", "Zeus", "Athena", "Hermes"],
        "ssn": ["Red", "Blue", "Green", "Yellow"],
        "email": ["Alpha", "Beta", "Gamma", "Delta"],
    }
)

spider = Anancy("project", vocabulary=custom_vocab)

Installation

# Clone the repo
git clone https://github.com/DevontiaW/anancy.git
cd anancy

# No external dependencies required!
python -m anancy.core  # Run the demo

What Gets Detected

mindmap
  root((Anancy))
    PII
      SSN
      Email
      Phone
      Names
    Financial
      Currency
      Percentages
      Account Numbers
    Location
      Addresses
      ZIP Codes
    Temporal
      Dates
      Timestamps

Pattern	Example	Method
SSN	123-45-6789	Regex
Email	user@domain.com	Regex
Phone	(555) 123-4567	Regex
Currency	$1,234.56	Regex + Semantic Scaling
Dates	January 15, 2025	Regex
Addresses	123 Main Street	Regex
Names	John Smith	Name Dictionary

Architecture

flowchart TB
    subgraph INPUT["Input"]
        I1["Text / File"]
    end

    subgraph ANANCY["Anancy Engine"]
        V1["Pattern\nScanner"] --> V2["Type\nClassifier"]
        V2 --> V3["Placeholder\nGenerator"]
        V3 --> V4["Mapping\nStorage"]
    end

    subgraph OUTPUT["Output"]
        O1["Masked\nContent"]
        O2["Local\nMapping Key"]
    end

    I1 --> V1
    V3 --> O1
    V4 --> O2

    style V1 fill:#667eea,color:#fff
    style V2 fill:#667eea,color:#fff
    style V3 fill:#667eea,color:#fff
    style V4 fill:#667eea,color:#fff

Storage Structure

~/.anancy/
├── {project_name}/
│   ├── mapping.json    # Your encryption key (NEVER leaves machine)
│   └── audit.jsonl     # Activity log

Demo

python demo_workspace/run_demo.py

Limitations

Be honest about what this tool can and can't do.

Catches Well	May Need Manual Handling
Standard PII (SSN, email, phone)	Uncommon names
Common Western names (~200 in dictionary)	Non-English names
US date/currency formats	International formats
Street addresses	Domain-specific identifiers

Known Gaps

Context Leakage - Masking "John Smith" doesn't hide context like "The CEO of Acme Corp, [ANANCY_PERSON_1]..."
Structured Data Headers - CSV column headers (ssn,salary,name) reveal what masked data means
Name Detection is Basic - Only ~200 common Western first names. Non-Western names often missed.
Mapping File is Plaintext - Anyone with access to ~/.anancy/ can read your mappings

When NOT to Use

Use Case	Why Not	What To Use Instead
HIPAA healthcare data	Regulatory requirements	Certified BAA solutions
PCI payment data	Compliance standards	PCI-compliant tools
Legal discovery	Chain of custody	eDiscovery platforms
Production systems	Alpha software	Enterprise PII tools

How Anancy Compares

Tool	Strengths	When to Use Instead
Microsoft Presidio	Enterprise-grade, NER-based, Azure integration	Production systems, large scale
AWS Comprehend	Cloud-native, managed service	Already in AWS ecosystem
Google DLP	Extensive detectors, cloud API	GCP users, API-based workflows
Anancy	Zero deps, local-first, instant setup, custom vocabularies	Quick tasks, learning, prototyping

Anancy's niche: When you need something working in 30 seconds without cloud accounts, API keys, or pip install hell. Graduate to enterprise tools when you need scale.

Roadmap

Current (MVP)

Planned

spaCy NER integration for better name detection
Encrypted mapping files
VS Code extension
Claude native integration

Future

Advanced semantic analysis
Enhanced encryption options
Enterprise features

Project Structure

anancy/
├── src/anancy/
│   ├── __init__.py           # Package exports
│   ├── core.py               # Main Anancy class
│   └── vocabulary.py         # Custom vocabulary system
├── cli.py                    # Command-line interface
├── tests/
│   └── test_anancy.py        # Test suite
├── demo_workspace/
│   ├── sensitive_data/       # Sample sensitive files
│   ├── cowork_safe/          # Masked versions (AI-safe)
│   └── run_demo.py           # Full workflow demo
└── guides/
    ├── STUDENT_GUIDE.md      # Teaching curriculum
    └── BUSINESS_OWNER_QUICKSTART.md

Contributing

This is an early-stage project. Contributions welcome!

Fork the repo
Create a feature branch
Make your changes
Submit a PR

Areas where help is needed:

Additional pattern detection
Non-English name support
Testing across file types
Documentation improvements

The Name

Anancy (also spelled Anansi) is the trickster spider from West African and Caribbean folklore. He's known for:

Transformation - changing form to achieve goals
Protection through cleverness - outsmarting larger threats
Preserving what matters - in folklore, he protects stories and wisdom

This maps to what we do: transform your data to protect it, use clever masking to outsmart exposure risks, and preserve the real values locally where they belong.

"Anansi does not reveal his true form until he's ready."

License

MIT License - see LICENSE

Credits

Created by Devon Williams at Textstone Labs

Questions?

Open an issue
Email: devontaew@textstonelabs.com

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
demo_workspace		demo_workspace
guides		guides
src		src
tests		tests
.gitignore		.gitignore
BOARD_BRIEFING_ANANCY.md		BOARD_BRIEFING_ANANCY.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
cli.py		cli.py
example_usage.py		example_usage.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
vault_lite.py		vault_lite.py

License

DevontiaW/anancy

Folders and files

Latest commit

History

Repository files navigation

Anancy

How It Works

The Problem

The Solution

Quick Start

File-Based Workflow

CLI Usage

Custom Vocabularies

Preset Vocabularies

Why Custom Vocabularies?

Custom Configuration

Installation

What Gets Detected

Architecture

Storage Structure

Demo

Limitations

Known Gaps

When NOT to Use

How Anancy Compares

Roadmap

Current (MVP)

Planned

Future

Project Structure

Contributing

The Name

License

Credits

Questions?

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages