diff --git a/README.md b/README.md
index eca4b99..e5e2f3e 100644
--- a/README.md
+++ b/README.md
@@ -36,6 +36,8 @@ psql -d localdb < subset.sql
 - **Zero-config start** -- Introspects schema automatically, no data model file required
 - **Single command** -- Extract complete data subsets with one CLI invocation
 - **Safe by default** -- Auto-detects and anonymizes sensitive fields (emails, phones, SSNs, etc.)
+- **Compliance profiles** -- Built-in GDPR, HIPAA Safe Harbor, and PCI-DSS profiles with two-phase PII scanning
+- **Column mapping UI** -- Local browser UI to visually map columns, apply compliance profiles, and export config
 - **Multiple output formats** -- SQL, JSON, and CSV
 - **Streaming** -- Memory-efficient extraction for large datasets (100K+ rows)
 - **Virtual foreign keys** -- Support for Django GenericForeignKeys and implicit relationships via config
@@ -100,6 +102,44 @@ dbslice extract postgres://... --seed "users.id=1" --anonymize
 dbslice extract postgres://... --seed "users.id=1" --anonymize --redact "audit_logs.ip_address"
 ```
 
+### Column Mapping UI
+
+Map columns visually, apply compliance profiles, and generate a ready-to-use config — all from a local browser UI.
+
+```bash
+dbslice map postgres://localhost/myapp
+
+# Custom port
+dbslice map postgres://localhost/myapp --port 8888
+
+# Also works with uvx (no install needed)
+uvx dbslice map postgres://localhost/myapp
+```
+
+<table>
+<tr>
+<td width="50%"><strong>Map columns to anonymization rules</strong></td>
+<td width="50%"><strong>Generate and export config</strong></td>
+</tr>
+<tr>
+<td><img src="https://raw.githubusercontent.com/nabroleonx/dbslice/main/docs/assets/mapping.png" alt="Column mapping" width="100%"></td>
+<td><img src="https://raw.githubusercontent.com/nabroleonx/dbslice/main/docs/assets/mapping_instructions.png" alt="Generated config" width="100%"></td>
+</tr>
+</table>
+
+Runs on `127.0.0.1:9473` with a one-time session token — no data leaves your machine. Apply GDPR, HIPAA, or PCI-DSS profiles with one click, review what gets masked, then download the YAML.
+
+### Compliance Profiles
+
+```bash
+# HIPAA Safe Harbor — auto-masks all 18 identifier types
+dbslice extract postgres://... --seed "patients.id=1" --compliance hipaa --compliance-strict
+
+# Multiple profiles + audit manifest
+dbslice extract postgres://... --seed "users.id=1" --compliance gdpr --compliance pci-dss -f subset.sql
+# Produces subset.sql + subset.manifest.json
+```
+
 ### Output Formats
 
 ```bash
@@ -170,6 +210,9 @@ dbslice extract --config dbslice.yaml --seed "orders.id=12345"
 | Configuration | Zero-config | Requires model file | Config required | Manual YAML |
 | Setup time | Seconds | Hours | Medium | Medium |
 | Anonymization | Built-in (Faker) | Plugin-based | Advanced transformers | Not available |
+| Compliance profiles | GDPR, HIPAA, PCI-DSS | None | None | None |
+| Column mapping UI | Built-in (local) | None | None | None |
+| PII value scanning | Two-phase (pre/post mask) | None | None | None |
 | Subsetting | FK traversal | FK traversal | Limited | FK traversal |
 | Output formats | SQL, JSON, CSV | SQL, XML, CSV | SQL | SQL only |
 | Cycle handling | Automatic | Manual config | N/A | Manual |
diff --git a/docs/assets/mapping.png b/docs/assets/mapping.png
new file mode 100644
index 0000000..268d415
Binary files /dev/null and b/docs/assets/mapping.png differ
diff --git a/docs/assets/mapping_instructions.png b/docs/assets/mapping_instructions.png
new file mode 100644
index 0000000..5ed636f
Binary files /dev/null and b/docs/assets/mapping_instructions.png differ
diff --git a/docs/help/best-practices.md b/docs/help/best-practices.md
index c1de711..affa47c 100644
--- a/docs/help/best-practices.md
+++ b/docs/help/best-practices.md
@@ -20,6 +20,19 @@ dbslice extract postgres://prod/db --seed "users.id=1" --anonymize
 
 Never extract production data without `--anonymize`. Foreign keys are preserved automatically.
 
+## 2b. Use Compliance Profiles for Regulated Data
+
+```bash
+dbslice extract postgres://prod/db --seed "users.id=1" \
+  --compliance hipaa --compliance-strict
+```
+
+Compliance profiles (GDPR, HIPAA, PCI-DSS) auto-configure anonymization, run value-based PII scanning, and generate audit manifests. Use `--compliance-strict` to fail if unmasked PII is detected.
+
+## 2c. Treat Output as Pseudonymized Data
+
+Deterministic mode is **pseudonymization**, not full anonymization. For higher privacy, set `anonymization.deterministic: false` and still keep operational controls (least privilege DB account, restricted output location, and manifest review).
+
 ## 3. Validate Extractions
 
 ```bash
@@ -96,6 +109,29 @@ dropdb test_import
 
 Always verify extracted data loads cleanly into an isolated database before relying on it.
 
+## 11. Use a Compliance Runbook in CI
+
+Suggested CI flow:
+1. `dbslice inspect --compliance-check ... --compliance-output json` on target schema.
+2. `dbslice extract ... --out-file ...` with compliance profiles.
+3. `dbslice verify-manifest ...` to confirm output file hashes.
+4. Optionally sign manifest + output with an external tool (cosign, GPG) for non-repudiation.
+5. Archive artifacts to immutable storage (S3 Object Lock, GCS retention, etc.).
+
+## 12. Compliance Controls (Quick Reference)
+
+These are **runtime CLI checks**, not an IAM or governance system. They reduce accidental mistakes but are not a substitute for network-level controls, access policies, or encryption at rest.
+
+| Risk | Control | Limitation |
+|------|---------|------------|
+| Unmasked PII reaches dev/test | `--compliance ... --compliance-strict`, profile rules, residual scan | Pattern-based detection only; may miss PII in unusual column names or embedded in binary data |
+| Unsafe ad-hoc extraction | `compliance.policy_mode: standard`, breakglass override with reason + ticket | CLI flags can be bypassed by not using the config file |
+| Unknown data source used | `compliance.allow_url_patterns` / `deny_url_patterns` | Regex on URL string; does not prevent DNS aliasing or network-level bypass |
+| Non-TLS DB connection | `compliance.required_sslmode` | Checks URL query param only; does not verify actual TLS handshake |
+| Non-CI execution | `compliance.require_ci: true` | Checks `CI=true` env var, which can be set manually |
+| Output tampering | Manifest `output_file_hashes` + `dbslice verify-manifest` | SHA256 file hashes detect changes after the fact |
+| Manifest tampering | `compliance.sign_manifest: true` with HMAC-SHA256 | Symmetric key — tamper detection only, **not** non-repudiation. For provable origin, wrap with external signing (cosign, GPG) |
+
 ---
 
 ## See Also
diff --git a/docs/user-guide/advanced-usage.md b/docs/user-guide/advanced-usage.md
index 817b198..6934551 100644
--- a/docs/user-guide/advanced-usage.md
+++ b/docs/user-guide/advanced-usage.md
@@ -117,6 +117,153 @@ dbslice extract \
 
 Validation confirms all FK references remain intact after anonymization.
 
+### Non-Deterministic Mode
+
+For stronger privacy guarantees, use non-deterministic mode where each value gets a random Faker seed instead of a deterministic one:
+
+```bash
+dbslice extract \
+  postgres://prod:5432/app \
+  --seed "users.id=1" \
+  --anonymize \
+  --non-deterministic \
+  --out-file strong_privacy.sql
+```
+
+Or in config:
+
+```yaml
+anonymization:
+  enabled: true
+  deterministic: false
+```
+
+**Trade-off**: Same value in different tables may produce different fake values (e.g., "alice@example.com" might become "john@foo.com" in one table and "jane@bar.org" in another). Use deterministic mode when cross-table consistency matters.
+
+**Legal note**: Deterministic anonymization is technically **pseudonymization** under GDPR (same seed + input = same output = reversible). Non-deterministic mode is closer to true anonymization but structural linkage may still allow re-identification.
+
+---
+
+## Compliance Profiles
+
+dbslice includes built-in compliance profiles for GDPR, HIPAA Safe Harbor, and PCI-DSS v4.0. Profiles auto-configure anonymization patterns, run value-based PII scanning, and generate audit manifests.
+
+### Using Compliance Profiles
+
+```bash
+# HIPAA-compliant extraction
+dbslice extract \
+  postgres://medical-db:5432/ehr \
+  --seed "patients.id=1" \
+  --compliance hipaa \
+  --out-file patient_subset.sql
+
+# Multiple profiles
+dbslice extract \
+  postgres://prod:5432/app \
+  --seed "users.id=1" \
+  --compliance gdpr \
+  --compliance pci-dss \
+  --out-file compliant_subset.sql
+```
+
+Or in config:
+
+```yaml
+compliance:
+  profiles: [hipaa, gdpr]
+  strict: true
+  generate_manifest: true
+```
+
+### Available Profiles
+
+| Profile | Description | Key Coverage |
+|---------|-------------|-------------|
+| `gdpr` | EU General Data Protection Regulation | Names, email, phone, address, IP, DOB, SSN, financial IDs, online identifiers |
+| `hipaa` | HIPAA Safe Harbor de-identification | All 18 Safe Harbor identifiers: names, dates, geographic data, phone, fax, email, SSN, medical record numbers, health plan IDs, account numbers, license numbers, vehicle/device IDs, URLs, IPs, biometrics, photos, unique IDs |
+| `pci-dss` | PCI-DSS v4.0 | PAN (credit card), cardholder name, expiration date, service code; CVV/PIN NULLed (never faked) |
+
+### What Compliance Profiles Do
+
+When a profile is active:
+
+1. **Auto-enable anonymization** -- no need for `--anonymize`
+2. **Merge column patterns** -- profile-defined patterns are added to your anonymization config
+3. **Apply security NULL rules** -- profile-specific fields are forced to NULL (e.g., CVV for PCI-DSS)
+4. **Run value-based PII scanning** -- regex patterns scan actual data values (not just column names) for email, SSN, phone numbers, IP addresses, and credit card numbers (with Luhn validation)
+5. **Flag free-text columns** -- columns like `notes`, `comments`, `description` are flagged as potential PII containers
+6. **Generate audit manifest** -- a JSON manifest documenting what was anonymized
+
+### Strict Mode
+
+In strict mode, extraction fails if the PII scanner detects unmasked PII in the output:
+
+```bash
+dbslice extract \
+  postgres://prod:5432/app \
+  --seed "users.id=1" \
+  --compliance hipaa \
+  --compliance-strict \
+  --out-file subset.sql
+```
+
+This ensures no PII slips through to dev/test environments.
+
+### Audit Manifest
+
+When compliance profiles are active (or `--manifest` is passed), dbslice writes a `*.manifest.json` file alongside the output:
+
+```json
+{
+  "extraction_id": "550e8400-e29b-41d4-a716-446655440000",
+  "timestamp": "2026-03-06T10:30:00Z",
+  "dbslice_version": "0.5.0",
+  "masking_type": "deterministic_pseudonymization",
+  "compliance_profiles": ["hipaa"],
+  "seed_hash": "sha256:a1b2c3d4e5f6...",
+  "tables": {
+    "patients": {
+      "rows_extracted": 1,
+      "fields_masked": [
+        {"column": "email", "method": "email", "category": ""},
+        {"column": "ssn", "method": "ssn", "category": ""}
+      ],
+      "fields_nulled": [
+        {"column": "password_hash", "reason": "security_null_pattern"}
+      ],
+      "fields_preserved_fk": ["id", "doctor_id"],
+      "fields_unmasked": ["created_at", "status"]
+    }
+  },
+  "pii_scan_results": [],
+  "output_file_hashes": {
+    "subset.sql": "sha256:a1b2c3..."
+  },
+  "breakglass": {},
+  "signature_algorithm": "",
+  "signature": "",
+  "warnings": [
+    {"table": "visits", "column": "notes", "reason": "Free-text column may contain embedded PII", "severity": "warning"}
+  ]
+}
+```
+
+This manifest provides structured evidence for audit reviews. It documents what dbslice did but is not a substitute for infrastructure-level audit logging.
+
+You can verify output file integrity later:
+
+```bash
+# Verify output file hashes match
+dbslice verify-manifest subset.manifest.json --no-verify-signature
+
+# Verify hashes + HMAC signature (if signing was enabled)
+export DBSLICE_MANIFEST_SIGNING_KEY="your-key"
+dbslice verify-manifest subset.manifest.json
+```
+
+Note: HMAC signing uses a shared symmetric key. It provides tamper detection (was the manifest modified after creation?) but not non-repudiation (it cannot prove *who* created it). For provable origin, wrap with an external signing tool (e.g., cosign, GPG) in your CI pipeline.
+
 ### Compliance Use Cases
 
 **GDPR Right to Erasure** -- extract and anonymize before deletion:
@@ -125,22 +272,71 @@ Validation confirms all FK references remain intact after anonymization.
 dbslice extract \
   postgres://prod:5432/app \
   --seed "users.id=12345" \
-  --anonymize \
+  --compliance gdpr \
   --out-file gdpr_erasure_backup.sql
 ```
 
-**HIPAA De-identification** -- anonymize plus redact clinical fields:
+**HIPAA Safe Harbor De-identification**:
 
 ```bash
 dbslice extract \
   postgres://medical-db:5432/ehr \
   --seed "patients.mrn='12345'" \
-  --anonymize \
-  --redact "patients.social_security" \
-  --redact "visits.notes" \
+  --compliance hipaa \
+  --compliance-strict \
   --out-file patient_deidentified.sql
 ```
 
+**PCI-DSS: No Real PANs in Dev/Test** (Requirement 6.5.6):
+
+```bash
+dbslice extract \
+  postgres://billing:5432/payments \
+  --seed "transactions.id=999" \
+  --compliance pci-dss \
+  --out-file test_transactions.sql
+```
+
+---
+
+## Column Mapping UI
+
+Instead of manually writing anonymization config, use the built-in browser UI to visually map columns.
+
+### Launch
+
+```bash
+dbslice map postgresql://localhost/myapp
+
+# Custom port
+dbslice map postgresql://localhost/myapp --port 8888
+```
+
+This opens a local server on `127.0.0.1:9473` with a session token for security. No data leaves your machine — the browser connects to the local `dbslice` process, which connects to the database.
+
+### Workflow
+
+1. **Introspect** -- Enter your database URL, click Introspect Schema. Only metadata is read.
+2. **Apply profiles** -- Click GDPR, HIPAA, or PCI-DSS to auto-map columns matching the profile's rules.
+3. **Review** -- For each column, set action to Keep, Anonymize, or NULL. Pick a provider from the dropdown.
+4. **Export** -- Click Generate Config to produce a `dbslice.yaml`. Download it.
+5. **Use** -- `dbslice extract --config dbslice.yaml --seed "table.column=value"`
+
+### What the UI shows
+
+- **Table list** with progress bars showing how many columns are mapped per table
+- **Compliance profile chips** that overlay suggested mappings with one click
+- **Provider dropdown** with descriptions (not a raw text input)
+- **Summary panel** at the bottom: click "14 masked" to see all masked fields across all tables, grouped by table
+- **Live YAML preview** that updates as you change mappings
+- **Bulk actions** per table: Anonymize all, NULL all, Reset
+
+### Security
+
+- Server binds to `127.0.0.1` only (not `0.0.0.0`)
+- Random session token generated at startup, required on all API requests
+- No persistent state, no cookies, no external requests (except Tailwind CSS CDN for styling)
+
 ---
 
 ## Streaming Large Datasets
diff --git a/docs/user-guide/cli-reference.md b/docs/user-guide/cli-reference.md
index 2ed97ac..a0d998f 100644
--- a/docs/user-guide/cli-reference.md
+++ b/docs/user-guide/cli-reference.md
@@ -9,6 +9,8 @@ Complete reference for the dbslice command-line interface.
   - [extract](#extract)
   - [init](#init)
   - [inspect](#inspect)
+  - [map](#map)
+  - [verify-manifest](#verify-manifest)
 - [Global Options](#global-options)
 - [Environment Variables](#environment-variables)
 - [Exit Codes](#exit-codes)
@@ -91,6 +93,20 @@ dbslice extract [OPTIONS] [DATABASE_URL]
 |--------|------|---------|-------------|
 | `--anonymize` / `--no-anonymize`, `-a` | FLAG | Disabled | Enable/disable automatic anonymization of sensitive fields |
 | `--redact`, `-r` | TEXT | - | Additional fields to redact (repeatable, format: `table.column`) |
+| `--non-deterministic` / `--deterministic` | FLAG | Deterministic | Use non-deterministic anonymization (random output each run, stronger privacy but no cross-table consistency) |
+
+##### Compliance Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `--compliance` | TEXT | - | Compliance profile(s) to apply (repeatable): `gdpr`, `hipaa`, `pci-dss` |
+| `--compliance-strict` / `--no-compliance-strict` | FLAG | Disabled | Fail extraction if value-based PII scanning detects unmasked PII |
+| `--manifest` / `--no-manifest` | FLAG | Auto | Generate audit manifest (auto-enabled with `--compliance`) |
+| `--allow-raw` | FLAG | Disabled | Breakglass override for compliance policy gates (requires reason + ticket) |
+| `--breakglass-reason` | TEXT | - | Required justification when `--allow-raw` is used |
+| `--ticket-id` | TEXT | - | Required tracking ticket/incident ID when `--allow-raw` is used |
+
+When compliance profiles are active, anonymization is auto-enabled and profile patterns are merged as fallback wildcard rules (`user exact fields > user patterns > profile patterns > built-ins`). Value-based scanning runs in two phases: coverage (pre-mask) identifies where PII exists, then residual (post-mask) checks only unprotected columns. Strict mode fails only on residual detections — it won't false-positive on correctly anonymized fields.
 
 ##### Validation Options
 
@@ -227,6 +243,36 @@ dbslice extract postgresql://localhost/myapp \
   --redact customers.tax_id
 ```
 
+##### Compliance
+
+```bash
+# Extract with HIPAA compliance profile
+dbslice extract postgresql://localhost/myapp \
+  -s "patients.id=1" \
+  --compliance hipaa
+
+# Multiple compliance profiles with strict mode
+dbslice extract postgresql://localhost/myapp \
+  -s "users.id=1" \
+  --compliance gdpr \
+  --compliance pci-dss \
+  --compliance-strict
+
+# Non-deterministic anonymization for stronger privacy
+dbslice extract postgresql://localhost/myapp \
+  -s "users.id=1" \
+  --compliance gdpr \
+  --non-deterministic
+
+# Generate audit manifest without compliance profile
+dbslice extract postgresql://localhost/myapp \
+  -s "users.id=1" \
+  --anonymize \
+  --manifest \
+  -f subset.sql
+# Writes subset.sql + subset.manifest.json
+```
+
 ##### JSON Output
 
 ```bash
@@ -439,6 +485,9 @@ dbslice inspect [OPTIONS] [DATABASE_URL]
 |--------|------|---------|-------------|
 | `--table`, `-t` | TEXT | - | Show details for a specific table |
 | `--schema` | TEXT | `public` | PostgreSQL schema name |
+| `--compliance-check` | TEXT | - | Run compliance coverage check for profile(s): `gdpr`, `hipaa`, `pci-dss` |
+| `--compliance-output` | TEXT | `human` | Compliance report output format: `human` or `json` |
+| `--sample-rows` | INTEGER | `100` | Rows sampled per table for value-based compliance scan |
 
 #### Examples
 
@@ -513,6 +562,101 @@ for table in users orders products; do
 done
 ```
 
+##### Compliance Coverage Check
+
+```bash
+# Human-readable compliance check
+dbslice inspect postgresql://localhost/myapp \
+  --compliance-check gdpr
+
+# JSON report for CI pipelines
+dbslice inspect postgresql://localhost/myapp \
+  --compliance-check hipaa \
+  --compliance-output json
+```
+
+---
+
+### verify-manifest
+
+Verify manifest file hashes and optional HMAC signature.
+
+#### Synopsis
+
+```bash
+dbslice verify-manifest [OPTIONS] MANIFEST_FILE
+```
+
+#### Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `--verify-signature` / `--no-verify-signature` | FLAG | Enabled | Verify HMAC signature when present |
+| `--key-env` | TEXT | `DBSLICE_MANIFEST_SIGNING_KEY` | Env var containing signature key |
+
+#### Examples
+
+```bash
+# Verify output hashes only
+dbslice verify-manifest subset.manifest.json --no-verify-signature
+
+# Verify hashes + HMAC signature
+export DBSLICE_MANIFEST_SIGNING_KEY="super-secret"
+dbslice verify-manifest subset.manifest.json
+```
+
+---
+
+### map
+
+Launch a local browser UI for visually mapping database columns to anonymization rules.
+
+#### Synopsis
+
+```bash
+dbslice map [OPTIONS] [DATABASE_URL]
+```
+
+#### Arguments
+
+| Argument | Description |
+|----------|-------------|
+| `DATABASE_URL` | Optional database connection URL. Can also be entered in the browser UI. |
+
+#### Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `--schema` | TEXT | `public` | PostgreSQL schema name |
+| `--port`, `-p` | INTEGER | `9473` | Port for the local server |
+| `--open-browser` / `--no-open-browser` | FLAG | Enabled | Auto-open browser on launch |
+
+#### Security
+
+The server binds to `127.0.0.1` only — it is not accessible from the network. A random session token is generated at startup and required for all requests. The token is passed via the URL when the browser opens.
+
+#### Examples
+
+```bash
+# Launch mapping UI (enter URL in browser)
+dbslice map
+
+# Pre-fill database URL
+dbslice map postgresql://localhost/myapp
+
+# Custom port, no auto-open
+dbslice map postgresql://localhost/myapp --port 8888 --no-open-browser
+```
+
+#### Workflow
+
+1. Enter database URL and click **Introspect Schema**
+2. Optionally click **GDPR**, **HIPAA**, or **PCI-DSS** to apply compliance profile suggestions
+3. Review each column: set action to **Keep**, **Anonymize**, or **NULL**
+4. For anonymized columns, select a provider from the dropdown (e.g., `email`, `ssn`, `hipaa_zip3`)
+5. Click **Generate Config** to export a `dbslice.yaml`
+6. Use the config: `dbslice extract --config dbslice.yaml --seed "table.column=value"`
+
 ---
 
 ## Global Options
diff --git a/docs/user-guide/configuration.md b/docs/user-guide/configuration.md
index d2b8b44..d9d792e 100644
--- a/docs/user-guide/configuration.md
+++ b/docs/user-guide/configuration.md
@@ -12,6 +12,7 @@ Complete reference for dbslice YAML configuration files.
   - [database](#database)
   - [extraction](#extraction)
   - [anonymization](#anonymization)
+  - [compliance](#compliance)
   - [output](#output)
   - [tables](#tables)
   - [performance](#performance)
@@ -69,6 +70,7 @@ version: "1.0"           # Optional config version tag (informational)
 database:                # Database connection settings
 extraction:              # Extraction behavior settings
 anonymization:           # Anonymization configuration
+compliance:              # Compliance profiles and audit manifest (optional)
 output:                  # Output format settings
 tables:                  # Per-table configuration (optional)
 performance:             # Performance tuning (optional)
@@ -245,6 +247,7 @@ anonymization:
   fields: object                # Exact table.column -> provider
   patterns: object              # Wildcard table.column glob -> provider
   security_null_fields: list    # Wildcard table.column globs to force NULL
+  deterministic: boolean        # Use deterministic anonymization (default: true)
 ```
 
 #### Fields
@@ -256,6 +259,7 @@ anonymization:
 | `fields` | Object | No | `{}` | Exact map of `table.column` to Faker method |
 | `patterns` | Object | No | `{}` | Wildcard map of `table.column` glob to Faker method |
 | `security_null_fields` | List[String] | No | `[]` | Wildcard `table.column` globs to force `NULL` |
+| `deterministic` | Boolean | No | `true` | Deterministic mode (same input = same output). Set `false` for non-deterministic anonymization with stronger privacy guarantees |
 
 Notes:
 - `fields` keys must be exact `table.column` entries (no wildcards).
@@ -361,6 +365,118 @@ anonymization:
 
 ---
 
+### compliance
+
+Compliance profile and audit manifest configuration.
+
+#### Schema
+
+```yaml
+compliance:
+  profiles: list[string]          # Compliance profiles to apply
+  strict: boolean                 # Fail if uncovered PII detected
+  generate_manifest: boolean      # Generate audit manifest
+  policy_mode: string             # Runtime policy gates: off|standard|strict
+  allow_url_patterns: list[string]# Regex allow-list for source DB URL
+  deny_url_patterns: list[string] # Regex deny-list for source DB URL
+  required_sslmode: string        # Required sslmode query value in DB URL
+  require_ci: boolean             # Require CI=true environment
+  sign_manifest: boolean          # HMAC-sign manifest when key is available
+  manifest_key_env: string        # Env var name containing signing key
+```
+
+#### Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `profiles` | List[String] | No | `[]` | Compliance profiles: `gdpr`, `hipaa`, `pci-dss` |
+| `strict` | Boolean | No | `false` | Fail extraction if value-based PII scanning detects unmasked PII |
+| `generate_manifest` | Boolean | No | `false` | Generate a JSON audit manifest alongside output (auto-enabled when profiles are active) |
+| `policy_mode` | String | No | `"off"` | Compliance policy gates: `off`, `standard`, `strict` |
+| `allow_url_patterns` | List[String] | No | `[]` | Source DB URL must match one of these regex patterns (if set) |
+| `deny_url_patterns` | List[String] | No | `[]` | Source DB URL must not match any of these regex patterns |
+| `required_sslmode` | String | No | - | Required PostgreSQL `sslmode` query parameter value |
+| `require_ci` | Boolean | No | `false` | Fail when running outside CI (`CI=true` expected) |
+| `sign_manifest` | Boolean | No | `false` | Sign manifest with HMAC-SHA256 (tamper detection, not non-repudiation) |
+| `manifest_key_env` | String | No | `"DBSLICE_MANIFEST_SIGNING_KEY"` | Env var containing HMAC signing key (shared secret) |
+
+#### Compliance Profiles
+
+| Profile | Description | Key Coverage |
+|---------|-------------|-------------|
+| `gdpr` | EU General Data Protection Regulation | Names, email, phone, address, IP, DOB, SSN, financial IDs |
+| `hipaa` | HIPAA Safe Harbor (18 identifiers) | All 18 Safe Harbor identifiers including medical record numbers, device IDs, dates |
+| `pci-dss` | PCI-DSS v4.0 | PAN, cardholder name, expiration, CVV/PIN (NULLed) |
+
+When a compliance profile is active:
+- Anonymization is auto-enabled (no need for `anonymization.enabled: true`)
+- Profile-defined column patterns are merged as **fallback wildcard rules** (`user exact fields > user patterns > profile patterns > built-ins`)
+- Value-based scanning runs in two phases:
+  - coverage scan (pre-mask) to detect PII presence
+  - residual scan (post-mask) on unprotected columns only (strict mode fails only here)
+- Free-text columns (notes, comments, descriptions) are flagged as warnings
+- Audit manifest is generated by default
+
+#### Policy Modes
+
+`policy_mode` adds runtime guardrails when compliance profiles are active. These are CLI-level checks that prevent accidental misconfiguration — they are not a security boundary.
+
+- `off`: No policy gates (default).
+- `standard` / `strict`: Block risky defaults — stdout output, `--allow-unsafe-where`, and non-masked extraction are rejected unless overridden with `--allow-raw`. Both modes currently apply the same gates; `strict` is reserved for future tightening.
+
+Breakglass override: `--allow-raw --breakglass-reason "..." --ticket-id "..."`. The reason and ticket ID are recorded in the manifest for audit purposes.
+
+#### Important: Pseudonymization vs Anonymization
+
+dbslice's anonymization is technically **pseudonymization** under GDPR (deterministic mode: same input = same output, reversible with seed knowledge). For stronger privacy guarantees, use `anonymization.deterministic: false` (non-deterministic mode), which uses random seeds per value but loses cross-table consistency.
+
+True GDPR anonymization (where re-identification is "not reasonably possible") may require additional measures beyond what dbslice provides (k-anonymity, data generalization, etc.).
+
+#### Audit Manifest
+
+When `generate_manifest` is enabled, dbslice writes a `*.manifest.json` file alongside the output containing:
+
+- Extraction metadata (timestamp, version, seed hash)
+- Per-table breakdown of masked, NULLed, FK-preserved, and unmasked fields
+- Residual PII scan results from value-based scanning
+- Compliance warnings (e.g., free-text columns that may contain embedded PII)
+- Output file hash set (`sha256`) for produced artifacts
+- Optional breakglass metadata (reason + ticket) when override is used
+- Optional HMAC-SHA256 signature for tamper detection (symmetric key — integrity checking, not non-repudiation)
+
+This manifest provides structured evidence for audit reviews. For non-repudiation (provable origin), sign the manifest externally with cosign or GPG in your CI pipeline.
+
+#### Examples
+
+```yaml
+# HIPAA-compliant extraction
+compliance:
+  profiles: [hipaa]
+  strict: true
+  generate_manifest: true
+
+anonymization:
+  enabled: true
+  seed: "hipaa-compliant-seed-2024"
+
+# Multiple compliance profiles
+compliance:
+  profiles: [gdpr, pci-dss]
+  strict: false
+  generate_manifest: true
+
+# Non-deterministic mode for stronger privacy
+compliance:
+  profiles: [gdpr]
+  strict: true
+
+anonymization:
+  enabled: true
+  deterministic: false  # Random output each run
+```
+
+---
+
 ### output
 
 Output format and generation configuration.
@@ -779,6 +895,51 @@ dbslice extract \
 
 ---
 
+### HIPAA-Compliant Extraction
+
+**config/hipaa_compliant.yaml:**
+```yaml
+version: "1.0"
+
+database:
+  url: ${MEDICAL_DATABASE_URL}
+
+extraction:
+  default_depth: 3
+  direction: both
+  exclude_tables:
+    - audit_logs
+    - system_events
+  validate: true
+  fail_on_validation_error: true
+
+compliance:
+  profiles: [hipaa]
+  strict: true              # Fail if PII detected in output
+  generate_manifest: true   # Generate audit trail
+
+anonymization:
+  enabled: true
+  seed: "hipaa-compliant-extraction-2024"
+  deterministic: false      # Non-deterministic for stronger privacy
+```
+
+**Usage:**
+```bash
+export MEDICAL_DATABASE_URL="postgresql://medical-db.example.com/ehr"
+
+dbslice extract \
+  --config config/hipaa_compliant.yaml \
+  --seed "patients.id=12345" \
+  --out-file patient_subset.sql
+
+# Output:
+#   patient_subset.sql              (anonymized data)
+#   patient_subset.manifest.json    (audit manifest for compliance team)
+```
+
+---
+
 ### Test Fixture Generation
 
 **config/test_fixtures.yaml:**
diff --git a/src/dbslice/cli.py b/src/dbslice/cli.py
index f1e04df..ba6fa6e 100644
--- a/src/dbslice/cli.py
+++ b/src/dbslice/cli.py
@@ -1,6 +1,10 @@
+import itertools
+import json
 import os
+import re
 from pathlib import Path
 from typing import Annotated
+from urllib.parse import parse_qs, urlparse
 
 import typer
 from rich.console import Console
@@ -204,9 +208,7 @@ def _parse_and_validate_seeds(
     parsed_seeds = []
     for s in seeds:
         try:
-            parsed_seeds.append(
-                SeedSpec.parse(s, allow_unsafe_subqueries=allow_unsafe_subqueries)
-            )
+            parsed_seeds.append(SeedSpec.parse(s, allow_unsafe_subqueries=allow_unsafe_subqueries))
         except ValueError as e:
             raise InvalidSeedError(s, str(e))
 
@@ -348,11 +350,19 @@ def _show_extraction_settings(
         seed_desc = f"{s.table}.{s.column}={s.value}" if s.column else f"{s.table}:{s.where_clause}"
         console.print(f"    - {seed_desc}")
     if config.anonymize:
-        console.print("  [yellow]Anonymization: ENABLED[/yellow]")
+        mode = "deterministic" if config.deterministic else "non-deterministic"
+        console.print(f"  [yellow]Anonymization: ENABLED ({mode})[/yellow]")
         if config.redact_fields:
             console.print("  Additional redacted fields:")
             for field in config.redact_fields:
                 console.print(f"    - {field}")
+    if config.compliance_profiles:
+        profiles_str = ", ".join(p.upper() for p in config.compliance_profiles)
+        console.print(f"  [yellow]Compliance profiles: {profiles_str}[/yellow]")
+        if config.compliance_strict:
+            console.print("  [yellow]Strict mode: ENABLED (will fail on PII detection)[/yellow]")
+    if config.generate_manifest:
+        console.print("  [yellow]Audit manifest: ENABLED[/yellow]")
     console.print()
 
 
@@ -408,8 +418,12 @@ def _show_extraction_summary(
     if result.has_cycles:
         console.print()
         if result.used_deferred_cycle_strategy:
-            console.print("[yellow]⚠ Circular dependencies detected (deferred-constraint strategy)[/yellow]")
-            console.print("  Strategy: [cyan]Deterministic order + SET CONSTRAINTS ALL DEFERRED[/cyan]")
+            console.print(
+                "[yellow]⚠ Circular dependencies detected (deferred-constraint strategy)[/yellow]"
+            )
+            console.print(
+                "  Strategy: [cyan]Deterministic order + SET CONSTRAINTS ALL DEFERRED[/cyan]"
+            )
             console.print(f"  Cycles: [cyan]{len(result.cycle_infos)}[/cyan]")
         else:
             console.print("[yellow]⚠ Circular dependencies detected and resolved[/yellow]")
@@ -485,7 +499,7 @@ def _generate_and_output_sql(
     disable_fk_checks: bool,
     output_file_mode: int,
     db_schema: str | None = None,
-) -> None:
+) -> list[Path]:
     """
     Generate SQL output and write to file or stdout.
 
@@ -524,11 +538,13 @@ def _generate_and_output_sql(
             console.print(
                 f"[green]Wrote {result.total_rows()} rows to [bold]{out_file}[/bold][/green]"
             )
+        return [out_file.resolve()]
     else:
         if not no_progress:
             console.print()
             console.print("[dim]--- SQL Output ---[/dim]")
         stdout_console.print(sql_output)
+        return []
 
 
 def _generate_and_output_json(
@@ -541,7 +557,7 @@ def _generate_and_output_json(
     console: Console,
     stdout_console: Console,
     output_file_mode: int,
-) -> None:
+) -> list[Path]:
     """
     Generate JSON output and write to file(s) or stdout.
 
@@ -584,19 +600,23 @@ def _generate_and_output_json(
                 console.print(
                     f"[green]Wrote {result.total_rows()} rows to [bold]{out_file}[/bold][/green]"
                 )
+            return [out_file.resolve()]
         else:
             assert isinstance(json_output, dict)
             out_file.mkdir(parents=True, exist_ok=True)
+            written_files: list[Path] = []
             for table_name, table_json in json_output.items():
                 table_file = out_file / f"{table_name}.json"
                 write_text_file_secure(
                     table_file, table_json, file_mode=output_file_mode, encoding="utf-8"
                 )
+                written_files.append(table_file.resolve())
             if not no_progress:
                 console.print()
                 console.print(
                     f"[green]Wrote {result.table_count()} tables ({result.total_rows()} rows) to [bold]{out_file}[/bold][/green]"
                 )
+            return written_files
     else:
         # Output to stdout (only single mode makes sense)
         if mode == "per-table":
@@ -616,6 +636,7 @@ def _generate_and_output_json(
             console.print()
             console.print("[dim]--- JSON Output ---[/dim]")
         stdout_console.print(json_output)
+        return []
 
 
 def _generate_and_output_csv(
@@ -628,7 +649,7 @@ def _generate_and_output_csv(
     console: Console,
     stdout_console: Console,
     output_file_mode: int,
-) -> None:
+) -> list[Path]:
     """
     Generate CSV output and write to file(s) or stdout.
 
@@ -671,19 +692,23 @@ def _generate_and_output_csv(
                 console.print(
                     f"[green]Wrote {result.total_rows()} rows to [bold]{out_file}[/bold][/green]"
                 )
+            return [out_file.resolve()]
         else:
             assert isinstance(csv_output, dict)
             out_file.mkdir(parents=True, exist_ok=True)
+            written_files: list[Path] = []
             for table_name, table_csv in csv_output.items():
                 table_file = out_file / f"{table_name}.csv"
                 write_text_file_secure(
                     table_file, table_csv, file_mode=output_file_mode, encoding="utf-8"
                 )
+                written_files.append(table_file.resolve())
             if not no_progress:
                 console.print()
                 console.print(
                     f"[green]Wrote {result.table_count()} tables ({result.total_rows()} rows) to [bold]{out_file}[/bold][/green]"
                 )
+            return written_files
     else:
         # Output to stdout (only single mode makes sense)
         if mode == "per-table":
@@ -703,6 +728,7 @@ def _generate_and_output_csv(
             console.print()
             console.print("[dim]--- CSV Output ---[/dim]")
         stdout_console.print(csv_output)
+        return []
 
 
 def _handle_output_format(
@@ -720,7 +746,7 @@ def _handle_output_format(
     console: Console,
     stdout_console: Console,
     db_schema: str | None = None,
-) -> None:
+) -> list[Path]:
     """
     Handle output generation based on configured format.
 
@@ -743,7 +769,7 @@ def _handle_output_format(
         typer.Exit: If format is not yet implemented (exits with code 1)
     """
     if output_format == OutputFormat.SQL:
-        _generate_and_output_sql(
+        return _generate_and_output_sql(
             result,
             schema,
             database_url,
@@ -758,7 +784,7 @@ def _handle_output_format(
             db_schema=db_schema,
         )
     elif output_format == OutputFormat.JSON:
-        _generate_and_output_json(
+        return _generate_and_output_json(
             result,
             schema,
             out_file,
@@ -770,7 +796,7 @@ def _handle_output_format(
             output_file_mode=extract_config.output_file_mode,
         )
     elif output_format == OutputFormat.CSV:
-        _generate_and_output_csv(
+        return _generate_and_output_csv(
             result,
             schema,
             out_file,
@@ -782,6 +808,113 @@ def _handle_output_format(
             output_file_mode=extract_config.output_file_mode,
         )
 
+    return []
+
+
+def _is_truthy_env(value: str | None) -> bool:
+    """Interpret common truthy environment values."""
+    if value is None:
+        return False
+    return value.strip().lower() in {"1", "true", "yes", "on"}
+
+
+def _enforce_source_guardrails(config: ExtractConfig) -> None:
+    """Apply source guardrail checks for compliance-sensitive runs."""
+    if config.compliance_require_ci and not _is_truthy_env(os.environ.get("CI")):
+        raise ValueError("Compliance policy requires CI environment, but CI is not set")
+
+    url = config.database_url
+    for pattern in config.compliance_denied_url_patterns:
+        if re.search(pattern, url):
+            raise ValueError(f"Database URL rejected by compliance deny pattern: {pattern}")
+
+    if config.compliance_allowed_url_patterns and not any(
+        re.search(pattern, url) for pattern in config.compliance_allowed_url_patterns
+    ):
+        raise ValueError("Database URL does not match any compliance allow pattern")
+
+    if config.compliance_required_sslmode:
+        parsed = urlparse(url)
+        query = parse_qs(parsed.query)
+        sslmode = query.get("sslmode", [None])[0]
+        if sslmode != config.compliance_required_sslmode:
+            raise ValueError(
+                "Database URL sslmode does not satisfy compliance requirement "
+                f"(expected '{config.compliance_required_sslmode}', got '{sslmode}')"
+            )
+
+
+def _enforce_compliance_policy(
+    config: ExtractConfig,
+    out_file: Path | None,
+    allow_raw: bool,
+    breakglass_reason: str | None,
+    ticket_id: str | None,
+) -> tuple[bool, str | None, str | None]:
+    """
+    Apply policy-mode gates for compliance runs.
+
+    Returns:
+        Tuple of (breakglass_applied, reason, ticket_id)
+    """
+    mode = config.compliance_policy_mode.lower()
+    policy_active = mode in {"standard", "strict"} and bool(config.compliance_profiles)
+
+    risky_reasons: list[str] = []
+    if policy_active:
+        if out_file is None:
+            risky_reasons.append("stdout output is blocked when compliance profiles are active")
+        if config.allow_unsafe_where:
+            risky_reasons.append("unsafe WHERE subqueries are blocked under compliance policy")
+        if not config.anonymize:
+            risky_reasons.append("masking/anonymization is required under compliance policy")
+
+    if allow_raw:
+        if not risky_reasons:
+            raise ValueError("--allow-raw was provided but no policy gate requires breakglass")
+        if not breakglass_reason:
+            raise ValueError("--allow-raw requires --breakglass-reason")
+        if not ticket_id:
+            raise ValueError("--allow-raw requires --ticket-id")
+        return True, breakglass_reason, ticket_id
+
+    if risky_reasons:
+        details = "\n".join(f"  - {reason}" for reason in risky_reasons)
+        raise ValueError(
+            "Compliance policy blocked extraction:\n"
+            f"{details}\n"
+            "Use --allow-raw with --breakglass-reason and --ticket-id only for approved exceptions."
+        )
+
+    if breakglass_reason or ticket_id:
+        raise ValueError("--breakglass-reason/--ticket-id require --allow-raw")
+
+    return False, None, None
+
+
+def _write_compliance_manifest(
+    manifest: object,
+    out_file: Path | None,
+    console: Console,
+    no_progress: bool,
+) -> None:
+    """Write compliance manifest to a JSON file alongside the output."""
+    from dbslice.compliance.manifest import ComplianceManifest
+    from dbslice.utils.fileio import write_text_file_secure
+
+    assert isinstance(manifest, ComplianceManifest)
+
+    if out_file:
+        manifest_path = out_file.with_suffix(".manifest.json")
+    else:
+        manifest_path = Path("dbslice_manifest.json")
+
+    manifest_json = manifest.to_json(pretty=True)
+    write_text_file_secure(manifest_path, manifest_json, file_mode=0o600)
+
+    if not no_progress:
+        console.print(f"[green]Wrote compliance manifest to [bold]{manifest_path}[/bold][/green]")
+
 
 @app.command()
 def extract(
@@ -986,6 +1119,58 @@ def extract(
             ),
         ),
     ] = None,
+    compliance: Annotated[
+        list[str] | None,
+        typer.Option(
+            "--compliance",
+            help="Compliance profile(s) to apply: gdpr, hipaa, pci-dss",
+        ),
+    ] = None,
+    compliance_strict: Annotated[
+        bool | None,
+        typer.Option(
+            "--compliance-strict/--no-compliance-strict",
+            help="Fail extraction if uncovered PII is detected by value scanning",
+        ),
+    ] = None,
+    manifest: Annotated[
+        bool | None,
+        typer.Option(
+            "--manifest/--no-manifest",
+            help="Generate audit manifest alongside output (auto-enabled with --compliance)",
+        ),
+    ] = None,
+    non_deterministic: Annotated[
+        bool | None,
+        typer.Option(
+            "--non-deterministic/--deterministic",
+            help="Use non-deterministic anonymization (random output each run, stronger privacy)",
+        ),
+    ] = None,
+    allow_raw: Annotated[
+        bool,
+        typer.Option(
+            "--allow-raw",
+            help=(
+                "Breakglass override for compliance policy gates (requires "
+                "--breakglass-reason and --ticket-id)"
+            ),
+        ),
+    ] = False,
+    breakglass_reason: Annotated[
+        str | None,
+        typer.Option(
+            "--breakglass-reason",
+            help="Required breakglass justification when using --allow-raw",
+        ),
+    ] = None,
+    ticket_id: Annotated[
+        str | None,
+        typer.Option(
+            "--ticket-id",
+            help="Required tracking ticket/incident ID when using --allow-raw",
+        ),
+    ] = None,
 ):
     """
     Extract a database subset starting from seed record(s).
@@ -1037,15 +1222,11 @@ def extract(
 
             direction_override = direction
             if direction_override is None:
-                direction_override = _parse_env_choice(
-                    "DBSLICE_DIRECTION", {"up", "down", "both"}
-                )
+                direction_override = _parse_env_choice("DBSLICE_DIRECTION", {"up", "down", "both"})
 
             output_override = output
             if output_override is None:
-                output_override = _parse_env_choice(
-                    "DBSLICE_OUTPUT_FORMAT", {"sql", "json", "csv"}
-                )
+                output_override = _parse_env_choice("DBSLICE_OUTPUT_FORMAT", {"sql", "json", "csv"})
 
             anonymize_override = anonymize
             if anonymize_override is None:
@@ -1103,12 +1284,39 @@ def extract(
             allow_unsafe_where_override
             if allow_unsafe_where_override is not None
             else (
-                loaded_config.extraction.allow_unsafe_where
-                if loaded_config is not None
-                else False
+                loaded_config.extraction.allow_unsafe_where if loaded_config is not None else False
             )
         )
 
+        # Compliance settings
+        effective_compliance = compliance or []
+        effective_compliance_strict = compliance_strict if compliance_strict is not None else False
+        effective_manifest = manifest if manifest is not None else bool(effective_compliance)
+        effective_deterministic = not non_deterministic if non_deterministic is not None else True
+
+        if loaded_config:
+            if not compliance:
+                effective_compliance = loaded_config.compliance.profiles
+            if compliance_strict is None:
+                effective_compliance_strict = loaded_config.compliance.strict
+            if manifest is None:
+                effective_manifest = loaded_config.compliance.generate_manifest or bool(
+                    effective_compliance
+                )
+            if non_deterministic is None:
+                effective_deterministic = loaded_config.anonymization.deterministic
+
+        # Validate compliance profile names
+        if effective_compliance:
+            from dbslice.compliance.profiles import get_profile
+
+            for profile_name in effective_compliance:
+                try:
+                    get_profile(profile_name)
+                except ValueError as e:
+                    console.print(f"[red]Compliance Error:[/red] {e}")
+                    raise typer.Exit(1)
+
         resolved_database_url = database_url_override
         if not resolved_database_url and loaded_config:
             resolved_database_url = loaded_config.database.url
@@ -1143,12 +1351,17 @@ def extract(
                 validate_exclude_tables(passthrough)  # Same validation as exclude
             if redact_override:
                 validate_redact_fields(redact_override)
-            if (
-                direction_override is not None
-                and direction_override.lower() not in {"up", "down", "both"}
-            ):
+            if direction_override is not None and direction_override.lower() not in {
+                "up",
+                "down",
+                "both",
+            }:
                 raise ValueError("Invalid direction. Use: up, down, both")
-            if output_override is not None and output_override.lower() not in {"sql", "json", "csv"}:
+            if output_override is not None and output_override.lower() not in {
+                "sql",
+                "json",
+                "csv",
+            }:
                 raise ValueError("Invalid output format. Use: sql, json, csv")
             if effective_json_mode not in ("auto", "single", "per-table"):
                 raise ValueError(
@@ -1184,9 +1397,7 @@ def extract(
                 else None
             )
             output_format_enum = (
-                OutputFormat(effective_output.lower())
-                if output_override is not None
-                else None
+                OutputFormat(effective_output.lower()) if output_override is not None else None
             )
 
             extract_config = loaded_config.to_extract_config(
@@ -1245,18 +1456,45 @@ def extract(
                 allow_unsafe_where=effective_allow_unsafe_where,
             )
 
+        # Apply compliance settings to extract config
+        extract_config.compliance_profiles = effective_compliance
+        extract_config.compliance_strict = effective_compliance_strict
+        extract_config.generate_manifest = effective_manifest
+        extract_config.deterministic = effective_deterministic
+
+        # Compliance profiles auto-enable anonymization
+        if effective_compliance and not extract_config.anonymize and not allow_raw:
+            extract_config.anonymize = True
+
+        try:
+            _enforce_source_guardrails(extract_config)
+            (
+                breakglass_applied,
+                breakglass_applied_reason,
+                breakglass_applied_ticket,
+            ) = _enforce_compliance_policy(
+                extract_config,
+                out_file,
+                allow_raw=allow_raw,
+                breakglass_reason=breakglass_reason,
+                ticket_id=ticket_id,
+            )
+        except ValueError as e:
+            console.print(f"[red]Compliance Policy Error:[/red] {e}")
+            raise typer.Exit(1)
+
         if verbose and not no_progress:
             _show_extraction_settings(extract_config, console)
 
-        result, schema, engine = _execute_extraction(extract_config, console)
+        result, schema_graph, engine = _execute_extraction(extract_config, console)
 
         if not no_progress:
             _show_extraction_summary(result, extract_config, engine, console)
 
-        _handle_output_format(
+        output_files = _handle_output_format(
             output_format=output_format,
             result=result,
-            schema=schema,
+            schema=schema_graph,
             extract_config=extract_config,
             database_url=extract_config.database_url,
             out_file=out_file,
@@ -1270,6 +1508,30 @@ def extract(
             db_schema=extract_config.schema,
         )
 
+        # Write compliance manifest after output so file hashes can be recorded.
+        engine_manifest = getattr(engine, "manifest", None)
+        if engine_manifest and effective_manifest:
+            engine_manifest.add_output_file_hashes(output_files, base_dir=Path.cwd())
+
+            if breakglass_applied and breakglass_applied_reason and breakglass_applied_ticket:
+                engine_manifest.set_breakglass(
+                    reason=breakglass_applied_reason,
+                    ticket_id=breakglass_applied_ticket,
+                )
+
+            if extract_config.compliance_manifest_sign:
+                signing_key = os.environ.get(extract_config.compliance_manifest_key_env)
+                if not signing_key:
+                    console.print(
+                        "[red]Compliance Policy Error:[/red] "
+                        "Manifest signing is enabled but signing key environment variable "
+                        f"'{extract_config.compliance_manifest_key_env}' is not set"
+                    )
+                    raise typer.Exit(1)
+                engine_manifest.sign(signing_key)
+
+            _write_compliance_manifest(engine_manifest, out_file, console, no_progress)
+
     except ConnectionError as e:
         logger.error("Database connection failed", error=e.reason, exc_info=True)
         console.print(f"[red]Connection failed:[/red] {e.reason}")
@@ -1590,6 +1852,124 @@ def _detect_potential_implicit_fks(schema) -> list[tuple[str, str, str]]:
     return sorted(candidates, key=lambda item: (item[0], item[1], item[2]))
 
 
+def _run_compliance_check_report(
+    adapter,
+    db_schema,
+    profiles: list[str],
+    sample_rows: int,
+    output_mode: str,
+    target_table: str | None,
+    console: Console,
+) -> None:
+    """Run profile-aware coverage scanning and print human/json compliance report."""
+    from dbslice.compliance.profiles import get_profile
+    from dbslice.compliance.scanner import PIIScanner
+    from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+    scan_patterns: set[str] = set()
+    fallback_patterns: dict[str, str] = {}
+    security_null_fields: list[str] = []
+    profile_summaries: list[dict[str, object]] = []
+
+    for profile_name in profiles:
+        profile = get_profile(profile_name)
+        scan_patterns.update(profile.value_scan_patterns)
+        for pattern, provider in profile.required_column_patterns.items():
+            fallback_patterns.setdefault(f"*.{pattern}*", provider)
+        for pattern in profile.required_null_patterns:
+            glob = f"*.{pattern}*"
+            if glob not in security_null_fields:
+                security_null_fields.append(glob)
+        profile_summaries.append(
+            {
+                "profile": profile.name,
+                "display_name": profile.display_name,
+                "identifier_categories": len(profile.identifiers),
+                "required_column_patterns": len(profile.required_column_patterns),
+                "required_null_patterns": len(profile.required_null_patterns),
+            }
+        )
+
+    scanner = PIIScanner(
+        patterns=sorted(scan_patterns)
+        if scan_patterns
+        else ["email", "ssn", "phone", "credit_card"]
+    )
+    anonymizer = DeterministicAnonymizer(seed="compliance-check")
+    anonymizer.configure(
+        [],
+        patterns={},
+        fallback_patterns=fallback_patterns,
+        security_null_fields=security_null_fields,
+    )
+
+    if target_table:
+        tables_to_scan = [target_table]
+    else:
+        tables_to_scan = sorted(db_schema.tables.keys())
+
+    detections: list[dict[str, object]] = []
+    for table_name in tables_to_scan:
+        rows = list(itertools.islice(adapter.fetch_rows(table_name, "TRUE", ()), sample_rows))
+        if not rows:
+            continue
+        for detection in scanner.scan_rows(table_name, rows):
+            protected = anonymizer.should_anonymize(
+                detection.table, detection.column
+            ) or anonymizer.should_null(detection.table, detection.column)
+            detections.append(
+                {
+                    "table": detection.table,
+                    "column": detection.column,
+                    "pattern": detection.pattern_name,
+                    "match_count": detection.match_count,
+                    "sample_size": detection.sample_size,
+                    "match_rate": round(detection.match_rate, 4),
+                    "confidence": detection.confidence,
+                    "protected": protected,
+                }
+            )
+
+    uncovered = [item for item in detections if not item["protected"]]
+    report = {
+        "profiles": profiles,
+        "profile_summaries": profile_summaries,
+        "tables_scanned": len(tables_to_scan),
+        "sample_rows_per_table": sample_rows,
+        "detections_total": len(detections),
+        "detections_protected": len(detections) - len(uncovered),
+        "detections_uncovered": len(uncovered),
+        "status": "pass" if not uncovered else "gaps_found",
+        "uncovered_detections": uncovered,
+    }
+
+    if output_mode == "json":
+        console.print(json.dumps(report, indent=2))
+        return
+
+    console.print("\n[bold]Compliance Coverage Check[/bold]")
+    console.print(f"  Profiles: [cyan]{', '.join(profile.upper() for profile in profiles)}[/cyan]")
+    console.print(f"  Tables scanned: [cyan]{report['tables_scanned']}[/cyan]")
+    console.print(f"  Sample rows/table: [cyan]{sample_rows}[/cyan]")
+    console.print(f"  Detections: [cyan]{report['detections_total']}[/cyan]")
+    console.print(f"  Protected detections: [green]{report['detections_protected']}[/green]")
+    if uncovered:
+        console.print(f"  Uncovered detections: [red]{len(uncovered)}[/red]")
+        console.print("\n[red]Potential Compliance Gaps:[/red]")
+        for finding in uncovered[:50]:
+            console.print(
+                "  "
+                f"{finding['table']}.{finding['column']}: "
+                f"{finding['pattern']} ({finding['match_count']}/{finding['sample_size']}, "
+                f"{finding['confidence']})"
+            )
+        if len(uncovered) > 50:
+            console.print(f"  [dim]... and {len(uncovered) - 50} more[/dim]")
+    else:
+        console.print("  Uncovered detections: [green]0[/green]")
+        console.print("[green]Status: PASS[/green]")
+
+
 @app.command()
 def inspect(
     database_url: Annotated[
@@ -1611,6 +1991,28 @@ def inspect(
             help="PostgreSQL schema name (default: 'public')",
         ),
     ] = None,
+    compliance_check: Annotated[
+        list[str] | None,
+        typer.Option(
+            "--compliance-check",
+            help="Run compliance coverage check for profile(s): gdpr, hipaa, pci-dss",
+        ),
+    ] = None,
+    compliance_output: Annotated[
+        str,
+        typer.Option(
+            "--compliance-output",
+            help="Compliance report output format: human or json",
+        ),
+    ] = "human",
+    sample_rows: Annotated[
+        int,
+        typer.Option(
+            "--sample-rows",
+            help="Rows sampled per table for compliance value scanning",
+            min=1,
+        ),
+    ] = 100,
 ):
     """
     Inspect database schema without extracting data.
@@ -1635,6 +2037,8 @@ def inspect(
                 from dbslice.input_validators import validate_table_name
 
                 validate_table_name(table)
+            if compliance_output not in {"human", "json"}:
+                raise ValueError("--compliance-output must be one of: human, json")
         except (ValidationError, ValueError) as e:
             console.print(f"[red]Validation Error:[/red] {e}")
             raise typer.Exit(1)
@@ -1654,6 +2058,27 @@ def inspect(
             with console.status("[bold blue]Introspecting schema...[/bold blue]"):
                 db_schema = adapter.get_schema()
 
+            if compliance_check:
+                from dbslice.compliance.profiles import get_profile
+
+                for profile_name in compliance_check:
+                    try:
+                        get_profile(profile_name)
+                    except ValueError as e:
+                        console.print(f"[red]Compliance Error:[/red] {e}")
+                        raise typer.Exit(1)
+
+                _run_compliance_check_report(
+                    adapter=adapter,
+                    db_schema=db_schema,
+                    profiles=compliance_check,
+                    sample_rows=sample_rows,
+                    output_mode=compliance_output,
+                    target_table=table,
+                    console=console,
+                )
+                return
+
             if table:
                 table_info = db_schema.get_table(table)
                 if not table_info:
@@ -1729,9 +2154,7 @@ def inspect(
                     for src_table, src_col, target_table in implicit_candidates[:25]:
                         console.print(f"  {src_table}.{src_col} -> [cyan]{target_table}[/cyan].id")
                     if len(implicit_candidates) > 25:
-                        console.print(
-                            f"  [dim]... and {len(implicit_candidates) - 25} more[/dim]"
-                        )
+                        console.print(f"  [dim]... and {len(implicit_candidates) - 25} more[/dim]")
                     console.print(
                         "  [dim]Tip: define virtual_foreign_keys for confirmed implicit links.[/dim]"
                     )
@@ -1752,6 +2175,134 @@ def inspect(
         raise typer.Exit(1)
 
 
+@app.command("verify-manifest")
+def verify_manifest(
+    manifest_file: Annotated[
+        Path,
+        typer.Argument(help="Path to compliance manifest JSON file"),
+    ],
+    verify_signature: Annotated[
+        bool,
+        typer.Option(
+            "--verify-signature/--no-verify-signature",
+            help="Verify HMAC manifest signature when present",
+        ),
+    ] = True,
+    key_env: Annotated[
+        str,
+        typer.Option(
+            "--key-env",
+            help="Environment variable containing manifest signing key",
+        ),
+    ] = "DBSLICE_MANIFEST_SIGNING_KEY",
+):
+    """Verify compliance manifest output hashes and optional HMAC signature."""
+    try:
+        if not manifest_file.exists():
+            console.print(f"[red]Error:[/red] Manifest file not found: {manifest_file}")
+            raise typer.Exit(1)
+        if not manifest_file.is_file():
+            console.print(f"[red]Error:[/red] Not a file: {manifest_file}")
+            raise typer.Exit(1)
+
+        try:
+            payload = json.loads(manifest_file.read_text(encoding="utf-8"))
+        except json.JSONDecodeError as e:
+            console.print(f"[red]Error:[/red] Invalid manifest JSON: {e}")
+            raise typer.Exit(1)
+
+        from dbslice.compliance.manifest import verify_manifest_payload
+
+        signing_key: str | None = None
+        if verify_signature:
+            signing_key = os.environ.get(key_env)
+
+        valid, errors = verify_manifest_payload(
+            payload=payload,
+            manifest_path=manifest_file,
+            signing_key=signing_key,
+            verify_signature=verify_signature,
+        )
+
+        if valid:
+            console.print("[green]Manifest verification passed[/green]")
+            return
+
+        console.print("[red]Manifest verification failed:[/red]")
+        for error in errors:
+            console.print(f"  - {error}")
+        raise typer.Exit(1)
+    except typer.Exit:
+        raise
+    except Exception as e:
+        console.print(f"[red]Unexpected error:[/red] {e}")
+        raise typer.Exit(1)
+
+
+@app.command()
+def map(
+    database_url: Annotated[
+        str | None,
+        typer.Argument(help="Optional database URL (can also enter in the UI)"),
+    ] = None,
+    schema: Annotated[
+        str | None,
+        typer.Option("--schema", help="PostgreSQL schema name (default: 'public')"),
+    ] = None,
+    port: Annotated[
+        int,
+        typer.Option("--port", "-p", help="Port for local mapping UI server", min=1024, max=65535),
+    ] = 9473,
+    open_browser: Annotated[
+        bool,
+        typer.Option("--open-browser/--no-open-browser", help="Auto-open browser"),
+    ] = True,
+):
+    """
+    Launch local column-mapping UI.
+
+    Opens a browser-based interface for reviewing database columns and
+    configuring anonymization mappings. Generates a ready-to-use dbslice.yaml.
+
+    The server runs locally on 127.0.0.1 only and requires a session token.
+
+    Examples:
+
+        # Launch mapping UI (enter URL in browser)
+        dbslice map
+
+        # Launch with pre-filled database URL
+        dbslice map postgresql://localhost/myapp
+
+        # Custom port, no auto-open
+        dbslice map postgresql://localhost/myapp --port 8888 --no-open-browser
+    """
+    from dbslice.mapping.server import MappingServer
+
+    resolved_url = database_url
+    if resolved_url is None:
+        resolved_url = os.environ.get("DATABASE_URL")
+
+    server = MappingServer(
+        port=port,
+        database_url=resolved_url or "",
+        schema=schema,
+    )
+
+    console.print("\n[bold]dbslice Column Mapping[/bold]")
+    console.print(f"  URL: [cyan]{server.url}[/cyan]")
+    console.print("  Bound to 127.0.0.1 only (local access)")
+    console.print("\n  Press Ctrl+C to stop.\n")
+
+    try:
+        server.start(open_browser=open_browser)
+    except OSError as e:
+        console.print(f"[red]Error:[/red] Could not start server on port {port}: {e}")
+        raise typer.Exit(1)
+    except KeyboardInterrupt:
+        console.print("\n[dim]Mapping UI stopped.[/dim]")
+
+
 @app.command()
 def docs(
     port: Annotated[
diff --git a/src/dbslice/compliance/__init__.py b/src/dbslice/compliance/__init__.py
new file mode 100644
index 0000000..d04022b
--- /dev/null
+++ b/src/dbslice/compliance/__init__.py
@@ -0,0 +1,18 @@
+from dbslice.compliance.manifest import (
+    ComplianceManifest,
+    ManifestEntry,
+    verify_manifest_payload,
+)
+from dbslice.compliance.profiles import ComplianceProfile, get_profile, list_profiles
+from dbslice.compliance.scanner import PIIDetection, PIIScanner
+
+__all__ = [
+    "ComplianceManifest",
+    "ComplianceProfile",
+    "ManifestEntry",
+    "PIIDetection",
+    "PIIScanner",
+    "get_profile",
+    "list_profiles",
+    "verify_manifest_payload",
+]
diff --git a/src/dbslice/compliance/manifest.py b/src/dbslice/compliance/manifest.py
new file mode 100644
index 0000000..fb33e52
--- /dev/null
+++ b/src/dbslice/compliance/manifest.py
@@ -0,0 +1,319 @@
+import hashlib
+import hmac
+import json
+from dataclasses import asdict, dataclass, field
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+from dbslice import __version__
+from dbslice.compliance.scanner import PIIDetection
+
+
+@dataclass
+class ManifestFieldEntry:
+    """Record of anonymization applied to a single field."""
+
+    table: str
+    column: str
+    method: str
+    category: str = ""  # e.g., "direct_identifier", "hipaa_identifier_7"
+
+
+@dataclass
+class ManifestNullEntry:
+    """Record of a field forced to NULL."""
+
+    table: str
+    column: str
+    reason: str  # e.g., "security_null_pattern"
+
+
+@dataclass
+class ManifestWarning:
+    """A compliance warning."""
+
+    table: str
+    column: str
+    reason: str
+    severity: str = "warning"  # "warning" or "error"
+
+
+@dataclass
+class ManifestTableEntry:
+    """Per-table manifest data."""
+
+    rows_extracted: int = 0
+    fields_masked: list[ManifestFieldEntry] = field(default_factory=list)
+    fields_nulled: list[ManifestNullEntry] = field(default_factory=list)
+    fields_preserved_fk: list[str] = field(default_factory=list)
+    fields_unmasked: list[str] = field(default_factory=list)
+
+
+@dataclass
+class ManifestEntry:
+    """A single entry in the compliance manifest (for external use)."""
+
+    table: str
+    column: str
+    action: str  # "masked", "nulled", "preserved_fk", "unmasked"
+    method: str = ""
+    reason: str = ""
+
+
+@dataclass
+class ComplianceManifest:
+    """
+    Full compliance audit manifest.
+
+    Generated alongside extraction output to document what anonymization
+    was applied and provide evidence for compliance audits.
+    """
+
+    extraction_id: str = ""
+    timestamp: str = ""
+    dbslice_version: str = ""
+    masking_type: str = "deterministic_pseudonymization"
+    compliance_profiles: list[str] = field(default_factory=list)
+    tables: dict[str, ManifestTableEntry] = field(default_factory=dict)
+    pii_scan_results: list[PIIDetection] = field(default_factory=list)
+    warnings: list[ManifestWarning] = field(default_factory=list)
+    seed_hash: str = ""
+    output_file_hashes: dict[str, str] = field(default_factory=dict)
+    breakglass: dict[str, str] = field(default_factory=dict)
+    signature_algorithm: str = ""
+    signature: str = ""
+
+    def initialize(
+        self,
+        extraction_id: str,
+        compliance_profiles: list[str] | None = None,
+        anonymization_seed: str | None = None,
+        deterministic: bool = True,
+    ) -> None:
+        """
+        Initialize manifest metadata.
+
+        Args:
+            extraction_id: Unique ID for this extraction
+            compliance_profiles: Names of active compliance profiles
+            anonymization_seed: The anonymization seed (hashed, not stored raw)
+            deterministic: Whether deterministic mode is used
+        """
+        self.extraction_id = extraction_id
+        self.timestamp = datetime.now(timezone.utc).isoformat()
+        self.dbslice_version = __version__
+        self.compliance_profiles = compliance_profiles or []
+        self.masking_type = (
+            "deterministic_pseudonymization"
+            if deterministic
+            else "non_deterministic_pseudonymization"
+        )
+        if anonymization_seed:
+            self.seed_hash = (
+                f"sha256:{hashlib.sha256(anonymization_seed.encode()).hexdigest()[:16]}"
+            )
+
+    def record_masked_field(
+        self,
+        table: str,
+        column: str,
+        method: str,
+        category: str = "",
+    ) -> None:
+        """Record that a field was masked/anonymized."""
+        entry = self.tables.setdefault(table, ManifestTableEntry())
+        entry.fields_masked.append(
+            ManifestFieldEntry(table=table, column=column, method=method, category=category)
+        )
+
+    def record_nulled_field(self, table: str, column: str, reason: str) -> None:
+        """Record that a field was set to NULL."""
+        entry = self.tables.setdefault(table, ManifestTableEntry())
+        entry.fields_nulled.append(ManifestNullEntry(table=table, column=column, reason=reason))
+
+    def record_fk_preserved(self, table: str, column: str) -> None:
+        """Record that a FK column was preserved (not anonymized)."""
+        entry = self.tables.setdefault(table, ManifestTableEntry())
+        if column not in entry.fields_preserved_fk:
+            entry.fields_preserved_fk.append(column)
+
+    def record_unmasked_field(self, table: str, column: str) -> None:
+        """Record that a field was not masked."""
+        entry = self.tables.setdefault(table, ManifestTableEntry())
+        if column not in entry.fields_unmasked:
+            entry.fields_unmasked.append(column)
+
+    def set_table_row_count(self, table: str, count: int) -> None:
+        """Set the extracted row count for a table."""
+        entry = self.tables.setdefault(table, ManifestTableEntry())
+        entry.rows_extracted = count
+
+    def add_warning(
+        self,
+        table: str,
+        column: str,
+        reason: str,
+        severity: str = "warning",
+    ) -> None:
+        """Add a compliance warning."""
+        self.warnings.append(
+            ManifestWarning(table=table, column=column, reason=reason, severity=severity)
+        )
+
+    def add_pii_detections(self, detections: list[PIIDetection]) -> None:
+        """Add PII scan results."""
+        self.pii_scan_results.extend(detections)
+
+    def set_breakglass(self, reason: str, ticket_id: str) -> None:
+        """Record breakglass metadata for raw/unsafe extraction exceptions."""
+        self.breakglass = {
+            "reason": reason,
+            "ticket_id": ticket_id,
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+        }
+
+    def add_output_file_hashes(
+        self, output_files: list[Path], base_dir: Path | None = None
+    ) -> None:
+        """Record deterministic SHA256 hashes for generated output files."""
+        root = (base_dir or Path.cwd()).resolve()
+        hashes: dict[str, str] = {}
+
+        for file_path in sorted((Path(p).resolve() for p in output_files), key=lambda p: str(p)):
+            if not file_path.exists() or not file_path.is_file():
+                continue
+            digest = _sha256_file(file_path)
+            key: str
+            try:
+                key = str(file_path.relative_to(root))
+            except ValueError:
+                key = str(file_path)
+            hashes[key] = f"sha256:{digest}"
+
+        self.output_file_hashes = hashes
+
+    def sign(self, signing_key: str) -> None:
+        """Sign manifest payload using HMAC-SHA256."""
+        payload = self._signable_dict()
+        digest = _manifest_hmac(payload, signing_key)
+        self.signature_algorithm = "hmac-sha256"
+        self.signature = f"hmac-sha256:{digest}"
+
+    def to_dict(self) -> dict[str, Any]:
+        """Convert to a JSON-serializable dictionary."""
+        tables_dict: dict[str, Any] = {}
+        for table_name, table_entry in self.tables.items():
+            tables_dict[table_name] = {
+                "rows_extracted": table_entry.rows_extracted,
+                "fields_masked": [
+                    {"column": f.column, "method": f.method, "category": f.category}
+                    for f in table_entry.fields_masked
+                ],
+                "fields_nulled": [
+                    {"column": f.column, "reason": f.reason} for f in table_entry.fields_nulled
+                ],
+                "fields_preserved_fk": table_entry.fields_preserved_fk,
+                "fields_unmasked": table_entry.fields_unmasked,
+            }
+
+        pii_results = [
+            {
+                "table": d.table,
+                "column": d.column,
+                "pattern": d.pattern_name,
+                "match_count": d.match_count,
+                "sample_size": d.sample_size,
+                "confidence": d.confidence,
+            }
+            for d in self.pii_scan_results
+        ]
+
+        warnings = [asdict(w) for w in self.warnings]
+
+        return {
+            "extraction_id": self.extraction_id,
+            "timestamp": self.timestamp,
+            "dbslice_version": self.dbslice_version,
+            "masking_type": self.masking_type,
+            "compliance_profiles": self.compliance_profiles,
+            "seed_hash": self.seed_hash,
+            "tables": tables_dict,
+            "pii_scan_results": pii_results,
+            "warnings": warnings,
+            "output_file_hashes": self.output_file_hashes,
+            "breakglass": self.breakglass,
+            "signature_algorithm": self.signature_algorithm,
+            "signature": self.signature,
+        }
+
+    def to_json(self, pretty: bool = True) -> str:
+        """Serialize manifest to JSON string."""
+        return json.dumps(self.to_dict(), indent=2 if pretty else None, default=str)
+
+    def _signable_dict(self) -> dict[str, Any]:
+        payload = self.to_dict()
+        payload.pop("signature_algorithm", None)
+        payload.pop("signature", None)
+        return payload
+
+
+def _sha256_file(path: Path) -> str:
+    digest = hashlib.sha256()
+    with path.open("rb") as handle:
+        for chunk in iter(lambda: handle.read(1024 * 1024), b""):
+            digest.update(chunk)
+    return digest.hexdigest()
+
+
+def _manifest_hmac(payload: dict[str, Any], signing_key: str) -> str:
+    canonical = json.dumps(payload, sort_keys=True, separators=(",", ":"), ensure_ascii=True)
+    return hmac.new(signing_key.encode("utf-8"), canonical.encode("utf-8"), hashlib.sha256).hexdigest()
+
+
+def verify_manifest_payload(
+    payload: dict[str, Any],
+    manifest_path: Path,
+    signing_key: str | None = None,
+    verify_signature: bool = True,
+) -> tuple[bool, list[str]]:
+    """Verify output file hashes and optional HMAC signature for a manifest payload."""
+    errors: list[str] = []
+    manifest_dir = manifest_path.parent.resolve()
+
+    file_hashes = payload.get("output_file_hashes", {})
+    if not isinstance(file_hashes, dict):
+        errors.append("'output_file_hashes' must be an object")
+        return False, errors
+
+    for rel_path, expected_hash in file_hashes.items():
+        if not isinstance(rel_path, str) or not isinstance(expected_hash, str):
+            errors.append("Invalid output_file_hashes entry")
+            continue
+        target = (manifest_dir / rel_path).resolve()
+        if not target.exists():
+            errors.append(f"Missing output file: {rel_path}")
+            continue
+        actual = f"sha256:{_sha256_file(target)}"
+        if actual != expected_hash:
+            errors.append(
+                f"Hash mismatch for {rel_path}: expected {expected_hash}, got {actual}"
+            )
+
+    if verify_signature:
+        signature = payload.get("signature")
+        signature_algorithm = payload.get("signature_algorithm")
+        if signature:
+            if signature_algorithm != "hmac-sha256":
+                errors.append("Unsupported signature_algorithm (expected hmac-sha256)")
+            elif signing_key is None:
+                errors.append("Manifest is signed but no signing key was provided")
+            else:
+                signable = dict(payload)
+                signable.pop("signature", None)
+                signable.pop("signature_algorithm", None)
+                expected = f"hmac-sha256:{_manifest_hmac(signable, signing_key)}"
+                if signature != expected:
+                    errors.append("Manifest signature verification failed")
+
+    return len(errors) == 0, errors
diff --git a/src/dbslice/compliance/profiles.py b/src/dbslice/compliance/profiles.py
new file mode 100644
index 0000000..fe9b28d
--- /dev/null
+++ b/src/dbslice/compliance/profiles.py
@@ -0,0 +1,366 @@
+from dataclasses import dataclass, field
+
+
+@dataclass(frozen=True)
+class ComplianceProfile:
+    """A compliance profile defining anonymization requirements for a regulatory framework."""
+
+    name: str
+    """Profile identifier (e.g., 'gdpr', 'hipaa', 'pci-dss')."""
+
+    display_name: str
+    """Human-readable name (e.g., 'GDPR', 'HIPAA Safe Harbor')."""
+
+    description: str
+    """Brief description of what this profile covers."""
+
+    required_column_patterns: dict[str, str] = field(default_factory=dict)
+    """Column name substring -> Faker provider mappings that MUST be anonymized."""
+
+    required_null_patterns: list[str] = field(default_factory=list)
+    """Column name patterns that must be NULLed (security-sensitive data)."""
+
+    value_scan_patterns: list[str] = field(default_factory=list)
+    """Names of value-based PII scanner patterns to run (e.g., 'email', 'ssn', 'credit_card')."""
+
+    warn_freetext_columns: list[str] = field(default_factory=list)
+    """Column name patterns that may contain embedded PII in free text."""
+
+    identifiers: list[str] = field(default_factory=list)
+    """List of identifier categories this profile covers (for compliance reports)."""
+
+
+GDPR_PROFILE = ComplianceProfile(
+    name="gdpr",
+    display_name="GDPR",
+    description=(
+        "EU General Data Protection Regulation. Covers direct identifiers and "
+        "flags quasi-identifiers that could enable singling out or linkage attacks."
+    ),
+    required_column_patterns={
+        # Direct identifiers
+        "email": "email",
+        "first_name": "first_name",
+        "last_name": "last_name",
+        "firstname": "first_name",
+        "lastname": "last_name",
+        "full_name": "name",
+        "fullname": "name",
+        "name": "name",
+        "phone": "phone_number",
+        "mobile": "phone_number",
+        "fax": "phone_number",
+        # Address / location
+        "address": "address",
+        "street": "street_address",
+        "city": "city",
+        "zip": "zipcode",
+        "zipcode": "zipcode",
+        "postal": "zipcode",
+        # Identity documents
+        "ssn": "ssn",
+        "passport": "passport_number",
+        "driver_license": "license_plate",
+        # Financial
+        "credit_card": "credit_card_number",
+        "card_number": "credit_card_number",
+        "iban": "iban",
+        "bank_account": "bban",
+        "account_number": "bban",
+        # Network identifiers
+        "ip_address": "ipv4",
+        "ipaddress": "ipv4",
+        "ip": "ipv4",
+        "ipv6": "ipv6",
+        "mac_address": "mac_address",
+        # Online identifiers
+        "username": "user_name",
+        "user_name": "user_name",
+        # Biographic
+        "dob": "date_of_birth",
+        "date_of_birth": "date_of_birth",
+        "birthdate": "date_of_birth",
+        "birth_date": "date_of_birth",
+    },
+    required_null_patterns=[
+        "password",
+        "passwd",
+        "pwd",
+        "hash",
+        "salt",
+        "token",
+        "secret",
+        "api_key",
+        "apikey",
+        "private_key",
+        "public_key",
+        "certificate",
+        "session_id",
+    ],
+    value_scan_patterns=["email", "phone", "ipv4", "ipv6"],
+    warn_freetext_columns=[
+        "note",
+        "notes",
+        "comment",
+        "comments",
+        "description",
+        "message",
+        "body",
+        "content",
+        "text",
+        "bio",
+        "about",
+        "reason",
+        "feedback",
+        "review",
+    ],
+    identifiers=[
+        "Names",
+        "Email addresses",
+        "Phone numbers",
+        "Physical addresses",
+        "IP addresses",
+        "Date of birth",
+        "Identity documents (SSN, passport)",
+        "Financial identifiers (credit card, IBAN)",
+        "Online identifiers (username)",
+        "Biometric identifiers (flagged via value scan)",
+    ],
+)
+
+HIPAA_PROFILE = ComplianceProfile(
+    name="hipaa",
+    display_name="HIPAA Safe Harbor",
+    description=(
+        "HIPAA Safe Harbor de-identification method. Requires removal or masking "
+        "of all 18 specified identifier types per 45 CFR 164.514(b)(2)."
+    ),
+    required_column_patterns={
+        # 1. Names
+        "name": "name",
+        "first_name": "first_name",
+        "last_name": "last_name",
+        "firstname": "first_name",
+        "lastname": "last_name",
+        "full_name": "name",
+        "fullname": "name",
+        # 2. Geographic (smaller than state) — Safe Harbor requires ZIP3 with population check
+        "address": "address",
+        "street": "street_address",
+        "city": "city",
+        "zip": "hipaa_zip3",
+        "zipcode": "hipaa_zip3",
+        "postal": "hipaa_zip3",
+        "county": "city",
+        # 3. Dates (except year) — Safe Harbor requires year-only
+        "dob": "year_only",
+        "date_of_birth": "year_only",
+        "birthdate": "year_only",
+        "birth_date": "year_only",
+        "admission_date": "year_only",
+        "discharge_date": "year_only",
+        "death_date": "year_only",
+        "service_date": "year_only",
+        "visit_date": "year_only",
+        # 4. Phone numbers
+        "phone": "phone_number",
+        "mobile": "phone_number",
+        "telephone": "phone_number",
+        "cell": "phone_number",
+        # 5. Fax numbers
+        "fax": "phone_number",
+        # 6. Email addresses
+        "email": "email",
+        # 7. SSN
+        "ssn": "ssn",
+        "social_security": "ssn",
+        # 8. Medical record numbers
+        "medical_record": "pystr",
+        "mrn": "pystr",
+        "patient_id": "pystr",
+        # 9. Health plan beneficiary numbers
+        "beneficiary": "pystr",
+        "member_id": "pystr",
+        "subscriber_id": "pystr",
+        # 10. Account numbers
+        "account_number": "bban",
+        "bank_account": "bban",
+        # 11. Certificate/license numbers
+        "license_number": "license_plate",
+        "certificate_number": "pystr",
+        "driver_license": "license_plate",
+        "passport": "passport_number",
+        # 12. Vehicle identifiers
+        "vin": "pystr",
+        "vehicle_id": "pystr",
+        "license_plate": "license_plate",
+        # 13. Device identifiers
+        "device_id": "pystr",
+        "serial_number": "pystr",
+        "device_serial": "pystr",
+        # 14. Web URLs
+        "url": "url",
+        "website": "url",
+        # 15. IP addresses
+        "ip_address": "ipv4",
+        "ipaddress": "ipv4",
+        "ip": "ipv4",
+        "ipv6": "ipv6",
+        # 16. Biometric identifiers (column names are hints)
+        "fingerprint": "pystr",
+        "biometric": "pystr",
+        "retina": "pystr",
+        "voiceprint": "pystr",
+        # 17. Full-face photographs (binary columns - flag as warning)
+        # 18. Any other unique identifier
+        "unique_id": "pystr",
+    },
+    required_null_patterns=[
+        "password",
+        "passwd",
+        "pwd",
+        "hash",
+        "salt",
+        "token",
+        "secret",
+        "api_key",
+        "apikey",
+        "private_key",
+        "public_key",
+        "certificate",
+        "session_id",
+    ],
+    value_scan_patterns=["email", "ssn", "phone", "credit_card", "ipv4", "ipv6"],
+    warn_freetext_columns=[
+        "note",
+        "notes",
+        "comment",
+        "comments",
+        "description",
+        "message",
+        "body",
+        "content",
+        "text",
+        "diagnosis",
+        "treatment",
+        "history",
+        "narrative",
+        "clinical_notes",
+        "progress_notes",
+        "discharge_summary",
+    ],
+    identifiers=[
+        "1. Names",
+        "2. Geographic data (smaller than state)",
+        "3. Dates (except year)",
+        "4. Phone numbers",
+        "5. Fax numbers",
+        "6. Email addresses",
+        "7. Social Security numbers",
+        "8. Medical record numbers",
+        "9. Health plan beneficiary numbers",
+        "10. Account numbers",
+        "11. Certificate/license numbers",
+        "12. Vehicle identifiers",
+        "13. Device identifiers",
+        "14. Web URLs",
+        "15. IP addresses",
+        "16. Biometric identifiers",
+        "17. Full-face photographs (flag only)",
+        "18. Any other unique identifying number",
+    ],
+)
+
+PCI_DSS_PROFILE = ComplianceProfile(
+    name="pci-dss",
+    display_name="PCI-DSS v4.0",
+    description=(
+        "Payment Card Industry Data Security Standard v4.0. "
+        "Real PANs are PROHIBITED in dev/test environments (Req 6.5.6). "
+        "Cardholder data must be fully replaced with synthetic data."
+    ),
+    required_column_patterns={
+        # Primary Account Number (PAN)
+        "credit_card": "credit_card_number",
+        "card_number": "credit_card_number",
+        "card_num": "credit_card_number",
+        "pan": "credit_card_number",
+        "account_number": "bban",
+        # Cardholder name
+        "cardholder": "name",
+        "card_holder": "name",
+        "cardholder_name": "name",
+        # Expiration
+        "expiry": "credit_card_expire",
+        "expiration": "credit_card_expire",
+        "exp_date": "credit_card_expire",
+        "card_expiry": "credit_card_expire",
+        # Service code (3-4 digit)
+        "service_code": "pystr",
+        "cvv": "credit_card_security_code",
+        "cvc": "credit_card_security_code",
+        "cvv2": "credit_card_security_code",
+    },
+    required_null_patterns=[
+        # Sensitive authentication data - MUST be removed post-authorization
+        "pin",
+        "pin_block",
+        "pin_number",
+        "cvv",
+        "cvc",
+        "cvv2",
+        "cvc2",
+        "magnetic_stripe",
+        "track_data",
+        "track1",
+        "track2",
+    ],
+    value_scan_patterns=["credit_card"],
+    warn_freetext_columns=[
+        "note",
+        "notes",
+        "comment",
+        "description",
+        "transaction_detail",
+        "memo",
+    ],
+    identifiers=[
+        "Primary Account Number (PAN)",
+        "Cardholder name",
+        "Expiration date",
+        "Service code",
+        "Sensitive authentication data (CVV/PIN)",
+    ],
+)
+
+
+_PROFILES: dict[str, ComplianceProfile] = {
+    "gdpr": GDPR_PROFILE,
+    "hipaa": HIPAA_PROFILE,
+    "pci-dss": PCI_DSS_PROFILE,
+}
+
+
+def get_profile(name: str) -> ComplianceProfile:
+    """
+    Get a compliance profile by name.
+
+    Args:
+        name: Profile name (case-insensitive)
+
+    Returns:
+        ComplianceProfile
+
+    Raises:
+        ValueError: If profile not found
+    """
+    profile = _PROFILES.get(name.lower())
+    if profile is None:
+        available = ", ".join(sorted(_PROFILES.keys()))
+        raise ValueError(f"Unknown compliance profile '{name}'. Available: {available}")
+    return profile
+
+
+def list_profiles() -> list[ComplianceProfile]:
+    """Return all available compliance profiles."""
+    return list(_PROFILES.values())
diff --git a/src/dbslice/compliance/scanner.py b/src/dbslice/compliance/scanner.py
new file mode 100644
index 0000000..8ffaeee
--- /dev/null
+++ b/src/dbslice/compliance/scanner.py
@@ -0,0 +1,203 @@
+import re
+from dataclasses import dataclass, field
+from typing import Any
+
+
+@dataclass
+class PIIDetection:
+    """A single PII detection result."""
+
+    table: str
+    column: str
+    pattern_name: str
+    match_count: int
+    sample_size: int
+    confidence: str  # "high", "medium", "low"
+
+    @property
+    def match_rate(self) -> float:
+        """Fraction of sampled values that matched."""
+        if self.sample_size == 0:
+            return 0.0
+        return self.match_count / self.sample_size
+
+
+# Compiled regex patterns for PII detection
+_PII_PATTERNS: dict[str, tuple[re.Pattern[str], str]] = {
+    "email": (
+        re.compile(r"\b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b"),
+        "high",
+    ),
+    "ssn": (
+        re.compile(r"\b\d{3}-\d{2}-\d{4}\b"),
+        "high",
+    ),
+    "phone": (
+        re.compile(r"\b(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b"),
+        "medium",
+    ),
+    "ipv4": (
+        re.compile(r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b"),
+        "medium",
+    ),
+    "ipv6": (
+        re.compile(r"\b(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}\b"),
+        "medium",
+    ),
+    "credit_card": (
+        re.compile(r"(?<!\d)(?:\d[\s-]?){13,19}(?!\d)"),
+        "high",  # Confidence raised only if Luhn passes
+    ),
+}
+
+_PAN_CANDIDATE_RE = re.compile(r"(?<!\d)(?:\d[\s-]?){13,19}(?!\d)")
+
+
+def _luhn_check(number: str) -> bool:
+    """Validate a number string using the Luhn algorithm."""
+    digits = [int(d) for d in number if d.isdigit()]
+    if len(digits) < 13:
+        return False
+    checksum = 0
+    for i, d in enumerate(reversed(digits)):
+        if i % 2 == 1:
+            d *= 2
+            if d > 9:
+                d -= 9
+        checksum += d
+    return checksum % 10 == 0
+
+
+def _extract_pan_candidates(text: str) -> list[str]:
+    """Extract PAN-like candidates and normalize separators before Luhn checks."""
+    candidates: list[str] = []
+    for match in _PAN_CANDIDATE_RE.findall(text):
+        digits_only = "".join(ch for ch in match if ch.isdigit())
+        if 13 <= len(digits_only) <= 19:
+            candidates.append(digits_only)
+    return candidates
+
+
+@dataclass
+class PIIScanner:
+    """
+    Scans data values for PII using regex patterns.
+
+    Usage:
+        scanner = PIIScanner(patterns=["email", "ssn", "credit_card"])
+        detections = scanner.scan_column("users", "notes", sample_values)
+    """
+
+    patterns: list[str] = field(default_factory=lambda: list(_PII_PATTERNS.keys()))
+    """Which PII patterns to scan for."""
+
+    min_match_rate: float = 0.1
+    """Minimum fraction of values that must match to report a detection (default: 10%)."""
+
+    def scan_column(
+        self,
+        table: str,
+        column: str,
+        values: list[Any],
+    ) -> list[PIIDetection]:
+        """
+        Scan a list of column values for PII patterns.
+
+        Args:
+            table: Table name
+            column: Column name
+            values: Sample of values from the column
+
+        Returns:
+            List of PIIDetection results for patterns that matched
+        """
+        # Only scan string-like values
+        str_values = [str(v) for v in values if v is not None and str(v).strip()]
+        if not str_values:
+            return []
+
+        detections: list[PIIDetection] = []
+        sample_size = len(str_values)
+
+        for pattern_name in self.patterns:
+            if pattern_name not in _PII_PATTERNS:
+                continue
+
+            regex, base_confidence = _PII_PATTERNS[pattern_name]
+            match_count = 0
+
+            for val in str_values:
+                if pattern_name == "credit_card":
+                    pan_candidates = _extract_pan_candidates(val)
+                    if any(_luhn_check(candidate) for candidate in pan_candidates):
+                        match_count += 1
+                else:
+                    matches = regex.findall(val)
+                    if matches:
+                        match_count += 1
+
+            if match_count == 0:
+                continue
+
+            match_rate = match_count / sample_size
+            if match_rate < self.min_match_rate:
+                continue
+
+            # Adjust confidence based on match rate
+            if match_rate >= 0.8:
+                confidence = "high"
+            elif match_rate >= 0.3:
+                confidence = base_confidence
+            else:
+                confidence = "low" if base_confidence == "medium" else "medium"
+
+            detections.append(
+                PIIDetection(
+                    table=table,
+                    column=column,
+                    pattern_name=pattern_name,
+                    match_count=match_count,
+                    sample_size=sample_size,
+                    confidence=confidence,
+                )
+            )
+
+        return detections
+
+    def scan_rows(
+        self,
+        table: str,
+        rows: list[dict[str, Any]],
+        skip_columns: set[str] | None = None,
+    ) -> list[PIIDetection]:
+        """
+        Scan all text columns in a set of rows for PII.
+
+        Args:
+            table: Table name
+            rows: List of row dictionaries
+            skip_columns: Columns to skip (e.g., already anonymized)
+
+        Returns:
+            List of PIIDetection results
+        """
+        if not rows:
+            return []
+
+        skip = skip_columns or set()
+        all_detections: list[PIIDetection] = []
+
+        # Collect values per column
+        columns: dict[str, list[Any]] = {}
+        for row in rows:
+            for col, val in row.items():
+                if col in skip:
+                    continue
+                if val is not None and isinstance(val, (str, int, float)):
+                    columns.setdefault(col, []).append(val)
+
+        for col, values in columns.items():
+            detections = self.scan_column(table, col, values)
+            all_detections.extend(detections)
+
+        return all_detections
diff --git a/src/dbslice/compliance/transformers.py b/src/dbslice/compliance/transformers.py
new file mode 100644
index 0000000..9df44ba
--- /dev/null
+++ b/src/dbslice/compliance/transformers.py
@@ -0,0 +1,183 @@
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any
+
+# Per 45 CFR 164.514(b)(2)(i)(B): Geographic data smaller than state must be
+# removed, EXCEPT the initial 3 digits of a ZIP code may be retained if the
+# geographic unit formed by combining all ZIP codes with the same 3 initial
+# digits contains more than 20,000 people.
+#
+# The following 3-digit ZIP prefixes have population < 20,000 per US Census
+# and must be changed to "000" under Safe Harbor.
+#
+# Source: US Census Bureau, derived from ZCTA population data.
+# These prefixes are stable across census cycles. Last verified: 2020 Census.
+
+_LOW_POPULATION_ZIP3: frozenset[str] = frozenset({
+    "036",  # NH
+    "059",  # MT
+    "063",  # VT/NH
+    "102",  # NY (small area)
+    "203",  # DC (small overlap)
+    "556",  # MN
+    "692",  # NE
+    "790",  # TX (small area)
+    "821",  # WY
+    "823",  # WY
+    "830",  # WY
+    "831",  # WY
+    "878",  # NM
+    "879",  # NM
+    "884",  # NM
+    "890",  # NV
+    "893",  # NV
+})
+
+
+def hipaa_safe_harbor_zip3(value: Any) -> str:
+    """
+    HIPAA Safe Harbor ZIP code transformation.
+
+    Retains only the first 3 digits of a ZIP code. If the 3-digit prefix
+    has population < 20,000 (per Census data), returns "000" instead.
+
+    Per 45 CFR 164.514(b)(2)(i)(B).
+
+    Args:
+        value: Original ZIP code (string or int)
+
+    Returns:
+        3-digit ZIP prefix, or "000" if low-population area
+    """
+    raw = str(value).strip()
+    # Extract digits only (handles "12345-6789" format)
+    digits = re.sub(r"[^0-9]", "", raw)
+    if len(digits) < 3:
+        return "000"
+
+    prefix = digits[:3]
+    if prefix in _LOW_POPULATION_ZIP3:
+        return "000"
+    return prefix
+
+
+def year_only(value: Any) -> str:
+    """
+    HIPAA Safe Harbor date transformation.
+
+    Extracts only the year from a date value. Per 45 CFR 164.514(b)(2)(i)(C),
+    all date elements (except year) must be removed for dates directly related
+    to an individual.
+
+    Args:
+        value: Original date value (date, datetime, string, or int)
+
+    Returns:
+        Year string (e.g., "1985")
+    """
+    if value is None:
+        return ""
+
+    # datetime/date objects
+    if isinstance(value, (datetime.datetime, datetime.date)):
+        return str(value.year)
+
+    raw = str(value).strip()
+
+    # ISO format: 2024-03-15 or 2024-03-15T10:30:00
+    iso_match = re.match(r"(\d{4})-\d{2}-\d{2}", raw)
+    if iso_match:
+        return iso_match.group(1)
+
+    # US format: 03/15/2024 or 03-15-2024
+    us_match = re.match(r"\d{1,2}[/-]\d{1,2}[/-](\d{4})", raw)
+    if us_match:
+        return us_match.group(1)
+
+    # Just a 4-digit year
+    year_match = re.match(r"^(\d{4})$", raw)
+    if year_match:
+        return year_match.group(1)
+
+    # Fallback: try to find any 4-digit year in the string
+    any_year = re.search(r"\b(19|20)\d{2}\b", raw)
+    if any_year:
+        return any_year.group(0)
+
+    return ""
+
+
+def age_bucket(value: Any) -> str:
+    """
+    HIPAA Safe Harbor age bucketing.
+
+    Per 45 CFR 164.514(b)(2)(i)(C), ages over 89 must be aggregated into
+    a single category of "90 or over."
+
+    Args:
+        value: Age as integer or string
+
+    Returns:
+        Original age as string if <= 89, or "90+" if > 89
+    """
+    try:
+        age = int(value)
+    except (ValueError, TypeError):
+        return str(value)
+
+    if age > 89:
+        return "90+"
+    return str(age)
+
+
+_FREETEXT_REDACTION_PATTERNS: list[tuple[re.Pattern[str], str]] = [
+    # Email
+    (re.compile(r"\b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b"), "[REDACTED_EMAIL]"),
+    # SSN
+    (re.compile(r"\b\d{3}-\d{2}-\d{4}\b"), "[REDACTED_SSN]"),
+    # US Phone
+    (re.compile(r"\b(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b"), "[REDACTED_PHONE]"),
+    # Credit card (with separators)
+    (re.compile(r"(?<!\d)(?:\d[\s-]?){13,19}(?!\d)"), "[REDACTED_PAN]"),
+    # IPv4
+    (re.compile(
+        r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b"
+    ), "[REDACTED_IP]"),
+]
+
+
+def redact_freetext(value: Any) -> str:
+    """
+    Inline PII redaction for free-text fields.
+
+    Replaces detected PII patterns with placeholder tokens while preserving
+    the surrounding text structure. This is for NOT NULL text columns where
+    NULLing is not possible.
+
+    Args:
+        value: Original text value
+
+    Returns:
+        Text with PII patterns replaced by [REDACTED_*] placeholders
+    """
+    if value is None:
+        return ""
+
+    text = str(value)
+    for pattern, replacement in _FREETEXT_REDACTION_PATTERNS:
+        text = pattern.sub(replacement, text)
+    return text
+
+
+BINARY_SENTINEL = b"\x00"
+"""Sentinel value for NOT NULL binary columns when compliance requires NULLing."""
+
+
+CUSTOM_TRANSFORMERS: dict[str, Any] = {
+    "hipaa_zip3": hipaa_safe_harbor_zip3,
+    "year_only": year_only,
+    "age_bucket": age_bucket,
+    "redact_freetext": redact_freetext,
+}
diff --git a/src/dbslice/config.py b/src/dbslice/config.py
index 3ae619b..98a2dea 100644
--- a/src/dbslice/config.py
+++ b/src/dbslice/config.py
@@ -323,3 +323,20 @@ class ExtractConfig:
     virtual_foreign_keys: list[VirtualForeignKey] = field(default_factory=list)
     schema: str | None = None  # PostgreSQL schema name (default: public)
     allow_unsafe_where: bool = False
+    compliance_profiles: list[str] = field(default_factory=list)
+    compliance_strict: bool = False  # Fail if uncovered PII detected
+    generate_manifest: bool = False  # Generate audit manifest
+    deterministic: bool = True  # False = non-deterministic anonymization
+    compliance_policy_mode: str = "off"  # off, standard, strict
+    compliance_allowed_url_patterns: list[str] = field(default_factory=list)
+    compliance_denied_url_patterns: list[str] = field(default_factory=list)
+    compliance_required_sslmode: str | None = None
+    compliance_require_ci: bool = False
+    compliance_manifest_sign: bool = False
+    compliance_manifest_key_env: str = "DBSLICE_MANIFEST_SIGNING_KEY"
+    freetext_action: str = "warn"  # warn, null, redact
+    binary_action: str = "warn"  # warn, null, sentinel
+    compliance_sample_rows: int = 100  # PII scan sample size during extract
+    k_anonymity_min_k: int | None = None  # None = disabled, 2+ = check
+    k_anonymity_quasi_identifiers: list[str] = field(default_factory=list)
+    k_anonymity_action: str = "warn"  # warn, fail
diff --git a/src/dbslice/config_file.py b/src/dbslice/config_file.py
index f1a45c6..78a8e3c 100644
--- a/src/dbslice/config_file.py
+++ b/src/dbslice/config_file.py
@@ -30,6 +30,7 @@
     "performance",
     "tables",
     "virtual_foreign_keys",
+    "compliance",
 }
 _DATABASE_KEYS = {"url", "schema", "options"}
 _EXTRACTION_KEYS = {
@@ -48,6 +49,19 @@
     "fields",
     "patterns",
     "security_null_fields",
+    "deterministic",
+}
+_COMPLIANCE_KEYS = {
+    "profiles",
+    "strict",
+    "generate_manifest",
+    "policy_mode",
+    "allow_url_patterns",
+    "deny_url_patterns",
+    "required_sslmode",
+    "require_ci",
+    "sign_manifest",
+    "manifest_key_env",
 }
 _OUTPUT_KEYS = {
     "format",
@@ -233,6 +247,7 @@ def _yaml_quote(value: str) -> str:
     "DatabaseConfig",
     "ExtractionConfig",
     "AnonymizationConfig",
+    "ComplianceConfig",
     "OutputConfig",
     "PerformanceConfig",
     "StreamingConfig",
@@ -326,6 +341,44 @@ class AnonymizationConfig:
     Example: ["users.password*", "*.api_key"]
     """
 
+    deterministic: bool = True
+    """Use deterministic anonymization (same input → same output). Set to false for stronger privacy."""
+
+
+@dataclass
+class ComplianceConfig:
+    """Compliance configuration."""
+
+    profiles: list[str] = field(default_factory=list)
+    """Compliance profiles to apply (e.g., ['gdpr', 'hipaa', 'pci-dss'])."""
+
+    strict: bool = False
+    """Fail extraction if uncovered PII is detected by value scanning."""
+
+    generate_manifest: bool = False
+    """Generate an audit manifest alongside extraction output."""
+
+    policy_mode: str = "off"
+    """Policy gate mode: off, standard, or strict."""
+
+    allow_url_patterns: list[str] = field(default_factory=list)
+    """Allow-list regex patterns for source database URLs."""
+
+    deny_url_patterns: list[str] = field(default_factory=list)
+    """Deny-list regex patterns for source database URLs."""
+
+    required_sslmode: str | None = None
+    """Required PostgreSQL sslmode query parameter value."""
+
+    require_ci: bool = False
+    """Require CI environment for compliance-active extraction."""
+
+    sign_manifest: bool = False
+    """Sign compliance manifests with HMAC."""
+
+    manifest_key_env: str = "DBSLICE_MANIFEST_SIGNING_KEY"
+    """Environment variable name containing HMAC signing key."""
+
 
 @dataclass
 class OutputConfig:
@@ -438,6 +491,7 @@ class DbsliceConfig:
     database: DatabaseConfig = field(default_factory=DatabaseConfig)
     extraction: ExtractionConfig = field(default_factory=ExtractionConfig)
     anonymization: AnonymizationConfig = field(default_factory=AnonymizationConfig)
+    compliance: ComplianceConfig = field(default_factory=ComplianceConfig)
     output: OutputConfig = field(default_factory=OutputConfig)
     performance: PerformanceConfig = field(default_factory=PerformanceConfig)
     tables: dict[str, TableOverride] = field(default_factory=dict)
@@ -583,12 +637,100 @@ def _from_dict(cls, data: dict[str, Any]) -> "DbsliceConfig":
         for pattern in security_null_fields:
             _validate_glob_field_pattern(pattern, "'anonymization.security_null_fields'")
 
+        deterministic_val = anon_data.get("deterministic", True)
+        if not isinstance(deterministic_val, bool):
+            raise ValueError("'anonymization.deterministic' must be true or false")
+
         anonymization = AnonymizationConfig(
             enabled=anon_data.get("enabled", False),
             seed=anon_data.get("seed"),
             fields=fields,
             patterns=patterns,
             security_null_fields=security_null_fields,
+            deterministic=deterministic_val,
+        )
+
+        compliance_data = data.get("compliance", {})
+        if not isinstance(compliance_data, dict):
+            raise ValueError("'compliance' section must be a mapping")
+        _validate_unknown_keys("compliance", compliance_data, _COMPLIANCE_KEYS)
+
+        compliance_profiles_raw = compliance_data.get("profiles", [])
+        if not isinstance(compliance_profiles_raw, list):
+            raise ValueError("'compliance.profiles' must be a list")
+
+        # Validate profile names
+        from dbslice.compliance.profiles import get_profile
+        for profile_name in compliance_profiles_raw:
+            if not isinstance(profile_name, str):
+                raise ValueError("'compliance.profiles' entries must be strings")
+            get_profile(profile_name)  # Raises ValueError if unknown
+
+        compliance_strict = compliance_data.get("strict", False)
+        if not isinstance(compliance_strict, bool):
+            raise ValueError("'compliance.strict' must be true or false")
+        compliance_manifest = compliance_data.get("generate_manifest", False)
+        if not isinstance(compliance_manifest, bool):
+            raise ValueError("'compliance.generate_manifest' must be true or false")
+        compliance_policy_mode = compliance_data.get("policy_mode", "off")
+        if compliance_policy_mode not in {"off", "standard", "strict"}:
+            raise ValueError("'compliance.policy_mode' must be one of: off, standard, strict")
+
+        allow_url_patterns = compliance_data.get("allow_url_patterns", [])
+        if not isinstance(allow_url_patterns, list) or not all(
+            isinstance(item, str) for item in allow_url_patterns
+        ):
+            raise ValueError("'compliance.allow_url_patterns' must be a list of strings")
+        for pattern in allow_url_patterns:
+            try:
+                re.compile(pattern)
+            except re.error as e:
+                raise ValueError(
+                    f"'compliance.allow_url_patterns' contains invalid regex '{pattern}': {e}"
+                ) from e
+
+        deny_url_patterns = compliance_data.get("deny_url_patterns", [])
+        if not isinstance(deny_url_patterns, list) or not all(
+            isinstance(item, str) for item in deny_url_patterns
+        ):
+            raise ValueError("'compliance.deny_url_patterns' must be a list of strings")
+        for pattern in deny_url_patterns:
+            try:
+                re.compile(pattern)
+            except re.error as e:
+                raise ValueError(
+                    f"'compliance.deny_url_patterns' contains invalid regex '{pattern}': {e}"
+                ) from e
+
+        required_sslmode = compliance_data.get("required_sslmode")
+        if required_sslmode is not None and (
+            not isinstance(required_sslmode, str) or not required_sslmode.strip()
+        ):
+            raise ValueError("'compliance.required_sslmode' must be a non-empty string when set")
+
+        require_ci = compliance_data.get("require_ci", False)
+        if not isinstance(require_ci, bool):
+            raise ValueError("'compliance.require_ci' must be true or false")
+
+        sign_manifest = compliance_data.get("sign_manifest", False)
+        if not isinstance(sign_manifest, bool):
+            raise ValueError("'compliance.sign_manifest' must be true or false")
+
+        manifest_key_env = compliance_data.get("manifest_key_env", "DBSLICE_MANIFEST_SIGNING_KEY")
+        if not isinstance(manifest_key_env, str) or not manifest_key_env:
+            raise ValueError("'compliance.manifest_key_env' must be a non-empty string")
+
+        compliance = ComplianceConfig(
+            profiles=compliance_profiles_raw,
+            strict=compliance_strict,
+            generate_manifest=compliance_manifest,
+            policy_mode=compliance_policy_mode,
+            allow_url_patterns=allow_url_patterns,
+            deny_url_patterns=deny_url_patterns,
+            required_sslmode=required_sslmode,
+            require_ci=require_ci,
+            sign_manifest=sign_manifest,
+            manifest_key_env=manifest_key_env,
         )
 
         output_data = data.get("output", {})
@@ -805,6 +947,7 @@ def _from_dict(cls, data: dict[str, Any]) -> "DbsliceConfig":
             database=database,
             extraction=extraction,
             anonymization=anonymization,
+            compliance=compliance,
             output=output,
             performance=performance,
             tables=tables,
@@ -1052,6 +1195,18 @@ def to_extract_config(
             virtual_foreign_keys=virtual_fks,
             schema=final_schema,
             allow_unsafe_where=final_allow_unsafe_where,
+            compliance_profiles=self.compliance.profiles,
+            compliance_strict=self.compliance.strict,
+            generate_manifest=self.compliance.generate_manifest
+            or bool(self.compliance.profiles),
+            deterministic=self.anonymization.deterministic,
+            compliance_policy_mode=self.compliance.policy_mode,
+            compliance_allowed_url_patterns=list(self.compliance.allow_url_patterns),
+            compliance_denied_url_patterns=list(self.compliance.deny_url_patterns),
+            compliance_required_sslmode=self.compliance.required_sslmode,
+            compliance_require_ci=self.compliance.require_ci,
+            compliance_manifest_sign=self.compliance.sign_manifest,
+            compliance_manifest_key_env=self.compliance.manifest_key_env,
         )
 
     def to_yaml(self, include_comments: bool = True) -> str:
@@ -1136,6 +1291,38 @@ def to_yaml(self, include_comments: bool = True) -> str:
             output.append("  security_null_fields:")
             for pattern in self.anonymization.security_null_fields:
                 output.append(f"    - {_yaml_quote(pattern)}")
+        output.append(f"  deterministic: {str(self.anonymization.deterministic).lower()}")
+        if include_comments:
+            output.append(
+                "  # deterministic=false increases privacy but may reduce repeatability"
+            )
+        output.append("")
+
+        if include_comments:
+            output.append("# Compliance settings")
+        output.append("compliance:")
+        if self.compliance.profiles:
+            output.append("  profiles:")
+            for profile in self.compliance.profiles:
+                output.append(f"    - {profile}")
+        else:
+            output.append("  profiles: []")
+        output.append(f"  strict: {str(self.compliance.strict).lower()}")
+        output.append(f"  generate_manifest: {str(self.compliance.generate_manifest).lower()}")
+        output.append(f"  policy_mode: {_yaml_quote(self.compliance.policy_mode)}")
+        if self.compliance.allow_url_patterns:
+            output.append("  allow_url_patterns:")
+            for pattern in self.compliance.allow_url_patterns:
+                output.append(f"    - {_yaml_quote(pattern)}")
+        if self.compliance.deny_url_patterns:
+            output.append("  deny_url_patterns:")
+            for pattern in self.compliance.deny_url_patterns:
+                output.append(f"    - {_yaml_quote(pattern)}")
+        if self.compliance.required_sslmode:
+            output.append(f"  required_sslmode: {self.compliance.required_sslmode}")
+        output.append(f"  require_ci: {str(self.compliance.require_ci).lower()}")
+        output.append(f"  sign_manifest: {str(self.compliance.sign_manifest).lower()}")
+        output.append(f"  manifest_key_env: {self.compliance.manifest_key_env}")
         output.append("")
 
         if include_comments:
diff --git a/src/dbslice/core/engine.py b/src/dbslice/core/engine.py
index b447851..e9f9266 100644
--- a/src/dbslice/core/engine.py
+++ b/src/dbslice/core/engine.py
@@ -117,18 +117,52 @@ def __init__(
         self.adapter: DatabaseAdapter | None = None
         self.schema: SchemaGraph | None = None
         self.progress_callback = progress_callback
+        self.manifest: Any = None  # ComplianceManifest or None
+
+        if config.generate_manifest or config.compliance_profiles:
+            from dbslice.compliance.manifest import ComplianceManifest
+
+            self.manifest = ComplianceManifest()
+            import uuid
+
+            self.manifest.initialize(
+                extraction_id=str(uuid.uuid4()),
+                compliance_profiles=config.compliance_profiles,
+                anonymization_seed=config.anonymization_seed,
+                deterministic=config.deterministic,
+            )
+
+        effective_field_providers = dict(config.anonymization_field_providers)
+        effective_patterns = dict(config.anonymization_patterns)
+        effective_profile_patterns: dict[str, str] = {}
+        effective_security_null = list(config.security_null_fields)
+        if config.compliance_profiles:
+            from dbslice.compliance.profiles import get_profile
+
+            for profile_name in config.compliance_profiles:
+                profile = get_profile(profile_name)
+                # Merge profile patterns as wildcard fallback rules (lower priority than user rules).
+                for pattern, method in profile.required_column_patterns.items():
+                    effective_profile_patterns.setdefault(f"*.{pattern}*", method)
+                for null_pattern in profile.required_null_patterns:
+                    glob = f"*.{null_pattern}*"
+                    if glob not in effective_security_null:
+                        effective_security_null.append(glob)
 
-        # Initialize anonymizer if needed (schema will be set after introspection)
         self.anonymizer: DeterministicAnonymizer | None = None
-        if config.anonymize or config.redact_fields:
+        needs_anonymize = config.anonymize or config.redact_fields or config.compliance_profiles
+        if needs_anonymize:
             self.anonymizer = DeterministicAnonymizer(
-                seed=config.anonymization_seed or DEFAULT_ANONYMIZATION_SEED
+                seed=config.anonymization_seed or DEFAULT_ANONYMIZATION_SEED,
+                deterministic=config.deterministic,
+                manifest=self.manifest,
             )
             self.anonymizer.configure(
                 config.redact_fields,
-                field_providers=config.anonymization_field_providers,
-                patterns=config.anonymization_patterns,
-                security_null_fields=config.security_null_fields,
+                field_providers=effective_field_providers,
+                patterns=effective_patterns,
+                fallback_patterns=effective_profile_patterns,
+                security_null_fields=effective_security_null,
             )
 
     def _log(self, stage: str, message: str, current: int = 0, total: int = 0) -> None:
@@ -371,7 +405,6 @@ def _do_extract(self, db_type: DatabaseType) -> ExtractionResult:
         logger.info("Starting data fetch phase", table_count=len(all_records))
 
         tables_data: dict[str, list[dict[str, Any]]] = {}
-        stats: dict[str, int] = {}
 
         total_tables = len(all_records)
         for i, (table, pk_values) in enumerate(all_records.items()):
@@ -392,29 +425,47 @@ def _do_extract(self, db_type: DatabaseType) -> ExtractionResult:
             ):
                 rows = list(self.adapter.fetch_by_pk(table, pk_columns, pk_values))
 
-            # Anonymize if enabled
-            if self.anonymizer:
-                with logger.timed_operation("anonymize_table_data", table=table):
-                    rows = self._anonymize_table_data(table, rows)
-                logger.debug("Table data anonymized", table=table, row_count=len(rows))
-
             tables_data[table] = rows
-            stats[table] = len(rows)
             logger.debug("Table data fetched", table=table, row_count=len(rows))
 
         if self._has_row_limits():
             self._log("limits", "Applying deterministic row limits with integrity closure...")
             with logger.timed_operation("apply_row_limits"):
                 tables_data = self._apply_row_limits(tables_data)
-            stats = {table: len(rows) for table, rows in tables_data.items()}
             logger.info(
                 "Row limits applied",
                 global_limit=self.config.row_limit_global,
                 per_table_limits=len(self.config.row_limit_per_table),
-                total_rows=sum(stats.values()),
+                total_rows=sum(len(rows) for rows in tables_data.values()),
             )
             self._log("limits", "Row limits applied")
 
+        scan_pre_mask_data: dict[str, list[dict[str, Any]]] | None = None
+        if self.config.compliance_profiles and self.anonymizer:
+            # Pre-mask snapshot used for coverage scan decisions.
+            scan_pre_mask_data = {
+                table: [dict(row) for row in rows] for table, rows in tables_data.items()
+            }
+
+        if self.anonymizer:
+            self._log("anonymize", "Applying anonymization rules...")
+            total_tables = len(tables_data)
+            for i, table in enumerate(sorted(tables_data.keys())):
+                rows = tables_data[table]
+                if not rows:
+                    continue
+                self._log(
+                    "anonymize",
+                    f"Anonymizing {len(rows)} rows in {table}",
+                    i + 1,
+                    total_tables,
+                )
+                with logger.timed_operation("anonymize_table_data", table=table):
+                    tables_data[table] = self._anonymize_table_data(table, rows)
+                logger.debug("Table data anonymized", table=table, row_count=len(rows))
+
+        stats: dict[str, int] = {table: len(rows) for table, rows in tables_data.items()}
+
         deferred_updates = []
         if broken_fks:
             from dbslice.core.cycles import build_deferred_updates
@@ -472,6 +523,19 @@ def _do_extract(self, db_type: DatabaseType) -> ExtractionResult:
                     )
                     raise ExtractionError(error_msg)
 
+        if self.config.compliance_profiles and self.schema:
+            self._apply_freetext_and_binary_handling(tables_data)
+
+        if self.config.compliance_profiles and self.anonymizer and scan_pre_mask_data is not None:
+            self._run_pii_scan(scan_pre_mask_data, tables_data)
+
+        if self.config.k_anonymity_min_k is not None:
+            self._check_k_anonymity(tables_data)
+
+        if self.manifest:
+            for table, rows in tables_data.items():
+                self.manifest.set_table_row_count(table, len(rows))
+
         return ExtractionResult(
             tables=tables_data,
             insert_order=insert_order,
@@ -504,6 +568,276 @@ def _anonymize_table_data(self, table: str, rows: list[dict[str, Any]]) -> list[
 
         return [self.anonymizer.anonymize_row(table, row) for row in rows]
 
+    def _run_pii_scan(
+        self,
+        pre_mask_data: dict[str, list[dict[str, Any]]],
+        post_mask_data: dict[str, list[dict[str, Any]]],
+    ) -> None:
+        """
+        Run two-phase compliance value scanning.
+
+        1) Coverage scan (pre-mask): identify where PII exists in extracted values.
+        2) Residual scan (post-mask): re-scan only columns not expected to be protected.
+        """
+        from dbslice.compliance.profiles import get_profile
+        from dbslice.compliance.scanner import PIIScanner
+
+        assert self.anonymizer is not None
+
+        # Collect all scan patterns from active profiles
+        scan_patterns: set[str] = set()
+        freetext_patterns: set[str] = set()
+        for profile_name in self.config.compliance_profiles:
+            profile = get_profile(profile_name)
+            scan_patterns.update(profile.value_scan_patterns)
+            freetext_patterns.update(profile.warn_freetext_columns)
+
+        if not scan_patterns:
+            return
+
+        scanner = PIIScanner(patterns=sorted(scan_patterns))
+        self._log("compliance", "Running compliance coverage scan...")
+
+        coverage_detections = []
+        for table, rows in pre_mask_data.items():
+            if not rows:
+                continue
+            sample = rows[:100]
+            detections = scanner.scan_rows(table, sample)
+            coverage_detections.extend(detections)
+
+            # Check for freetext columns that might contain embedded PII
+            if freetext_patterns:
+                for col in rows[0].keys():
+                    col_lower = col.lower()
+                    for pattern in freetext_patterns:
+                        if pattern in col_lower:
+                            if self.manifest:
+                                self.manifest.add_warning(
+                                    table, col,
+                                    f"Free-text column may contain embedded PII (matched pattern: {pattern})",
+                                )
+                            break
+
+        unprotected_columns: dict[str, set[str]] = {}
+        for detection in coverage_detections:
+            is_protected = self.anonymizer.should_anonymize(
+                detection.table, detection.column
+            ) or self.anonymizer.should_null(detection.table, detection.column)
+            if not is_protected:
+                unprotected_columns.setdefault(detection.table, set()).add(detection.column)
+                if self.manifest:
+                    self.manifest.add_warning(
+                        detection.table,
+                        detection.column,
+                        "PII detected in coverage scan but field is not configured for masking",
+                    )
+
+        residual_detections = []
+        if unprotected_columns:
+            self._log("compliance", "Running compliance residual scan...")
+            for table, rows in post_mask_data.items():
+                if not rows:
+                    continue
+                columns_to_scan = unprotected_columns.get(table)
+                if not columns_to_scan:
+                    continue
+                sample = rows[:100]
+                skip_columns = {col for col in rows[0].keys() if col not in columns_to_scan}
+                detections = scanner.scan_rows(table, sample, skip_columns=skip_columns)
+                residual_detections.extend(detections)
+
+        if self.manifest:
+            self.manifest.add_pii_detections(residual_detections)
+
+        if residual_detections:
+            logger.warning(
+                "Residual PII detected in post-mask scan",
+                detection_count=len(residual_detections),
+                tables_affected=len({d.table for d in residual_detections}),
+            )
+            self._log(
+                "compliance",
+                f"Residual scan found {len(residual_detections)} unprotected PII detection(s)",
+            )
+
+            if self.config.compliance_strict:
+                detection_details = [
+                    f"  {d.table}.{d.column}: {d.pattern_name} ({d.match_count}/{d.sample_size} matches, {d.confidence} confidence)"
+                    for d in residual_detections
+                ]
+                raise ExtractionError(
+                    "Compliance strict mode: residual unprotected PII detected after masking.\n"
+                    + "\n".join(detection_details)
+                )
+        else:
+            if coverage_detections:
+                self._log(
+                    "compliance",
+                    "Coverage scan detected PII in source values; residual scan is clean",
+                )
+            else:
+                self._log("compliance", "Coverage scan clean: no PII detected in sampled values")
+
+    def _apply_freetext_and_binary_handling(
+        self, tables_data: dict[str, list[dict[str, Any]]]
+    ) -> None:
+        """Apply free-text redaction and binary column handling based on compliance config."""
+        assert self.schema is not None
+
+        from dbslice.compliance.profiles import get_profile
+        from dbslice.compliance.transformers import BINARY_SENTINEL, redact_freetext
+
+        freetext_action = self.config.freetext_action
+        binary_action = self.config.binary_action
+
+        # Collect freetext column patterns from active profiles
+        freetext_patterns: set[str] = set()
+        for profile_name in self.config.compliance_profiles:
+            profile = get_profile(profile_name)
+            freetext_patterns.update(profile.warn_freetext_columns)
+
+        # Binary-like PostgreSQL types
+        binary_types = {"bytea", "blob", "binary", "varbinary", "image", "lo"}
+
+        for table, rows in tables_data.items():
+            if not rows:
+                continue
+            table_info = self.schema.get_table(table)
+            if not table_info:
+                continue
+
+            for col_obj in table_info.columns:
+                col = col_obj.name
+                col_lower = col.lower()
+                col_type_lower = col_obj.data_type.lower()
+
+                # Binary column handling
+                if any(bt in col_type_lower for bt in binary_types):
+                    if binary_action == "null":
+                        for row in rows:
+                            if col in row:
+                                row[col] = None
+                    elif binary_action == "sentinel":
+                        for row in rows:
+                            if col in row and row[col] is not None:
+                                if col_obj.nullable:
+                                    row[col] = None
+                                else:
+                                    row[col] = BINARY_SENTINEL
+                    if self.manifest:
+                        self.manifest.add_warning(
+                            table, col,
+                            f"Binary column ({col_obj.data_type}) handled with action={binary_action}",
+                        )
+                    continue
+
+                # Free-text column handling
+                is_freetext = any(pat in col_lower for pat in freetext_patterns)
+                if not is_freetext:
+                    continue
+
+                # Skip columns already handled by anonymizer
+                if self.anonymizer and (
+                    self.anonymizer.should_anonymize(table, col)
+                    or self.anonymizer.should_null(table, col)
+                ):
+                    continue
+
+                if freetext_action == "null":
+                    for row in rows:
+                        if col in row:
+                            if col_obj.nullable:
+                                row[col] = None
+                            else:
+                                # NOT NULL: fall back to redact
+                                row[col] = redact_freetext(row[col])
+                elif freetext_action == "redact":
+                    for row in rows:
+                        if col in row and row[col] is not None:
+                            row[col] = redact_freetext(row[col])
+
+                if self.manifest and freetext_action != "warn":
+                    effective = freetext_action
+                    if freetext_action == "null" and not col_obj.nullable:
+                        effective = "redact (NOT NULL fallback)"
+                    self.manifest.add_warning(
+                        table, col,
+                        f"Free-text column handled with action={effective}",
+                    )
+
+    def _check_k_anonymity(self, tables_data: dict[str, list[dict[str, Any]]]) -> None:
+        """
+        Post-extraction k-anonymity verification.
+
+        Checks that every combination of configured quasi-identifiers appears
+        at least k times in the output. Fail-only — does not modify data.
+        """
+        min_k = self.config.k_anonymity_min_k
+        qi_specs = self.config.k_anonymity_quasi_identifiers
+        action = self.config.k_anonymity_action
+
+        if not min_k or not qi_specs:
+            return
+
+        self._log("compliance", f"Running k-anonymity check (k={min_k})...")
+
+        # Parse quasi-identifier specs: "table.column" format
+        qi_by_table: dict[str, list[str]] = {}
+        for spec in qi_specs:
+            parts = spec.split(".", 1)
+            if len(parts) == 2:
+                qi_by_table.setdefault(parts[0].lower(), []).append(parts[1].lower())
+
+        violations: list[str] = []
+        for table, qi_columns in qi_by_table.items():
+            rows = tables_data.get(table, [])
+            if not rows:
+                continue
+
+            # Check which columns actually exist
+            available = {c.lower() for c in rows[0].keys()}
+            active_qi = [c for c in qi_columns if c in available]
+            if not active_qi:
+                continue
+
+            # Count combinations
+            from collections import Counter
+
+            combos = Counter(
+                tuple(str(row.get(c, "")) for c in active_qi)
+                for row in rows
+            )
+
+            for combo, count in combos.items():
+                if count < min_k:
+                    combo_str = ", ".join(f"{c}={v}" for c, v in zip(active_qi, combo))
+                    violations.append(f"{table}: [{combo_str}] appears {count} time(s)")
+
+        if not violations:
+            self._log("compliance", f"k-anonymity check passed (k={min_k})")
+            return
+
+        msg = f"k-anonymity violation: {len(violations)} combination(s) appear fewer than {min_k} times"
+        logger.warning(msg, violation_count=len(violations), min_k=min_k)
+
+        if self.manifest:
+            for v in violations[:50]:
+                self.manifest.add_warning("_k_anonymity", "quasi_identifiers", v)
+
+        detail_lines = [f"  {v}" for v in violations[:20]]
+        if len(violations) > 20:
+            detail_lines.append(f"  ... and {len(violations) - 20} more")
+
+        self._log("compliance", msg)
+
+        if action == "fail":
+            raise ExtractionError(
+                f"k-anonymity check failed (k={min_k}): "
+                f"{len(violations)} quasi-identifier combination(s) are unique or below threshold.\n"
+                + "\n".join(detail_lines)
+            )
+
     def _has_row_limits(self) -> bool:
         """Check whether any row-limit configuration is active."""
         return self.config.row_limit_global is not None or bool(self.config.row_limit_per_table)
diff --git a/src/dbslice/mapping/__init__.py b/src/dbslice/mapping/__init__.py
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/src/dbslice/mapping/__init__.py
@@ -0,0 +1 @@
+
diff --git a/src/dbslice/mapping/server.py b/src/dbslice/mapping/server.py
new file mode 100644
index 0000000..dd904e0
--- /dev/null
+++ b/src/dbslice/mapping/server.py
@@ -0,0 +1,489 @@
+from __future__ import annotations
+
+import inspect
+import json
+import secrets
+import threading
+from http.server import BaseHTTPRequestHandler, HTTPServer
+from typing import Any
+from urllib.parse import parse_qs, urlparse
+
+from dbslice.logging import get_logger
+from dbslice.models import SchemaGraph
+
+logger = get_logger(__name__)
+
+
+class MappingServer:
+    """Local mapping UI HTTP server."""
+
+    def __init__(
+        self,
+        port: int = 9473,
+        database_url: str | None = None,
+        schema: str | None = None,
+    ):
+        self.port = port
+        self.database_url = database_url
+        self.schema_name = schema
+        self.token = secrets.token_urlsafe(32)
+        self._server: HTTPServer | None = None
+        self._cached_schema: SchemaGraph | None = None
+        self._cached_adapter: Any = None
+
+    @property
+    def url(self) -> str:
+        return f"http://127.0.0.1:{self.port}?token={self.token}"
+
+    def start(self, open_browser: bool = True) -> None:
+        """Start the server and optionally open a browser."""
+        handler = _make_handler(self)
+        self._server = HTTPServer(("127.0.0.1", self.port), handler)
+
+        if open_browser:
+            import webbrowser
+
+            threading.Timer(0.5, webbrowser.open, args=[self.url]).start()
+
+        logger.info("Mapping UI server starting", url=self.url)
+        try:
+            self._server.serve_forever()
+        except KeyboardInterrupt:
+            pass
+        finally:
+            self._server.server_close()
+            if self._cached_adapter:
+                try:
+                    self._cached_adapter.close()
+                except Exception:
+                    pass
+
+    def _introspect(self, database_url: str, schema: str | None, detect_sensitive: bool) -> dict:
+        """Connect to database and introspect schema."""
+        from dbslice.adapters.postgresql import PostgreSQLAdapter
+        from dbslice.compliance.profiles import list_profiles
+        from dbslice.input_validators import validate_database_url
+        from dbslice.utils.anonymizer import (
+            _SECURITY_NULL_PATTERNS,
+        )
+        from dbslice.utils.connection import parse_database_url
+
+        validate_database_url(database_url)
+        db_config = parse_database_url(database_url)
+
+        if self._cached_adapter:
+            try:
+                self._cached_adapter.close()
+            except Exception:
+                pass
+
+        adapter = PostgreSQLAdapter(schema=schema)
+        adapter.connect(database_url)
+        self._cached_adapter = adapter
+
+        db_schema = adapter.get_schema()
+        self._cached_schema = db_schema
+
+        tables = []
+        sensitive_suggestions: dict[str, str] = {}
+
+        if detect_sensitive:
+            sensitive_patterns = {
+                "email": "email",
+                "e_mail": "email",
+                "email_address": "email",
+                "phone": "phone_number",
+                "telephone": "phone_number",
+                "mobile": "phone_number",
+                "cell": "phone_number",
+                "first_name": "first_name",
+                "firstname": "first_name",
+                "last_name": "last_name",
+                "lastname": "last_name",
+                "full_name": "name",
+                "fullname": "name",
+                "address": "address",
+                "street": "street_address",
+                "city": "city",
+                "postal_code": "postcode",
+                "zipcode": "postcode",
+                "ssn": "ssn",
+                "social_security": "ssn",
+                "passport": "passport_number",
+                "driver_license": "license_plate",
+                "credit_card": "credit_card_number",
+                "card_number": "credit_card_number",
+                "ip_address": "ipv4",
+                "ip": "ipv4",
+                "ipv4": "ipv4",
+                "ipv6": "ipv6",
+                "dob": "date_of_birth",
+                "date_of_birth": "date_of_birth",
+                "username": "user_name",
+            }
+            for table_name, table in db_schema.tables.items():
+                for column in table.columns:
+                    col_lower = column.name.lower()
+                    if col_lower in sensitive_patterns:
+                        sensitive_suggestions[f"{table_name}.{column.name}"] = sensitive_patterns[
+                            col_lower
+                        ]
+                    else:
+                        for pattern, provider in sensitive_patterns.items():
+                            if pattern in col_lower:
+                                sensitive_suggestions[f"{table_name}.{column.name}"] = provider
+                                break
+
+        fk_columns: set[tuple[str, str]] = set()
+        for fk in db_schema.edges:
+            for col in fk.source_columns:
+                fk_columns.add((fk.source_table, col))
+
+        null_columns: set[str] = set()
+        for tbl_name, tbl in db_schema.tables.items():
+            for col_obj in tbl.columns:
+                col_lower = col_obj.name.lower()
+                for pat in _SECURITY_NULL_PATTERNS:
+                    if pat in col_lower:
+                        null_columns.add(f"{tbl_name}.{col_obj.name}")
+                        break
+
+        from dbslice.models import Column as ColumnModel
+
+        for table_name in sorted(db_schema.tables.keys()):
+            table_info = db_schema.tables[table_name]
+            columns: list[dict[str, Any]] = []
+            col_obj2: ColumnModel
+            for col_obj2 in table_info.columns:
+                full_name = f"{table_name}.{col_obj2.name}"
+                is_fk = (table_name, col_obj2.name) in fk_columns
+                suggested = sensitive_suggestions.get(full_name)
+                is_null_target = full_name in null_columns
+
+                action = "keep"
+                provider = ""
+                if is_fk:
+                    action = "locked_fk"
+                elif col_obj2.is_primary_key:
+                    action = "locked_pk"
+                elif is_null_target:
+                    action = "null"
+                elif suggested:
+                    action = "anonymize"
+                    provider = suggested
+
+                columns.append(
+                    {
+                        "name": col_obj2.name,
+                        "data_type": col_obj2.data_type,
+                        "nullable": col_obj2.nullable,
+                        "is_pk": col_obj2.is_primary_key,
+                        "is_fk": is_fk,
+                        "suggested_action": action,
+                        "suggested_provider": provider,
+                    }
+                )
+
+            tables.append(
+                {
+                    "name": table_name,
+                    "primary_key": list(table_info.primary_key),
+                    "columns": columns,
+                }
+            )
+
+        profiles = [
+            {"name": p.name, "display_name": p.display_name, "description": p.description}
+            for p in list_profiles()
+        ]
+
+        common_providers = [
+            "email",
+            "phone_number",
+            "first_name",
+            "last_name",
+            "name",
+            "address",
+            "street_address",
+            "city",
+            "zipcode",
+            "ssn",
+            "credit_card_number",
+            "ipv4",
+            "ipv6",
+            "company",
+            "url",
+            "date_of_birth",
+            "user_name",
+            "passport_number",
+            "iban",
+            "pystr",
+            "random_int",
+            "year_only",
+            "hipaa_zip3",
+            "age_bucket",
+            "redact_freetext",
+        ]
+
+        return {
+            "database": db_config.database,
+            "table_count": len(tables),
+            "tables": tables,
+            "sensitive_suggestions": sensitive_suggestions,
+            "compliance_profiles": profiles,
+            "common_providers": common_providers,
+        }
+
+    def _apply_profile(self, profile_name: str, current_mappings: dict) -> dict:
+        """Apply a compliance profile's patterns to the current schema."""
+        from dbslice.compliance.profiles import get_profile
+
+        profile = get_profile(profile_name)
+        if not self._cached_schema:
+            return {"error": "No schema loaded. Run introspection first."}
+
+        additions: dict[str, str] = {}
+        null_additions: list[str] = []
+
+        for table_name, table in self._cached_schema.tables.items():
+            for column in table.columns:
+                full_name = f"{table_name}.{column.name}"
+                if full_name in current_mappings:
+                    continue
+
+                col_lower = column.name.lower()
+
+                for pat in profile.required_null_patterns:
+                    if pat in col_lower:
+                        null_additions.append(full_name)
+                        break
+                else:
+                    for pat, method in profile.required_column_patterns.items():
+                        if pat in col_lower:
+                            additions[full_name] = method
+                            break
+
+        return {
+            "profile": profile_name,
+            "display_name": profile.display_name,
+            "field_additions": additions,
+            "null_additions": null_additions,
+            "identifiers_covered": profile.identifiers,
+        }
+
+    @staticmethod
+    def _generate_config(mappings: dict) -> dict:
+        """Generate YAML config from column mappings."""
+        fields: dict[str, str] = {}
+        null_fields: list[str] = []
+
+        for full_name, action_data in mappings.items():
+            action = action_data.get("action", "keep")
+            if action == "anonymize":
+                provider = action_data.get("provider", "pystr")
+                fields[full_name] = provider
+            elif action == "null":
+                null_fields.append(full_name)
+
+        lines = [
+            "# Generated by dbslice map",
+            "",
+            "database:",
+            "  url: ${DATABASE_URL}",
+            "",
+            "anonymization:",
+            "  enabled: true",
+        ]
+
+        if fields:
+            lines.append("  fields:")
+            for field_name, provider in sorted(fields.items()):
+                lines.append(f"    {field_name}: {provider}")
+
+        if null_fields:
+            lines.append("  security_null_fields:")
+            for field_name in sorted(null_fields):
+                lines.append(f"    - {field_name}")
+
+        lines.extend(
+            [
+                "",
+                "extraction:",
+                "  default_depth: 3",
+                "  direction: both",
+                "  validate: true",
+                "",
+                "output:",
+                "  format: sql",
+                "  include_transaction: true",
+            ]
+        )
+
+        yaml_content = "\n".join(lines) + "\n"
+
+        cmd = 'dbslice extract --config dbslice.yaml --seed "<table.column=value>"'
+
+        return {
+            "yaml": yaml_content,
+            "command_template": cmd,
+            "field_count": len(fields),
+            "null_count": len(null_fields),
+        }
+
+    @staticmethod
+    def _validate_provider(provider: str) -> dict:
+        """Validate a Faker provider name."""
+        from dbslice.compliance.transformers import CUSTOM_TRANSFORMERS
+
+        if provider in CUSTOM_TRANSFORMERS:
+            return {"valid": True, "provider": provider, "source": "custom_transformer"}
+
+        try:
+            from faker import Faker
+        except ImportError:
+            return {"valid": False, "error": "Faker not installed"}
+
+        fake = Faker()
+        method = getattr(fake, provider, None)
+        if method is None or not callable(method):
+            return {"valid": False, "error": f"Unknown provider '{provider}'"}
+
+        try:
+            sig = inspect.signature(method)
+            for param in sig.parameters.values():
+                if param.kind in (inspect.Parameter.VAR_POSITIONAL, inspect.Parameter.VAR_KEYWORD):
+                    continue
+                if param.default is inspect.Parameter.empty:
+                    return {
+                        "valid": False,
+                        "error": f"Provider '{provider}' requires argument '{param.name}'",
+                    }
+        except (TypeError, ValueError):
+            pass
+
+        return {"valid": True, "provider": provider, "source": "faker"}
+
+
+def _make_handler(server: MappingServer):
+    """Create a request handler class bound to the server instance."""
+
+    class Handler(BaseHTTPRequestHandler):
+        def log_message(self, format, *args):
+            pass
+
+        def _check_token(self) -> bool:
+            token = self.headers.get("X-DBSLICE-Token")
+            if token != server.token:
+                self._json_error(403, "Invalid or missing session token")
+                return False
+            return True
+
+        def _json_response(self, data: dict, status: int = 200) -> None:
+            body = json.dumps(data, default=str).encode("utf-8")
+            self.send_response(status)
+            self.send_header("Content-Type", "application/json")
+            self.send_header("Content-Length", str(len(body)))
+            self.end_headers()
+            self.wfile.write(body)
+
+        def _json_error(self, status: int, message: str) -> None:
+            self._json_response({"error": message}, status)
+
+        def _read_json_body(self) -> dict[str, Any] | None:
+            length = int(self.headers.get("Content-Length", 0))
+            if length == 0:
+                self._json_error(400, "Empty request body")
+                return None
+            try:
+                result: dict[str, Any] = json.loads(self.rfile.read(length))
+                return result
+            except json.JSONDecodeError:
+                self._json_error(400, "Invalid JSON")
+                return None
+
+        def do_GET(self) -> None:
+            parsed = urlparse(self.path)
+
+            if parsed.path == "/" or parsed.path == "":
+                query = parse_qs(parsed.query)
+                url_token = query.get("token", [None])[0]
+                if url_token != server.token:
+                    self.send_response(403)
+                    self.send_header("Content-Type", "text/plain")
+                    self.end_headers()
+                    self.wfile.write(b"Invalid session token")
+                    return
+
+                from dbslice.mapping.ui import get_ui_html
+
+                html = get_ui_html(server.token, server.database_url or "")
+                body = html.encode("utf-8")
+                self.send_response(200)
+                self.send_header("Content-Type", "text/html; charset=utf-8")
+                self.send_header("Content-Length", str(len(body)))
+                self.end_headers()
+                self.wfile.write(body)
+            else:
+                self._json_error(404, "Not found")
+
+        def do_POST(self) -> None:
+            parsed = urlparse(self.path)
+
+            if not self._check_token():
+                return
+
+            if parsed.path == "/api/introspect":
+                body = self._read_json_body()
+                if body is None:
+                    return
+                try:
+                    result = server._introspect(
+                        database_url=body.get("database_url", ""),
+                        schema=body.get("schema"),
+                        detect_sensitive=body.get("detect_sensitive", True),
+                    )
+                    self._json_response(result)
+                except Exception as e:
+                    self._json_error(400, str(e))
+
+            elif parsed.path == "/api/apply-profile":
+                body = self._read_json_body()
+                if body is None:
+                    return
+                try:
+                    result = server._apply_profile(
+                        profile_name=body.get("profile", ""),
+                        current_mappings=body.get("current_mappings", {}),
+                    )
+                    self._json_response(result)
+                except Exception as e:
+                    self._json_error(400, str(e))
+
+            elif parsed.path == "/api/generate-config":
+                body = self._read_json_body()
+                if body is None:
+                    return
+                try:
+                    result = server._generate_config(
+                        mappings=body.get("mappings", {}),
+                    )
+                    self._json_response(result)
+                except Exception as e:
+                    self._json_error(400, str(e))
+
+            elif parsed.path == "/api/validate-provider":
+                body = self._read_json_body()
+                if body is None:
+                    return
+                try:
+                    result = server._validate_provider(
+                        provider=body.get("provider", ""),
+                    )
+                    self._json_response(result)
+                except Exception as e:
+                    self._json_error(400, str(e))
+
+            else:
+                self._json_error(404, "Not found")
+
+    return Handler
diff --git a/src/dbslice/mapping/static/index.html b/src/dbslice/mapping/static/index.html
new file mode 100644
index 0000000..d8b6b57
--- /dev/null
+++ b/src/dbslice/mapping/static/index.html
@@ -0,0 +1,591 @@
+<!DOCTYPE html>
+<html lang="en" class="h-full">
+<head>
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<title>dbslice — Column Mapping</title>
+<script src="https://cdn.tailwindcss.com"></script>
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=DM+Sans:ital,opsz,wght@0,9..40,300;0,9..40,400;0,9..40,500;0,9..40,600;1,9..40,400&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
+<script>
+tailwind.config = {
+  darkMode: 'class',
+  theme: {
+    extend: {
+      fontFamily: {
+        sans: ['DM Sans', 'system-ui', 'sans-serif'],
+        mono: ['JetBrains Mono', 'Consolas', 'monospace'],
+      },
+      colors: {
+        surface: { 950:'#0a0e17', 900:'#0f1420', 850:'#141a28', 800:'#1a2133', 700:'#243049', 600:'#2e3f5e' },
+        accent: { 400:'#2dd4bf', 500:'#14b8a6', 600:'#0d9488', 700:'#0f766e', 900:'#0c3b36' },
+      },
+      fontSize: { '2xs':'0.65rem' },
+    }
+  }
+}
+</script>
+<style>
+  .no-scrollbar::-webkit-scrollbar{display:none}
+  .no-scrollbar{-ms-overflow-style:none;scrollbar-width:none}
+  @keyframes enter{from{opacity:0;transform:translateY(6px)}to{opacity:1;transform:translateY(0)}}
+  @keyframes fadeIn{from{opacity:0}to{opacity:1}}
+  @keyframes pulse-dot{0%,100%{opacity:1}50%{opacity:.4}}
+  .animate-enter{animation:enter .25s ease-out both}
+  .stagger-1{animation-delay:30ms}.stagger-2{animation-delay:60ms}.stagger-3{animation-delay:90ms}
+  .stagger-4{animation-delay:120ms}.stagger-5{animation-delay:150ms}
+  /* YAML highlighting tokens */
+  .yk{color:#2dd4bf}.yv{color:#a5f3fc}.yc{color:#475569;font-style:italic}.yd{color:#fbbf24}
+  /* Subtle row hover glow */
+  .row-glow:hover{background:rgba(45,212,191,.03)}
+  /* Custom scrollbar for main content */
+  .custom-scroll::-webkit-scrollbar{width:6px}
+  .custom-scroll::-webkit-scrollbar-track{background:transparent}
+  .custom-scroll::-webkit-scrollbar-thumb{background:#243049;border-radius:3px}
+  .custom-scroll::-webkit-scrollbar-thumb:hover{background:#2e3f5e}
+  /* Focus ring */
+  *:focus-visible{outline:2px solid #2dd4bf;outline-offset:2px;border-radius:4px}
+  /* Select styling */
+  select{background-image:url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='12' height='12' viewBox='0 0 12 12'%3E%3Cpath fill='%23475569' d='M3 5l3 3 3-3'/%3E%3C/svg%3E");background-repeat:no-repeat;background-position:right 8px center;-webkit-appearance:none;appearance:none;padding-right:24px}
+</style>
+</head>
+<body class="h-full bg-surface-950 text-gray-300 text-[13px] font-sans dark antialiased">
+
+<!-- ═══ HELP MODAL ═══ -->
+<div id="helpModal" class="hidden fixed inset-0 z-50 flex items-center justify-center animate-[fadeIn_.12s]" onclick="if(event.target===this)closeHelp()">
+  <div class="absolute inset-0 bg-black/70 backdrop-blur-sm"></div>
+  <div class="relative bg-surface-900 border border-surface-700 rounded-2xl shadow-2xl shadow-black/40 max-w-xl w-full mx-4 max-h-[85vh] overflow-y-auto custom-scroll animate-enter" role="dialog" aria-label="Help">
+    <div class="sticky top-0 z-10 flex items-center justify-between px-6 py-4 border-b border-surface-800 bg-surface-900/95 backdrop-blur-sm rounded-t-2xl">
+      <div>
+        <h2 class="text-[15px] font-semibold text-white">Getting Started</h2>
+        <p class="text-2xs text-gray-500 mt-0.5">5 steps to generate your anonymization config</p>
+      </div>
+      <button onclick="closeHelp()" class="h-7 w-7 flex items-center justify-center rounded-lg text-gray-500 hover:text-white hover:bg-surface-800 transition" aria-label="Close help">
+        <svg class="h-4 w-4" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><path stroke-linecap="round" d="M6 18L18 6M6 6l12 12"/></svg>
+      </button>
+    </div>
+    <div class="px-6 py-5 space-y-5">
+      <div class="flex gap-3 animate-enter stagger-1">
+        <span class="flex-shrink-0 h-6 w-6 rounded-full bg-accent-900 border border-accent-700 flex items-center justify-center text-2xs font-mono font-medium text-accent-400">1</span>
+        <div>
+          <h3 class="text-[13px] font-medium text-white mb-1">Connect to your database</h3>
+          <p class="text-xs text-gray-400 leading-relaxed">Enter your PostgreSQL URL and click <strong class="text-gray-300">Introspect Schema</strong>. Only metadata is read — no row data is accessed.</p>
+        </div>
+      </div>
+      <div class="flex gap-3 animate-enter stagger-2">
+        <span class="flex-shrink-0 h-6 w-6 rounded-full bg-accent-900 border border-accent-700 flex items-center justify-center text-2xs font-mono font-medium text-accent-400">2</span>
+        <div>
+          <h3 class="text-[13px] font-medium text-white mb-1">Apply a compliance profile</h3>
+          <p class="text-xs text-gray-400 leading-relaxed">Click <strong class="text-gray-300">GDPR</strong>, <strong class="text-gray-300">HIPAA</strong>, or <strong class="text-gray-300">PCI-DSS</strong> to auto-map columns matching the profile's rules. You can apply multiple and override any suggestion.</p>
+        </div>
+      </div>
+      <div class="flex gap-3 animate-enter stagger-3">
+        <span class="flex-shrink-0 h-6 w-6 rounded-full bg-accent-900 border border-accent-700 flex items-center justify-center text-2xs font-mono font-medium text-accent-400">3</span>
+        <div>
+          <h3 class="text-[13px] font-medium text-white mb-1">Review columns</h3>
+          <p class="text-xs text-gray-400 leading-relaxed">For each column, choose:</p>
+          <div class="mt-2 space-y-1.5 text-xs">
+            <div class="flex items-center gap-2"><span class="h-1.5 w-1.5 rounded-full bg-gray-500"></span><strong class="text-gray-300">Keep</strong> <span class="text-gray-500">— passes through unchanged</span></div>
+            <div class="flex items-center gap-2"><span class="h-1.5 w-1.5 rounded-full bg-accent-500"></span><strong class="text-gray-300">Anonymize</strong> <span class="text-gray-500">— replaced with fake data</span></div>
+            <div class="flex items-center gap-2"><span class="h-1.5 w-1.5 rounded-full bg-amber-500"></span><strong class="text-gray-300">NULL</strong> <span class="text-gray-500">— set to NULL (passwords, tokens)</span></div>
+          </div>
+        </div>
+      </div>
+      <div class="flex gap-3 animate-enter stagger-4">
+        <span class="flex-shrink-0 h-6 w-6 rounded-full bg-accent-900 border border-accent-700 flex items-center justify-center text-2xs font-mono font-medium text-accent-400">4</span>
+        <div>
+          <h3 class="text-[13px] font-medium text-white mb-1">Choose providers</h3>
+          <p class="text-xs text-gray-400 leading-relaxed">When you select Anonymize, pick a provider from the dropdown. Each provider generates a specific type of fake data:</p>
+          <div class="grid grid-cols-2 gap-x-6 gap-y-0.5 mt-2 text-xs">
+            <span><code class="text-accent-400 font-mono">email</code> <span class="text-gray-600">fake email</span></span>
+            <span><code class="text-accent-400 font-mono">name</code> <span class="text-gray-600">full name</span></span>
+            <span><code class="text-accent-400 font-mono">phone_number</code> <span class="text-gray-600">phone</span></span>
+            <span><code class="text-accent-400 font-mono">ssn</code> <span class="text-gray-600">SSN</span></span>
+            <span><code class="text-accent-400 font-mono">ipv4</code> <span class="text-gray-600">IP address</span></span>
+            <span><code class="text-accent-400 font-mono">year_only</code> <span class="text-gray-600">HIPAA date</span></span>
+            <span><code class="text-accent-400 font-mono">hipaa_zip3</code> <span class="text-gray-600">Safe Harbor ZIP</span></span>
+            <span><code class="text-accent-400 font-mono">credit_card_number</code> <span class="text-gray-600">PAN</span></span>
+          </div>
+        </div>
+      </div>
+      <div class="flex gap-3 animate-enter stagger-5">
+        <span class="flex-shrink-0 h-6 w-6 rounded-full bg-accent-900 border border-accent-700 flex items-center justify-center text-2xs font-mono font-medium text-accent-400">5</span>
+        <div>
+          <h3 class="text-[13px] font-medium text-white mb-1">Generate &amp; export</h3>
+          <p class="text-xs text-gray-400 leading-relaxed">Click <strong class="text-gray-300">Generate Config</strong> to create a <code class="bg-surface-800 text-accent-400 px-1 py-0.5 rounded font-mono text-2xs">dbslice.yaml</code>. Download and use:</p>
+          <div class="mt-2 bg-surface-950 rounded-lg px-3 py-2 font-mono text-2xs text-gray-400 border border-surface-800">dbslice extract --config dbslice.yaml --seed "table.column=value"</div>
+        </div>
+      </div>
+      <div class="pt-2 border-t border-surface-800">
+        <p class="text-2xs text-gray-600"><kbd class="bg-surface-800 border border-surface-700 px-1.5 py-0.5 rounded text-gray-400">Ctrl+Z</kbd> / <kbd class="bg-surface-800 border border-surface-700 px-1.5 py-0.5 rounded text-gray-400">Cmd+Z</kbd> to undo &middot; <kbd class="bg-surface-800 border border-surface-700 px-1.5 py-0.5 rounded text-gray-400">Esc</kbd> to close modals</p>
+      </div>
+    </div>
+  </div>
+</div>
+
+<!-- ═══ HEADER ═══ -->
+<header class="flex items-center gap-3 px-5 h-12 border-b border-surface-800/80 bg-surface-950/90 backdrop-blur-md sticky top-0 z-30">
+  <div class="flex items-center gap-2.5">
+    <svg class="h-5 w-5 text-accent-500" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5"><path d="M3.75 6A2.25 2.25 0 016 3.75h2.25A2.25 2.25 0 0110.5 6v2.25a2.25 2.25 0 01-2.25 2.25H6A2.25 2.25 0 013.75 8.25V6z"/><path d="M3.75 15.75A2.25 2.25 0 016 13.5h2.25a2.25 2.25 0 012.25 2.25V18a2.25 2.25 0 01-2.25 2.25H6A2.25 2.25 0 013.75 18v-2.25z" opacity=".5"/><path d="M13.5 6a2.25 2.25 0 012.25-2.25H18A2.25 2.25 0 0120.25 6v2.25A2.25 2.25 0 0118 10.5h-2.25A2.25 2.25 0 0113.5 8.25V6z" opacity=".5"/><path d="M13.5 15.75a2.25 2.25 0 012.25-2.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-2.25a2.25 2.25 0 01-2.25-2.25v-2.25z"/></svg>
+    <h1 class="text-[15px] font-semibold tracking-tight text-white">dbslice</h1>
+    <span class="rounded-md bg-accent-900/60 border border-accent-700/40 px-2 py-0.5 text-2xs font-medium text-accent-400 tracking-wide uppercase">Map</span>
+  </div>
+  <span class="flex-1"></span>
+  <span id="connBadge" class="hidden items-center gap-1.5 text-xs text-gray-400 mr-2">
+    <span class="h-1.5 w-1.5 rounded-full bg-emerald-400" style="animation:pulse-dot 2s infinite"></span>
+    <span id="connLabel" class="font-mono text-2xs">Connected</span>
+  </span>
+  <button onclick="openHelp()" class="h-7 px-2.5 flex items-center gap-1.5 rounded-lg border border-surface-700 text-xs text-gray-500 hover:text-gray-300 hover:border-surface-600 hover:bg-surface-900 transition-all" aria-label="Open help guide">
+    <svg class="h-3.5 w-3.5" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><circle cx="12" cy="12" r="10"/><path d="M9.09 9a3 3 0 015.83 1c0 2-3 3-3 3m.08 4h.01"/></svg>
+    Guide
+  </button>
+  <span class="text-2xs text-gray-700 font-mono">127.0.0.1</span>
+</header>
+
+<!-- ═══ LAYOUT ═══ -->
+<div class="flex h-[calc(100vh-48px)]">
+
+  <!-- Mobile sidebar toggle -->
+  <button id="sidebarToggle" class="lg:hidden fixed bottom-5 left-5 z-40 h-11 w-11 rounded-xl bg-accent-600 shadow-lg shadow-accent-600/20 text-white flex items-center justify-center" aria-label="Toggle sidebar" onclick="document.getElementById('sidebar').classList.toggle('-translate-x-full')">
+    <svg class="h-5 w-5" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><path stroke-linecap="round" d="M3.75 6.75h16.5M3.75 12h16.5m-16.5 5.25h16.5"/></svg>
+  </button>
+
+  <!-- ═══ SIDEBAR ═══ -->
+  <aside id="sidebar" class="fixed lg:static inset-y-12 left-0 z-20 w-[280px] flex flex-col border-r border-surface-800/80 bg-surface-950 transition-transform duration-200 lg:translate-x-0 -translate-x-full lg:flex-shrink-0">
+
+    <!-- Connection -->
+    <div class="p-4 border-b border-surface-800/60 space-y-2.5">
+      <h3 class="text-2xs font-semibold uppercase tracking-[.08em] text-gray-600">Connection</h3>
+      <div>
+        <label for="dbUrl" class="block text-2xs text-gray-500 mb-1">Database URL</label>
+        <input id="dbUrl" type="text" class="w-full rounded-lg border border-surface-700 bg-surface-900 px-3 py-[7px] text-xs font-mono text-gray-200 placeholder:text-gray-700 focus:border-accent-600 focus:ring-1 focus:ring-accent-600/30 outline-none transition" placeholder="postgresql://user:pass@host:5432/db" autocomplete="off" spellcheck="false">
+      </div>
+      <div>
+        <label for="dbSchema" class="block text-2xs text-gray-500 mb-1">Schema</label>
+        <input id="dbSchema" type="text" class="w-full rounded-lg border border-surface-700 bg-surface-900 px-3 py-[7px] text-xs font-mono text-gray-200 placeholder:text-gray-700 focus:border-accent-600 focus:ring-1 focus:ring-accent-600/30 outline-none transition" placeholder="public" value="public">
+      </div>
+      <button id="btnIntrospect" class="w-full rounded-lg bg-accent-600 hover:bg-accent-500 px-3 py-[7px] text-xs font-medium text-white disabled:opacity-40 disabled:cursor-not-allowed transition-colors" onclick="introspect()">Introspect Schema</button>
+    </div>
+
+    <!-- Compliance profiles -->
+    <div class="p-4 border-b border-surface-800/60">
+      <h3 class="text-2xs font-semibold uppercase tracking-[.08em] text-gray-600 mb-2.5">Compliance Profiles</h3>
+      <div id="profileChips" class="flex flex-wrap gap-1.5 mb-1.5"></div>
+      <p class="text-2xs text-gray-600">Click to apply suggested mappings</p>
+    </div>
+
+    <!-- Search -->
+    <div class="px-4 py-2.5 border-b border-surface-800/60">
+      <div class="relative">
+        <svg class="absolute left-2.5 top-1/2 -translate-y-1/2 h-3.5 w-3.5 text-gray-600" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><circle cx="11" cy="11" r="8"/><path d="m21 21-4.3-4.3"/></svg>
+        <input id="tableSearch" type="text" class="w-full rounded-lg border border-surface-700 bg-surface-900 pl-8 pr-3 py-[6px] text-xs text-gray-300 placeholder:text-gray-700 focus:border-accent-600 outline-none transition" placeholder="Search tables or columns..." oninput="renderTableList()">
+      </div>
+    </div>
+
+    <!-- Table list -->
+    <div id="tableList" class="flex-1 overflow-y-auto no-scrollbar"></div>
+  </aside>
+
+  <!-- ═══ MAIN CONTENT ═══ -->
+  <main class="flex-1 flex flex-col min-w-0 overflow-hidden">
+
+    <div id="mainContent" class="flex-1 overflow-y-auto custom-scroll">
+      <!-- Empty state -->
+      <div class="flex flex-col items-center justify-center h-full text-center px-6">
+        <div class="relative mb-6">
+          <div class="absolute inset-0 bg-accent-500/5 rounded-full blur-2xl scale-150"></div>
+          <svg class="relative h-16 w-16 text-surface-700" fill="none" stroke="currentColor" stroke-width=".75" viewBox="0 0 24 24">
+            <path stroke-linecap="round" d="M20.25 6.375c0 2.278-3.694 4.125-8.25 4.125S3.75 8.653 3.75 6.375m16.5 0c0-2.278-3.694-4.125-8.25-4.125S3.75 4.097 3.75 6.375m16.5 0v11.25c0 2.278-3.694 4.125-8.25 4.125s-8.25-1.847-8.25-4.125V6.375m16.5 0v3.75m-16.5-3.75v3.75m16.5 0v3.75C20.25 16.153 16.556 18 12 18s-8.25-1.847-8.25-4.125v-3.75m16.5 0c0 2.278-3.694 4.125-8.25 4.125s-8.25-1.847-8.25-4.125"/>
+          </svg>
+        </div>
+        <h2 class="text-lg font-medium text-gray-300 mb-1.5">Connect to a database</h2>
+        <p class="text-xs text-gray-600 mb-4 max-w-xs">Enter your PostgreSQL connection URL in the sidebar and introspect the schema to begin mapping columns.</p>
+        <button onclick="openHelp()" class="text-xs text-accent-400 hover:text-accent-300 transition font-medium">Read the guide</button>
+      </div>
+    </div>
+
+    <!-- ═══ BOTTOM BAR ═══ -->
+    <div id="bottomBar" class="hidden border-t border-surface-800/80 bg-surface-950/95 backdrop-blur-md">
+      <div class="flex items-center h-10 px-4 gap-0.5">
+        <button class="summary-tab h-full px-3 text-xs flex items-center gap-1.5 border-b-2 border-transparent transition-colors" data-panel="mapped" onclick="toggleSummary('mapped')">
+          <span class="h-2 w-2 rounded-full bg-accent-500"></span><span id="statMasked" class="font-mono">0</span> masked
+        </button>
+        <button class="summary-tab h-full px-3 text-xs flex items-center gap-1.5 border-b-2 border-transparent transition-colors" data-panel="nulled" onclick="toggleSummary('nulled')">
+          <span class="h-2 w-2 rounded-full bg-amber-500"></span><span id="statNulled" class="font-mono">0</span> nulled
+        </button>
+        <button class="summary-tab h-full px-3 text-xs flex items-center gap-1.5 border-b-2 border-transparent transition-colors" data-panel="kept" onclick="toggleSummary('kept')">
+          <span class="h-2 w-2 rounded-full bg-rose-400/70"></span><span id="statKept" class="font-mono">0</span> kept
+        </button>
+        <span class="flex-1"></span>
+        <button class="summary-tab h-full px-3 text-xs flex items-center gap-1.5 border-b-2 border-transparent text-gray-500 hover:text-gray-300 transition-colors" data-panel="preview" onclick="toggleSummary('preview')">
+          <svg class="h-3 w-3" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><path d="M17.25 6.75L22.5 12l-5.25 5.25m-10.5 0L1.5 12l5.25-5.25m7.5-3l-4.5 16.5"/></svg>
+          YAML
+        </button>
+        <button id="btnGenerate" disabled class="ml-2 h-7 rounded-lg bg-accent-600 hover:bg-accent-500 px-4 text-xs font-medium text-white disabled:opacity-30 disabled:cursor-not-allowed transition-colors" onclick="generateConfig()">Generate Config</button>
+      </div>
+      <div id="summaryPanel" class="hidden border-t border-surface-800/60 bg-surface-900/40"></div>
+    </div>
+  </main>
+</div>
+
+<!-- Toast -->
+<div id="toastContainer" class="fixed bottom-5 right-5 z-50 space-y-2" aria-live="polite"></div>
+
+<script>
+const TOKEN="{{TOKEN}}";
+const HEADERS={"Content-Type":"application/json","X-DBSLICE-Token":TOKEN};
+const state={tables:[],selectedTable:null,mappings:{},profiles:[],commonProviders:[],undoStack:[],openPanel:null};
+
+const PROVIDER_INFO={
+  email:"Fake email address",phone_number:"Phone number",first_name:"First name",
+  last_name:"Last name",name:"Full name",address:"Street address",
+  street_address:"Street only",city:"City name",zipcode:"ZIP code",
+  ssn:"SSN (###-##-####)",credit_card_number:"Credit card (Luhn-valid)",
+  ipv4:"IPv4 address",ipv6:"IPv6 address",company:"Company name",
+  url:"URL",date_of_birth:"Date of birth",user_name:"Username",
+  passport_number:"Passport number",iban:"IBAN",
+  pystr:"Random string",random_int:"Random integer",
+  year_only:"Year only (HIPAA)",hipaa_zip3:"Safe Harbor ZIP3 (HIPAA)",
+  age_bucket:"Age 90+ bucket (HIPAA)",redact_freetext:"Inline PII redaction",
+  credit_card_expire:"Card expiry",credit_card_security_code:"CVV",
+};
+
+document.getElementById("dbUrl").value="{{INITIAL_URL}}";
+
+async function api(path,body){
+  const r=await fetch(path,{method:"POST",headers:HEADERS,body:JSON.stringify(body)});
+  const d=await r.json();if(!r.ok) throw new Error(d.error||"Request failed");return d;
+}
+
+function openHelp(){document.getElementById("helpModal").classList.remove("hidden")}
+function closeHelp(){document.getElementById("helpModal").classList.add("hidden")}
+
+async function introspect(){
+  const btn=document.getElementById("btnIntrospect");
+  btn.disabled=true;btn.innerHTML='<span class="inline-block h-3 w-3 border-2 border-white/30 border-t-white rounded-full animate-spin mr-1.5"></span>Connecting...';
+  try{
+    const r=await api("/api/introspect",{database_url:document.getElementById("dbUrl").value,schema:document.getElementById("dbSchema").value||null,detect_sensitive:true});
+    state.tables=r.tables;state.profiles=r.compliance_profiles;state.commonProviders=r.common_providers;state.mappings={};state.undoStack=[];
+    for(const t of r.tables) for(const c of t.columns){
+      const k=t.name+"."+c.name;
+      if(c.suggested_action==="anonymize") state.mappings[k]={action:"anonymize",provider:c.suggested_provider};
+      else if(c.suggested_action==="null") state.mappings[k]={action:"null",provider:""};
+    }
+    document.getElementById("connBadge").classList.replace("hidden","flex");
+    document.getElementById("connLabel").textContent=r.database+" · "+r.table_count+" tables";
+    document.getElementById("bottomBar").classList.remove("hidden");
+    document.getElementById("btnGenerate").disabled=false;
+    renderProfileChips();renderTableList();
+    if(state.tables.length>0){state.selectedTable=state.tables[0].name;renderColumns();}
+    updateStats();
+    toast(r.table_count+" tables introspected","success");
+  }catch(e){toast(e.message,"error")}
+  finally{btn.disabled=false;btn.textContent="Introspect Schema"}
+}
+
+function renderProfileChips(){
+  document.getElementById("profileChips").innerHTML=state.profiles.map(p=>
+    `<button class="group rounded-lg border border-surface-700 px-2.5 py-1 text-2xs font-medium text-gray-400 hover:border-accent-600 hover:text-accent-400 transition-all data-[active]:bg-accent-900/50 data-[active]:border-accent-600 data-[active]:text-accent-400" data-profile="${p.name}" onclick="applyProfile('${p.name}',this)" aria-label="Apply ${p.display_name} profile">${p.display_name}</button>`
+  ).join("");
+}
+
+async function applyProfile(name,el){
+  try{
+    pushUndo();
+    const r=await api("/api/apply-profile",{profile:name,current_mappings:state.mappings});
+    for(const[f,p] of Object.entries(r.field_additions)) if(!state.mappings[f]) state.mappings[f]={action:"anonymize",provider:p};
+    for(const f of r.null_additions) if(!state.mappings[f]) state.mappings[f]={action:"null",provider:""};
+    el.toggleAttribute("data-active");
+    renderColumns();renderTableList();updateStats();refreshOpenPanel();
+    toast(`${r.display_name}: +${Object.keys(r.field_additions).length} masked, +${r.null_additions.length} nulled`,"success");
+  }catch(e){toast(e.message,"error")}
+}
+
+function renderTableList(){
+  const q=document.getElementById("tableSearch").value.toLowerCase();
+  const el=document.getElementById("tableList");
+  const filtered=state.tables.filter(t=>t.name.toLowerCase().includes(q)||t.columns.some(c=>c.name.toLowerCase().includes(q)));
+  el.innerHTML=filtered.map(t=>{
+    const total=t.columns.filter(c=>!c.is_pk&&!c.is_fk).length;
+    const mapped=t.columns.filter(c=>{const k=t.name+"."+c.name;const m=state.mappings[k];return m&&m.action!=="keep"}).length;
+    const pct=total>0?Math.round(mapped/total*100):100;
+    const active=state.selectedTable===t.name;
+    const barColor=pct===100?"bg-accent-500":pct>0?"bg-amber-500":"bg-surface-700";
+    return `<button class="group w-full flex items-center gap-2.5 px-4 py-2 text-left transition-colors ${active?"bg-surface-900":"hover:bg-surface-900/50"}" onclick="selectTable('${t.name}')" aria-label="${t.name}">
+      <div class="flex-1 min-w-0">
+        <span class="block text-xs font-medium ${active?"text-white":"text-gray-300 group-hover:text-white"} truncate transition-colors">${t.name}</span>
+        <div class="flex items-center gap-2 mt-1">
+          <div class="flex-1 h-[3px] rounded-full bg-surface-800 overflow-hidden"><div class="h-full ${barColor} rounded-full transition-all" style="width:${pct}%"></div></div>
+          <span class="text-2xs text-gray-600 font-mono tabular-nums">${mapped}/${total}</span>
+        </div>
+      </div>
+    </button>`;
+  }).join("");
+}
+
+function selectTable(name){
+  state.selectedTable=name;renderTableList();renderColumns();
+  document.getElementById("sidebar").classList.add("-translate-x-full");
+}
+
+function renderColumns(){
+  const table=state.tables.find(t=>t.name===state.selectedTable);
+  if(!table) return;
+  const main=document.getElementById("mainContent");
+
+  let h=`<div class="px-6 pt-5 pb-4">
+    <div class="flex items-center gap-3 mb-5 flex-wrap">
+      <h2 class="text-base font-semibold text-white">${table.name}</h2>
+      <span class="text-2xs text-gray-600 font-mono">${table.columns.length} columns</span>
+      <span class="flex-1"></span>
+      <div class="flex gap-1.5">
+        <button class="h-6 px-2 rounded-md text-2xs border border-surface-700 text-gray-500 hover:text-accent-400 hover:border-accent-700 transition-all" onclick="bulkAction('${table.name}','anonymize')">Anonymize all</button>
+        <button class="h-6 px-2 rounded-md text-2xs border border-surface-700 text-gray-500 hover:text-amber-400 hover:border-amber-800 transition-all" onclick="bulkAction('${table.name}','null')">NULL all</button>
+        <button class="h-6 px-2 rounded-md text-2xs border border-surface-700 text-gray-500 hover:text-gray-300 transition-all" onclick="bulkAction('${table.name}','keep')">Reset</button>
+      </div>
+    </div>`;
+
+  // Responsive: table on desktop, cards on mobile
+  h+=`<div class="hidden sm:grid grid-cols-[minmax(130px,1.2fr)_90px_130px_1fr] gap-x-3 px-3 pb-2 text-2xs font-semibold uppercase tracking-[.06em] text-gray-600">
+    <div>Column</div><div>Type</div><div>Action</div><div>Provider</div>
+  </div>`;
+
+  for(let idx=0;idx<table.columns.length;idx++){
+    const c=table.columns[idx];
+    const k=table.name+"."+c.name;
+    const m=state.mappings[k]||{action:"keep",provider:""};
+    const locked=c.is_pk||c.is_fk;
+    const lockLabel=c.is_pk?"PK":"FK";
+
+    const leftBorder=m.action==="anonymize"?"border-l-accent-500":m.action==="null"?"border-l-amber-500":"border-l-transparent";
+    const bg=m.action==="anonymize"?"bg-accent-500/[.03]":m.action==="null"?"bg-amber-500/[.03]":"";
+
+    // Desktop row
+    h+=`<div class="hidden sm:grid grid-cols-[minmax(130px,1.2fr)_90px_130px_1fr] gap-x-3 items-center px-3 py-2 border-b border-surface-800/30 border-l-2 ${leftBorder} ${bg} ${locked?"opacity-35":"row-glow"} animate-enter" style="animation-delay:${Math.min(idx*15,300)}ms" role="row">`;
+    h+=`<div class="font-mono text-xs ${locked?"text-gray-500":"text-gray-300"} truncate" title="${c.name}">${c.name}${locked?` <span class="text-2xs text-gray-600">${lockLabel}</span>`:""}</div>`;
+    h+=`<div class="text-2xs text-gray-600 truncate font-mono" title="${c.data_type}${c.nullable?"":" NOT NULL"}">${c.data_type}${c.nullable?"":'<span class="text-amber-600 ml-0.5">!</span>'}</div>`;
+
+    if(locked){
+      h+=`<div class="text-2xs text-gray-600">${c.is_pk?"Primary Key":"Foreign Key"}</div><div></div>`;
+    } else {
+      const sel=v=>m.action===v?"selected":"";
+      h+=`<div><select onchange="setAction('${k}',this.value)" class="w-full h-7 rounded-md border border-surface-700 bg-surface-900 pl-2 text-xs text-gray-300 focus:border-accent-600 outline-none cursor-pointer transition" aria-label="Action for ${c.name}">
+        <option value="keep" ${sel("keep")}>Keep</option>
+        <option value="anonymize" ${sel("anonymize")}>Anonymize</option>
+        <option value="null" ${sel("null")}>NULL</option>
+      </select></div>`;
+
+      if(m.action==="anonymize"){
+        h+=`<div><select onchange="setProvider('${k}',this.value)" class="w-full h-7 rounded-md border border-surface-700 bg-surface-900 pl-2 text-xs font-mono text-gray-300 focus:border-accent-600 outline-none cursor-pointer transition" aria-label="Provider for ${c.name}">`;
+        for(const p of state.commonProviders){
+          h+=`<option value="${p}" ${m.provider===p?"selected":""}>${p} — ${PROVIDER_INFO[p]||p}</option>`;
+        }
+        if(m.provider&&!state.commonProviders.includes(m.provider)){
+          h+=`<option value="${m.provider}" selected>${m.provider} (custom)</option>`;
+        }
+        h+=`</select></div>`;
+      } else {
+        h+=`<div></div>`;
+      }
+    }
+    h+=`</div>`;
+
+    // Mobile card
+    h+=`<div class="sm:hidden border-b border-surface-800/30 border-l-2 ${leftBorder} ${bg} px-3 py-3 ${locked?"opacity-35":""}">`;
+    h+=`<div class="flex items-center justify-between mb-2"><span class="font-mono text-xs text-gray-300">${c.name}</span><span class="text-2xs text-gray-600 font-mono">${c.data_type}${locked?" · "+lockLabel:""}</span></div>`;
+    if(!locked){
+      const sel=v=>m.action===v?"selected":"";
+      h+=`<div class="flex gap-2"><select onchange="setAction('${k}',this.value)" class="flex-1 h-8 rounded-md border border-surface-700 bg-surface-900 pl-2 text-xs text-gray-300 focus:border-accent-600 outline-none" aria-label="Action">
+        <option value="keep" ${sel("keep")}>Keep</option><option value="anonymize" ${sel("anonymize")}>Anonymize</option><option value="null" ${sel("null")}>NULL</option>
+      </select>`;
+      if(m.action==="anonymize"){
+        h+=`<select onchange="setProvider('${k}',this.value)" class="flex-1 h-8 rounded-md border border-surface-700 bg-surface-900 pl-2 text-xs font-mono text-gray-300 focus:border-accent-600 outline-none" aria-label="Provider">`;
+        for(const p of state.commonProviders) h+=`<option value="${p}" ${m.provider===p?"selected":""}>${p}</option>`;
+        h+=`</select>`;
+      }
+      h+=`</div>`;
+    }
+    h+=`</div>`;
+  }
+
+  h+=`</div>`;
+  main.innerHTML=h;
+}
+
+function setAction(k,action){
+  pushUndo();
+  if(action==="keep") delete state.mappings[k];
+  else if(action==="anonymize") state.mappings[k]={action:"anonymize",provider:state.mappings[k]?.provider||"pystr"};
+  else state.mappings[k]={action:"null",provider:""};
+  renderColumns();updateStats();renderTableList();refreshOpenPanel();
+}
+
+function setProvider(k,provider){if(state.mappings[k]) state.mappings[k].provider=provider;refreshOpenPanel();}
+
+function bulkAction(tn,action){
+  pushUndo();
+  const table=state.tables.find(t=>t.name===tn);if(!table) return;
+  for(const c of table.columns){
+    if(c.is_pk||c.is_fk) continue;
+    const k=tn+"."+c.name;
+    if(action==="keep") delete state.mappings[k];
+    else if(action==="anonymize") state.mappings[k]={action:"anonymize",provider:state.mappings[k]?.provider||"pystr"};
+    else state.mappings[k]={action:"null",provider:""};
+  }
+  renderColumns();updateStats();renderTableList();refreshOpenPanel();
+}
+
+function pushUndo(){state.undoStack.push(JSON.stringify(state.mappings));if(state.undoStack.length>50)state.undoStack.shift()}
+document.addEventListener("keydown",e=>{
+  if(e.key==="Escape") closeHelp();
+  if((e.metaKey||e.ctrlKey)&&e.key==="z"){
+    e.preventDefault();
+    if(state.undoStack.length>0){state.mappings=JSON.parse(state.undoStack.pop());renderColumns();updateStats();renderTableList();refreshOpenPanel();toast("Undone","success")}
+  }
+});
+
+function updateStats(){
+  let masked=0,nulled=0,kept=0;
+  for(const t of state.tables) for(const c of t.columns){
+    if(c.is_pk||c.is_fk) continue;
+    const k=t.name+"."+c.name,m=state.mappings[k];
+    if(m&&m.action==="anonymize") masked++;else if(m&&m.action==="null") nulled++;else kept++;
+  }
+  document.getElementById("statMasked").textContent=masked;
+  document.getElementById("statNulled").textContent=nulled;
+  document.getElementById("statKept").textContent=kept;
+}
+
+function toggleSummary(panel){
+  const sp=document.getElementById("summaryPanel");
+  if(state.openPanel===panel){state.openPanel=null;sp.classList.add("hidden");highlightTabs(null);return}
+  state.openPanel=panel;showSummaryContent(panel);sp.classList.remove("hidden");highlightTabs(panel);
+}
+
+function refreshOpenPanel(){if(state.openPanel) showSummaryContent(state.openPanel)}
+
+function highlightTabs(active){
+  document.querySelectorAll(".summary-tab").forEach(t=>{
+    const a=t.dataset.panel===active;
+    t.classList.toggle("border-accent-500",a);t.classList.toggle("text-white",a);
+    t.classList.toggle("border-transparent",!a);
+  });
+}
+
+function showSummaryContent(panel){
+  const el=document.getElementById("summaryPanel");
+
+  if(panel==="preview"){
+    const yaml=buildYaml();
+    el.innerHTML=`<div class="p-4 max-h-64 overflow-y-auto custom-scroll"><pre class="text-xs font-mono leading-relaxed">${highlightYaml(yaml)}</pre></div>`;
+    return;
+  }
+
+  const items=[];
+  for(const t of state.tables) for(const c of t.columns){
+    if(c.is_pk||c.is_fk) continue;
+    const k=t.name+"."+c.name,m=state.mappings[k],act=m?.action||"keep";
+    if(panel==="mapped"&&act==="anonymize") items.push({table:t.name,col:c.name,detail:m.provider});
+    else if(panel==="nulled"&&act==="null") items.push({table:t.name,col:c.name,detail:"NULL"});
+    else if(panel==="kept"&&act==="keep") items.push({table:t.name,col:c.name,detail:""});
+  }
+
+  if(!items.length){el.innerHTML=`<p class="text-gray-600 text-xs p-4">No columns in this category</p>`;return}
+
+  const groups={};
+  for(const i of items)(groups[i.table]=groups[i.table]||[]).push(i);
+
+  const colorMap={mapped:"accent",nulled:"amber",kept:"gray"};
+  const c=colorMap[panel]||"gray";
+
+  let h=`<div class="max-h-64 overflow-y-auto custom-scroll"><table class="w-full text-xs">`;
+  for(const[tbl,cols] of Object.entries(groups).sort(([a],[b])=>a.localeCompare(b))){
+    h+=`<tr><td colspan="3" class="pt-3 pb-1 px-4 text-2xs font-semibold text-gray-500 uppercase tracking-wide">${tbl}</td></tr>`;
+    for(const col of cols){
+      h+=`<tr class="hover:bg-surface-800/30 cursor-pointer" onclick="selectTable('${tbl}')">
+        <td class="py-1 pl-6 pr-2 font-mono text-${c}-400">${col.col}</td>
+        <td class="py-1 pr-4 text-gray-600 text-right">${col.detail}</td>
+      </tr>`;
+    }
+  }
+  h+=`</table></div>`;
+  el.innerHTML=h;
+}
+
+function buildYaml(){
+  const fields={},nulls=[];
+  for(const[k,v] of Object.entries(state.mappings)){
+    if(v.action==="anonymize") fields[k]=v.provider;
+    else if(v.action==="null") nulls.push(k);
+  }
+  let y="anonymization:\n  enabled: true\n";
+  if(Object.keys(fields).length){y+="  fields:\n";for(const[k,v] of Object.entries(fields).sort()) y+=`    ${k}: ${v}\n`}
+  if(nulls.length){y+="  security_null_fields:\n";for(const n of nulls.sort()) y+=`    - ${n}\n`}
+  return y;
+}
+
+function highlightYaml(yaml){
+  return yaml
+    .replace(/^(\s*#.*)$/gm,'<span class="yc">$1</span>')
+    .replace(/^(\s*[\w._]+)(:)/gm,'<span class="yk">$1</span><span class="text-gray-600">$2</span>')
+    .replace(/: (.+)$/gm,': <span class="yv">$1</span>')
+    .replace(/^(\s*- )(.+)$/gm,'$1<span class="yd">$2</span>');
+}
+
+async function generateConfig(){
+  try{
+    const r=await api("/api/generate-config",{mappings:state.mappings});
+    const main=document.getElementById("mainContent");
+    main.innerHTML=`<div class="max-w-2xl mx-auto px-6 py-8 animate-enter">
+      <div class="flex items-center gap-3 mb-6">
+        <div class="h-9 w-9 rounded-xl bg-accent-900/50 border border-accent-700/30 flex items-center justify-center">
+          <svg class="h-4 w-4 text-accent-400" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24"><path d="M9 12.75L11.25 15 15 9.75M21 12a9 9 0 11-18 0 9 9 0 0118 0z"/></svg>
+        </div>
+        <div>
+          <h2 class="text-base font-semibold text-white">Configuration Ready</h2>
+          <p class="text-2xs text-gray-500">${r.field_count} fields masked &middot; ${r.null_count} fields nulled</p>
+        </div>
+      </div>
+
+      <div class="rounded-xl border border-surface-700 bg-surface-900 overflow-hidden">
+        <div class="flex items-center justify-between px-4 py-2 border-b border-surface-800 bg-surface-850">
+          <span class="text-2xs font-mono text-gray-500">dbslice.yaml</span>
+          <button class="text-2xs text-accent-400 hover:text-accent-300 transition font-medium" onclick="navigator.clipboard.writeText(document.getElementById('yamlRaw').textContent);toast('Copied to clipboard','success')">Copy</button>
+        </div>
+        <pre class="p-4 text-xs font-mono leading-[1.7] overflow-x-auto custom-scroll">${highlightYaml(r.yaml)}</pre>
+        <pre id="yamlRaw" class="hidden">${r.yaml}</pre>
+      </div>
+
+      <div class="flex gap-2 mt-4">
+        <button class="h-8 rounded-lg bg-accent-600 hover:bg-accent-500 px-4 text-xs font-medium text-white transition-colors" onclick="downloadYaml()">Download dbslice.yaml</button>
+        <button class="h-8 rounded-lg border border-surface-700 px-4 text-xs text-gray-400 hover:text-gray-200 hover:bg-surface-900 transition-all" onclick="renderColumns()">Back to mapping</button>
+      </div>
+
+      <div class="mt-8">
+        <h3 class="text-2xs font-semibold uppercase tracking-[.06em] text-gray-600 mb-2">Next step</h3>
+        <div class="rounded-xl bg-surface-900 border border-surface-700 px-4 py-3 font-mono text-xs text-accent-400 select-all leading-relaxed">${r.command_template}</div>
+      </div>
+    </div>`;
+    toast("Config generated","success");
+  }catch(e){toast(e.message,"error")}
+}
+
+function downloadYaml(){
+  const raw=document.getElementById("yamlRaw").textContent;
+  const a=document.createElement("a");
+  a.href=URL.createObjectURL(new Blob([raw],{type:"text/yaml"}));
+  a.download="dbslice.yaml";a.click();
+}
+
+function toast(msg,type){
+  const el=document.createElement("div");
+  const isOk=type==="success";
+  el.className=`flex items-center gap-2 rounded-xl px-4 py-2.5 text-xs font-medium shadow-xl shadow-black/20 animate-enter border ${isOk?"bg-accent-900/90 text-accent-300 border-accent-700/40":"bg-rose-950/90 text-rose-300 border-rose-800/40"}`;
+  el.innerHTML=`<svg class="h-3.5 w-3.5 flex-shrink-0" fill="none" stroke="currentColor" stroke-width="2" viewBox="0 0 24 24">${isOk?'<path d="M9 12.75L11.25 15 15 9.75M21 12a9 9 0 11-18 0 9 9 0 0118 0z"/>':'<path d="M12 9v3.75m9-.75a9 9 0 11-18 0 9 9 0 0118 0zm-9 3.75h.008v.008H12v-.008z"/>'}</svg>${msg}`;
+  el.setAttribute("role","status");
+  document.getElementById("toastContainer").appendChild(el);
+  setTimeout(()=>{el.style.opacity="0";el.style.transition="opacity .2s";setTimeout(()=>el.remove(),200)},3000);
+}
+</script>
+</body>
+</html>
diff --git a/src/dbslice/mapping/ui.py b/src/dbslice/mapping/ui.py
new file mode 100644
index 0000000..1d4b80b
--- /dev/null
+++ b/src/dbslice/mapping/ui.py
@@ -0,0 +1,18 @@
+from pathlib import Path
+
+_STATIC_DIR = Path(__file__).parent / "static"
+_TEMPLATE_CACHE: str | None = None
+
+
+def get_ui_html(token: str, initial_url: str) -> str:
+    """Return the complete HTML page with token and initial URL embedded."""
+    global _TEMPLATE_CACHE  # noqa: PLW0603
+    if _TEMPLATE_CACHE is None:
+        template_path = _STATIC_DIR / "index.html"
+        _TEMPLATE_CACHE = template_path.read_text(encoding="utf-8")
+
+    return (
+        _TEMPLATE_CACHE
+        .replace("{{TOKEN}}", token)
+        .replace("{{INITIAL_URL}}", initial_url)
+    )
diff --git a/src/dbslice/utils/anonymizer.py b/src/dbslice/utils/anonymizer.py
index 7fe836e..afbfddd 100644
--- a/src/dbslice/utils/anonymizer.py
+++ b/src/dbslice/utils/anonymizer.py
@@ -1,4 +1,5 @@
 import hashlib
+import secrets
 from fnmatch import fnmatchcase
 from typing import TYPE_CHECKING, Any
 
@@ -16,6 +17,7 @@
     Faker = None  # type: ignore
 
 if TYPE_CHECKING:
+    from dbslice.compliance.manifest import ComplianceManifest
     from dbslice.models import SchemaGraph
 
 
@@ -126,13 +128,21 @@ class DeterministicAnonymizer:
     tables or rows. Uses Faker with deterministic seeding based on input values.
     """
 
-    def __init__(self, seed: str = DEFAULT_ANONYMIZATION_SEED, schema: "SchemaGraph | None" = None):
+    def __init__(
+        self,
+        seed: str = DEFAULT_ANONYMIZATION_SEED,
+        schema: "SchemaGraph | None" = None,
+        deterministic: bool = True,
+        manifest: "ComplianceManifest | None" = None,
+    ):
         """
         Initialize the anonymizer with a global seed.
 
         Args:
             seed: Global seed for deterministic anonymization
             schema: Optional schema graph for FK detection (prevents anonymizing FK columns)
+            deterministic: If False, use random seeds per value (stronger privacy, no cross-table consistency)
+            manifest: Optional compliance manifest to record anonymization actions
 
         Raises:
             ImportError: If Faker is not installed
@@ -142,16 +152,21 @@ def __init__(self, seed: str = DEFAULT_ANONYMIZATION_SEED, schema: "SchemaGraph
                 "Faker is required for anonymization. Install it with: pip install faker"
             )
 
-        logger.info("Initializing anonymizer", seed=seed[:20] + "...")  # Truncate seed in logs
+        mode = "deterministic" if deterministic else "non-deterministic"
+        logger.info("Initializing anonymizer", seed=seed[:20] + "...", mode=mode)
         self.global_seed = seed
+        self.deterministic = deterministic
         self.fake = Faker()
         self._cache: dict[tuple, Any] = {}
         self.redact_fields: set[str] = set()  # Set of normalized "table.column"
         self.field_providers: dict[str, str] = {}
         self.custom_patterns: list[tuple[str, str]] = []
+        self.fallback_patterns: list[tuple[str, str]] = []
         self.security_null_fields: list[str] = []
         self.schema = schema
         self._fk_columns_cache: dict[str, set[str]] = {}  # Cache of FK columns per table
+        self.manifest = manifest
+        self._manifest_recorded: set[tuple[str, str]] = set()  # Track which fields we've recorded
 
     def _normalize_field(self, table: str, column: str) -> str:
         """Return normalized table.column field name for matching."""
@@ -161,9 +176,11 @@ def _match_glob(self, pattern: str, field: str) -> bool:
         """Case-insensitive shell-style glob match for table.column patterns."""
         return fnmatchcase(field, pattern.lower())
 
-    def _resolve_custom_pattern_provider(self, table: str, column: str) -> str | None:
+    def _resolve_pattern_provider(
+        self, table: str, column: str, patterns: list[tuple[str, str]]
+    ) -> str | None:
         """
-        Resolve provider from custom wildcard patterns.
+        Resolve provider from wildcard patterns.
 
         Resolution policy:
         - Most specific pattern wins (longest non-wildcard literal).
@@ -173,7 +190,7 @@ def _resolve_custom_pattern_provider(self, table: str, column: str) -> str | Non
         best_provider: str | None = None
         best_specificity = -1
 
-        for pattern, provider in self.custom_patterns:
+        for pattern, provider in patterns:
             if not self._match_glob(pattern, field):
                 continue
 
@@ -184,6 +201,14 @@ def _resolve_custom_pattern_provider(self, table: str, column: str) -> str | Non
 
         return best_provider
 
+    def _resolve_custom_pattern_provider(self, table: str, column: str) -> str | None:
+        """Resolve provider from user-defined wildcard patterns."""
+        return self._resolve_pattern_provider(table, column, self.custom_patterns)
+
+    def _resolve_fallback_pattern_provider(self, table: str, column: str) -> str | None:
+        """Resolve provider from fallback wildcard patterns (e.g., compliance profiles)."""
+        return self._resolve_pattern_provider(table, column, self.fallback_patterns)
+
     def _resolve_exact_field_provider(self, table: str, column: str) -> str | None:
         """Resolve provider from exact field mappings."""
         return self.field_providers.get(self._normalize_field(table, column))
@@ -192,9 +217,10 @@ def _resolve_faker_method(self, table: str, column: str) -> str:
         """
         Resolve faker method with precedence:
         1. Exact field provider mapping
-        2. Custom wildcard pattern mapping
-        3. Built-in column substring mapping
-        4. pystr fallback
+        2. User wildcard pattern mapping
+        3. Fallback wildcard pattern mapping
+        4. Built-in column substring mapping
+        5. pystr fallback
         """
         exact_provider = self._resolve_exact_field_provider(table, column)
         if exact_provider:
@@ -204,6 +230,10 @@ def _resolve_faker_method(self, table: str, column: str) -> str:
         if pattern_provider:
             return pattern_provider
 
+        fallback_pattern_provider = self._resolve_fallback_pattern_provider(table, column)
+        if fallback_pattern_provider:
+            return fallback_pattern_provider
+
         return self.get_faker_method(column)
 
     def configure(
@@ -211,6 +241,7 @@ def configure(
         redact_fields: list[str],
         field_providers: dict[str, str] | None = None,
         patterns: dict[str, str] | None = None,
+        fallback_patterns: dict[str, str] | None = None,
         security_null_fields: list[str] | None = None,
     ):
         """
@@ -219,7 +250,8 @@ def configure(
         Args:
             redact_fields: List of exact fields in "table.column" format.
             field_providers: Exact field to faker-provider mappings.
-            patterns: Wildcard table.column glob to faker-provider mappings.
+            patterns: User wildcard table.column glob to faker-provider mappings.
+            fallback_patterns: Lower-priority wildcard mappings (e.g., compliance profiles).
             security_null_fields: Wildcard table.column globs to force NULL.
         """
         self.redact_fields = {field.lower() for field in redact_fields}
@@ -229,13 +261,17 @@ def configure(
         self.custom_patterns = [
             (pattern.lower(), provider) for pattern, provider in (patterns or {}).items()
         ]
+        self.fallback_patterns = [
+            (pattern.lower(), provider) for pattern, provider in (fallback_patterns or {}).items()
+        ]
         self.security_null_fields = [pattern.lower() for pattern in (security_null_fields or [])]
 
         logger.info(
             "Anonymizer configured",
             redact_field_count=len(self.redact_fields),
             exact_provider_count=len(self.field_providers),
-            pattern_count=len(self.custom_patterns),
+            user_pattern_count=len(self.custom_patterns),
+            fallback_pattern_count=len(self.fallback_patterns),
             security_null_pattern_count=len(self.security_null_fields),
         )
 
@@ -297,6 +333,10 @@ def should_anonymize(self, table: str, column: str) -> bool:
         if self._resolve_custom_pattern_provider(table, column):
             return True
 
+        # Fallback wildcard patterns (e.g., compliance profiles)
+        if self._resolve_fallback_pattern_provider(table, column):
+            return True
+
         # Pattern matching on column name
         col_lower = column.lower()
         for pattern in _DEFAULT_ANONYMIZATION_PATTERNS:
@@ -380,34 +420,53 @@ def anonymize_value(self, value: Any, table: str, column: str) -> Any:
 
         # FK integrity has highest priority over nulling/anonymization rules.
         if self._is_foreign_key_column(table, column):
+            self._record_manifest_fk(table, column)
             return value
 
         if self.should_null(table, column):
+            self._record_manifest_null(table, column)
             return None
 
         if not self.should_anonymize(table, column):
+            self._record_manifest_unmasked(table, column)
             return value
 
         faker_method = self._resolve_faker_method(table, column)
-        cache_key = (str(value), column, faker_method)
-        if cache_key in self._cache:
-            return self._cache[cache_key]
-
-        # Generate deterministic seed from global seed + column/provider + original value
-        # Including column name ensures same value in different column types gets different output
-        hash_input = f"{self.global_seed}:{column}:{faker_method}:{value}".encode()
-        seed_int = int.from_bytes(hashlib.sha256(hash_input).digest()[:8], "big")
-
-        self.fake.seed_instance(seed_int)
-
-        try:
-            anonymized = getattr(self.fake, faker_method)()
-        except (AttributeError, TypeError):
-            # Fallback if Faker method doesn't exist or fails
-            anonymized = self.fake.pystr()
-
-        self._cache[cache_key] = anonymized
-        return anonymized
+        self._record_manifest_masked(table, column, faker_method)
+
+        # Check for custom compliance transformers first (these take the value as input)
+        custom_fn = self._get_custom_transformer(faker_method)
+        if custom_fn is not None:
+            return custom_fn(value)
+
+        if self.deterministic:
+            cache_key = (str(value), column, faker_method)
+            if cache_key in self._cache:
+                return self._cache[cache_key]
+
+            # Generate deterministic seed from global seed + column/provider + original value
+            # Including column name ensures same value in different column types gets different output
+            hash_input = f"{self.global_seed}:{column}:{faker_method}:{value}".encode()
+            seed_int = int.from_bytes(hashlib.sha256(hash_input).digest()[:8], "big")
+            self.fake.seed_instance(seed_int)
+
+            try:
+                anonymized = getattr(self.fake, faker_method)()
+            except (AttributeError, TypeError):
+                anonymized = self.fake.pystr()
+
+            self._cache[cache_key] = anonymized
+            return anonymized
+        else:
+            seed_int = int.from_bytes(secrets.token_bytes(8), "big")
+            self.fake.seed_instance(seed_int)
+
+            try:
+                anonymized = getattr(self.fake, faker_method)()
+            except (AttributeError, TypeError):
+                anonymized = self.fake.pystr()
+
+            return anonymized
 
     def anonymize_row(self, table: str, row: dict[str, Any]) -> dict[str, Any]:
         """
@@ -451,5 +510,49 @@ def get_statistics(self) -> dict[str, int]:
             "redact_fields_count": len(self.redact_fields),
             "exact_provider_count": len(self.field_providers),
             "pattern_count": len(self.custom_patterns),
+            "fallback_pattern_count": len(self.fallback_patterns),
             "security_null_pattern_count": len(self.security_null_fields),
         }
+
+    @staticmethod
+    def _get_custom_transformer(method_name: str) -> Any | None:
+        """Look up a custom compliance transformer function by name."""
+        from dbslice.compliance.transformers import CUSTOM_TRANSFORMERS
+
+        return CUSTOM_TRANSFORMERS.get(method_name)
+
+    def _record_manifest_masked(self, table: str, column: str, method: str) -> None:
+        """Record a masked field in the manifest (once per table.column)."""
+        if not self.manifest:
+            return
+        key = (table, column)
+        if key not in self._manifest_recorded:
+            self._manifest_recorded.add(key)
+            self.manifest.record_masked_field(table, column, method)
+
+    def _record_manifest_null(self, table: str, column: str) -> None:
+        """Record a NULLed field in the manifest (once per table.column)."""
+        if not self.manifest:
+            return
+        key = (table, column)
+        if key not in self._manifest_recorded:
+            self._manifest_recorded.add(key)
+            self.manifest.record_nulled_field(table, column, "security_null_pattern")
+
+    def _record_manifest_fk(self, table: str, column: str) -> None:
+        """Record a preserved FK field in the manifest (once per table.column)."""
+        if not self.manifest:
+            return
+        key = (table, column)
+        if key not in self._manifest_recorded:
+            self._manifest_recorded.add(key)
+            self.manifest.record_fk_preserved(table, column)
+
+    def _record_manifest_unmasked(self, table: str, column: str) -> None:
+        """Record an unmasked field in the manifest (once per table.column)."""
+        if not self.manifest:
+            return
+        key = (table, column)
+        if key not in self._manifest_recorded:
+            self._manifest_recorded.add(key)
+            self.manifest.record_unmasked_field(table, column)
diff --git a/tests/test_anonymizer.py b/tests/test_anonymizer.py
index 6654416..dc86580 100644
--- a/tests/test_anonymizer.py
+++ b/tests/test_anonymizer.py
@@ -240,6 +240,27 @@ def test_custom_pattern_tie_uses_first_defined(self):
 
         assert anon._resolve_faker_method("users", "user_id") == "name"
 
+    def test_fallback_patterns_apply_when_user_patterns_missing(self):
+        anon = DeterministicAnonymizer()
+        anon.configure(
+            [],
+            patterns={},
+            fallback_patterns={"*.admission_date*": "date"},
+        )
+
+        assert anon.should_anonymize("visits", "admission_date")
+        assert anon._resolve_faker_method("visits", "admission_date") == "date"
+
+    def test_user_patterns_override_fallback_patterns(self):
+        anon = DeterministicAnonymizer()
+        anon.configure(
+            [],
+            patterns={"*.*date*": "date_time"},
+            fallback_patterns={"*.admission_date*": "date"},
+        )
+
+        assert anon._resolve_faker_method("visits", "admission_date") == "date_time"
+
     def test_security_null_fields_applies(self):
         anon = DeterministicAnonymizer()
         anon.configure(
diff --git a/tests/test_compliance.py b/tests/test_compliance.py
new file mode 100644
index 0000000..c50b52b
--- /dev/null
+++ b/tests/test_compliance.py
@@ -0,0 +1,701 @@
+"""Tests for the compliance module: profiles, scanner, manifest, and integration."""
+
+import json
+from unittest.mock import MagicMock
+
+import pytest
+import yaml
+
+import dbslice.cli as cli
+from dbslice.compliance.manifest import ComplianceManifest
+from dbslice.compliance.profiles import (
+    get_profile,
+    list_profiles,
+)
+from dbslice.compliance.scanner import PIIDetection, PIIScanner, _luhn_check
+from dbslice.config import ExtractConfig
+from dbslice.core.engine import ExtractionEngine
+from dbslice.exceptions import ExtractionError
+from dbslice.models import Column, SchemaGraph, Table
+
+# ──────────────────────────────────────────────────────────
+# Profile tests
+# ──────────────────────────────────────────────────────────
+
+
+class TestProfiles:
+    def test_get_gdpr_profile(self):
+        profile = get_profile("gdpr")
+        assert profile.name == "gdpr"
+        assert profile.display_name == "GDPR"
+        assert "email" in profile.required_column_patterns
+
+    def test_get_hipaa_profile(self):
+        profile = get_profile("hipaa")
+        assert profile.name == "hipaa"
+        assert "ssn" in profile.required_column_patterns
+        assert len(profile.identifiers) == 18
+
+    def test_get_pci_dss_profile(self):
+        profile = get_profile("pci-dss")
+        assert profile.name == "pci-dss"
+        assert "credit_card" in profile.required_column_patterns
+        assert "cvv" in profile.required_null_patterns
+
+    def test_get_profile_case_insensitive(self):
+        assert get_profile("GDPR").name == "gdpr"
+        assert get_profile("Hipaa").name == "hipaa"
+        assert get_profile("PCI-DSS").name == "pci-dss"
+
+    def test_get_profile_unknown_raises(self):
+        with pytest.raises(ValueError, match="Unknown compliance profile"):
+            get_profile("unknown")
+
+    def test_list_profiles(self):
+        profiles = list_profiles()
+        assert len(profiles) >= 3
+        names = {p.name for p in profiles}
+        assert "gdpr" in names
+        assert "hipaa" in names
+        assert "pci-dss" in names
+
+    def test_gdpr_covers_direct_identifiers(self):
+        profile = get_profile("gdpr")
+        expected_patterns = ["email", "phone", "first_name", "last_name", "ssn", "ip_address"]
+        for pattern in expected_patterns:
+            assert pattern in profile.required_column_patterns, f"Missing: {pattern}"
+
+    def test_hipaa_has_18_identifiers(self):
+        profile = get_profile("hipaa")
+        assert len(profile.identifiers) == 18
+        assert profile.identifiers[0].startswith("1.")
+        assert profile.identifiers[17].startswith("18.")
+
+    def test_pci_dss_covers_pan_fields(self):
+        profile = get_profile("pci-dss")
+        for pattern in ["credit_card", "card_number", "pan"]:
+            assert pattern in profile.required_column_patterns
+
+    def test_profiles_have_value_scan_patterns(self):
+        for profile in list_profiles():
+            assert len(profile.value_scan_patterns) > 0
+
+    def test_profiles_have_freetext_warnings(self):
+        for profile in list_profiles():
+            assert len(profile.warn_freetext_columns) > 0
+
+    def test_profile_is_frozen(self):
+        profile = get_profile("gdpr")
+        with pytest.raises(AttributeError):
+            profile.name = "hacked"  # type: ignore[misc]
+
+
+# ──────────────────────────────────────────────────────────
+# Scanner tests
+# ──────────────────────────────────────────────────────────
+
+
+class TestPIIScanner:
+    def test_detect_emails(self):
+        scanner = PIIScanner(patterns=["email"])
+        values = ["john@example.com", "jane@test.org", "not-an-email", "bob@foo.co"]
+        detections = scanner.scan_column("users", "notes", values)
+        assert len(detections) == 1
+        assert detections[0].pattern_name == "email"
+        assert detections[0].match_count == 3
+
+    def test_detect_ssn(self):
+        scanner = PIIScanner(patterns=["ssn"])
+        values = ["123-45-6789", "987-65-4321", "not-ssn", "hello"]
+        detections = scanner.scan_column("users", "data", values)
+        assert len(detections) == 1
+        assert detections[0].pattern_name == "ssn"
+        assert detections[0].match_count == 2
+
+    def test_detect_credit_card_with_luhn(self):
+        scanner = PIIScanner(patterns=["credit_card"])
+        # 4111111111111111 is a valid Luhn number (Visa test card)
+        values = ["4111111111111111", "1234567890123456", "not-a-card"]
+        detections = scanner.scan_column("orders", "memo", values)
+        assert len(detections) == 1
+        assert detections[0].pattern_name == "credit_card"
+        # Only the Luhn-valid one should match
+        assert detections[0].match_count >= 1
+
+    def test_detect_credit_card_with_grouped_format(self):
+        scanner = PIIScanner(patterns=["credit_card"])
+        values = ["4111-1111-1111-1111", "not-a-card"]
+        detections = scanner.scan_column("orders", "memo", values)
+        assert len(detections) == 1
+        assert detections[0].pattern_name == "credit_card"
+
+    def test_detect_credit_card_embedded_in_text(self):
+        scanner = PIIScanner(patterns=["credit_card"])
+        values = ["card=4111 1111 1111 1111 expires soon", "hello"]
+        detections = scanner.scan_column("orders", "memo", values)
+        assert len(detections) == 1
+        assert detections[0].match_count == 1
+
+    def test_detect_ipv4(self):
+        scanner = PIIScanner(patterns=["ipv4"])
+        values = ["192.168.1.1", "10.0.0.1", "not-ip", "256.1.1.1"]
+        detections = scanner.scan_column("logs", "source", values)
+        assert len(detections) >= 1
+        assert detections[0].pattern_name == "ipv4"
+
+    def test_no_detection_below_threshold(self):
+        scanner = PIIScanner(patterns=["email"], min_match_rate=0.5)
+        # Only 1 out of 10 is an email — below 50% threshold
+        values = ["john@example.com"] + ["no-email"] * 9
+        detections = scanner.scan_column("users", "notes", values)
+        assert len(detections) == 0
+
+    def test_scan_rows(self):
+        scanner = PIIScanner(patterns=["email"])
+        rows = [
+            {"id": 1, "notes": "contact john@example.com"},
+            {"id": 2, "notes": "call jane@test.org"},
+            {"id": 3, "notes": "nothing here"},
+        ]
+        detections = scanner.scan_rows("users", rows)
+        assert any(d.column == "notes" and d.pattern_name == "email" for d in detections)
+
+    def test_scan_rows_skip_columns(self):
+        scanner = PIIScanner(patterns=["email"])
+        rows = [
+            {"id": 1, "email": "john@example.com", "notes": "call john@example.com"},
+        ]
+        detections = scanner.scan_rows("users", rows, skip_columns={"email"})
+        # Should only detect in "notes", not in "email" (skipped)
+        assert all(d.column != "email" for d in detections)
+
+    def test_scan_empty_rows(self):
+        scanner = PIIScanner()
+        assert scanner.scan_rows("users", []) == []
+
+    def test_scan_none_values(self):
+        scanner = PIIScanner(patterns=["email"])
+        values = [None, None, None]
+        detections = scanner.scan_column("users", "email", values)
+        assert len(detections) == 0
+
+    def test_confidence_levels(self):
+        scanner = PIIScanner(patterns=["email"], min_match_rate=0.01)
+        # High match rate = high confidence
+        values = ["a@b.com"] * 10
+        detections = scanner.scan_column("t", "c", values)
+        assert detections[0].confidence == "high"
+
+    def test_match_rate_property(self):
+        detection = PIIDetection(
+            table="t",
+            column="c",
+            pattern_name="email",
+            match_count=3,
+            sample_size=10,
+            confidence="high",
+        )
+        assert detection.match_rate == 0.3
+
+    def test_match_rate_zero_sample(self):
+        detection = PIIDetection(
+            table="t",
+            column="c",
+            pattern_name="email",
+            match_count=0,
+            sample_size=0,
+            confidence="low",
+        )
+        assert detection.match_rate == 0.0
+
+
+class TestLuhnCheck:
+    def test_valid_visa(self):
+        assert _luhn_check("4111111111111111") is True
+
+    def test_valid_mastercard(self):
+        assert _luhn_check("5500000000000004") is True
+
+    def test_invalid_number(self):
+        assert _luhn_check("1234567890123456") is False
+
+    def test_too_short(self):
+        assert _luhn_check("123") is False
+
+
+# ──────────────────────────────────────────────────────────
+# Manifest tests
+# ──────────────────────────────────────────────────────────
+
+
+class TestComplianceManifest:
+    def test_initialize(self):
+        manifest = ComplianceManifest()
+        manifest.initialize(
+            extraction_id="test-123",
+            compliance_profiles=["gdpr"],
+            anonymization_seed="my_seed",
+            deterministic=True,
+        )
+        assert manifest.extraction_id == "test-123"
+        assert manifest.compliance_profiles == ["gdpr"]
+        assert manifest.masking_type == "deterministic_pseudonymization"
+        assert manifest.seed_hash.startswith("sha256:")
+        assert manifest.dbslice_version
+
+    def test_initialize_non_deterministic(self):
+        manifest = ComplianceManifest()
+        manifest.initialize(
+            extraction_id="test-456",
+            deterministic=False,
+        )
+        assert manifest.masking_type == "non_deterministic_pseudonymization"
+
+    def test_record_masked_field(self):
+        manifest = ComplianceManifest()
+        manifest.record_masked_field("users", "email", "email", category="direct_identifier")
+        assert len(manifest.tables["users"].fields_masked) == 1
+        assert manifest.tables["users"].fields_masked[0].method == "email"
+
+    def test_record_nulled_field(self):
+        manifest = ComplianceManifest()
+        manifest.record_nulled_field("users", "password_hash", "security_null_pattern")
+        assert len(manifest.tables["users"].fields_nulled) == 1
+
+    def test_record_fk_preserved(self):
+        manifest = ComplianceManifest()
+        manifest.record_fk_preserved("orders", "user_id")
+        assert "user_id" in manifest.tables["orders"].fields_preserved_fk
+
+    def test_record_unmasked_field(self):
+        manifest = ComplianceManifest()
+        manifest.record_unmasked_field("orders", "status")
+        assert "status" in manifest.tables["orders"].fields_unmasked
+
+    def test_set_table_row_count(self):
+        manifest = ComplianceManifest()
+        manifest.set_table_row_count("users", 150)
+        assert manifest.tables["users"].rows_extracted == 150
+
+    def test_add_warning(self):
+        manifest = ComplianceManifest()
+        manifest.add_warning("notes", "body", "may contain PII")
+        assert len(manifest.warnings) == 1
+        assert manifest.warnings[0].table == "notes"
+
+    def test_add_pii_detections(self):
+        manifest = ComplianceManifest()
+        detection = PIIDetection(
+            table="logs",
+            column="message",
+            pattern_name="email",
+            match_count=5,
+            sample_size=100,
+            confidence="high",
+        )
+        manifest.add_pii_detections([detection])
+        assert len(manifest.pii_scan_results) == 1
+
+    def test_to_dict(self):
+        manifest = ComplianceManifest()
+        manifest.initialize(
+            extraction_id="test-789",
+            compliance_profiles=["hipaa"],
+            anonymization_seed="seed123",
+        )
+        manifest.record_masked_field("users", "email", "email")
+        manifest.record_nulled_field("users", "password", "security_null")
+        manifest.set_table_row_count("users", 50)
+        manifest.add_warning("notes", "body", "freetext PII risk")
+
+        d = manifest.to_dict()
+        assert d["extraction_id"] == "test-789"
+        assert d["compliance_profiles"] == ["hipaa"]
+        assert "users" in d["tables"]
+        assert d["tables"]["users"]["rows_extracted"] == 50
+        assert len(d["tables"]["users"]["fields_masked"]) == 1
+        assert len(d["warnings"]) == 1
+
+    def test_to_json(self):
+        manifest = ComplianceManifest()
+        manifest.initialize(extraction_id="test-json")
+        manifest.record_masked_field("t", "c", "email")
+        json_str = manifest.to_json()
+        parsed = json.loads(json_str)
+        assert parsed["extraction_id"] == "test-json"
+
+    def test_to_json_compact(self):
+        manifest = ComplianceManifest()
+        manifest.initialize(extraction_id="test-compact")
+        json_str = manifest.to_json(pretty=False)
+        assert "\n" not in json_str
+
+
+# ──────────────────────────────────────────────────────────
+# Integration: anonymizer + manifest
+# ──────────────────────────────────────────────────────────
+
+
+class TestAnonymizerManifestIntegration:
+    def test_anonymizer_records_to_manifest(self):
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        manifest = ComplianceManifest()
+        anonymizer = DeterministicAnonymizer(
+            seed="test_seed",
+            deterministic=True,
+            manifest=manifest,
+        )
+        anonymizer.configure(redact_fields=["users.email"])
+
+        row = {"id": 1, "email": "john@example.com", "status": "active"}
+        anonymizer.anonymize_row("users", row)
+
+        # email should be recorded as masked
+        assert any(
+            f.column == "email" for f in manifest.tables.get("users", MagicMock()).fields_masked
+        )
+
+    def test_anonymizer_non_deterministic_mode(self):
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        anonymizer = DeterministicAnonymizer(
+            seed="test_seed",
+            deterministic=False,
+        )
+        anonymizer.configure(redact_fields=["users.email"])
+
+        # Same input should produce different outputs in non-deterministic mode
+        results = set()
+        for _ in range(10):
+            row = {"email": "john@example.com"}
+            result = anonymizer.anonymize_row("users", row)
+            results.add(result["email"])
+
+        # With 10 random seeds, we should get multiple distinct values
+        assert len(results) > 1
+
+
+# ──────────────────────────────────────────────────────────
+# Config file integration
+# ──────────────────────────────────────────────────────────
+
+
+class TestComplianceConfig:
+    def test_config_file_compliance_section(self, tmp_path):
+        from dbslice.config_file import DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text("""
+database:
+  url: postgres://localhost/test
+
+compliance:
+  profiles: [gdpr]
+  strict: true
+  generate_manifest: true
+
+anonymization:
+  enabled: true
+  deterministic: false
+""")
+        config = DbsliceConfig.from_yaml(config_file)
+        assert config.compliance.profiles == ["gdpr"]
+        assert config.compliance.strict is True
+        assert config.compliance.generate_manifest is True
+        assert config.anonymization.deterministic is False
+
+    def test_config_file_invalid_profile(self, tmp_path):
+        from dbslice.config_file import ConfigFileError, DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text("""
+compliance:
+  profiles: [nonexistent]
+""")
+        with pytest.raises(ConfigFileError, match="Unknown compliance profile"):
+            DbsliceConfig.from_yaml(config_file)
+
+    def test_config_file_compliance_unknown_key(self, tmp_path):
+        from dbslice.config_file import ConfigFileError, DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text("""
+compliance:
+  profiles: [gdpr]
+  invalid_key: true
+""")
+        with pytest.raises(ConfigFileError, match="Unknown key"):
+            DbsliceConfig.from_yaml(config_file)
+
+    def test_to_extract_config_with_compliance(self, tmp_path):
+        from dbslice.config import SeedSpec
+        from dbslice.config_file import DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text("""
+database:
+  url: postgres://localhost/test
+
+compliance:
+  profiles: [hipaa]
+  strict: true
+  generate_manifest: true
+
+anonymization:
+  enabled: true
+  deterministic: false
+""")
+        config = DbsliceConfig.from_yaml(config_file)
+        seed = SeedSpec(table="users", column="id", value=1, where_clause=None)
+        extract_config = config.to_extract_config(seeds=[seed])
+
+        assert extract_config.compliance_profiles == ["hipaa"]
+        assert extract_config.compliance_strict is True
+        assert extract_config.generate_manifest is True
+        assert extract_config.deterministic is False
+
+    def test_to_extract_config_with_compliance_policy_fields(self, tmp_path):
+        from dbslice.config import SeedSpec
+        from dbslice.config_file import DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text(
+            """
+database:
+  url: postgres://localhost/test
+compliance:
+  profiles: [gdpr]
+  policy_mode: strict
+  allow_url_patterns:
+    - ".*localhost.*"
+  deny_url_patterns:
+    - ".*prod.*"
+  required_sslmode: require
+  require_ci: true
+  sign_manifest: true
+  manifest_key_env: DBSLICE_SIGN_KEY
+"""
+        )
+        config = DbsliceConfig.from_yaml(config_file)
+        seed = SeedSpec(table="users", column="id", value=1, where_clause=None)
+        extract_config = config.to_extract_config(seeds=[seed])
+
+        assert extract_config.compliance_policy_mode == "strict"
+        assert extract_config.compliance_allowed_url_patterns == [".*localhost.*"]
+        assert extract_config.compliance_denied_url_patterns == [".*prod.*"]
+        assert extract_config.compliance_required_sslmode == "require"
+        assert extract_config.compliance_require_ci is True
+        assert extract_config.compliance_manifest_sign is True
+        assert extract_config.compliance_manifest_key_env == "DBSLICE_SIGN_KEY"
+
+    def test_compliance_empty_section_ok(self, tmp_path):
+        from dbslice.config_file import DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text("""
+compliance: {}
+""")
+        config = DbsliceConfig.from_yaml(config_file)
+        assert config.compliance.profiles == []
+        assert config.compliance.strict is False
+
+    def test_to_yaml_includes_deterministic_and_compliance(self, tmp_path):
+        from dbslice.config_file import DbsliceConfig
+
+        config_file = tmp_path / "dbslice.yaml"
+        config_file.write_text(
+            """
+database:
+  url: postgres://localhost/test
+anonymization:
+  enabled: true
+  deterministic: false
+compliance:
+  profiles: [gdpr]
+  strict: true
+  generate_manifest: true
+  policy_mode: strict
+"""
+        )
+        config = DbsliceConfig.from_yaml(config_file)
+        exported = config.to_yaml(include_comments=False)
+        parsed = yaml.safe_load(exported)
+        assert parsed["anonymization"]["deterministic"] is False
+        assert parsed["compliance"]["profiles"] == ["gdpr"]
+        assert parsed["compliance"]["strict"] is True
+        assert parsed["compliance"]["policy_mode"] == "strict"
+
+
+class TestComplianceScanSemantics:
+    def _engine(self, strict: bool) -> ExtractionEngine:
+        config = ExtractConfig(
+            database_url="postgresql://localhost/test",
+            seeds=[],
+            anonymize=True,
+            compliance_profiles=["gdpr"],
+            compliance_strict=strict,
+        )
+        return ExtractionEngine(config)
+
+    def test_strict_mode_ignores_masked_synthetic_values(self):
+        engine = self._engine(strict=True)
+        pre_mask = {"users": [{"email": "alice@example.com"}]}
+        post_mask = {"users": [{"email": "xcooper@example.org"}]}
+        # No exception: email column is protected by profile rules.
+        engine._run_pii_scan(pre_mask, post_mask)
+
+    def test_strict_mode_fails_on_residual_unprotected_detections(self):
+        engine = self._engine(strict=True)
+        pre_mask = {"logs": [{"message": "contact alice@example.com"}]}
+        post_mask = {"logs": [{"message": "contact alice@example.com"}]}
+        with pytest.raises(ExtractionError, match="residual unprotected PII"):
+            engine._run_pii_scan(pre_mask, post_mask)
+
+
+class TestManifestVerification:
+    def test_manifest_file_hash_and_signature_verification(self, tmp_path):
+        from dbslice.compliance.manifest import verify_manifest_payload
+
+        data_file = tmp_path / "subset.sql"
+        data_file.write_text("select 1;\n", encoding="utf-8")
+
+        manifest = ComplianceManifest()
+        manifest.initialize(extraction_id="verify-1")
+        manifest.add_output_file_hashes([data_file], base_dir=tmp_path)
+        manifest.sign("secret-key")
+
+        manifest_path = tmp_path / "subset.manifest.json"
+        manifest_path.write_text(manifest.to_json(pretty=True), encoding="utf-8")
+        payload = json.loads(manifest_path.read_text(encoding="utf-8"))
+
+        ok, errors = verify_manifest_payload(
+            payload,
+            manifest_path,
+            signing_key="secret-key",
+            verify_signature=True,
+        )
+        assert ok is True
+        assert errors == []
+
+        data_file.write_text("tampered\n", encoding="utf-8")
+        ok, errors = verify_manifest_payload(
+            payload,
+            manifest_path,
+            signing_key="secret-key",
+            verify_signature=True,
+        )
+        assert ok is False
+        assert any("Hash mismatch" in err for err in errors)
+
+    def test_verify_manifest_cli_command(self, tmp_path, monkeypatch):
+        data_file = tmp_path / "subset.sql"
+        data_file.write_text("select 1;\n", encoding="utf-8")
+
+        manifest = ComplianceManifest()
+        manifest.initialize(extraction_id="verify-cli")
+        manifest.add_output_file_hashes([data_file], base_dir=tmp_path)
+        manifest.sign("secret-key")
+
+        manifest_path = tmp_path / "subset.manifest.json"
+        manifest_path.write_text(manifest.to_json(pretty=True), encoding="utf-8")
+
+        monkeypatch.setenv("DBSLICE_MANIFEST_SIGNING_KEY", "secret-key")
+        cli.verify_manifest(
+            manifest_file=manifest_path,
+            verify_signature=True,
+            key_env="DBSLICE_MANIFEST_SIGNING_KEY",
+        )
+
+
+class TestCompliancePolicyGates:
+    def test_policy_blocks_stdout_without_breakglass(self):
+        config = ExtractConfig(
+            database_url="postgresql://localhost/test",
+            seeds=[],
+            compliance_profiles=["gdpr"],
+            compliance_policy_mode="standard",
+            anonymize=True,
+        )
+        with pytest.raises(ValueError, match="stdout output is blocked"):
+            cli._enforce_compliance_policy(
+                config,
+                out_file=None,
+                allow_raw=False,
+                breakglass_reason=None,
+                ticket_id=None,
+            )
+
+    def test_policy_breakglass_requires_metadata(self):
+        config = ExtractConfig(
+            database_url="postgresql://localhost/test",
+            seeds=[],
+            compliance_profiles=["gdpr"],
+            compliance_policy_mode="strict",
+            anonymize=False,
+        )
+        with pytest.raises(ValueError, match="--breakglass-reason"):
+            cli._enforce_compliance_policy(
+                config,
+                out_file=None,
+                allow_raw=True,
+                breakglass_reason=None,
+                ticket_id="SEC-123",
+            )
+
+    def test_source_guardrails_validate_sslmode_and_ci(self, monkeypatch):
+        config = ExtractConfig(
+            database_url="postgresql://localhost/test?sslmode=require",
+            seeds=[],
+            compliance_profiles=["gdpr"],
+            compliance_required_sslmode="require",
+            compliance_require_ci=True,
+        )
+        monkeypatch.setenv("CI", "true")
+        cli._enforce_source_guardrails(config)
+
+        bad = ExtractConfig(
+            database_url="postgresql://localhost/test?sslmode=disable",
+            seeds=[],
+            compliance_profiles=["gdpr"],
+            compliance_required_sslmode="require",
+        )
+        with pytest.raises(ValueError, match="sslmode"):
+            cli._enforce_source_guardrails(bad)
+
+
+class TestComplianceInspectReport:
+    def test_report_detects_uncovered_columns(self):
+        class FakeAdapter:
+            def fetch_rows(self, table: str, where_clause: str, params: tuple[object, ...]):
+                assert where_clause == "TRUE"
+                assert params == ()
+                if table == "logs":
+                    yield {"id": 1, "message": "contact jane@example.com"}
+
+        schema = SchemaGraph(
+            tables={
+                "logs": Table(
+                    name="logs",
+                    schema="public",
+                    columns=[
+                        Column("id", "integer", False, True),
+                        Column("message", "text", True, False),
+                    ],
+                    primary_key=("id",),
+                    foreign_keys=[],
+                )
+            },
+            edges=[],
+        )
+
+        # Smoke test that helper executes and prints JSON report
+        cli._run_compliance_check_report(
+            adapter=FakeAdapter(),
+            db_schema=schema,
+            profiles=["gdpr"],
+            sample_rows=10,
+            output_mode="json",
+            target_table=None,
+            console=MagicMock(),
+        )
diff --git a/tests/test_compliance_gaps.py b/tests/test_compliance_gaps.py
new file mode 100644
index 0000000..2bd30f0
--- /dev/null
+++ b/tests/test_compliance_gaps.py
@@ -0,0 +1,328 @@
+"""
+Tests for compliance gap fixes: HIPAA transformers, free-text handling,
+binary columns, k-anonymity, and configurable scan sample size.
+
+These tests verify END-TO-END behavior — not just that functions exist,
+but that data is actually transformed correctly through the full pipeline.
+"""
+
+from dbslice.compliance.transformers import (
+    BINARY_SENTINEL,
+    age_bucket,
+    hipaa_safe_harbor_zip3,
+    redact_freetext,
+    year_only,
+)
+
+# ──────────────────────────────────────────────────────────
+# Phase A: HIPAA-specific transformers
+# ──────────────────────────────────────────────────────────
+
+
+class TestYearOnly:
+    def test_iso_date(self):
+        assert year_only("2024-03-15") == "2024"
+
+    def test_iso_datetime(self):
+        assert year_only("2024-03-15T10:30:00") == "2024"
+
+    def test_us_date_slash(self):
+        assert year_only("03/15/2024") == "2024"
+
+    def test_us_date_dash(self):
+        assert year_only("03-15-2024") == "2024"
+
+    def test_datetime_object(self):
+        import datetime
+
+        assert year_only(datetime.date(1985, 6, 15)) == "1985"
+
+    def test_datetime_datetime_object(self):
+        import datetime
+
+        assert year_only(datetime.datetime(1985, 6, 15, 10, 30)) == "1985"
+
+    def test_just_year(self):
+        assert year_only("1990") == "1990"
+
+    def test_none(self):
+        assert year_only(None) == ""
+
+    def test_garbage(self):
+        assert year_only("not a date") == ""
+
+    def test_embedded_year(self):
+        assert year_only("born in 1985 somewhere") == "1985"
+
+
+class TestHipaaZip3:
+    def test_normal_5digit_zip(self):
+        result = hipaa_safe_harbor_zip3("12345")
+        assert result == "123"
+
+    def test_zip_plus_4(self):
+        result = hipaa_safe_harbor_zip3("12345-6789")
+        assert result == "123"
+
+    def test_low_population_zip_suppressed(self):
+        # 036xx is in NH, low population
+        result = hipaa_safe_harbor_zip3("03601")
+        assert result == "000"
+
+    def test_another_low_pop(self):
+        # 821xx is in WY
+        result = hipaa_safe_harbor_zip3("82101")
+        assert result == "000"
+
+    def test_high_population_zip_retained(self):
+        # 100xx is NYC — high population
+        result = hipaa_safe_harbor_zip3("10001")
+        assert result == "100"
+
+    def test_short_zip(self):
+        result = hipaa_safe_harbor_zip3("12")
+        assert result == "000"
+
+    def test_integer_zip(self):
+        result = hipaa_safe_harbor_zip3(90210)
+        assert result == "902"
+
+
+class TestAgeBucket:
+    def test_normal_age(self):
+        assert age_bucket(45) == "45"
+
+    def test_age_89(self):
+        assert age_bucket(89) == "89"
+
+    def test_age_90_bucketed(self):
+        assert age_bucket(90) == "90+"
+
+    def test_age_105_bucketed(self):
+        assert age_bucket(105) == "90+"
+
+    def test_string_age(self):
+        assert age_bucket("75") == "75"
+
+    def test_string_age_over_89(self):
+        assert age_bucket("92") == "90+"
+
+    def test_non_numeric(self):
+        assert age_bucket("unknown") == "unknown"
+
+
+class TestCustomTransformersInAnonymizer:
+    """Verify custom transformers are actually called by the anonymizer."""
+
+    def test_year_only_through_anonymizer(self):
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        anon = DeterministicAnonymizer(seed="test")
+        anon.configure(
+            redact_fields=[],
+            field_providers={"patients.admission_date": "year_only"},
+        )
+        result = anon.anonymize_value("2024-03-15", "patients", "admission_date")
+        assert result == "2024"
+
+    def test_hipaa_zip3_through_anonymizer(self):
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        anon = DeterministicAnonymizer(seed="test")
+        anon.configure(
+            redact_fields=[],
+            field_providers={"patients.zipcode": "hipaa_zip3"},
+        )
+        result = anon.anonymize_value("03601", "patients", "zipcode")
+        assert result == "000"  # Low population area
+
+    def test_age_bucket_through_anonymizer(self):
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        anon = DeterministicAnonymizer(seed="test")
+        anon.configure(
+            redact_fields=[],
+            field_providers={"patients.age": "age_bucket"},
+        )
+        assert anon.anonymize_value(92, "patients", "age") == "90+"
+        assert anon.anonymize_value(45, "patients", "age") == "45"
+
+    def test_hipaa_profile_uses_year_only_for_dates(self):
+        """End-to-end: HIPAA profile maps date columns to year_only transformer."""
+        from dbslice.compliance.profiles import get_profile
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        profile = get_profile("hipaa")
+        # Build fallback patterns like the engine does
+        fallback_patterns = {
+            f"*.{pattern}*": method for pattern, method in profile.required_column_patterns.items()
+        }
+
+        anon = DeterministicAnonymizer(seed="test")
+        anon.configure(
+            redact_fields=[],
+            fallback_patterns=fallback_patterns,
+        )
+
+        # admission_date should use year_only
+        result = anon.anonymize_value("2024-03-15", "visits", "admission_date")
+        assert result == "2024"
+
+    def test_hipaa_profile_uses_zip3_for_zipcodes(self):
+        """End-to-end: HIPAA profile maps ZIP columns to hipaa_zip3 transformer."""
+        from dbslice.compliance.profiles import get_profile
+        from dbslice.utils.anonymizer import DeterministicAnonymizer
+
+        profile = get_profile("hipaa")
+        fallback_patterns = {
+            f"*.{pattern}*": method for pattern, method in profile.required_column_patterns.items()
+        }
+
+        anon = DeterministicAnonymizer(seed="test")
+        anon.configure(
+            redact_fields=[],
+            fallback_patterns=fallback_patterns,
+        )
+
+        result = anon.anonymize_value("82101", "addresses", "zipcode")
+        assert result == "000"  # Wyoming, low population
+
+
+# ──────────────────────────────────────────────────────────
+# Phase B: Free-text handling
+# ──────────────────────────────────────────────────────────
+
+
+class TestFreetextRedaction:
+    def test_redact_email(self):
+        text = "Contact john@example.com for details"
+        result = redact_freetext(text)
+        assert "[REDACTED_EMAIL]" in result
+        assert "john@example.com" not in result
+        assert "Contact" in result
+        assert "for details" in result
+
+    def test_redact_ssn(self):
+        text = "Patient SSN: 123-45-6789"
+        result = redact_freetext(text)
+        assert "[REDACTED_SSN]" in result
+        assert "123-45-6789" not in result
+
+    def test_redact_phone(self):
+        text = "Call 555-123-4567"
+        result = redact_freetext(text)
+        assert "[REDACTED_PHONE]" in result
+
+    def test_redact_multiple(self):
+        text = "Email: alice@test.com, SSN: 123-45-6789"
+        result = redact_freetext(text)
+        assert "[REDACTED_EMAIL]" in result
+        assert "[REDACTED_SSN]" in result
+        assert "alice@test.com" not in result
+        assert "123-45-6789" not in result
+
+    def test_no_pii_unchanged(self):
+        text = "This is a normal note with no PII"
+        assert redact_freetext(text) == text
+
+    def test_none_returns_empty(self):
+        assert redact_freetext(None) == ""
+
+    def test_redact_ip(self):
+        text = "Source IP: 192.168.1.100"
+        result = redact_freetext(text)
+        assert "[REDACTED_IP]" in result
+        assert "192.168.1.100" not in result
+
+
+# ──────────────────────────────────────────────────────────
+# Phase E: k-anonymity
+# ──────────────────────────────────────────────────────────
+
+
+class TestKAnonymityCheck:
+    """Test k-anonymity verification logic directly on data."""
+
+    def test_passes_when_k_satisfied(self):
+        """Each combination appears >= 2 times."""
+        rows = [
+            {"age": "30", "gender": "M", "zip": "100"},
+            {"age": "30", "gender": "M", "zip": "100"},
+            {"age": "40", "gender": "F", "zip": "200"},
+            {"age": "40", "gender": "F", "zip": "200"},
+        ]
+        violations = self._check(rows, ["age", "gender", "zip"], k=2)
+        assert len(violations) == 0
+
+    def test_fails_when_unique_combination(self):
+        """One person with a unique combination."""
+        rows = [
+            {"age": "30", "gender": "M", "zip": "100"},
+            {"age": "30", "gender": "M", "zip": "100"},
+            {"age": "99", "gender": "X", "zip": "999"},  # unique
+        ]
+        violations = self._check(rows, ["age", "gender", "zip"], k=2)
+        assert len(violations) == 1
+
+    def test_k_1_always_passes(self):
+        rows = [{"age": "unique_value", "gender": "unique"}]
+        violations = self._check(rows, ["age", "gender"], k=1)
+        assert len(violations) == 0
+
+    @staticmethod
+    def _check(rows, qi_columns, k):
+        from collections import Counter
+
+        combos = Counter(tuple(str(row.get(c, "")) for c in qi_columns) for row in rows)
+        return [(combo, count) for combo, count in combos.items() if count < k]
+
+
+# ──────────────────────────────────────────────────────────
+# Phase C: Binary column sentinel
+# ──────────────────────────────────────────────────────────
+
+
+class TestBinarySentinel:
+    def test_sentinel_value(self):
+        assert BINARY_SENTINEL == b"\x00"
+
+
+# ──────────────────────────────────────────────────────────
+# Integration: verify config fields exist
+# ──────────────────────────────────────────────────────────
+
+
+class TestConfigFields:
+    def test_extract_config_has_new_fields(self):
+        from dbslice.config import ExtractConfig, SeedSpec
+
+        seed = SeedSpec(table="t", column="c", value=1, where_clause=None)
+        config = ExtractConfig(
+            database_url="postgres://localhost/test",
+            seeds=[seed],
+            freetext_action="redact",
+            binary_action="sentinel",
+            compliance_sample_rows=500,
+            k_anonymity_min_k=3,
+            k_anonymity_quasi_identifiers=["users.age", "users.zip"],
+            k_anonymity_action="fail",
+        )
+        assert config.freetext_action == "redact"
+        assert config.binary_action == "sentinel"
+        assert config.compliance_sample_rows == 500
+        assert config.k_anonymity_min_k == 3
+        assert config.k_anonymity_action == "fail"
+
+    def test_defaults(self):
+        from dbslice.config import ExtractConfig, SeedSpec
+
+        seed = SeedSpec(table="t", column="c", value=1, where_clause=None)
+        config = ExtractConfig(
+            database_url="postgres://localhost/test",
+            seeds=[seed],
+        )
+        assert config.freetext_action == "warn"
+        assert config.binary_action == "warn"
+        assert config.compliance_sample_rows == 100
+        assert config.k_anonymity_min_k is None
+        assert config.k_anonymity_action == "warn"
diff --git a/tests/test_mapping_ui.py b/tests/test_mapping_ui.py
new file mode 100644
index 0000000..403f560
--- /dev/null
+++ b/tests/test_mapping_ui.py
@@ -0,0 +1,204 @@
+"""Tests for the column mapping UI server and API."""
+
+import json
+import threading
+import time
+from http.client import HTTPConnection
+
+import pytest
+
+from dbslice.mapping.server import MappingServer
+
+
+@pytest.fixture()
+def server():
+    """Start a mapping UI server on a random-ish port for testing."""
+    srv = MappingServer(port=19473, database_url="", schema=None)
+    thread = threading.Thread(target=srv.start, kwargs={"open_browser": False}, daemon=True)
+    thread.start()
+    time.sleep(0.3)  # Wait for server to bind
+    yield srv
+    if srv._server:
+        srv._server.shutdown()
+
+
+def _conn():
+    return HTTPConnection("127.0.0.1", 19473, timeout=5)
+
+
+def _post(conn, path, body, token):
+    conn.request(
+        "POST",
+        path,
+        json.dumps(body).encode(),
+        {"Content-Type": "application/json", "X-DBSLICE-Token": token},
+    )
+    return conn.getresponse()
+
+
+class TestTokenSecurity:
+    def test_get_without_token_fails(self, server):
+        conn = _conn()
+        conn.request("GET", "/")
+        resp = conn.getresponse()
+        assert resp.status == 403
+
+    def test_get_with_wrong_token_fails(self, server):
+        conn = _conn()
+        conn.request("GET", "/?token=wrong")
+        resp = conn.getresponse()
+        assert resp.status == 403
+
+    def test_get_with_valid_token_serves_html(self, server):
+        conn = _conn()
+        conn.request("GET", f"/?token={server.token}")
+        resp = conn.getresponse()
+        assert resp.status == 200
+        body = resp.read().decode()
+        assert "dbslice" in body
+        assert "Column Mapping" in body
+
+    def test_post_without_token_fails(self, server):
+        conn = _conn()
+        conn.request(
+            "POST",
+            "/api/validate-provider",
+            json.dumps({"provider": "email"}).encode(),
+            {"Content-Type": "application/json"},
+        )
+        resp = conn.getresponse()
+        assert resp.status == 403
+
+    def test_post_with_wrong_token_fails(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/validate-provider", {"provider": "email"}, "bad-token")
+        assert resp.status == 403
+
+
+class TestValidateProviderAPI:
+    def test_valid_faker_provider(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/validate-provider", {"provider": "email"}, server.token)
+        assert resp.status == 200
+        data = json.loads(resp.read())
+        assert data["valid"] is True
+        assert data["source"] == "faker"
+
+    def test_valid_custom_transformer(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/validate-provider", {"provider": "year_only"}, server.token)
+        assert resp.status == 200
+        data = json.loads(resp.read())
+        assert data["valid"] is True
+        assert data["source"] == "custom_transformer"
+
+    def test_invalid_provider(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/validate-provider", {"provider": "not_a_real_provider_xyz"}, server.token)
+        assert resp.status == 200
+        data = json.loads(resp.read())
+        assert data["valid"] is False
+
+    def test_hipaa_zip3_is_valid(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/validate-provider", {"provider": "hipaa_zip3"}, server.token)
+        data = json.loads(resp.read())
+        assert data["valid"] is True
+
+
+class TestGenerateConfigAPI:
+    def test_generate_basic_config(self, server):
+        conn = _conn()
+        mappings = {
+            "users.email": {"action": "anonymize", "provider": "email"},
+            "users.password_hash": {"action": "null", "provider": ""},
+        }
+        resp = _post(conn, "/api/generate-config", {"mappings": mappings}, server.token)
+        assert resp.status == 200
+        data = json.loads(resp.read())
+        assert "yaml" in data
+        assert "users.email: email" in data["yaml"]
+        assert "users.password_hash" in data["yaml"]
+        assert data["field_count"] == 1
+        assert data["null_count"] == 1
+        assert "command_template" in data
+        assert "dbslice extract" in data["command_template"]
+
+    def test_generate_empty_config(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/generate-config", {"mappings": {}}, server.token)
+        assert resp.status == 200
+        data = json.loads(resp.read())
+        assert data["field_count"] == 0
+        assert data["null_count"] == 0
+
+    def test_keep_action_excluded(self, server):
+        conn = _conn()
+        mappings = {
+            "users.email": {"action": "keep", "provider": ""},
+        }
+        resp = _post(conn, "/api/generate-config", {"mappings": mappings}, server.token)
+        data = json.loads(resp.read())
+        assert "users.email" not in data["yaml"]
+
+    def test_generated_yaml_is_valid(self, server):
+        """The generated YAML should parse without errors."""
+        import yaml
+
+        conn = _conn()
+        mappings = {
+            "users.email": {"action": "anonymize", "provider": "email"},
+            "users.ssn": {"action": "anonymize", "provider": "ssn"},
+            "users.token": {"action": "null", "provider": ""},
+        }
+        resp = _post(conn, "/api/generate-config", {"mappings": mappings}, server.token)
+        data = json.loads(resp.read())
+        parsed = yaml.safe_load(data["yaml"])
+        assert parsed["anonymization"]["enabled"] is True
+        assert "users.email" in parsed["anonymization"]["fields"]
+        assert "users.token" in parsed["anonymization"]["security_null_fields"]
+
+
+class TestNotFoundRoutes:
+    def test_unknown_get(self, server):
+        conn = _conn()
+        conn.request("GET", f"/unknown?token={server.token}")
+        resp = conn.getresponse()
+        assert resp.status == 404
+
+    def test_unknown_post(self, server):
+        conn = _conn()
+        resp = _post(conn, "/api/unknown", {}, server.token)
+        assert resp.status == 404
+
+
+class TestUIContent:
+    def test_html_contains_token(self, server):
+        conn = _conn()
+        conn.request("GET", f"/?token={server.token}")
+        resp = conn.getresponse()
+        body = resp.read().decode()
+        assert server.token in body
+
+    def test_html_loads_from_static_file(self, server):
+        """UI should load from the static HTML file, not inline string."""
+        conn = _conn()
+        conn.request("GET", f"/?token={server.token}")
+        resp = conn.getresponse()
+        body = resp.read().decode()
+        # Should contain Tailwind CDN (intentional external resource)
+        assert "tailwindcss" in body
+        # Should contain key UI elements
+        assert "Column Mapping" in body
+        assert "Introspect Schema" in body
+        assert "Compliance Profiles" in body
+
+    def test_html_has_proper_structure(self, server):
+        """UI should have accessibility and structural elements."""
+        conn = _conn()
+        conn.request("GET", f"/?token={server.token}")
+        resp = conn.getresponse()
+        body = resp.read().decode()
+        assert 'lang="en"' in body
+        assert 'aria-label' in body
+        assert 'aria-live="polite"' in body

Map columns to anonymization rules	Generate and export config