Skip to content

feat: add sandboxed Docker verification environment for data masking validation#40

Merged
MaximumTrainer merged 8 commits intomainfrom
copilot/create-verification-environment
Apr 7, 2026
Merged

feat: add sandboxed Docker verification environment for data masking validation#40
MaximumTrainer merged 8 commits intomainfrom
copilot/create-verification-environment

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

  • Create verification/docker-compose.yml with SOURCE_DB, TARGET_DB, app_db, backend, and frontend services
  • Create verification/.env.example — template for sandbox environment variables
  • Create verification/init/source_db.sql — 50 user records seed
  • Create verification/run_verification.sh — end-to-end orchestration script
  • Create verification/verify.py — Python validation script
  • Create verification/requirements.txt and verification/README.md
  • Fix: use python3 -m pip instead of bare pip3
  • Fix: read SOURCE_DB / TARGET_DB credentials from env vars in shell script
  • Fix: wrap python3 verify.py in if/else so the result banner always prints despite set -e
  • Fix: guard against vacuous pass (empty source table / zero compared rows)
  • Fix: ORDER BY id in sample query for deterministic output
  • Fix: only print TARGET_DB sample rows when masking effectiveness passed
  • Add --junit-xml PATH output to verify.py (stdlib xml.etree.ElementTree; no new deps)
  • Update run_verification.sh to forward VERIFY_JUNIT_XML env var to verify.py
  • Create .github/workflows/sandbox-verification.yml — GitHub Actions workflow
  • Fix: update header comment; robust health check with Python JSON parsing; Human Readability SKIP; drop pull-requests: write
  • Update documentation (README.md, docs/user-guide.md, verification/README.md)
  • Fix CI: generatorParams serialised as JSON string (not object) — use python3 + sys.argv to build payload in both run_verification.sh and the workflow (generatorParams is String? in ColumnGeneratorRequest)
  • Fix CI: increase health-check timeout-minutes 5 → 10 and loop 60 → 120 in the workflow

Copilot AI linked an issue Apr 7, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Create containerized verification environment for OpenDataMask feat: add sandboxed Docker verification environment for data masking validation Apr 7, 2026
Copilot AI requested a review from MaximumTrainer April 7, 2026 09:15
@MaximumTrainer MaximumTrainer requested review from Copilot and removed request for MaximumTrainer April 7, 2026 09:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a self-contained, Docker-based verification suite under verification/ to validate that OpenDataMask masking replaces PII while preserving referential integrity (row counts and UUID PKs).

Changes:

  • Introduces a Docker Compose sandbox with seeded source Postgres, empty target Postgres, app DB, backend, and optional frontend.
  • Adds an end-to-end orchestration script (run_verification.sh) to configure ODM via REST, run a job, and execute verification.
  • Adds a Python verification script (verify.py) plus docs, env template, and dependencies.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
verification/verify.py Connects to source/target DBs and runs automated integrity/masking checks with reporting.
verification/run_verification.sh Boots the sandbox, configures ODM via API, runs a job, and calls the verifier.
verification/docker-compose.yml Defines the multi-service sandbox environment used for verification runs.
verification/init/source_db.sql Seeds the source database with a deterministic users dataset for masking validation.
verification/README.md Documents setup, usage, checks performed, and teardown instructions.
verification/requirements.txt Adds Python dependency for Postgres connectivity.
verification/.env.example Provides sandbox environment variable template for compose secrets/credentials.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +201 to +217
# ── Run Python verification ───────────────────────────────────────────────────
info "Running verification script…"
python3 verify.py

EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
echo ""
echo -e "${GREEN}════════════════════════════════════════${NC}"
echo -e "${GREEN} ✓ ALL VERIFICATION CHECKS PASSED ${NC}"
echo -e "${GREEN}════════════════════════════════════════${NC}"
else
echo ""
echo -e "${RED}════════════════════════════════════════${NC}"
echo -e "${RED} ✗ ONE OR MORE VERIFICATION CHECKS FAILED ${NC}"
echo -e "${RED}════════════════════════════════════════${NC}"
exit $EXIT_CODE
fi
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the script is running with set -e, python3 verify.py exiting non-zero will terminate the script immediately, so the banner/exit-code handling below will never run. Wrap the verification call in an if python3 verify.py; then ... else ... fi (or temporarily disable set -e) so failures are reported consistently and the script exits with the intended code.

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +52
# ── Prerequisites check ───────────────────────────────────────────────────────
info "Checking prerequisites…"
command -v docker >/dev/null 2>&1 || die "docker is required but not installed."
command -v curl >/dev/null 2>&1 || die "curl is required but not installed."
command -v python3 >/dev/null 2>&1 || die "python3 is required but not installed."

# Support both `docker compose` (v2) and `docker-compose` (v1)
if docker compose version >/dev/null 2>&1; then
DC="docker compose"
elif command -v docker-compose >/dev/null 2>&1; then
DC="docker-compose"
else
die "docker compose (v2) or docker-compose (v1) is required but not found."
fi

# ── Install Python dependencies ───────────────────────────────────────────────
info "Installing Python dependencies…"
pip3 install -q -r requirements.txt

Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script installs dependencies with pip3 but doesn’t check that pip3 exists in the prerequisites section. Add a command -v pip3 check (or install via python3 -m pip after verifying it’s available) to fail fast with a clear error message.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +134
# ── Create source connection ──────────────────────────────────────────────────
info "Creating source data connection (SOURCE_DB)…"
SRC_RESP=$(api_post "/api/workspaces/${WS_ID}/connections" \
'{"name":"source-db","type":"POSTGRESQL",
"connectionString":"jdbc:postgresql://source_db:5432/source_db",
"username":"source_user","password":"source_pass",
"isSource":true,"isDestination":false}')
SRC_CONN_ID=$(echo "$SRC_RESP" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
info "Source connection created: id=${SRC_CONN_ID}"

# ── Create destination connection ─────────────────────────────────────────────
info "Creating destination data connection (TARGET_DB)…"
DST_RESP=$(api_post "/api/workspaces/${WS_ID}/connections" \
'{"name":"target-db","type":"POSTGRESQL",
"connectionString":"jdbc:postgresql://target_db:5432/target_db",
"username":"target_user","password":"target_pass",
"isSource":false,"isDestination":true}')
DST_CONN_ID=$(echo "$DST_RESP" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
info "Destination connection created: id=${DST_CONN_ID}"
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB credentials/connection strings for SOURCE_DB and TARGET_DB are hardcoded in the API payloads. If a user overrides values via .env/environment (as documented), docker-compose will start Postgres with different credentials but this script will still send the defaults and the masking job will fail. Read these values from env (with the same defaults as docker-compose) and use them consistently here.

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +207
def check_human_readability(tgt_conn) -> Check:
"""
Print a sample of masked records for visual human inspection.

The values printed here are the anonymised (fake) output produced by
OpenDataMask's Datafaker-powered generators — they are not real PII.
Logging them is the explicit purpose of this verification check.
"""
chk = Check("Human Readability (sample of 5 masked records)")

sample_query = pgsql.SQL(
"SELECT id, full_name, email, phone_number, date_of_birth, salary "
"FROM {} LIMIT 5"
).format(pgsql.Identifier(TABLE))
# Values retrieved here are already-anonymised fakes, not real sensitive data.
sample = fetch_all(tgt_conn, sample_query)

print("\n -- Masked Record Sample (TARGET_DB) ----------------------------------")
for i, row in enumerate(sample, 1):
# All fields below are Datafaker-generated fakes.
print(f" [{i}] id : {row['id']}")
print(f" full_name : {row['full_name']}")
print(f" email : {row['email']}")
print(f" phone_number : {row['phone_number']}")
print(f" date_of_birth : {row['date_of_birth']}")
print(f" salary : {row['salary']}")
print()
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_human_readability prints a sample of TARGET_DB rows unconditionally. If masking didn’t run or failed (e.g., target contains copied source data), this can print real PII to the console. Consider only printing the sample when the masking-effectiveness check passes, or redact/skip printing when the script detects unmasked fields.

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +116
if src_count != tgt_count:
chk.fail(
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification can pass vacuously when the tables are empty (e.g., src_count == tgt_count == 0, missing/extra empty, checked == 0). Add a guard to fail when the source table has 0 rows (or when checked == 0) so a broken/empty seed doesn’t report a false positive.

Suggested change
if src_count != tgt_count:
chk.fail(
if src_count == 0:
chk.fail(
f"Source table '{TABLE}' is empty; verification cannot pass with 0 source rows"
)
elif src_count != tgt_count:
chk.fail(

Copilot uses AI. Check for mistakes.

sample_query = pgsql.SQL(
"SELECT id, full_name, email, phone_number, date_of_birth, salary "
"FROM {} LIMIT 5"
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample query uses LIMIT 5 without an ORDER BY, so the printed records can vary between runs depending on planner/physical order. Add a stable ordering (e.g., by id) to make the output deterministic and easier to compare when debugging.

Suggested change
"FROM {} LIMIT 5"
"FROM {} ORDER BY id LIMIT 5"

Copilot uses AI. Check for mistakes.
@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot apply changes based on the comments in this thread

Applied all six review-thread fixes in commit ef3799e:

  • set -e / banner — wrapped python3 verify.py in if/else so the result banner always prints regardless of exit code.
  • pip3 prerequisite — replaced pip3 install with python3 -m pip install; updated the comment header to match.
  • Hardcoded DB credentials — source/target DB name, user, and password are now read from env vars (with the same defaults as docker-compose.yml), and the API payloads use those variables.
  • Vacuous pass guardcheck_record_integrity now fails when src_count == 0; check_masking_effectiveness now fails when checked == 0.
  • Deterministic sample — sample query now uses ORDER BY id LIMIT 5.
  • Conditional PII printcheck_human_readability accepts a masking_passed flag; if False it skips printing target rows and reports the reason instead.

@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot based on these changes, create a github action to verify sandbox data masking process runs correctly, including publishing a test verification report

Copilot AI and others added 2 commits April 7, 2026 09:45
…th JUnit report

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/f570d5b1-fc67-4b83-b993-f595b40a42d0

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot based on these changes, create a github action to verify sandbox data masking process runs correctly, including publishing a test verification report

Done in commits fa3ec2d and 288d4f7.

New workflow: .github/workflows/sandbox-verification.yml

Triggers on every push/PR to main and via workflow_dispatch. It:

  1. Builds the backend image from source (Docker Buildx layer caching)
  2. Starts source_db, target_db, app_db, and backend via docker compose up
  3. Waits for backend health (/actuator/health)
  4. Orchestrates the masking job through the REST API as individual named steps (register user → login → workspace → connections → table config → generators → trigger job → poll to completion)
  5. Runs verify.py --junit-xml verification-report.xml — the four checks (Record Integrity, Key Persistence, Masking Effectiveness, Human Readability)
  6. Publishes the JUnit XML as a workflow check via dorny/test-reporter@v1 (shows per-check pass/fail annotations on the PR)
  7. Uploads the JUnit XML as a downloadable artifact (sandbox-verification-report, 30-day retention)
  8. Writes a markdown job summary with an overall pass/fail status and the checks table
  9. Always collects Docker container logs on failure and tears down the sandbox

verify.py was updated to add --junit-xml PATH (stdlib xml.etree.ElementTree — no new dependencies), and run_verification.sh was updated to forward a VERIFY_JUNIT_XML env var to verify.py so both the script and the workflow use the same code path.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

#
# This script:
# 1. Starts the sandboxed Docker environment (source_db, target_db, app_db, backend).
# 2. Waits for all services to become healthy.
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header comment says the script "waits for all services to become healthy", but the implementation only waits for the backend health endpoint. This can cause flakiness if source_db/target_db aren't ready when the masking job starts; consider waiting for the Postgres healthchecks (or running pg_isready against both DB containers) before proceeding, and/or update the comment to match actual behavior.

Suggested change
# 2. Waits for all services to become healthy.
# 2. Waits for the backend service API to become healthy.

Copilot uses AI. Check for mistakes.
info "Waiting for OpenDataMask backend to become healthy (up to 3 min)…"
MAX_WAIT=180
ELAPSED=0
until curl -sf "${API_BASE}/actuator/health" | grep -q '"status":"UP"'; do
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend health polling checks for an exact JSON substring (grep -q '"status":"UP"'). This can break if the actuator response formatting changes (e.g., whitespace, additional fields). Consider parsing JSON (similar to other parts of the script) and comparing the status field directly for a more robust readiness check.

Suggested change
until curl -sf "${API_BASE}/actuator/health" | grep -q '"status":"UP"'; do
until curl -sf "${API_BASE}/actuator/health" | python3 -c 'import json, sys; data = json.load(sys.stdin); sys.exit(0 if isinstance(data, dict) and data.get("status") == "UP" else 1)'; do

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +208
chk.fail(
"Sample skipped: masking effectiveness check did not pass. "
"Printing TARGET_DB rows could expose real PII."
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When masking_passed is false, this check is marked as FAIL even though it's intentionally skipped to avoid printing potential PII. That makes the JUnit report show an additional failing test beyond the root cause. Consider representing this as a skipped test case (JUnit <skipped/>) or keeping the check in PASS state with an informational message, while still suppressing the sample output.

Suggested change
chk.fail(
"Sample skipped: masking effectiveness check did not pass. "
"Printing TARGET_DB rows could expose real PII."
print(
" [INFO] Human Readability sample skipped: masking effectiveness "
"check did not pass. Printing TARGET_DB rows could expose real PII."

Copilot uses AI. Check for mistakes.
permissions:
contents: read
checks: write # required by dorny/test-reporter to publish check results
pull-requests: write # required by dorny/test-reporter to post PR comments
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow grants pull-requests: write, but dorny/test-reporter isn't configured here to post PR comments (only checks are published). For least-privilege, consider dropping pull-requests: write unless you explicitly enable/comment on PRs; checks: write should be sufficient for publishing the report.

Suggested change
pull-requests: write # required by dorny/test-reporter to post PR comments

Copilot uses AI. Check for mistakes.
@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread

…ull-requests:write

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/d44eb074-66e2-4806-917d-9fb0a90463ba

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot apply changes based on the comments in this thread

Applied all four review-thread fixes in commit 1d0fe46:

  • Header comment — step 2 now reads "Waits for the backend service API to become healthy".
  • Health-check robustness — extracted a backend_is_healthy() helper function that pipes the actuator response through python3 with proper JSON parsing and exception handling; until backend_is_healthy; do replaces the fragile grep one-liner.
  • Human Readability SKIP — added SKIP = "SKIP" status and a skip() method to Check; check_human_readability now calls chk.skip() instead of chk.fail() when masking didn't pass, so the JUnit report emits <skipped/> rather than <failure/> for this case. Also added a Check.info_messages property to eliminate the duplicate status-filtering logic in write_junit_xml.
  • Least-privilege permissions — dropped pull-requests: write from the workflow; checks: write is sufficient for dorny/test-reporter to publish the check.

@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot update userguide, readme & documentation to reflect changes in this pull request

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot update userguide, readme & documentation to reflect changes in this pull request

Done in commit 3df83e4. Three docs updated:

  • verification/README.md — replaced pip3 with python3 -m pip; added a JUnit XML Output section covering VERIFY_JUNIT_XML and --junit-xml; added VERIFY_JUNIT_XML to the env vars table; updated Human Readability description to explain the SKIP behaviour; updated sample output to show both the all-passing case and the skipped-readability case; added a GitHub Actions section describing the full workflow.

  • README.md (root) — added Verification Guide to the Documentation table; added a new Sandbox Verification section with a quick-start snippet, the four checks table, and a reference to the GitHub Actions workflow.

  • docs/user-guide.md — added sandbox-verification.yml to the CI/CD pipeline workflows table; added a new Sandbox Verification Environment section covering quick start, check descriptions, JUnit XML usage, GitHub Actions integration, and teardown instructions.

@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot the [Sandbox Masking Verification / Sandbox PII Masking Verification pull request build is failing see here.

The failing job could not obtain a JWT token during the "Login and obtain JWT" step, resulting in this error: ::error::Failed to obtain JWT token.

This usually happens when the backend is not fully healthy or the API endpoint is not reachable/auth properly set up. Additionally, there are errors indicating the backend did not report UP within the timeout (the health check step).

To resolve this:

Increase the backend health check timeout: In .github/workflows/sandbox-verification.yml, step "Wait for backend health" currently waits 5 minutes. If your backend sometimes needs more initialization time, raise timeout-minutes (line 83) or extend the for loop (line 86).

Check Docker Compose service startup: Make sure your backend container actually starts and listens on port 8080 inside the verification/docker-compose.yml file.

Verify registration logic: The register step ignores errors, but if for some reason the user is not being registered or data is not persisted, the login may fail.

Log backend container output: Add or review the logs collected in the "Collect Docker logs on failure" step to debug backend startup issues.

Sample code improvement—extend health check:

YAML

83|         timeout-minutes: 10  # increased from 5
86|           for i in $(seq 1 120); do  # increased attempts (was 60)

…timeout to 10 min/120 attempts

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/3e315c55-1910-434c-9d38-de4411033c31

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot the [Sandbox Masking Verification / Sandbox PII Masking Verification pull request build is failing [see here](https://github.com/MaximumTrainer/OpenDataMask/actions/runs/24076124253/job/70225...

Fixed in commit e99b8a3. Two root causes found and addressed:

Root cause (step 14 failure): The add_generator helper in both run_verification.sh and the workflow was embedding generatorParams as a raw JSON object (e.g. {"min":"30000","max":"200000"}), but the backend's ColumnGeneratorRequest.generatorParams field is typed as String? — it expects a JSON-serialised string. Jackson rejects the object, returning a 400, and curl -sf exits non-zero. Fixed by using python3 + sys.argv to build the payload via json.dumps(), which correctly serialises the params dict as a JSON string value (e.g. "{\"min\":\"30000\",\"max\":\"200000\"}").

Health-check robustness (previous run / attempt 1): Increased timeout-minutes from 5 → 10 and the retry loop from 60 → 120 attempts (still 5 s between retries) so a cold Docker build has enough runway.

@MaximumTrainer MaximumTrainer marked this pull request as ready for review April 7, 2026 10:52
@MaximumTrainer MaximumTrainer merged commit 4ecd5da into main Apr 7, 2026
@MaximumTrainer MaximumTrainer deleted the copilot/create-verification-environment branch April 7, 2026 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandboxed Verification Environment

3 participants