Skip to content

Fix sandbox-verification workflow YAML parsing, Python argument, and generator type errors#45

Merged
MaximumTrainer merged 6 commits intomainfrom
copilot/fix-sandbox-verification-workflow
Apr 7, 2026
Merged

Fix sandbox-verification workflow YAML parsing, Python argument, and generator type errors#45
MaximumTrainer merged 6 commits intomainfrom
copilot/fix-sandbox-verification-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

  • Fix YAML parsing error in sandbox-verification workflow
  • Fix -- argument bug in Python invocations
  • Fix BIRTH_DATE generator to return java.sql.Date instead of String
  • Fix DATE generator to return java.sql.Date instead of String
  • Fix RANDOM_INT generator to return Long instead of String (salary column failing)
  • Fix SEQUENTIAL generator to return Long instead of String
  • Fix MONEY_AMOUNT generator to return BigDecimal instead of String
  • Add unit tests asserting correct types for numeric generators
  • Update GeneratorServiceCompositeTest to expect Long for SEQUENTIAL and RANDOM_INT

…parsing error

Lines 183-184 and 188-189 had Python code at column 1 (zero indentation)
inside a `run: |` block scalar, which broke YAML parsing. GitHub showed
the workflow run as failed with 0 jobs because it couldn't parse the file.

Fixed by inlining the Python code on single lines using semicolons.

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/4c9208a0-04e7-4727-b061-0407742be792

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a YAML parsing issue in the sandbox-verification GitHub Actions workflow so the workflow file is valid and jobs can be created/executed as expected.

Changes:

  • Rewrites the multi-line python3 -c invocations used to build JSON request bodies into single-line commands.
  • Removes zero-indentation lines inside a run: | block that were causing the workflow YAML to fail parsing (resulting in “0 jobs”).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot this build is failing. The job failed because the backend service did not report healthy within the expected timeout, so all subsequent steps (including obtaining a JWT token and test execution) could not continue.

Here’s how to address this:

Investigate Backend Service Startup:

Check the backend container logs to see why /actuator/health never returned "UP".
Common problems are:
Database containers (source_db, target_db, app_db) not started, initialized, or accessible by the backend.
Backend failing to bind to port 8080.
Race Condition with Database:

The backend may fail health checks if a Postgres DB isn’t ready.
Solution: Add a health check or wait-until-ready script for databases before launching the backend. In verification/docker-compose.yml, ensure each service has a healthcheck and that the backend service uses depends_on with condition: service_healthy.
Examine Docker Compose and Backend Code:

Ensure DATABASE_URL or related environment variables in backend’s config match what Docker Compose is providing.
If the backend expects migrations to run, ensure they execute before the backend starts (a common cause of “unhealthy”).

Python does not consume -- as an option terminator; it passes it
through as sys.argv[1]. This caused the JSON body to contain
{"columnName":"--","generatorType":"full_name"} instead of
{"columnName":"full_name","generatorType":"FULL_NAME"}, resulting
in a 400 Bad Request from the API (curl exit code 22).

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/bd4340f7-88e2-4b27-b879-d21be6992abf

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot this build is failing. The job failed because the backend service did not report healthy withi...

The backend was actually healthy and running fine — the real failure was in the "Add column generators" step (curl exit code 22 = HTTP 400).

Root cause: the python3 -c invocations used -- as an option separator, but Python doesn't consume -- — it passes it through as sys.argv[1]. This produced malformed JSON like {"columnName":"--","generatorType":"full_name"} instead of {"columnName":"full_name","generatorType":"FULL_NAME"}, causing the API to reject the request with a 400 Bad Request.

Fixed in 02da15e by removing -- from both Python invocations.

Copilot AI changed the title Fix sandbox-verification workflow YAML parsing error Fix sandbox-verification workflow YAML parsing and Python argument errors Apr 7, 2026
Copilot AI requested a review from MaximumTrainer April 7, 2026 12:43
@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot the build is failing.
The job failed due to an SQL error:

ERROR: column "date_of_birth" is of type date but expression is of type character varying
Hint: You will need to rewrite or cast the expression.

This indicates that the code tried to insert a string (character varying) into the "date_of_birth" column (which expects a date type) in the users table.

Solution:

Ensure that the value for "date_of_birth" in your masking code or data generation logic is a valid date, not a string, or explicitly cast the value to a date.
In SQL INSERT statements, you should use either an ISO-formatted date string and cast it, for example:
SQL
'1968-06-10'::date
or pass a proper Date/Datetime object if you're binding parameters in application code.
If the generated value for "date_of_birth" is a string:
Update the code where the masked user rows are assembled and ensure the date_of_birth field is formatted and/or cast as a date.

For example, if you use Python for generating values:

Python
from datetime import datetime

If you currently have:

row["date_of_birth"] = "1968-06-10 03:05:52.8941" # String

Use this instead:

row["date_of_birth"] = datetime.strptime("1968-06-10", "%Y-%m-%d").date()
Or, if you use SQL for inserts:

SQL
INSERT INTO users (..., date_of_birth, ...)
VALUES (..., '1968-06-10'::date, ...);
Summary Fix:

Locate where the masked data is generated or mapped for the 'users' table.
Make sure the "date_of_birth" field is of type date—not a plain string—before inserting it into the database.

The BIRTH_DATE generator was returning faker.date().birthday().toString()
which produces a String. When PostgreSQLConnector.writeData() passes this
to stmt.setObject(), PostgreSQL rejects it with:
  ERROR: column "date_of_birth" is of type date but expression is of
  type character varying

Fix: return java.sql.Date(faker.date().birthday().time) so JDBC
correctly maps it to a SQL DATE type.

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/0451c428-2e84-42ae-b750-404b5345b2a9

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot the build is failing.
The job failed due to an SQL error:

...

The BIRTH_DATE generator in GeneratorService.kt was returning faker.date().birthday().toString() — a String. When the PostgreSQL connector's writeData() passes this to stmt.setObject(), PostgreSQL rejects it because a varchar can't be implicitly cast to date.

Fixed in 2f41ec4 by returning java.sql.Date(faker.date().birthday().time) instead, which JDBC correctly maps to a SQL DATE type.

Copilot AI changed the title Fix sandbox-verification workflow YAML parsing and Python argument errors Fix sandbox-verification workflow YAML parsing, Python argument, and BIRTH_DATE type errors Apr 7, 2026
@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot review GeneratorService.kt and its unit test to ensure all data times return in the correct format, not just strings. see BIRTH_DATE as an example.

… for all date generators

The DATE generator had the same string-returning bug as BIRTH_DATE:
it called .toString() on the faker date, producing a String that
PostgreSQL rejected for date columns.

- Changed DATE generator to return java.sql.Date (matching BIRTH_DATE)
- Updated DATE test to assert java.sql.Date type instead of just non-null
- Added BIRTH_DATE test asserting java.sql.Date type

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/2f6ca3b8-126f-44ee-940c-2ee61c615b98

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot review GeneratorService.kt and its unit test to ensure all data times return in the correct format, not just strings. see BIRTH_DATE as an example.

Reviewed all generators in GeneratorService.kt — the DATE generator had the same string-returning bug as BIRTH_DATE: it called .toString() on the faker date. Fixed both to return java.sql.Date, and added assertInstanceOf(java.sql.Date) tests for both generators. Commit e1a1197.

Copilot AI changed the title Fix sandbox-verification workflow YAML parsing, Python argument, and BIRTH_DATE type errors Fix sandbox-verification workflow YAML parsing, Python argument, and date type errors Apr 7, 2026
…NT generators

RANDOM_INT returned .toString() producing a String, causing PostgreSQL
to reject inserts into numeric columns (e.g. "salary" of type numeric).
Same issue existed in SEQUENTIAL and MONEY_AMOUNT generators.

- RANDOM_INT: return Long instead of String
- SEQUENTIAL: return Long instead of String
- MONEY_AMOUNT: return BigDecimal instead of String
- Add assertInstanceOf tests for all three generators

Agent-Logs-Url: https://github.com/MaximumTrainer/OpenDataMask/sessions/888b36ec-98e1-4f51-bff9-4eee4fc169e9

Co-authored-by: MaximumTrainer <1376575+MaximumTrainer@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot fix this build
The job failed due to a database type mismatch: the masking process tried to i...

The RANDOM_INT generator was calling .toString() on the numeric result, same pattern as the date bugs. Also fixed SEQUENTIAL (returned Long.toString()) and MONEY_AMOUNT (returned String from faker.commerce().price()).

All three now return proper types:

  • RANDOM_INTLong
  • SEQUENTIALLong
  • MONEY_AMOUNTBigDecimal

Added assertInstanceOf tests for each. Commit 5fe7feb.

@MaximumTrainer MaximumTrainer requested a review from Copilot April 7, 2026 15:37
@MaximumTrainer MaximumTrainer marked this pull request as ready for review April 7, 2026 15:37
@MaximumTrainer MaximumTrainer removed their request for review April 7, 2026 15:37
Copilot AI changed the title Fix sandbox-verification workflow YAML parsing, Python argument, and date type errors Fix sandbox-verification workflow YAML parsing, Python argument, and generator type errors Apr 7, 2026
Copilot AI requested a review from MaximumTrainer April 7, 2026 15:38
@MaximumTrainer MaximumTrainer removed their request for review April 7, 2026 15:40
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 134 to 145
GeneratorType.SEQUENTIAL -> {
val start = params?.get("start")?.toLongOrNull() ?: 1L
val step = params?.get("step")?.toLongOrNull() ?: 1L
val key = columnKey ?: "default"
val counter = sequentialCounters.computeIfAbsent(key) { AtomicLong(start - step) }
counter.addAndGet(step).toString()
counter.addAndGet(step)
}
GeneratorType.RANDOM_INT -> {
val min = params?.get("min")?.toLongOrNull() ?: 1L
val max = params?.get("max")?.toLongOrNull() ?: 999999L
faker.number().numberBetween(min, max).toString()
faker.number().numberBetween(min, max)
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeneratorType.SEQUENTIAL and GeneratorType.RANDOM_INT now return Longs (and MONEY_AMOUNT returns BigDecimal). There are existing unit tests in backend/src/test/kotlin/com/opendatamask/application/service/GeneratorServiceCompositeTest.kt that still cast SEQUENTIAL/RANDOM_INT results to String and assert string behavior; those tests will fail and should be updated to match the new return types (or removed if redundant).

Copilot uses AI. Check for mistakes.
@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread.

@MaximumTrainer
Copy link
Copy Markdown
Owner

@copilot this build is not completed due to failed tests

The job failed due to six test failures in GeneratorServiceCompositeTest, all related to either ClassCastException or assertion errors:

Key failures:

  • RANDOM_INT returns a string not an integer() – AssertionFailedError at line 186
  • SEQUENTIAL starts at configured start value(), uses separate counters per column key(), increments by step() – ClassCastException at lines 140, 152, 162
  • RANDOM_INT returns string within range(), uses defaults when no params() – ClassCastException at lines 172, 191

Solution and code suggestions:

  1. Review the tested functions to ensure RANDOM_INT is returning and being asserted as a string, or update the test to expect an int if the return type is intended to be integer.
  2. For SEQUENTIAL and RANDOM_INT, inspect any casting logic around the reported lines. If values are returned or read as type X but cast to Y, align types in both production and test code.
  3. Example correction for Kotlin casting:
    // If the tested function returns Any, cast as the correct type
    val result = generator.randomInt()
    val value: String = result as? String ?: result.toString()
    // Or update the test if it should be Int:
    val value: Int = result as? Int ?: (result as String).toInt()
  4. Check the implementation to avoid assuming the returned type, especially if generator functions are composable or configurable.

Update GeneratorServiceCompositeTest, especially lines 140, 152, 162, 172, 186, and 191, to ensure type consistency between the expected and actual values. This should resolve the runtime ClassCastException and assertion failures.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot apply changes based on the comments in this thread.

Updated GeneratorServiceCompositeTest.kt — all SEQUENTIAL and RANDOM_INT tests now expect Long instead of String, matching the new return types. All tests pass. Commit 0d187b5.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot this build is not completed due to failed tests

The job failed due to six test failures in Generat...

These test failures were already fixed in 0d187b5 — all SEQUENTIAL and RANDOM_INT tests in GeneratorServiceCompositeTest now cast to Long and assert numeric behavior instead of String. All 6 previously failing tests pass.

Copilot AI requested a review from MaximumTrainer April 7, 2026 15:53
@MaximumTrainer MaximumTrainer requested review from MaximumTrainer and removed request for MaximumTrainer April 7, 2026 15:53
@MaximumTrainer MaximumTrainer merged commit d792720 into main Apr 7, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants