Add DocumentDB functional testing framework #1

nitinahuja89 · 2026-01-09T23:11:55Z

Implement complete test framework with pytest
- Add sample tests covering find, aggregate, and insert operations
- Multi-engine support with custom connection strings
- Automatic test isolation and cleanup
- Tag-based test organization and filtering
- Parallel execution support with pytest-xdist
Add smart result analyzer
- Automatic marker detection using heuristics
- Filters test names, file names, and engine names
- Categorizes failures: PASS/FAIL/UNSUPPORTED/INFRA_ERROR
- CLI tool: docdb-analyze with text and JSON output
Configure development tools
- Black for code formatting
- isort for import sorting
- flake8 for linting
- mypy for type checking
- pytest-cov for coverage reporting
Add comprehensive documentation
- README with usage examples and best practices
- CONTRIBUTING guide for writing tests
- result_analyzer/README explaining analyzer behavior
- All code formatted and linted
Add Docker support
- Dockerfile for containerized testing
- .dockerignore for clean builds

- Implement complete test framework with pytest - 36 tests covering find, aggregate, and insert operations - Multi-engine support with custom connection strings - Automatic test isolation and cleanup - Tag-based test organization and filtering - Parallel execution support with pytest-xdist - Add smart result analyzer - Automatic marker detection using heuristics - Filters test names, file names, and engine names - Categorizes failures: PASS/FAIL/UNSUPPORTED/INFRA_ERROR - CLI tool: docdb-analyze with text and JSON output - Configure development tools - Black for code formatting - isort for import sorting - flake8 for linting - mypy for type checking - pytest-cov for coverage reporting - Add comprehensive documentation - README with usage examples and best practices - CONTRIBUTING guide for writing tests - result_analyzer/README explaining analyzer behavior - All code formatted and linted - Add Docker support - Dockerfile for containerized testing - .dockerignore for clean builds Test Results: All 36 tests passed (100%) against DocumentDB

- Update test_find_empty_projection to use documents marker instead of manual insert - Update test_match_empty_result to use documents marker instead of manual insert - Ensures consistent test data setup and automatic cleanup - All 36 tests still passing

- Remove json_report, json_report_indent, json_report_omit from config - These are command-line options, not pytest.ini settings - Add comment explaining proper usage - All 36 tests still passing with no warnings

- Add GitHub Actions workflow for automated Docker builds - Build for linux/amd64 and linux/arm64 platforms - Push to GitHub Container Registry (ghcr.io) - Auto-tags images: latest, sha-*, version tags - Update README with pre-built image pull instructions - Fix Dockerfile casing warning (FROM...AS) Workflow Features: - Runs on push to main and on pull requests - Multi-platform support for Intel/AMD and ARM/Graviton - Automatic versioning from git tags - GitHub Actions cache for faster builds - Uses dynamic repository variable (works on forks and upstream)

- Remove the 'Image digest' step that was causing exit code 127 - The metadata and tags are already captured by the build step - Build step itself will show all relevant information in logs

normtown · 2026-01-21T16:47:13Z

conftest.py

+        metafunc.parametrize(
+            "engine_name,engine_connection_string",
+            [(name, conn) for name, conn in engines.items()],
+            ids=list(engines.keys()),


Adding the value of --engine to the test ID will cause IDs to not be stable across runs, making it more difficult to compare results for each test across executions of the test system.

If you are looking to use the engine parameter to distinguish between two systems under test, I think it is a simpler approach to run pytest separately against each system under test. Doing multiple in one pytest session mixes results, it requires both engines to be available simultaneously, and it makes comparison harder. You need to parse the engine out from the test ID, or if the engine is two versions of the same engine (say DocumentDB 1.0 vs 1.1), it's not possible to just parse out "documentdb" to distinguish which test results belong to which engine.

I have updated the RFC since raising this implementation PR to get rid of direct comparisons while doing the test. The tests will run independently against each engine and then we can use those results to compare using a separate tool. I am going to update this implementation to reflect the current state in the RFC.

normtown · 2026-01-21T16:49:35Z

conftest.py

+    parser.addoption(
+        "--engine",
+        action="append",
+        default=[],
+        help="Engine to test against. Format: name=connection_string. "
+        "Example: --engine documentdb=mongodb://localhost:27017 "
+        "--engine mongodb=mongodb://mongo:27017",


The inclusion of a connection string seems necessary, but it doesn't seem necessary to specify the name of the engine implementation: documentdb=... or mongodb=.... It's not clear what value that is supposed to add.

normtown · 2026-01-21T17:21:30Z

result_analyzer/analyzer.py

+    if outcome == "skipped":
+        # Skipped tests typically indicate unsupported features
+        return FailureType.UNSUPPORTED


I don't think this mapping is right. pytest.skip() is used for many reasons. Yes, for unsupported features, but also:

if the test needs a specific environment (e.g., requires TLS)

if the test is temporarily disabled while debugging

if there is a conditional skip (using @pytest.mark.skipif(<some predicate>)

If we want to compare test executions against each other (say for comparing versions over time), we need to distinguish things that are unsupported (i.e., real compatibility gaps) vs things that are skipped for legitimate reasons.

We probably need an explicit @pytest.mark.unsupported("reasons") for behaviors that are actually unsupported.

Though I'm also confused as to when we would mark a behavior as "unsupported" vs just not having the test. Most cases I can think of for a behavior being unsupported are relative to some other version of the system under test.

normtown · 2026-01-21T17:40:44Z

conftest.py

+    # Create unique database name based on test name
+    test_name = request.node.name.replace("[", "_").replace("]", "_")
+    db_name = f"test_{test_name}"[:63]  # MongoDB database name limit


This approach to naming can cause collisions when running tests in parallel. Tests that have names that are the same up to the 58th character can collide. E.g.,

test_aggregation_pipeline_with_multiple_stages_and_complex_grouping_operations[documentdb] test_aggregation_pipeline_with_multiple_stages_and_complex_sorting_operations[documentdb]

Their respective DB names can collide on test_aggregation_pipeline_with_multiple_stages_and_complex_ if running in parallel (I probably have at least one off-by-one error somewhere because I didn't count carefully).

I wouldn't use UUID to name the DB because that will make it difficult to debug across executions. A hash of the test name will be deterministic, which is better, but it will still not be straightforward to identify the test from the DB name.

You could also do something like a global counter combined with the worker ID. Something like f"test_{workder_id}_{db_counter}", giving you values like test_gw0_42. So, better because the names are so short, but incrementing the counter will require locking, and the names will still not be meaningful by glancing at them.

normtown · 2026-01-21T17:55:43Z

result_analyzer/analyzer.py

+        # Check for infrastructure-related errors
+        infra_keywords = [
+            "connection",
+            "timeout",
+            "network",
+            "cannot connect",
+            "refused",
+            "unreachable",
+            "host",
+        ]
+
+        if any(keyword in longrepr.lower() for keyword in infra_keywords):
+            return FailureType.INFRA_ERROR


This is pretty fragile. Example:

# FALSE POSITIVE (infra error but it's actually a test failure) def test_host_field_required(collection): """Test that host field must be present.""" assert "host" in doc # AssertionError: "host field missing" -> INFRA_ERROR (wrong!)

Can we check actual exception types vs the error messages?

normtown · 2026-01-21T18:22:28Z

result_analyzer/analyzer.py

+            continue
+
+        # If it passed all filters, it's likely a meaningful marker
+        filtered_markers.append(marker)


This seems brittle because we don't know what pytest-json-report will consider keywords in the future. Also, what if we have a test tag in the future called "lib"?

Implicit rules are harder to understand. A user adding a new marker has no idea that it might get filtered by some heuristic. Explicit is better than implicit.

We are already defining a list of test categories in pytest.ini, we could use that as an allow list:

def extract_markers(test_results): return [m for m in markers if m in registered_markers]

normtown · 2026-01-21T18:29:49Z

tests/aggregate/test_group_stage.py

+@pytest.mark.documents(
+    [
+        {"name": "Alice", "department": "Engineering", "salary": 100000},
+        {"name": "Bob", "department": "Engineering", "salary": 90000},
+        {"name": "Charlie", "department": "Sales", "salary": 80000},
+        {"name": "David", "department": "Sales", "salary": 75000},
+    ]
+)


I'm not really sure why this is better than inserting directly:

def test_group_with_count(collection): collection.insert_many([ {"name": "Alice", "department": "Engineering", "salary": 100000}, {"name": "Bob", "department": "Engineering", "salary": 90000}, {"name": "Charlie", "department": "Sales", "salary": 80000}, {"name": "David", "department": "Sales", "salary": 75000} ]) # the rest of the test

normtown · 2026-01-21T18:38:06Z

conftest.py

+# Custom marker for test data setup
+pytest.mark.documents = pytest.mark.documents


Does this actually do anything? It looks like you're just assigning an attribute to itself. You don't register the documents marker in pytest.ini, so i don't think this has any value?

Using --strict-markers could be a problem at some point. You only read the marker via request.node.get_closest_marker("documents") but not for test selection, so that's probably why this still works.

normtown · 2026-01-21T18:48:32Z

tests/find/test_basic_queries.py

+def test_find_with_filter(collection):
+    """Test find operation with a simple equality filter."""


This is a problem throughout the implementation, but I'll use this as a specific example...

Since we are trying to build spec/behavior-driven tests (without using BDD-like systems like Gherkin), we need a way to communicate the spec. The tests need to essentially be executable specifications. Using vanilla pytest, I think the most natural affordance available for that is the docstring.

The docstring should answer:

What behavior is being tested? (Which you're doing in this example)

What is the expected behavior per the API docs?

Where is that behavior documented? (Maybe that's too much overhead though)

Example:

def test_find_with_filter(collection): """ Test find() with equality filter returns matching documents. API Behavior: find({field: value}) returns all documents where field equals value. Uses BSON type-aware equality comparison. Reference(s): - https://documentdb.io/docs/reference/commands/query-and-write/find - https://www.mongodb.com/docs/manual/reference/method/db.collection.find Expected: - Returns only documents matching the filter - Order is undefined unless sort specified - _id included by default """ # body of test

Nitin Ahuja added 5 commits January 9, 2026 14:28

Fix pytest.ini warnings by removing invalid config options

51d29bf

- Remove json_report, json_report_indent, json_report_omit from config - These are command-line options, not pytest.ini settings - Add comment explaining proper usage - All 36 tests still passing with no warnings

Fix workflow error: remove problematic image digest step

57fb249

- Remove the 'Image digest' step that was causing exit code 127 - The metadata and tags are already captured by the build step - Build step itself will show all relevant information in logs

normtown reviewed Jan 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DocumentDB functional testing framework #1

Add DocumentDB functional testing framework #1

Uh oh!

nitinahuja89 commented Jan 9, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

nitinahuja89 Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026

Uh oh!

normtown Jan 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Custom marker for test data setup
		pytest.mark.documents = pytest.mark.documents

		def test_find_with_filter(collection):
		"""Test find operation with a simple equality filter."""

Add DocumentDB functional testing framework #1

Are you sure you want to change the base?

Add DocumentDB functional testing framework #1

Uh oh!

Conversation

nitinahuja89 commented Jan 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

normtown Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

normtown Jan 21, 2026 •

edited

Loading