-
Notifications
You must be signed in to change notification settings - Fork 2
Add DocumentDB functional testing framework #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement complete test framework with pytest - 36 tests covering find, aggregate, and insert operations - Multi-engine support with custom connection strings - Automatic test isolation and cleanup - Tag-based test organization and filtering - Parallel execution support with pytest-xdist - Add smart result analyzer - Automatic marker detection using heuristics - Filters test names, file names, and engine names - Categorizes failures: PASS/FAIL/UNSUPPORTED/INFRA_ERROR - CLI tool: docdb-analyze with text and JSON output - Configure development tools - Black for code formatting - isort for import sorting - flake8 for linting - mypy for type checking - pytest-cov for coverage reporting - Add comprehensive documentation - README with usage examples and best practices - CONTRIBUTING guide for writing tests - result_analyzer/README explaining analyzer behavior - All code formatted and linted - Add Docker support - Dockerfile for containerized testing - .dockerignore for clean builds Test Results: All 36 tests passed (100%) against DocumentDB
- Update test_find_empty_projection to use documents marker instead of manual insert - Update test_match_empty_result to use documents marker instead of manual insert - Ensures consistent test data setup and automatic cleanup - All 36 tests still passing
- Remove json_report, json_report_indent, json_report_omit from config - These are command-line options, not pytest.ini settings - Add comment explaining proper usage - All 36 tests still passing with no warnings
- Add GitHub Actions workflow for automated Docker builds - Build for linux/amd64 and linux/arm64 platforms - Push to GitHub Container Registry (ghcr.io) - Auto-tags images: latest, sha-*, version tags - Update README with pre-built image pull instructions - Fix Dockerfile casing warning (FROM...AS) Workflow Features: - Runs on push to main and on pull requests - Multi-platform support for Intel/AMD and ARM/Graviton - Automatic versioning from git tags - GitHub Actions cache for faster builds - Uses dynamic repository variable (works on forks and upstream)
- Remove the 'Image digest' step that was causing exit code 127 - The metadata and tags are already captured by the build step - Build step itself will show all relevant information in logs
| metafunc.parametrize( | ||
| "engine_name,engine_connection_string", | ||
| [(name, conn) for name, conn in engines.items()], | ||
| ids=list(engines.keys()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the value of --engine to the test ID will cause IDs to not be stable across runs, making it more difficult to compare results for each test across executions of the test system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are looking to use the engine parameter to distinguish between two systems under test, I think it is a simpler approach to run pytest separately against each system under test. Doing multiple in one pytest session mixes results, it requires both engines to be available simultaneously, and it makes comparison harder. You need to parse the engine out from the test ID, or if the engine is two versions of the same engine (say DocumentDB 1.0 vs 1.1), it's not possible to just parse out "documentdb" to distinguish which test results belong to which engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the RFC since raising this implementation PR to get rid of direct comparisons while doing the test. The tests will run independently against each engine and then we can use those results to compare using a separate tool. I am going to update this implementation to reflect the current state in the RFC.
| parser.addoption( | ||
| "--engine", | ||
| action="append", | ||
| default=[], | ||
| help="Engine to test against. Format: name=connection_string. " | ||
| "Example: --engine documentdb=mongodb://localhost:27017 " | ||
| "--engine mongodb=mongodb://mongo:27017", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inclusion of a connection string seems necessary, but it doesn't seem necessary to specify the name of the engine implementation: documentdb=... or mongodb=.... It's not clear what value that is supposed to add.
| if outcome == "skipped": | ||
| # Skipped tests typically indicate unsupported features | ||
| return FailureType.UNSUPPORTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this mapping is right. pytest.skip() is used for many reasons. Yes, for unsupported features, but also:
- if the test needs a specific environment (e.g., requires TLS)
- if the test is temporarily disabled while debugging
- if there is a conditional skip (using
@pytest.mark.skipif(<some predicate>)
If we want to compare test executions against each other (say for comparing versions over time), we need to distinguish things that are unsupported (i.e., real compatibility gaps) vs things that are skipped for legitimate reasons.
We probably need an explicit @pytest.mark.unsupported("reasons") for behaviors that are actually unsupported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I'm also confused as to when we would mark a behavior as "unsupported" vs just not having the test. Most cases I can think of for a behavior being unsupported are relative to some other version of the system under test.
| # Create unique database name based on test name | ||
| test_name = request.node.name.replace("[", "_").replace("]", "_") | ||
| db_name = f"test_{test_name}"[:63] # MongoDB database name limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach to naming can cause collisions when running tests in parallel. Tests that have names that are the same up to the 58th character can collide. E.g.,
test_aggregation_pipeline_with_multiple_stages_and_complex_grouping_operations[documentdb]
test_aggregation_pipeline_with_multiple_stages_and_complex_sorting_operations[documentdb]
Their respective DB names can collide on test_aggregation_pipeline_with_multiple_stages_and_complex_ if running in parallel (I probably have at least one off-by-one error somewhere because I didn't count carefully).
I wouldn't use UUID to name the DB because that will make it difficult to debug across executions. A hash of the test name will be deterministic, which is better, but it will still not be straightforward to identify the test from the DB name.
You could also do something like a global counter combined with the worker ID. Something like f"test_{workder_id}_{db_counter}", giving you values like test_gw0_42. So, better because the names are so short, but incrementing the counter will require locking, and the names will still not be meaningful by glancing at them.
| # Check for infrastructure-related errors | ||
| infra_keywords = [ | ||
| "connection", | ||
| "timeout", | ||
| "network", | ||
| "cannot connect", | ||
| "refused", | ||
| "unreachable", | ||
| "host", | ||
| ] | ||
|
|
||
| if any(keyword in longrepr.lower() for keyword in infra_keywords): | ||
| return FailureType.INFRA_ERROR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty fragile. Example:
# FALSE POSITIVE (infra error but it's actually a test failure)
def test_host_field_required(collection):
"""Test that host field must be present."""
assert "host" in doc
# AssertionError: "host field missing" -> INFRA_ERROR (wrong!)Can we check actual exception types vs the error messages?
| continue | ||
|
|
||
| # If it passed all filters, it's likely a meaningful marker | ||
| filtered_markers.append(marker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems brittle because we don't know what pytest-json-report will consider keywords in the future. Also, what if we have a test tag in the future called "lib"?
Implicit rules are harder to understand. A user adding a new marker has no idea that it might get filtered by some heuristic. Explicit is better than implicit.
We are already defining a list of test categories in pytest.ini, we could use that as an allow list:
def extract_markers(test_results):
return [m for m in markers if m in registered_markers]| @pytest.mark.documents( | ||
| [ | ||
| {"name": "Alice", "department": "Engineering", "salary": 100000}, | ||
| {"name": "Bob", "department": "Engineering", "salary": 90000}, | ||
| {"name": "Charlie", "department": "Sales", "salary": 80000}, | ||
| {"name": "David", "department": "Sales", "salary": 75000}, | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure why this is better than inserting directly:
def test_group_with_count(collection):
collection.insert_many([
{"name": "Alice", "department": "Engineering", "salary": 100000},
{"name": "Bob", "department": "Engineering", "salary": 90000},
{"name": "Charlie", "department": "Sales", "salary": 80000},
{"name": "David", "department": "Sales", "salary": 75000}
])
# the rest of the test| # Custom marker for test data setup | ||
| pytest.mark.documents = pytest.mark.documents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually do anything? It looks like you're just assigning an attribute to itself. You don't register the documents marker in pytest.ini, so i don't think this has any value?
Using --strict-markers could be a problem at some point. You only read the marker via request.node.get_closest_marker("documents") but not for test selection, so that's probably why this still works.
| def test_find_with_filter(collection): | ||
| """Test find operation with a simple equality filter.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem throughout the implementation, but I'll use this as a specific example...
Since we are trying to build spec/behavior-driven tests (without using BDD-like systems like Gherkin), we need a way to communicate the spec. The tests need to essentially be executable specifications. Using vanilla pytest, I think the most natural affordance available for that is the docstring.
The docstring should answer:
- What behavior is being tested? (Which you're doing in this example)
- What is the expected behavior per the API docs?
- Where is that behavior documented? (Maybe that's too much overhead though)
Example:
def test_find_with_filter(collection):
"""
Test find() with equality filter returns matching documents.
API Behavior:
find({field: value}) returns all documents where field equals value.
Uses BSON type-aware equality comparison.
Reference(s):
- https://documentdb.io/docs/reference/commands/query-and-write/find
- https://www.mongodb.com/docs/manual/reference/method/db.collection.find
Expected:
- Returns only documents matching the filter
- Order is undefined unless sort specified
- _id included by default
"""
# body of test
Implement complete test framework with pytest
Add smart result analyzer
Configure development tools
Add comprehensive documentation
Add Docker support