Update to version 0.0.4#7
Conversation
There was a problem hiding this comment.
Pull request overview
This PR bumps OpenBatch to v0.0.4, fixes incorrect static-style invocation of BatchJobManager.add() inside the collector helpers, and introduces a new batch-file validation feature with accompanying documentation and a large new test suite.
Changes:
- Add
openbatch.validationmodule withvalidate_batch_file()/quick_validate()andValidationResult. - Fix collector helpers to use an instance of
BatchJobManagerinstead of calling.add()statically. - Add extensive unit/integration tests, plus pytest/coverage configuration and CI workflow updates.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/openbatch/validation.py |
Adds batch JSONL validation logic and result formatting. |
src/openbatch/collector.py |
Fixes request writing to use a BatchJobManager instance. |
src/openbatch/__init__.py |
Exposes validate_batch_file from the top-level package. |
tests/test_validation.py |
Comprehensive validation behavior tests. |
tests/test_manager.py |
Tests BatchJobManager request writing and templating workflows. |
tests/test_collector.py |
Tests collector convenience API behavior. |
tests/test_integration.py |
End-to-end tests for batch generation workflows and structured output. |
tests/test_utils.py |
Tests for schema/utility helpers used by structured output features. |
tests/test_model.py |
Tests for request/config models and API strategies. |
pyproject.toml |
Version bump + pytest/coverage config + optional test dependencies. |
.github/workflows/test.yml |
Adds CI test matrix and coverage upload. |
docs/validation.md / mkdocs.yml / README.md |
Documents new validation functionality and improves README metadata/badges. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Update statistics | ||
| result.stats["total_requests"] = line_number | ||
| result.stats["unique_custom_ids"] = len(custom_ids) | ||
| result.stats["endpoints_used"] = list(endpoints) | ||
|
|
There was a problem hiding this comment.
total_requests (and the max-request limit check) currently uses line_number, which counts empty lines and invalid JSON lines even though empty lines are explicitly "ignored". Track a separate request counter (e.g., non-empty lines and/or successfully parsed request objects) and use that for total_requests and the request-count limit enforcement.
| result.stats["unique_custom_ids"] = len(custom_ids) | ||
| result.stats["endpoints_used"] = list(endpoints) | ||
|
|
There was a problem hiding this comment.
endpoints_used is built from a set, so converting it directly to a list makes the ordering nondeterministic. Sorting the endpoints before storing them would make stats output stable and easier to test/compare.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
|
@daniel-gomm I've opened a new pull request, #8, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Sort endpoints to make stats output deterministic Co-authored-by: daniel-gomm <63717948+daniel-gomm@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: daniel-gomm <63717948+daniel-gomm@users.noreply.github.com>
Update to version 0.0.4