fix: atomic cache writes + asyncio lock on CSRF refresh#256
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@py_gasbuddy/__init__.py`:
- Around line 578-587: The double-check fails because _cf_last is not flipped to
True after a successful refresh, so waiting coroutines still trigger a redundant
fetch; after a successful token extraction mark the refresh as completed by
setting self._cf_last = True (either at the end of _refresh_token immediately
after the token/tag is set, or by having _refresh_token return success and
setting self._cf_last = True right after awaiting it in the critical section
where you hold self._token_lock), referencing _token_lock, _cf_last, _tag,
_refresh_token and process_request to locate the relevant logic; only set
_cf_last = True on success (not on exception) so subsequent waiters skip
re-fetching.
In `@py_gasbuddy/cache.py`:
- Around line 40-41: The directory creation has a TOCTOU race: remove the
separate exists() check and make the mkdir idempotent by calling
aiofiles.os.makedirs on self._cache_file.parent with exist_ok=True (or wrap
makedirs in a try/except FileExistsError) so concurrent processes won't raise on
a racing create; update the call that currently uses await
aiofiles.os.makedirs(self._cache_file.parent) to include exist_ok=True.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bfc7fee3-3e74-4f3d-ba1a-1f209e73d6ee
📒 Files selected for processing (2)
py_gasbuddy/__init__.pypy_gasbuddy/cache.py
Review Summary
|
ff0c09e to
c2ecc41
Compare
|
Addressed both CodeRabbit findings in c2ecc41:
48 tests passing. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #256 +/- ##
==========================================
- Coverage 98.79% 98.55% -0.24%
==========================================
Files 6 6
Lines 581 485 -96
==========================================
- Hits 574 478 -96
Misses 7 7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
secondof9
left a comment
There was a problem hiding this comment.
Code Review Summary
Verdict: Approved ✅
This PR successfully addresses two critical concurrency hazards in the CSRF token management and caching logic. The implementation of in-process locking and atomic file writes significantly improves the robustness of the library in multi-coroutine environments.
🔴 Critical
- No critical issues found.
⚠️ Warnings
- No warnings found.
💡 Suggestions
py_gasbuddy/cache.py: The use ofuuid.uuid4().hexin the tempfile name is excellent for preventing collisions. Theos.replaceapproach is the industry standard for atomic file updates.
✅ Looks Good
py_gasbuddy/__init__.py: The introduction ofasyncio.LockwithinGasBuddyto serialize token refreshes correctly prevents redundant network requests when multiple coroutines encounter a cold cache.py_gasbuddy/cache.py: The transition from a directaiofiles.opento a "write-to-temp-then-replace" pattern effectively eliminates the risk of torn/corrupted cache files during concurrent writes.- Error Handling: The
try...exceptblock inwrite_cachefor cleaning up the temporary file on failure is a great touch for maintaining filesystem hygiene. - Documentation: The added comments clearly explain the why behind the changes, which is invaluable for future maintainers.
Reviewed by QA-D
secondof9
left a comment
There was a problem hiding this comment.
Code Review Summary
Verdict: Approved (The changes correctly address the identified concurrency and atomicity risks.)
✅ Looks Good
- Atomic Writes Implementation: The use of a uniquely-named sibling tempfile and
os.replaceinpy_gasbuddy/cache.pyeffectively prevents file corruption from interleaved writes. - In-process Serialization: Adding
asyncio.Lockto bothGasBuddyandGasBuddyCachesuccessfully mitigates race conditions within a single event loop. - Double-Check Pattern: The logic in
_get_headersto re-verifyself._cf_lastafter acquiring the lock is a textbook implementation of the double-checked locking pattern in an async context, preventing redundant network calls. - Robust Cleanup: The
try...exceptblock inwrite_cacheensures that failed writes do not leave orphaned.tmpfiles cluttering the cache directory.
💡 Suggestions
_cf_laststate management: Inpy_gasbuddy/__init__.py, you setself._cf_last = Trueafter a successful write. While this works for the current instance, it's worth noting that this doesn't communicate the state to other processes (only other coroutines in the same instance). Since you already addressed cross-process races viaos.replaceandself._lockincache.py, this is acceptable for in-process optimization, but ensure the distinction is understood in the documentation.
Reviewed by QA-Dev
Two related concurrency hazards in the token-cache machinery: 1. `GasBuddyCache.write_cache` opened the target path directly. With a shared cache (HA coordinator + a parallel service call), two coroutines could both call `_get_headers` after expiry, both fetch a new token, and interleave bytes into the cache file — a corrupt payload caught next call by JSONDecodeError, falling back to an empty dict, and wasting a token-refresh round trip on the next call. Write to a uniquely-named sibling tempfile and `aiofiles.os.replace` onto the final path. `os.replace` is atomic on POSIX and Windows ≥Vista. Also clean up the tempfile on write failure. 2. `_get_headers` had no in-process mutual exclusion. Two coroutines racing on a cold cache both hit `gasbuddy.com/home` and both wrote. With firstof9#1 the writes are now safe, but the duplicate HTTP is still wasted. Add an `asyncio.Lock` (`self._token_lock`) and a per-cache `self._lock` in `GasBuddyCache`. The token-refresh block re-checks `_cf_last` after acquiring the lock to absorb the case where the previous holder just populated the token. Extracted the actual refresh body into `_refresh_token` for readability. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- cache.py: drop the separate path.exists() check before makedirs and use exist_ok=True instead. Closes the TOCTOU race where a racing process creates the directory between our check and call. - __init__.py: set self._cf_last = True after a successful CSRF refresh so coroutines still queued on self._token_lock skip their own refresh attempt instead of re-fetching. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
c2ecc41 to
8bbf950
Compare
d181ef4 to
64e6029
Compare
secondof9
left a comment
There was a problem hiding this comment.
Code Review Summary
Verdict: Approved — robust defensive parsing implementation with excellent test coverage.
🔴 Critical
None found.
⚠️ Warnings
None found.
💡 Suggestions
- py-gasbuddy/parsers.py:28 — Consider adding a docstring to
_stations_block()explaining the GraphQL partial response risk.
✅ Looks Good
- Comprehensive
isinstance()guards throughoutbuild_discount_map()andformat_price_node() - New
_stations_block()helper centralizes defensive data extraction - Guard added for
pricesiteration inparse_results()(line 193) - New test file
tests/test_parsers.pyprovides thorough edge-case coverage - Tests cover null data, missing keys, wrong types, and malformed entries
Reviewed by Hermes Agent
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/test_init.py (1)
1223-1257: 💤 Low valueConsider adding an explicit
_cf_laststate assertion.The test correctly verifies that only one refresh occurs under concurrency, implying the double-check lock works. Per coding guidelines, tests changing
process_requestbehavior should confirm_cf_lastends in the correct state. Addingassert manager._cf_last is Trueafter the gather would make the state transition explicit.Suggested addition
# Only one token refresh should have occurred assert refresh_calls == 1 + # After successful refresh, _cf_last should be True + assert manager._cf_last is True await manager.clear_cache()As per coding guidelines: "Confirm
_cf_lastends in the correct state for each code path".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_init.py` around lines 1223 - 1257, Add an explicit assertion that the manager's Cloudflare "last" flag ends in the expected state: after the concurrent await asyncio.gather(...) in test_token_refresh_concurrency, assert manager._cf_last is True (placed before await manager.clear_cache()) to make the post-condition explicit and follow the guideline to confirm `_cf_last` ends in the correct state; reference the test function name test_token_refresh_concurrency and the manager instance `_cf_last` field when locating where to add the assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_init.py`:
- Around line 1223-1257: Add an explicit assertion that the manager's Cloudflare
"last" flag ends in the expected state: after the concurrent await
asyncio.gather(...) in test_token_refresh_concurrency, assert manager._cf_last
is True (placed before await manager.clear_cache()) to make the post-condition
explicit and follow the guideline to confirm `_cf_last` ends in the correct
state; reference the test function name test_token_refresh_concurrency and the
manager instance `_cf_last` field when locating where to add the assertion.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f5db02c9-9d24-426e-a8ed-48d69b63ac54
📒 Files selected for processing (3)
py_gasbuddy/__init__.pypy_gasbuddy/cache.pytests/test_init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- py_gasbuddy/init.py
Summary
Two related concurrency hazards on the shared CSRF cache file:
write_cacheopened the target path directly. Two coroutines racing (HA coordinator + a parallel service call) could interleave bytes into a corrupt token file. Write to a uniquely-named sibling tempfile andos.replaceonto the final path (atomic on POSIX + Windows ≥Vista). Tempfile cleanup on write failure.asyncio.LocktoGasBuddy(per-instance) and toGasBuddyCache(per-cache). The token-refresh block re-checks_cf_lastafter acquiring the lock so the second waiter doesn't re-fetch.Body extracted into
_refresh_tokenfor readability — same logic, no behavior change to the refresh itself.Test plan
pytest -q→ 48 passed.Out of scope
This doesn't help cross-process races (e.g. two HA workers sharing the same cache path). The atomic write keeps the file uncorrupted but they may still both fetch. Out-of-process locking would need a file lock, which adds dependencies — happy to follow up if needed.
Summary by CodeRabbit