feat: implement atomic writes, locking, and PII redaction for AI corrections (closes #368)#853
Conversation
|
@singhanurag0317-bit is attempting to deploy a commit to the ritesh Team on Vercel. A member of the Team first needs to authorize it. |
|
Warning Review limit reached
More reviews will be available in 19 minutes and 7 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| # Phone number pattern (7 to 15 digit formats) | ||
| phone_pattern = r'\b(?:\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b' | ||
|
|
||
| redacted = re.sub(email_pattern, "[EMAIL_REDACTED]", text) |
| os.remove(tmp_path) | ||
| except Exception: | ||
| pass | ||
| return {"status": "error", "message": f"Failed to log correction: {str(e)}"} |
|
Hi @singhanurag0317-bit! 🙌 Thank you so much for your excellent contribution: "feat: implement atomic writes, locking, and PII redaction for AI corrections (closes #368)"! We really appreciate the high-quality code and effort you have put into the platform. Just a quick, friendly heads-up as we prepare our manual merging and verification queues—please make sure to complete all the mandatory community steps listed below. Once those manual steps are verified, we'll get your PR officially merged into the Let's build something amazing together! 🚀🔥 🌟 Community Support & Network Steps (Take 10 Seconds!)As we prepare our manual verification and merging queues, please make sure you have taken a moment to complete these required steps to finalize your points:
Note: Having these steps completed manually is required before your PR points are officially cleared. |
|
Hi @singhanurag0317-bit! 🙌 Thank you so much for your excellent contribution: "feat: implement atomic writes, locking, and PII redaction for AI corrections (closes #368)"! We really appreciate the high-quality code and effort you have put into the platform. Just a quick, friendly heads-up as we prepare our manual merging and verification queues—please make sure to complete all the mandatory community steps listed below. Once those manual steps are verified, we'll get your PR officially merged into the Let's build something amazing together! 🚀🔥 🌟 Community Support & Network Steps (Take 10 Seconds!)As we prepare our manual verification and merging queues, please make sure you have taken a moment to complete these required steps to finalize your points:
Note: Having these steps completed manually is required before your PR points are officially cleared. |
Description
This pull request addresses Issue #368 ("Corrections log writes are non-atomic (race leads to corrupted training/audit data)"). It introduces a highly resilient, cross-process atomic file-locking mechanism, secure PII redaction, and structured rotating logging to make the corrections logging flow fully robust and production-grade.
Key Enhancements
os.mkdir(atomic directory creation guaranteed by OS kernels) to implement an exceptionally resilient, cross-process and thread-safe lock context manager without third-party dependencies..json.tmp) and atomically replacing the target log file (corrections_log.json).RotatingFileHandlerwriting JSON Lines (JSONL) todata/corrections_structured.log(1MB limit, 5 backups) for rapid, performant auditing.original_textandocr_text) to completely eliminate PII from logs.Responsefromfastapi.Verification
backend/main.py.scratch/test_concurrency_corrections.pywhich spawns 20 parallel threads to concurrently write logs, verifying: