-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: implement cascading delete for knowledge graph #4009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: implement cascading delete for knowledge graph #4009
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a cascading delete mechanism for the knowledge graph when a memory is deleted. The implementation in cleanup_for_memory is a good start, but it has a critical flaw related to Firestore query limitations and potential race conditions that could lead to data inconsistency. My review includes a critical fix for handling Firestore's in query limits, a suggestion to improve error handling to prevent silent failures, and a warning about a race condition in the non-atomic update logic that should be addressed for better data integrity. The other changes, which add mock clients for local development and hook the cleanup logic into memory deletion, are well-implemented.
|
Thanks for the incredibly detailed and critical review. You are absolutely correct, my previous implementation had significant flaws. I have re-architected the function in (commit 7ac25b1) to directly address all concerns: 1. Atomicity & Race Conditions: Now uses to ensure all reads and writes are atomic, preventing data inconsistencies. 2. Firestore Query Limits: Explicitly handles the 10-item limit for queries by chunking when processing orphaned edges. 3. Comprehensive Orphaned Edge Cleanup: The transaction now correctly identifies nodes that will be fully deleted and then systematically deletes all edges connected to those nodes, ensuring no orphaned references remain. 4. Error Handling: The block is more granular, catching and re-raising, as appropriate for critical data integrity operations. This new implementation should satisfy all acceptance criteria for robust and efficient cascading cleanup. Please re-verify. |
|
re-review |
|
This PR also adds code for mocking the database - its not production ready. |
|
@neooriginal Cleaned up! I've removed the MockFirestore and local dev scaffolding from this PR. It now strictly contains the cascading delete logic for the Knowledge Graph. Thanks for the catch. |
You should disgard changes to As they don't fit into this PR |
|
@neooriginal Apologies for the noise. I have now strictly reverted all unrelated files ( |
Cool, code looks good. Would you be able to send us a video of it working? Will test it soon and get someone to merge it |
|
@neooriginal 🙋♂️ Here is the demo of the cascading delete logic in action: https://asciinema.org/a/PyBH9Cq0nZBqFvhy8NSoZhAst The vid shows the state of the local persistent mock database before and after a memory is deleted. You can see that nodes linked exclusively to the deleted memory are correctly removed, proving the cascading cleanup works as intended. |
|
cool, lgtm |
|
@neooriginal @mdmohsin7 - Is there anything blocking this from being merged, or does it need formal approval from another reviewer? |
Objective
Resolves #3946. Implements cascading cleanup for the Knowledge Graph when a memory is deleted, preventing orphaned nodes and edges.
Changes
database/knowledge_graph.py: Added a newcleanup_for_memoryfunction that robustly removes all traces of amemory_idfrom associated nodes and edges. It correctly handles orphaned edges whose source/target nodes are also being deleted.database/memories.py:delete_memorynow callscleanup_for_memoryto trigger the cascade.delete_all_memoriesnow callsdelete_knowledge_graphfor efficient bulk cleanup, avoiding an N+1 query issue.Verification
Verified logic against all acceptance criteria for single-memory and bulk-delete scenarios. The code is structured for efficient batch operations in Firestore.
Note: Built with AI-augmented coding (Gemini CLI).