-
Notifications
You must be signed in to change notification settings - Fork 372
fix: prevent duplicate features in bulk creation #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
When feature_create_bulk is called multiple times (e.g., from expand sessions), it now checks for existing (category, name) pairs and skips duplicates instead of creating them again. Changes: - Query existing features before creation - Filter duplicates from input batch - Report skipped_duplicates count in response
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mcp_server/feature_mcp.py (1)
549-591:⚠️ Potential issue | 🔴 CriticalCritical bug: Dependency indices break when features are skipped.
The
depends_on_indicesvalues refer to positions in the originalfeatureslist, but after filtering intounique_features, the index mapping is invalidated. This causes:
- IndexError if
idx >= len(unique_features)after filtering- Wrong dependencies if the dependency target was skipped (exists in DB) — the code will reference the wrong feature from
created_featuresinstead of looking up the existing feature's ID- Self-dependency in edge cases where a feature incorrectly depends on itself
Example: If features
[A, B]where B hasdepends_on_indices: [0], and A is skipped (already exists withid=5), thenunique_features = [B],created_features = [B], and B's dependency resolves tocreated_features[0].id(B's own ID) instead of 5.🔧 Proposed fix: maintain index-to-ID mapping for both new and existing features
# Second pass: check for existing features to avoid duplicates - existing = set() - for f in session.query(Feature.category, Feature.name).all(): - existing.add((f.category, f.name)) + existing_map = {} # (category, name) -> feature_id + for f in session.query(Feature.id, Feature.category, Feature.name).all(): + existing_map[(f.category, f.name)] = f.id # Filter out duplicates from input batch unique_features = [] + original_idx_to_id = {} # Maps original index -> feature ID (existing or to-be-created) skipped = 0 - for feature_data in features: + for orig_idx, feature_data in enumerate(features): key = (feature_data["category"], feature_data["name"]) - if key in existing: + if key in existing_map: + # Track existing feature's ID for dependency resolution + original_idx_to_id[orig_idx] = existing_map[key] skipped += 1 else: - unique_features.append(feature_data) - existing.add(key) # Mark as will-exist for batch dedup + unique_features.append((orig_idx, feature_data)) + existing_map[key] = None # Placeholder; ID assigned after flush # Third pass: create unique features only created_features: list[Feature] = [] - for i, feature_data in enumerate(unique_features): + for orig_idx, feature_data in unique_features: db_feature = Feature( - priority=start_priority + i, + priority=start_priority + len(created_features), category=feature_data["category"], name=feature_data["name"], description=feature_data["description"], steps=feature_data["steps"], passes=False, in_progress=False, ) session.add(db_feature) created_features.append(db_feature) # Flush to get IDs assigned session.flush() + # Build complete index-to-ID mapping after flush + for i, (orig_idx, _) in enumerate(unique_features): + original_idx_to_id[orig_idx] = created_features[i].id + # Fourth pass: resolve index-based dependencies to actual IDs deps_count = 0 - for i, feature_data in enumerate(unique_features): + for i, (orig_idx, feature_data) in enumerate(unique_features): indices = feature_data.get("depends_on_indices", []) if indices: - # Convert indices to actual feature IDs - dep_ids = [created_features[idx].id for idx in indices] + # Convert original indices to actual feature IDs (new or existing) + dep_ids = [original_idx_to_id[idx] for idx in indices] created_features[i].dependencies = sorted(dep_ids) deps_count += 1
🧹 Nitpick comments (1)
mcp_server/feature_mcp.py (1)
549-552: Consider selecting only necessary columns for the existence check.The query now retrieves all
(category, name)pairs into memory. This is fine for moderate datasets, but if the feature table grows large, consider adding a database index on(category, name)or using anEXISTSsubquery per feature. For now, this approach is acceptable given the described scale (~600 features).
…AutoForgeAI#148) - Updated display_derivation.py icon values to match rest of codebase (coding->code, testing->test-tube, refactoring->wrench) - feature_compiler.py now imports TASK_TYPE_ICONS and DEFAULT_ICON from display_derivation instead of defining its own duplicate constants - spec_builder.py already delegates to display_derivation (prior session) - Updated all test assertions to match consolidated icon values - Added 34 tests in test_feature_148_display_consolidation.py verifying: - Single source of truth (no duplicate definitions) - Consistent icon values across all code paths - Import chain correctness (feature_compiler -> display_derivation) - AgentSpecs created via any path get correct display values Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Created comprehensive E2E test suite to verify ChatTab handles 404 errors and missing conversations gracefully, as implemented in Features AutoForgeAI#146, AutoForgeAI#147, and AutoForgeAI#148. Test File: ui/e2e/chattab-errors.spec.ts (599 lines, 10 test scenarios) Test Scenarios: 1. ChatTab loads without crashes when project is selected 2. No 404 errors when conversation list is empty 3. No TypeError when conversation is missing or deleted 4. Console shows warnings (not errors) for 404s 5. ChatTab recovers gracefully from API errors 6. Dev Mode toggle resets ChatTab state correctly 7. No console errors when opening browser DevTools 8. No errors on rapid project switching 9. Empty state message displayed when no conversations 10. Validate network requests for conversation endpoints Key Features: - Helper functions for project selection, ChatTab switching, console tracking - Comprehensive error detection (404, TypeError, crashes) - Screenshot assertions for visual verification (13 screenshots) - Network request validation - Console error tracking - Graceful degradation testing Related Features: - Feature AutoForgeAI#146: ChatTab 404 Error Handling - Feature AutoForgeAI#147: Defensive Programming for ChatTab - Feature AutoForgeAI#148: ChatTab State Reset on Mode Switch Verification: - All 10 required test scenarios implemented - Follows existing E2E test patterns (conversation-history.spec.ts) - Uses @playwright/test framework - Can run headless or with UI - Added to E2E test suite How to Run: cd /home/stu/projects/autocoder/ui npm run test:e2e -- chattab-errors.spec.ts npm run test:e2e:ui (interactive mode) Co-Authored-By: Claude <noreply@anthropic.com>
Problem
When
feature_create_bulkis called multiple times (e.g., from expand sessions or if the initializer runs again), it creates duplicate features with the same(category, name)pairs. This can lead to hundreds of duplicate features in the database.Solution
Added deduplication logic to
feature_create_bulk:(category, name)pairs already in the databaseskipped_duplicatesin the response so callers know how many were filteredChanges
mcp_server/feature_mcp.py: Added deduplication before feature creationTesting
Tested on a project that had 607 features (should have been ~98). After identifying the duplicates came from repeated bulk creation calls, this fix prevents the issue from recurring.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.