-
Notifications
You must be signed in to change notification settings - Fork 5
Fix/add connection pools to fix db hangup #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The subscriber's afterUpdate was using event.entity which only contains changed fields (partial entity), not the full entity with charter. When groups were updated via webhooks, the charter field was often absent in the partial entity, causing Cerberus to lose track of group charters over time. After restart, charters would be loaded fresh from DB. Root cause: TypeORM's afterUpdate event provides partial entities when using repository.save(entity). The code was not reloading the full entity with all fields (including charter) from the database after the transaction committed. Changes: - Refactored afterUpdate to pass metadata (entityId, relations) instead of using the partial entity from event.entity - Created handleChangeWithReload that schedules entity reload for after transaction commit (inside setTimeout with 50ms delay) - Created executeReloadAndSend that does the actual findOne with all relations AFTER transaction commits, ensuring charter and all fields are loaded - Groups and messages sync with 50ms delay (fast, ensures commit) This is the same transaction timing issue fixed in file-manager-api and dreamsync-api. Now Cerberus will maintain full group data (including charters) indefinitely without requiring restarts.
The group webhook processing could hang indefinitely when loading participant users, causing Cerberus to appear stuck after running for some time. The last log was "Extracted userId" with no progress. Root causes: 1. Promise.all blocks if any getUserById call hangs (DB lock, timeout, etc.) 2. No timeout protection - hangs wait forever 3. No error handling - failures block entire webhook 4. Loading unnecessary relations (followers/following) added complexity Changes: - Use Promise.allSettled instead of Promise.all to handle failures gracefully - Add 5-second timeout per user lookup using Promise.race - Wrap each participant load in try-catch with detailed error logging - Load users without heavy relations in webhook context (don't need followers/following) - Add indexed logging to identify which participant causes issues - Log success/failure counts for transparency Benefits: - Webhook completes even if some participants fail to load - 5s timeout prevents indefinite hangs - Better diagnostics via indexed logging - Reduced DB load by skipping unnecessary relations The webhook will now respond within ~5 seconds even if all participant lookups fail, preventing Cerberus from getting stuck.
📝 WalkthroughWalkthroughEnhances webhook participant loading with per-participant timeouts to avoid hangs, adds connection-pool tuning to multiple Postgres DataSource configs, and refactors the subscriber afterUpdate flow to reload, enrich, and debounce post‑commit entity notifications. Changes
Sequence Diagram(s)sequenceDiagram
participant Event as Change Event
participant Sub as Subscriber
participant DB as Database
participant Adapt as Adapter
Event->>Sub: afterUpdate(event)
Sub->>Sub: derive entityId (event.entity.id / databaseEntity?.id / common id fields)
alt id missing
Sub-->>Event: log warning & exit
else id present
Sub->>Sub: schedule handleChangeWithReload (debounced per table)
Note right of Sub: debounced timer per table (skip junctions)
Sub->>DB: reload entity by id after commit
Note over Sub,DB: reload uses repository.findOne and enrichment
DB-->>Sub: enriched entity/plain data
Sub->>Sub: validate (skip locked/non-system where applicable)
Sub->>Adapt: adapter.handleChange(envelope)
Adapt-->>Sub: acknowledgement
Sub->>Sub: log envelope
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
83dce76 to
c05d7d0
Compare
Description of change
Issue Number
closes #638
Type of change
How the change has been tested
Change checklist
Summary by CodeRabbit
Release Notes
Bug Fixes
Performance
✏️ Tip: You can customize this high-level summary in your review settings.