Feat/queue monitoring#337
Merged
Merged
Conversation
… management features
|
@Power70 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits. You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀 |
Contributor
|
Please resolve conflict fix CI |
…e, and notifications controller
Contributor
Author
I have ran all failing test on my local and they're all passing. I'm still trying to understand why it's failing here |
…achLink_backend into feat/queue-monitoring
…e files Co-authored-by: Copilot <copilot@github.com>
Contributor
Author
All checks are passing now |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Linked Issue
Closes #291
What does this PR do?
This PR introduces comprehensive queue monitoring and operational controls for Bull in the teachLink backend, covering metrics, health, failed-job handling, retry analytics, and scheduled-job management. The implementation adds strongly-typed DTO validation at controller boundaries, secures endpoints with authentication/authorization guards, and improves route safety by preventing static route collisions with parameterized routes. Queue observability is expanded with periodic health checks, trend-aware statistics, retry analysis, and stuck-job recovery hooks, making production diagnosis and response significantly faster. The changes are designed as non-breaking enhancements while preserving existing queue behavior.
Type of change
Pre-merge checklist (required)
Branch & metadata
feature/issue-<N>-<slug>/fix/issue-<N>-<slug>conventiondevelopormain)Code quality & tests
npm run lint:ci— zero ESLint warningsnpm run format:check— Prettier reports no changes needednpm run typecheck— zero TypeScript errorsnpm run test:ci— all tests pass, coverage ≥ 70%.spec.tsunit testsError handling & NestJS best practices
class-validator/class-transformerdecorators and are wired through NestJS pipes (e.g. globalValidationPipeor explicit)any/unknownreaching the domain)BadRequestException,UnauthorizedException,ForbiddenException,NotFoundException) instead of genericErrorLoggeror central logger service) with meaningful, structured messagesAuthGuard, role/permissions guards, custom guards) are applied to all new/modified endpoints where appropriateAPI documentation / Swagger
/api(or Swagger UI) reflects new/changed endpoints correctlyBreaking changes
Breaking change description (if applicable)
Not applicable.
Changes Overview
New file
src/queues/dto/queue.dto.tsAddJobDto,AddBulkJobsDtoScheduleJobDto,ScheduleDelayedJobDtoFailedJobsQueryDto,StuckJobsQueryDto,AnalyticsQueryDtoCleanQueueDtoUpdated file
src/queues/queue.controller.ts/jobs/failedand/jobs/stuckno longer collide with/jobs/:id.ValidationPipevalidation for body/query boundaries.JwtAuthGuard+RolesGuardwith admin restriction on mutation endpoints.NotFoundExceptionhandling for missing jobs ingetJob,retryJob, andremoveJob.POST /queues/jobs/failed/retry-allGET /queues/metrics/historyGET /queues/metrics/retriesGET /queues/countsGET /queues/jobs/scheduledGET /queues/cron/jobsPOST /queues/jobs/delayDELETE /queues/jobs/scheduled/:idDELETE /queues/emptyUpdated file
src/queues/monitoring/queue-monitoring.service.tscapturedAtto metrics history for time-series analysis.job.timestampwhenprocessedOnis null.try/catchand structured logger output.getRetryAnalytics(windowMinutes)with rates and per-job-type aggregation.retryAllFailedJobs()with summary{ requeued, skipped, errors }.@Cron(EVERY_MINUTE)) with alert stubs and stuck-job recovery.Acceptance Criteria Checklist
GET /queues/metrics,GET /queues/metrics/history,GET /queues/counts,GET /queues/statistics)GET /queues/jobs/failed,POST /queues/jobs/failed/retry-all,POST /queues/jobs/:id/retry)GET /queues/metrics/retrieswith per-job-type breakdown, window filtering, rates)GET /queues/healthwith healthy/warning/critical, stuck-job recovery, periodic alerting)Labels
backendqueuemonitoringpriority-mediumTest evidence (required)
Commands run locally
Observed results
Manual / API verification
Screenshots / recordings (if applicable)
Not applicable for backend-only changes.