Skip to content

test: add unit tests for DuplicateService (Fixes #442)#895

Open
zeroknowledge0x wants to merge 2 commits into
ritesh-1918:mainfrom
zeroknowledge0x:test/duplicate-service-unit-tests
Open

test: add unit tests for DuplicateService (Fixes #442)#895
zeroknowledge0x wants to merge 2 commits into
ritesh-1918:mainfrom
zeroknowledge0x:test/duplicate-service-unit-tests

Conversation

@zeroknowledge0x
Copy link
Copy Markdown

@zeroknowledge0x zeroknowledge0x commented May 31, 2026

Fixes #442

Summary

Adds comprehensive unit tests for DuplicateService in backend/services/duplicate_service.py. All 25 tests pass with ML dependencies mocked.

Changes

  • Created backend/tests/test_duplicate_service.py with 25 test cases
  • Covers all public methods: is_available(), load(), add_ticket(), check_duplicate(), save_to_disk()
  • Tests initialization state, availability flags, memory persistence, disk persistence
  • Tests duplicate detection with configurable thresholds
  • Tests graceful degradation when model is unavailable
  • Tests corrupt JSON handling in storage file

Testing

python -m pytest backend/tests/test_duplicate_service.py -v

Result: 25 passed in 8.35s

Test Coverage

Category Tests Description
Init 2 Default state, storage dir creation
is_available 4 Not loaded, loaded, failed, loaded+failed
add_ticket 4 Memory, disk, multiple, degraded skip
check_duplicate 6 No tickets, degraded, above threshold, below threshold, custom threshold, best match
save_to_disk 4 New file, append, corrupt JSON, non-list JSON
load 5 Flag set, idempotent, failure flag, degraded startup, load saved tickets

All ML dependencies (sentence_transformers) are mocked — no model download required.

Summary by CodeRabbit

  • Tests
    • Added comprehensive test suite for duplicate detection service, covering initialization, availability checks, ticket management, disk persistence, and model loading to ensure system reliability and robustness.

- 25 tests covering initialization, availability, load behavior,
  add_ticket, check_duplicate, save_to_disk, and degraded mode
- All ML dependencies mocked (sentence-transformers)
- Tests verify: memory persistence, disk persistence, threshold
  override, best-match selection, corrupt JSON handling, and
  graceful degradation when model is unavailable
@vercel
Copy link
Copy Markdown

vercel Bot commented May 31, 2026

@zeroknowledge0x is attempting to deploy a commit to the ritesh Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

Warning

Review limit reached

@zeroknowledge0x, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 5 minutes and 48 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b07e0e89-5d92-49ce-9961-5645010e80f9

📥 Commits

Reviewing files that changed from the base of the PR and between 1c5fccf and 77a77c3.

📒 Files selected for processing (1)
  • backend/tests/test_duplicate_service.py
📝 Walkthrough

Walkthrough

A complete pytest test suite for DuplicateService covering initialization state, availability checks, ticket addition and in-memory/disk persistence, duplicate detection with similarity thresholding and best-match selection, cache recovery from corrupt JSON, model loading with idempotency and failure flagging, and degraded startup mode via environment variable.

Changes

DuplicateService Unit Tests

Layer / File(s) Summary
Test fixtures and infrastructure
backend/tests/test_duplicate_service.py
Module imports for json/os/tempfile/pytest/unittest.mock and three pytest fixtures: _no_model_imports stubs heavy sentence_transformers imports; service creates a fresh DuplicateService wired to a temp cache file; loaded_service provides a preloaded service with mocked model and encoded tensor-like output.
Service initialization and availability checks
backend/tests/test_duplicate_service.py
TestInit verifies default _loaded/_load_failed flags, empty _tickets list, and storage directory creation. TestIsAvailable asserts is_available() returns correct boolean based on combinations of loaded/failed state.
Duplicate detection core logic
backend/tests/test_duplicate_service.py
TestCheckDuplicate covers no stored tickets, degraded mode returns, similarity above/below/custom threshold, and best-match selection among multiple tickets using mocked cosine similarity values.
Ticket addition, persistence, and recovery
backend/tests/test_duplicate_service.py
TestAddTicket verifies in-memory append and disk persistence in JSON format, and degraded mode skips embedding. TestSaveToDisk handles cache creation, append, and recovery from invalid/non-list JSON. TestLoad verifies _loaded flag setting, idempotent initialization, _load_failed marking on failure, degraded startup via ALLOW_DEGRADED_STARTUP=1 environment variable, and ticket restoration from on-disk cache.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

  • #297: Tests cover check_duplicate behavior including threshold override, empty store, and degraded mode scenarios.

Possibly related PRs

  • ritesh-1918/HELPDESK.AI#509: Both PRs add/extend pytest coverage for DuplicateService targeting backend/tests/test_duplicate_service.py with overlapping test scenarios.
  • ritesh-1918/HELPDESK.AI#779: Expands backend/tests/test_duplicate_service.py coverage with continued focus on check_duplicate thresholding, best-match selection, and degraded mode handling.
  • ritesh-1918/HELPDESK.AI#551: Extends the same test file coverage for DuplicateService initialization, is_available, degraded mode, and disk persistence logic.

Suggested labels

gssoc, gssoc:approved, level:intermediate, quality:exceptional, type:testing

Poem

🐰 A test suite hops into the warren so fine,
DuplicateService checks—each case does align,
From init to thresholds, persistence so deep,
The tests ensure tickets don't duplicate sheep! 🐑

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test: add unit tests for DuplicateService' clearly and concisely summarizes the main change—adding comprehensive test coverage for the DuplicateService class.
Linked Issues check ✅ Passed All coding objectives from issue #442 are met: the test file is created, similarity threshold handling is tested, duplicate detection accuracy is covered, and edge cases including degraded mode and corrupt JSON are properly tested.
Out of Scope Changes check ✅ Passed All changes are directly scoped to adding unit tests for DuplicateService as required by issue #442; no unrelated modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/tests/test_duplicate_service.py`:
- Around line 61-62: The test_storage_dir_created test is tautological because
the service fixture already sets service.storage_file under tmp_path (which
exists); change the test to configure DuplicateService to use a nested,
non-existent path before constructing it so __init__ must create the parent dir,
then assert that target.parent (the directory) exists; locate the
test_storage_dir_created and the service fixture setup (and
DuplicateService.__init__ / storage_file usage) and modify the test to provide a
nested path (e.g. tmp_path / "nested" / "case_history_cache.json") and
instantiate the service so the assertion verifies directory creation rather than
relying on tmp_path.
- Around line 128-132: Remove the leftover variable assignment (drop
loaded_service_result) and make the test actually exercise the degraded path by
adding a stored ticket before calling service.check_duplicate; specifically, in
test_degraded_returns_no_duplicate set service._loaded = False and
service._load_failed = True, store a ticket via the same API used in other tests
(e.g., the service method that adds tickets or the fixture helper) so
check_duplicate would normally detect a duplicate, then call
service.check_duplicate("text") and assert result["is_duplicate"] is False; keep
references to test_degraded_returns_no_duplicate, service._loaded,
service._load_failed, and service.check_duplicate to locate the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a49fdabd-d843-4408-b886-e0acf4aa9faa

📥 Commits

Reviewing files that changed from the base of the PR and between da8faf2 and 1c5fccf.

📒 Files selected for processing (1)
  • backend/tests/test_duplicate_service.py

Comment thread backend/tests/test_duplicate_service.py Outdated
Comment thread backend/tests/test_duplicate_service.py
@ritesh-1918
Copy link
Copy Markdown
Owner

Hi @zeroknowledge0x! 🙌

Thank you so much for your excellent contribution: "test: add unit tests for DuplicateService (Fixes #442)"! We really appreciate the high-quality code and effort you have put into the platform.

Just a quick, friendly heads-up as we prepare our manual merging and verification queues—please make sure to complete all the mandatory community steps listed below.

Once those manual steps are verified, we'll get your PR officially merged into the gssoc branch (or keep it neatly cataloged if closed as integrated) and assign it the highest possible GSSoC S-Tier labels to maximize your leaderboard points!

Let's build something amazing together! 🚀🔥


🌟 Community Support & Network Steps (Take 10 Seconds!)

As we prepare our manual verification and merging queues, please make sure you have taken a moment to complete these required steps to finalize your points:

  1. Star this repository: https://github.com/ritesh-1918/HELPDESK.AI (Mandatory)
  2. 🍴 Fork this repository: https://github.com/ritesh-1918/HELPDESK.AI/fork (Mandatory)
  3. 👤 Follow @ritesh-1918 on GitHub: https://github.com/ritesh-1918 (Mandatory - manual step)
  4. 💼 Connect on LinkedIn: https://www.linkedin.com/in/ritesh1908/ (Mandatory)

Note: Having these steps completed manually is required before your PR points are officially cleared.

@ritesh-1918 ritesh-1918 added gssoc GirlScript Summer of Code gssoc:approved GSSoC Approved PR quality:exceptional Exceptional code quality level:intermediate Intermediate level difficulty type:testing Testing suites, mock coverages, CI/CD integrations labels May 31, 2026
@ritesh-1918
Copy link
Copy Markdown
Owner

Hi @zeroknowledge0x! 🙌

Thank you so much for your excellent contribution: "test: add unit tests for DuplicateService (Fixes #442)"! We really appreciate the high-quality code and effort you have put into the platform.

Just a quick, friendly heads-up as we prepare our manual merging and verification queues—please make sure to complete all the mandatory community steps listed below.

Once those manual steps are verified, we'll get your PR officially merged into the gssoc branch (or keep it neatly cataloged if closed as integrated) and assign it the highest possible GSSoC S-Tier labels to maximize your leaderboard points!

Let's build something amazing together! 🚀🔥


🌟 Community Support & Network Steps (Take 10 Seconds!)

As we prepare our manual verification and merging queues, please make sure you have taken a moment to complete these required steps to finalize your points:

  1. Star this repository: https://github.com/ritesh-1918/HELPDESK.AI (Mandatory)
  2. 🍴 Fork this repository: https://github.com/ritesh-1918/HELPDESK.AI/fork (Mandatory)
  3. 👤 Follow @ritesh-1918 on GitHub: https://github.com/ritesh-1918 (Mandatory - manual step)
  4. 💼 Connect on LinkedIn: https://www.linkedin.com/in/ritesh1908/ (Mandatory)

Note: Having these steps completed manually is required before your PR points are officially cleared.

- test_storage_dir_created: test actual directory creation by DuplicateService
  instead of checking tmp_path (which pytest always creates)
- test_degraded_returns_no_duplicate: remove dead loaded_service_result variable,
  add stored ticket to properly exercise degraded code path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gssoc:approved GSSoC Approved PR gssoc GirlScript Summer of Code level:intermediate Intermediate level difficulty quality:exceptional Exceptional code quality type:testing Testing suites, mock coverages, CI/CD integrations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test : add unit tests for duplicate_service

2 participants