feat(indexers): added partial matching to reverse search #131

rileyok-ons · 2026-02-11T14:08:06Z

✨ Summary

Resolves #130
Adds partial "starts_with" matching to the reverse searches to check for hierarchical queries

📜 Changes Introduced

Rewrites the join call to be join_with, to see if the doc label starts with query label

Feature implementation (feat:) / bug fix (fix:) / refactoring (chore:) / documentation (docs:) / testing (test:)
Updates to tests and/or documentation
Terraform changes (if applicable)

✅ Checklist

Please confirm you've completed these checks before requesting a review.

Code passes linting with Ruff
Security checks pass using Bandit
API and Unit tests are written and pass using pytest
Terraform files (if applicable) follow best practices and have been validated (terraform fmt & terraform validate)
DocStrings follow Google-style and are added as per Pylint recommendations
Documentation has been updated if needed

🔍 How to Test

Attempt using in both api and reverse search demos, should preserve order and do hierarchical matching on demo codes

lukeroantreeONS

Behaves as described, and implemented well. Happy to approve.

Tested based on DEMO notebook;

Couple of small points:

could you update the documentation (README, Demo general_workflow notebook, reverse_search docstring) to mention this functionality so it's visible to users?
currently the dataframe we return is sorted by doc_id - I think it would be more intuitive if it was returned sorted first by id, then by doc_id. Do you agree? If not, happy to approve this now and talk later, and if so up to you whether you think it would be good to add to this PR or list as a separate task for a later date.

rileyok-ons · 2026-02-12T10:55:17Z

Behaves as described, and implemented well. Happy to approve.

Tested based on DEMO notebook;
One point though - currently the dataframe we return is sorted by `doc_id` - I think it would be more intuitive if it was returned sorted first by `id`, then by `doc_id`. Do you agree? If not, happy to approve this now and talk later, and if so up to you whether you think it would be good to add to this PR or list as a separate task for a later date.

This might be because of the join being performed right to left, will check if it makes a difference, something to note is there's a lot less explicit control for the matching and order than in a standard join, will look into if the paradigm is different

lukeroantreeONS · 2026-02-12T11:04:16Z

This might be because of the join being performed right to left, will check if it makes a difference, something to note is there's a lot less explicit control for the matching and order than in a standard join, will look into if the paradigm is different

I think your join is great, I was thinking more of a possible final sorting step on the final_table variable, e.g.

final_table = final_table.sort(
    by=["id", "doc_id"],
    descending=[False, False]
)

rileyok-ons · 2026-02-12T11:06:54Z

This might be because of the join being performed right to left, will check if it makes a difference, something to note is there's a lot less explicit control for the matching and order than in a standard join, will look into if the paradigm is different

I think your join is great, I was thinking more of a possible final sorting step on the final_table variable, e.g.
final_table = final_table.sort(
    by=["id", "doc_id"],
    descending=[False, False]
)

I see what you mean, will put it in as default and can can set as optional later ideas change

frayle-ons

Looks good to me, like the code and agree that it should be sorted by 'id' first and the sub-sorted by 'doc_id'

only suggestion is that we add a boolean parameter to the method to switch between exact matching and partial matching

frayle-ons · 2026-02-12T17:12:56Z

src/classifai/indexers/main.py


        return results_df

    def reverse_search(self, query: VectorStoreReverseSearchInput, n_results=100) -> VectorStoreReverseSearchOutput:


I think it would be appropriate to add a new method parameter 'do_partial' (or other name) and have it default to false?

so by default only full id matching results are returned, but if true then this partial matching code activates?

feat(indexers): added partial matching to reverse search

b0007d4

rileyok-ons linked an issue Feb 11, 2026 that may be closed by this pull request

Add Partial matching to reverse search #130

Open

lukeroantreeONS self-requested a review February 12, 2026 10:45

lukeroantreeONS previously approved these changes Feb 12, 2026

View reviewed changes

chore: sorted results

f8b40af

rileyok-ons dismissed lukeroantreeONS’s stale review via f8b40af February 12, 2026 11:41

Merge branch main into 130-add-partial-matching-to-reverse-search

117bbd2

rileyok-ons marked this pull request as ready for review February 12, 2026 16:14

frayle-ons reviewed Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(indexers): added partial matching to reverse search #131

feat(indexers): added partial matching to reverse search #131

Uh oh!

rileyok-ons commented Feb 11, 2026

Uh oh!

lukeroantreeONS left a comment •

edited

Loading

Uh oh!

rileyok-ons commented Feb 12, 2026

Uh oh!

lukeroantreeONS commented Feb 12, 2026

Uh oh!

rileyok-ons commented Feb 12, 2026

Uh oh!

frayle-ons left a comment

Uh oh!

frayle-ons Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		return results_df

		def reverse_search(self, query: VectorStoreReverseSearchInput, n_results=100) -> VectorStoreReverseSearchOutput:

feat(indexers): added partial matching to reverse search #131

Are you sure you want to change the base?

feat(indexers): added partial matching to reverse search #131

Uh oh!

Conversation

rileyok-ons commented Feb 11, 2026

✨ Summary

📜 Changes Introduced

✅ Checklist

🔍 How to Test

Uh oh!

lukeroantreeONS left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rileyok-ons commented Feb 12, 2026

Uh oh!

lukeroantreeONS commented Feb 12, 2026

Uh oh!

rileyok-ons commented Feb 12, 2026

Uh oh!

frayle-ons left a comment

Choose a reason for hiding this comment

Uh oh!

frayle-ons Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lukeroantreeONS left a comment •

edited

Loading