fix: escape angle brackets in plain-text fields for search_vector trigger#281
Merged
hoiekim merged 1 commit intohoiekim:mainfrom Mar 25, 2026
Merged
Conversation
Contributor
Author
Self-ReviewDiscussion thread status:
Checked:
E2E Testing:
Issues found:
Confidence: High |
…gger
to_tsvector('english', ...) treats angle brackets as HTML tags and strips
their contents. This is correct for the mail body (text field) but not for
plain-text fields like subject, from_text, and to_text.
Fix:
- Replace '<' and '>' with spaces in subject, from_text, and to_text before
passing to to_tsvector, so words inside angle brackets are indexed.
- Add a retroactive reindex UPDATE on startup so existing rows are fixed;
the WHERE clause makes it a no-op once all rows are up-to-date.
Closes hoiekim#280
49d9235 to
2193fe7
Compare
hoiekim
approved these changes
Mar 25, 2026
hoiekim
approved these changes
Mar 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
to_tsvector('english', ...)treats angle brackets as HTML tag delimiters and strips their contents. This is intentional for the mail body (textfield), butsubject,from_text, andto_textare plain-text fields — words inside angle brackets were being silently dropped from the search index.Example: email with subject
XSS test <script>alert(1)</script>— the wordsscriptandalertnever appear insearch_vector.Verified in the database:
Fix
In
mails_search_vector_trigger(), replace<and>with spaces insubject,from_text, andto_textbefore passing toto_tsvector. Thetext(HTML body) field is left unchanged — HTML stripping there is correct behaviour.Also adds an idempotent startup reindex so existing rows are fixed retroactively. The
WHERE … IS DISTINCT FROMclause makes it a no-op once all rows are up to date.Testing
<alert>)Closes #280