Skip to content

Add sanitize.js with correct script load order to fix broken autocomplete#26

Merged
NickHamby merged 3 commits into
mainfrom
copilot/add-string-sanitization-script
Apr 6, 2026
Merged

Add sanitize.js with correct script load order to fix broken autocomplete#26
NickHamby merged 3 commits into
mainfrom
copilot/add-string-sanitization-script

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 6, 2026

Previous sanitization attempt broke autocomplete because sanitize.js was loaded after Leaflet, making sanitizeInput undefined when app.js first ran. This PR redoes that work with sanitize.js loaded first.

Changes

  • web/js/sanitize.js (new): sanitizeInput() — strips parens, unsafe punctuation, collapses whitespace; allowlist keeps letters, digits, spaces, commas, hyphens, slashes
  • web/index.html: sanitize.js is now the first <script> tag, before the Leaflet CDN
  • web/js/app.js: Replace .value.trim() with sanitizeInput(inputEl.value) in both attachAutocomplete() and run()
  • web/js/hazards.js: Replace flat ABBR_MAP + normalizeStreet() with positional directional expansion and ZIP stripping
<!-- Critical order — sanitize.js must precede everything -->
<script src="js/sanitize.js"></script>
<script src="https://unpkg.com/leaflet@1.9.4/dist/leaflet.js"></script>
<script src="js/geocode.js"></script>
...

Key normalizeStreet improvements: strips trailing ZIP codes before expansion, expands directionals only at string start/end (not globally), preserving correctness for street names like "East Ave".

Original prompt

Implement the string sanitization plan as described in docs/string-sanitization-plan.md. A previous attempt at this broke autocomplete because sanitize.js was loaded in the wrong position — the script tag was placed after Leaflet instead of before all app scripts, so sanitizeInput was undefined when app.js tried to call it. This attempt must get the script load order exactly right.

Changes required

1. Create web/js/sanitize.js (new file)

// sanitize.js — sanitizes raw user input before geocoding or internal matching

/**
 * Sanitizes a raw user-supplied address string for use as a Nominatim query.
 *
 * - Trims leading/trailing whitespace
 * - Collapses multiple spaces into one
 * - Strips parenthetical context (e.g. "(near the park)")
 * - Removes characters that break queries: # " ' . ; : ! ? @ ^ * [ ] { } | \ ~ ` = + < > % & _
 * - Preserves commas (Nominatim field separators), hyphens (address ranges), slashes (fractions)
 * - Does NOT expand abbreviations (leave that to normalizeStreet in hazards.js)
 * - Does NOT strip ZIP codes (Nominatim handles them; strip only inside normalizeStreet)
 * - Must never be called on coordinate (lat/lng) values
 *
 * @param {string} str  Raw input string
 * @returns {string}    Sanitized string safe for Nominatim queries
 */
function sanitizeInput(str) {
  if (typeof str !== 'string') return '';
  let s = str;
  // Remove parenthetical content
  s = s.replace(/\([^)]*\)/g, '');
  // Keep: letters, digits, spaces, commas, hyphens, forward slashes
  s = s.replace(/[^a-zA-Z0-9\s,\-\/]/g, '');
  // Collapse multiple spaces
  s = s.replace(/\s{2,}/g, ' ');
  return s.trim();
}

2. Update web/index.html

The script load order is critical. sanitize.js MUST be the very first <script> tag — before Leaflet, before everything else. The final script block must look exactly like this:

  <script src="js/sanitize.js"></script>
  <script src="https://unpkg.com/leaflet@1.9.4/dist/leaflet.js"></script>
  <script src="js/geocode.js"></script>
  <script src="js/routing.js"></script>
  <script src="js/hazards.js"></script>
  <script src="js/map.js"></script>
  <script src="js/app.js"></script>

The current script block in index.html looks like this (read the file to confirm):

  <script src="https://unpkg.com/leaflet@1.9.4/dist/leaflet.js"></script>
  <script src="js/geocode.js"></script>
  <script src="js/routing.js"></script>
  <script src="js/hazards.js"></script>
  <script src="js/map.js"></script>
  <script src="js/app.js"></script>

Add <script src="js/sanitize.js"></script> as the FIRST line, before the Leaflet CDN script tag. Do NOT place it anywhere else.

3. Update web/js/app.js

In attachAutocomplete(): Replace inputEl.value.trim() with sanitizeInput(inputEl.value):

// Before:
const query = inputEl.value.trim();

// After:
const query = sanitizeInput(inputEl.value);

In run(): Replace the two .value.trim() calls with sanitizeInput():

// Before:
const origin = document.getElementById('origin').value.trim();
const destination = document.getElementById('destination').value.trim();

// After:
const origin = sanitizeInput(document.getElementById('origin').value);
const destination = sanitizeInput(document.getElementById('destination').value);

4. Update web/js/hazards.js

Replace the existing ABBR_MAP constant and normalizeStreet() function with this improved version. Do NOT touch anything else in hazards.jsgetAllHazards(), getHazardsOnRoute(), and the coordinate bounding box logic must remain exactly as they are.

const STREET_TYPE_ABBRS = [
  [/\bSt\b/g, 'Street'],
  [/\bAve\b/g, 'Avenue'],
  [/\bBlvd\b/g, 'Boulevard'],
  [/\bDr\b/g, 'Drive'],
  [/\bRd\b/g, 'Road'],
  [/\bPkwy\b/g, 'Parkway'],
  [/\bLn\b/g, 'Lane'],
  [/\bCt\b/g, 'Court'],
];

const DIRECTIONAL_EXPAND = {
  W: 'West',
  E: 'East',
  N: 'North',
  S: 'South',
};

function normalizeStreet(str) {
  let s = str;

  // 1. Strip trailing ZIP code (5 digits, optionally preceded by comma and/or space)
  s = s.replace(/[,\s]+\d{5}\s*$/, '');

  // 2. Expand street type abbreviations
  for (const [pattern, replacement] of STREET_TYPE_ABBRS) {
    s = s.replace(pattern, replacement);
  }

  // 3. Expand directionals at start of string (prefix directional)
  s = s.replace(/^(W|E|N|S)\b\s*/, (_, d) => DIRECTIONAL_EXPAND[d] + ' ');

  // 4. Expand directionals at end of string (suffix directional)
  s = s.replace(/\s+(W|E|N|S)$/, (_, d) => ' ' + DIRECTIONAL_EXPAND[d]);

  // 5. Strip leading house numbers (including fractional like "8 1/2")
  s = s.replace(/^\d+(\s+\d+\/\d+)?\s+/, '');

  // 6. Strip remaining punctuation
  s = s.replace(/[^a-zA-Z0-9\s]/g, '');

  // 7. Collapse whitespace and trim, lowercase
  s = s.replace(/\s{2,}/g, ' ');
  return s.trim().toLowerCase();
}

Verification checklist — the agent MUST confirm all of these before opening the PR

  • `web/js/sanit...

This pull request was created from Copilot chat.

…improve normalizeStreet in hazards.js

Agent-Logs-Url: https://github.com/NickHamby/PotholeDodgerV2/sessions/698ed176-2f2b-4e8e-921c-67ec953885b8

Co-authored-by: NickHamby <271342652+NickHamby@users.noreply.github.com>
Copilot AI changed the title [WIP] Add string sanitization script to improve input handling Add sanitize.js with correct script load order to fix broken autocomplete Apr 6, 2026
Copilot AI requested a review from NickHamby April 6, 2026 18:16
@NickHamby NickHamby marked this pull request as ready for review April 6, 2026 19:48
@NickHamby NickHamby merged commit d35fbdf into main Apr 6, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants