This document captures known bugs, quick wins, and feature ideas for the winget-search project. It is intended as a starting point for new contributors or an LLM asked to find work to do.
Each item notes which file(s) are affected, the effort involved, and whether the change is backend-only (Python / CI), frontend-only (index.html), or both.
This section tracks what is already implemented on branch codex/frontend-search-improvements
so future contributors/LLMs can continue from the remaining items.
#1Regex crash fix inhighlightMatch()(escaped regex input)#5Minified JSON output inextract_packages.py(separators=(",", ":"))#6Ranked/weighted search scoring in frontend#7shortDescriptionfallback in result rendering#8Incremental result rendering withLoad morepaging#9URL search state support (?q=) + back/forward handling#10Homepage link + license badge shown in cards#11Click-to-filter for publisher and tags#26Removed unusedreimport and unusedis_english_manifest()helper
| # | Status on this branch | Notes / next step |
|---|---|---|
| 1 | Done | highlightMatch() now escapes regex tokens before RegExp creation. |
| 2 | Open | CI cache key still uses ${{ github.run_id }} and misses cache reuse. |
| 3 | Open | Footer still has YOUR_USERNAME/YOUR_REPO_NAME placeholder URL. |
| 4 | Open | README still has duplicated summary/malformed opening heading block. |
| 5 | Done | packages.json output no longer pretty-printed. |
| 6 | Done | Weighted ranking added; exact/prefix/id/name matches now rank higher. |
| 7 | Done | Description fallback now uses `pkg.description |
| 8 | Done | Results render in pages of 25 with a Load more button. |
| 9 | Done | Search query is read/written from URL query string. |
| 10 | Done | Homepage and license are rendered in result cards. |
| 11 | Done | Publisher/tags are interactive filters feeding back into search. |
| 12 | Open | No compact/expandable details modal yet. |
| 13 | Open | No copy-command variants UI yet. |
| 14 | Open | No fuzzy/typo-tolerant search dependency integrated yet. |
| 15 | Open | No pre-built backend search index artifact yet. |
| 16 | Open | No inferred package categories generated/displayed yet. |
| 17 | Open | Template leftover config files still present. |
| 18 | Open | license.txt still has placeholder copyright holder. |
| 19 | Open | 404 redirect path is still hardcoded in workflow. |
| 20 | Open | Cached winget update still resets to origin/master. |
| 21 | Open | force_pages_update.sh still assumes return branch master. |
| 22 | Open | README license link/file mismatch still unresolved. |
| 23 | Open | README still describes stricter locale behavior than extractor implements. |
| 24 | Open | Version fallback logic still collapses non-PEP440 versions to 0.0.0. |
| 25 | Open | packages.json tracking policy remains ambiguous. |
| 26 | Done | Dead code/import cleanup completed in extractor. |
#3Replace placeholder repository URL in footer (index.html).#4Clean up duplicate README summary + malformed heading.#22Fix README license link mismatch (LICENSEvslicense.txt).#2Improve CI cache key strategy (date-based key + restore keys).#20Make cached winget reset use detected default branch.#19Remove hardcoded 404 redirect repo path.#21Makeforce_pages_update.shrestore original branch.
github_workflows_build.yml runs on push to main/master, on schedule, and via manual dispatch.
Pushing codex/frontend-search-improvements should not trigger that workflow automatically.
GitHub Pages must remain enabled for this repository and must serve from the gh-pages
branch root. A successful Build and Deploy run only updates the deployment branch; it does
not re-enable Pages if the repository-level Pages setting has been turned off.
Practical consequence: if gh-pages contains the expected index.html/packages.json
artifacts but https://solrevdev.com/winget-search/ returns 404, check Settings >
Pages before assuming a frontend regression.
File: index.html — highlightMatch() function
Priority: High (crashes the page on certain inputs)
The search query is passed directly into new RegExp() without escaping regex special characters.
A query like (test or a.b throws a SyntaxError and breaks highlighting for that keystroke.
Fix:
// Before
const regex = new RegExp(`(${query})`, 'gi');
// After
const escaped = query.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const regex = new RegExp(`(${escaped})`, 'gi');File: .github/workflows/github_workflows_build.yml
Priority: Medium (wastes CI time, not a correctness bug)
The cache key for the winget-pkgs clone is winget-pkgs-${{ github.run_id }}, which is unique
per run and will never produce a cache hit on scheduled runs. The winget repo is always cloned
fresh, negating the cache entirely.
Fix: Use a date-based key so same-day retries reuse the clone:
- name: Get date for cache key
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
- name: Cache winget-pkgs
uses: actions/cache@v4
with:
path: winget-pkgs
key: winget-pkgs-${{ steps.date.outputs.date }}
restore-keys: winget-pkgs-File: index.html — footer section (around line 317)
Priority: Medium (broken link on live site)
The footer contains https://github.com/YOUR_USERNAME/YOUR_REPO_NAME which should have been
replaced with the actual repository URL during initial setup. The README documents this step
but it was not done.
Fix: Replace the placeholder with the real repository URL in the footer link.
File: README.md — lines 1–8 and lines 246–254
Priority: Low (cosmetic)
The summary paragraph appears twice: once before the # heading (likely an authoring mistake)
and once near the end under a second ## Summary heading. Remove one of them.
File: extract_packages.py — json.dump() call near the end of main()
Priority: High (easy 30–40% reduction in payload size)
The JSON is written with indent=2, adding whitespace for human readability. With 30,000+
packages this is unnecessary overhead for a file that is only consumed by a browser.
Fix:
# Before
json.dump(output, f, cls=EnhancedJSONEncoder, ensure_ascii=False, indent=2)
# After
json.dump(output, f, cls=EnhancedJSONEncoder, ensure_ascii=False, separators=(',', ':'))File: index.html — showResults() function
Priority: High (meaningfully improves result quality)
All substring matches are treated equally. A package whose id exactly matches the query ranks
the same as one where the query appears in the middle of a long description.
Fix: Score each package and sort descending before slicing to 100:
function scorePackage(pkg, q) {
let score = 0;
if (pkg.id?.toLowerCase() === q) score += 100;
else if (pkg.id?.toLowerCase().startsWith(q)) score += 50;
else if (pkg.id?.toLowerCase().includes(q)) score += 20;
if (pkg.name?.toLowerCase().includes(q)) score += 15;
if (pkg.tags?.some(t => t?.toLowerCase().includes(q))) score += 8;
if (pkg.publisher?.toLowerCase().includes(q)) score += 5;
if (pkg.description?.toLowerCase().includes(q)) score += 3;
return score;
}
// In showResults(), replace the filter+slice with:
results = packages
.map(pkg => ({ pkg, score: scorePackage(pkg, q) }))
.filter(({ score }) => score > 0)
.sort((a, b) => b.score - a.score)
.slice(0, 100)
.map(({ pkg }) => pkg);File: index.html — result card template inside renderPackage() (or equivalent)
Priority: Medium (data completeness, backend already extracts the field)
The Python extractor outputs both description (from the locale file) and shortDescription
(from the version manifest), but the frontend only uses description. Many packages have an
empty description but a populated shortDescription.
Fix: In the result card template, change:
// Before
${pkg.description ? `<p class="description">${escapeHtml(pkg.description)}</p>` : ''}
// After
const desc = pkg.description || pkg.shortDescription || '';
// then use desc in the templateFile: index.html — showResults() and the results container
Priority: Medium (performance on low-end devices)
Results are capped at 100, but rendering 100 DOM nodes at once can still cause layout jank on lower-end hardware. The simplest fix is a "Load more" button that appends the next batch rather than a full virtual-scroll implementation:
const PAGE_SIZE = 25;
let visibleCount = PAGE_SIZE;
function renderPage() {
const visible = currentResults.slice(0, visibleCount);
resultsDiv.innerHTML = visible.map(renderCard).join('');
if (visibleCount < currentResults.length) {
resultsDiv.insertAdjacentHTML('beforeend',
`<button onclick="loadMore()">Load more (${currentResults.length - visibleCount} remaining)</button>`);
}
}
function loadMore() {
visibleCount += PAGE_SIZE;
renderPage();
}File: index.html — search input event listener and page initialisation
Priority: High (enables sharing search results, browser back/forward)
Currently there is no way to share a search URL or navigate back to a previous search.
Fix: Read from and write to the URL query string:
// On page load, after packages are ready:
const params = new URLSearchParams(location.search);
if (params.has('q')) {
searchInput.value = params.get('q');
showResults(params.get('q'));
}
// In the debounced input handler:
history.replaceState(null, '', query ? `?q=${encodeURIComponent(query)}` : location.pathname);File: index.html — result card template
Priority: Medium (zero backend cost, fields already in packages.json)
The Python extractor outputs homepage and license for every package but neither is shown
in the UI. Adding them to the card requires only a frontend template change.
Fix:
${pkg.homepage ? `<a href="${escapeHtml(pkg.homepage)}" target="_blank" rel="noopener noreferrer">Homepage</a>` : ''}
${pkg.license ? `<span class="license-badge">${escapeHtml(pkg.license)}</span>` : ''}homepage— link withrel="noopener noreferrer"andtarget="_blank"license— small badge rendered alongside the version badge
File: index.html — publisher span and tag pill click handlers
Priority: Medium (discoverability, frontend-only)
Publisher names and tag pills are currently rendered as plain text or non-interactive spans. Clicking them should populate the search box with that value and trigger a new search.
Fix: Add an onclick to each:
// Publisher
`<span class="publisher" onclick="filterBy('${escapeHtml(pkg.publisher)}')">${escapeHtml(pkg.publisher)}</span>`
// Tag pill
`<span class="tag" onclick="filterBy('${escapeHtml(tag)}')">${escapeHtml(tag)}</span>`
// Helper
function filterBy(value) {
searchInput.value = value;
showResults(value);
searchInput.focus();
}File: index.html
Priority: Medium (reduces visual clutter in the default list view)
Cards currently show all fields inline. A compact default view with an expandable section (or modal) for full description, all tags, homepage, and license would make the list easier to scan.
No backend changes needed.
File: index.html — copy button area
Priority: Low (power user feature)
The README lists "Add copy as PowerShell option" as a future idea. A toggle or secondary button could offer:
| Variant | Command |
|---|---|
| Standard (current) | winget install -e --id Publisher.Name |
| Silent / scripted | winget install --exact --id Publisher.Name --silent --accept-package-agreements |
A small segmented toggle above the results (or per-card) is sufficient. No backend changes needed.
File: index.html
Priority: Low (significant UX improvement but requires a new dependency)
Simple substring matching misses common typos (e.g., googel chrome). A lightweight client-side
full-text search library such as FlexSearch or
Fuse.js would add typo tolerance and tokenised indexing.
Both can be included as a single self-hosted minified file — no npm or build step required. FlexSearch is generally faster for large datasets; Fuse.js is simpler to configure.
The index would be built once after packages.json is fetched, then reused for all queries.
File: extract_packages.py, index.html
Priority: Low (more complex, high payoff at scale)
Instead of building the search index in the browser on every page load, generate a serialised
inverted index at extraction time in Python and ship it as a separate index.json file.
The browser loads the index lazily and searches it directly, avoiding the O(n) full-array scan.
File: extract_packages.py, index.html
Priority: Low (enables filter-by-category in the UI)
Winget manifests have no category field, but one can be inferred from the Tags list using a
mapping table in Python (e.g., tags containing browser → Web Browsers). The category would
be emitted as a new field in packages.json and surfaced as a filter panel in the UI.
Files: .editorconfig, omnisharp.json, .dockerignore
Priority: Low (cosmetic, no functional impact)
.editorconfigis configured for C#/.NET (indent style, charset settings for.csfiles)omnisharp.jsonis an OmniSharp C# language server config — not used by this project.dockerignoreexists but there is noDockerfile
These are likely leftovers from a project template. They can be removed or replaced with configs appropriate for a Python + HTML project.
File: license.txt
Priority: Low (cosmetic)
The copyright holder is listed as [Your Name]. Replace with the actual author name.
| # | Description | File(s) | Type | Priority |
|---|---|---|---|---|
| 1 | Fix regex crash in highlightMatch |
index.html |
Bug | High |
| 2 | Fix CI cache key (never hits) | build.yml |
Bug | Medium |
| 3 | Replace placeholder repo URL in footer | index.html |
Bug | Medium |
| 4 | Remove duplicate Summary in README | README.md |
Bug | Low |
| 5 | Minify packages.json output |
extract_packages.py |
Performance | High |
| 6 | Ranked/weighted search results | index.html |
Performance | High |
| 7 | Use shortDescription as fallback |
index.html |
Performance | Medium |
| 8 | Virtual scrolling / "Load more" button | index.html |
Performance | Medium |
| 9 | URL-based search state (?q=) |
index.html |
Feature | High |
| 10 | Show homepage + license fields |
index.html |
Feature | Medium |
| 11 | Click-to-filter by publisher or tag | index.html |
Feature | Medium |
| 12 | Package detail expansion / modal | index.html |
Feature | Medium |
| 13 | Copy command variants | index.html |
Feature | Low |
| 14 | Fuzzy / typo-tolerant search | index.html |
Feature | Low |
| 15 | Pre-built search index (backend) | extract_packages.py + index.html |
Feature | Low |
| 16 | Inferred package categories | extract_packages.py + index.html |
Feature | Low |
| 17 | Remove irrelevant config files | .editorconfig, omnisharp.json, .dockerignore |
Housekeeping | Low |
| 18 | Personalise license.txt |
license.txt |
Housekeeping | Low |
The list above was validated against the current repository state.
| # | Status | Validation notes |
|---|---|---|
| 1 | Confirmed | highlightMatch() still creates RegExp from unescaped input (index.html:389). |
| 2 | Confirmed | Cache key still uses ${{ github.run_id }} (.github/workflows/github_workflows_build.yml:36). |
| 3 | Confirmed | Footer still contains YOUR_USERNAME/YOUR_REPO_NAME placeholder (index.html:316). |
| 4 | Confirmed | Duplicate summary still present (README.md:1 and README.md:246); top heading is also malformed (README.md:9). |
| 5 | Confirmed | JSON output is still pretty-printed with indent=2 (extract_packages.py:167). |
| 6 | Confirmed | Search still does unranked substring filter + slice(0, 100) (index.html:400-410). |
| 7 | Partially confirmed | UI only uses pkg.description (index.html:439), but extractor already sets description fallback from locale ShortDescription (extract_packages.py:78). Fallback to package-level shortDescription is still useful. |
| 8 | Confirmed | Rendering still injects all returned cards at once (index.html:423-455). |
| 9 | Confirmed | No URL query-state persistence exists. |
| 10 | Confirmed | homepage and license are extracted (extract_packages.py:79-80) but not shown in UI. |
| 11 | Confirmed | Publisher/tags are non-interactive spans (index.html:427, index.html:440-441). |
| 12 | Confirmed | No expandable details/modal exists. |
| 13 | Confirmed | Only one command variant is rendered (index.html:424). |
| 14 | Confirmed | Search is exact substring only (no fuzzy matching). |
| 15 | Confirmed | No pre-built backend search index artifact exists. |
| 16 | Confirmed | No inferred category field is generated. |
| 17 | Partially confirmed | Files are mostly template leftovers, but removal is optional rather than mandatory. |
| 18 | Confirmed | license.txt still contains [Your Name] placeholder (license.txt:3). |
File: .github/workflows/github_workflows_build.yml (line ~98)
Priority: High
The generated 404 page always redirects to /winget-search/, which fails for repos with different names.
Fix: build from repository name:
- name: Copy site assets
run: |
repo_name="${{ github.event.repository.name }}"
echo "<!DOCTYPE html><html><head><meta http-equiv=\"refresh\" content=\"0; url=/${repo_name}/\"></head></html>" > deploy/404.htmlFile: .github/workflows/github_workflows_build.yml (line ~53)
Priority: Medium
If upstream default branch ever differs from master, cached update can break.
Fix: resolve default branch dynamically:
default_branch=$(git symbolic-ref refs/remotes/origin/HEAD | sed 's@^refs/remotes/origin/@@')
git reset --hard "origin/$default_branch"File: force_pages_update.sh (line ~25)
Priority: Low
Script always checks out master at the end, which is wrong on main-based repos.
Fix: capture and restore the original branch:
start_branch=$(git rev-parse --abbrev-ref HEAD)
# ... run script steps ...
git checkout "$start_branch"Files: README.md (line ~258), license.txt
Priority: Low
README links to [MIT License](LICENSE) but file is named license.txt.
Fix: rename file to LICENSE or update README link to license.txt.
Files: README.md, extract_packages.py
Priority: Medium
README claims English-only extraction, but extractor falls back to non-en-US locale files when needed (extract_packages.py:68-70).
Fix options:
- enforce strict
en-USonly in code, or - update README to describe current fallback behavior.
File: extract_packages.py (parse_version)
Priority: Medium
Invalid versions currently collapse to 0.0.0; multiple non-standard versions compare equal, making latest-selection unreliable.
Fix: compare with a deterministic fallback key:
def version_key(ver_str: str):
try:
return (0, version.parse(ver_str), "")
except Exception:
return (1, None, ver_str)Files: packages.json, .gitignore (line ~428)
Priority: Low
packages.json is tracked in git, but .gitignore also lists it. This can confuse contributor expectations.
Fix: decide one policy explicitly:
- generated artifact only (stop tracking file), or
- tracked sample file (remove ignore entry).
File: extract_packages.py
Priority: Low
is_english_manifest() is unused and re import is unused.
Fix: remove dead code/imports, or wire the helper into real filtering.
| # | Description | File(s) | Type | Priority |
|---|---|---|---|---|
| 19 | Remove hardcoded 404 path | .github/workflows/github_workflows_build.yml |
Bug | High |
| 20 | Remove hardcoded origin/master dependency |
.github/workflows/github_workflows_build.yml |
Bug | Medium |
| 21 | Make force_pages_update.sh branch-safe |
force_pages_update.sh |
Bug | Low |
| 22 | Fix README license filename mismatch | README.md, license.txt |
Bug | Low |
| 23 | Align docs with locale fallback behavior | README.md, extract_packages.py |
Bug | Medium |
| 24 | Improve non-standard version comparison | extract_packages.py |
Bug | Medium |
| 25 | Clarify packages.json tracking policy |
packages.json, .gitignore |
Housekeeping | Low |
| 26 | Remove dead code/imports | extract_packages.py |
Housekeeping | Low |