Skip to content

Releases: alo-exp/multai

v0.3.0 — 100% Test Coverage + Security Hardening

08 Apr 11:42

Choose a tag to compare

Features

  • `feat(phase-01): achieve 100% unit test coverage across entire codebase` (7a9837f) — 816 tests, all platforms, all modules
  • `feat(01-01): add pytest/coverage config and pragma: no cover annotations` (77972ce)
  • `feat(01-01): add conftest.py with mock_page fixture and shared helpers` (9861769)
  • `feat: add /multai:update skill` (fb39825)
  • `feat: add Help Center to MultAI website` (6655310)

Security

  • `security: apply SENTINEL Stage-4 audit patches (P1 + P2)` (f18e5d6) — delimiter injection fix, pip-audit now blocking on HIGH/CRITICAL CVEs, chrome-profile sanitisation, resolved output paths, default profile changed to MultAI
  • `security: fix task-name traversal guard and /tmp resolve (SENTINEL pass 2)` (0786dbb) — task-name path guarded against traversal; /tmp check uses Path.resolve() for macOS compatibility

Bug Fixes

  • `fix(quality-gate-stage-2): resolve consistency audit findings` (5abf5e3)
  • `fix(quality-gate-stage-1): address code review findings` (d3bca68)
  • `fix(tests): patch _ensure_playwright_data_dir in orchestrate tests` (add056d)
  • `fix(ci): install pytest-asyncio for async test support` (7af20ce)
  • `fix(tests): stub inject_prompt in TestRunLifecycle to avoid clipboard deps on CI` (da37d7e)
  • `fix(tests): resolve sys.modules platforms pollution across test suite` (022dc32)
  • `fix(conftest): make install_stubs return config mock with all required attrs` (1284369)
  • `fix(01-04): move test files from engine/tests/ to tests/` (178f860)
  • `fix(ci): exclude helper files from check_rate_limit platform count` (7afa1bc)
  • `fix: update test_orchestrator_args to import from cli.py` (db9310c)
  • `fix: remove User Guide nav link; add Help Center link to USER-GUIDE.md` (67df2ed)
  • `fix: apply main site theme/nav/footer to help center; replace emoji icons` (0ce0d97)
  • `fix: replace theme toggle emojis with inline SVG icons` (98576c3)

Tests

  • `test(01-06): achieve 100% statement coverage for gemini, grok, perplexity, chatgpt_extractor` (68ce628)
  • `test(01-05): add 100% unit test coverage for 4 platform drivers` (f82efde)
  • `test(01-04): achieve 100% coverage for Playwright core modules` (6af862b)
  • `test(01-03): achieve 100% coverage for non-Playwright modules` (39c89a0)
  • `test(01-02): achieve 100% coverage for matrix_builder.py` (03eae39)
  • `test(01-02): achieve 100% coverage for matrix_ops.py` (abe5404)

Chores

  • `chore(release): bump version to v0.3.0; update README` (830e589)

v0.2.26040636 Alpha

07 Apr 22:05

Choose a tag to compare

v0.2.26040636 Alpha

Security

  • SENTINEL security audit + 5 hardening patches (b101e02): prompt delimiter in agent fallback task, clipboard clear post-injection, engine requirements pinned to exact versions, agent API transmission disclosure log, CDP scope isolation comment

Bug Fixes

  • fix: Gemini DR — bring_to_front before plan search, wait_for_selector 45s (6858075): resolves race condition where Angular re-rendered the Start button between tab focus and click
  • fix: Gemini DR race condition — bring tab to front, extend Stop confirmation to 60s (b94113b)
  • fix: Gemini DR completion — remove false Thinking indicator, add Share & Export signal (04c012f)
  • fix: Claude.ai DEEP — disable Research mode, use web search only (d94b178): resolves connector failure when Research mode is enabled
  • fix: DeepSeek stop vs send button disambiguation via SVG rect + text-growth tracking (5f53a29): resolves false stop detection
  • fix: DeepSeek stop-button detection — JS DOM walk instead of :has-text() (cd2a6a4)
  • fix: detect Claude.ai Research failure, prevent sidebar-junk extraction (3dc02d9)
  • fix: Perplexity wait for input element before inject (ab17f73)
  • fix: Claude.ai artifact click once via JS dispatchEvent (c59cead)
  • fix: Claude.ai bring_to_front + Perplexity join all prose divs (faee4a9)
  • fix: Gemini quick-response fallback when DR cap exhausted (8e3658e)
  • fix: Claude.ai sidebar false-positive, conversation-turn extraction, 60 min timeout (2fe1f3a)
  • fix: ChatGPT DR quota fast-fail before 6-min retry loop (308d59a)
  • fix: Gemini DR stable_threshold 30→90 polls (7e5b6b0)
  • fix: ChatGPT DEEP body-threshold stability guard (1504fe8)
  • fix: Gemini DR Thinking detection, ChatGPT conversation ID filter (185e4f5)
  • fix: ChatGPT DEEP DR timeout 200s→600s, more cancel selectors (85522f0)
  • fix: Gemini DR stable-state threshold, ChatGPT DEEP echo suppression (7f7e764)
  • fix: ChatGPT stale DR panel, Gemini nav retry (81140a6)
  • fix: DEEP mode premature completion, Perplexity old-content, Chrome focus steal (d35e862)
  • fix: ChatGPT echo false-positive, Gemini cancel detection, Perplexity premature completion (2a53ce2)
  • fix: Gemini scoped completion check, DeepSeek leaf-block extraction (1c9687e)
  • fix: skill auto-invoke, non-interactive mode, rate limiter cap (0147cdf)
  • fix: ChatGPT DR frame isolation, extended retry (f424973)
  • fix: ChatGPT DR iframe URL pattern expansion (423b5e5)
  • fix: ChatGPT DEEP completion gated on DR iframe content (ec4602d)

Refactoring

  • refactor: split orchestrator.py into 6 focused modules (a3f1460): orchestrator, cli, engine_setup, tab_manager, status_writer, prompt_loader, retry_handler
  • refactor: split platforms/base.py into inject_utils and browser_utils mixins (aff8192)
  • refactor: split platforms/chatgpt.py — extract ChatGPTExtractorMixin (1013380)

Tests

  • test: add 13 unit tests for Gemini, DeepSeek completion_check and ChatGPT rate limit (bf138c6)

Documentation

Chores

  • fix: update .silver-bullet.json skill name release-notes → create-release (c9d99a5)
  • fix: Stage 2 consistency audit — Makefile test path, version stamp (a4a5ef1)
  • enforce: add pre-release quality gate (4-stage) to MultAI project (2d776b8)
  • fix: correct clipboard docstring, move imports to top-level (f10d48a)
  • fix: apply code review fixes — portable test paths, atomic pref write, gather timeout (05aed78)

Full Changelog: v0.2.26040304-alpha...v0.2.26040636-alpha

0.2.26040304 Alpha — Report Viewer Polish, CI Fix, Gitignore Cleanup

02 Apr 12:33

Choose a tag to compare

What's New

Features

  • Report Viewer — generic Landscape Report viewer: preview.html decoupled from hardcoded Platform Engineering data via chart-data.json sidecar pattern. Each report carries its own data; viewer falls back to built-in defaults when absent.
  • Report Viewer — empty state: polished hero card with icon, heading, subtitle, and styled launch command.
  • Report Viewer — export buttons: Copy, PDF, Compare buttons now have visible borders.
  • Report Viewer — orphan lines removed: bare horizontal lines in empty state eliminated (#stats:empty, #sidebarFooter border, #solutionNav border).
  • chart-data.json skeleton generation: launch_report.py auto-creates a sidecar with domain-aware placeholder titles alongside each new report.
  • Silver Bullet enforcement: workflow and quality gates initialised for the project.

Fixes

  • 8 issues from MULTAI-ISSUE-REPORT-2026-04-02: chart data reset per loadFile(), per-report title/anchor overrides, vendor pill table-row fallback, collectVendorNames tier-heading guard.
  • Preview server: switched to Homebrew Python 3.13 + PORT-aware serve.py (system Python 3.9 failed with PermissionError in sandboxed environment).
  • launch_report.py CI: mkdir(parents=True, exist_ok=True) before writing chart-data.json.
  • setup.sh: added git pull --rebase so reinstall always fetches latest source.
  • Docs sync: restored 6 remote doc files replaced by incorrect Silver Bullet stubs.

Chores

  • Gitignore: reports/*/ excludes all generated MultAI output (raw AI responses, CIRs, landscape reports, matrices). SENTINEL audit files remain tracked.

Also included (not previously tagged)

  • 0.2.26040303 — SENTINEL security audit: XSS fix via DOMPurify, CDN SRI hashes, temp file cleanup
  • 0.2.26040302 — Hardened release: security fixes, 3-iteration code review pass
  • 0.2.26040301 — Popup dismissal, readiness check, real-time sign-in, verified install
  • 0.2.26040203 — Report viewer redesign with Ālo Design System

0.2.26040202 Alpha — Login Retry, Perplexity Fix, Platform-Level Fallback

01 Apr 18:51

Choose a tag to compare

What's in this release

All 7 AIs now guaranteed to be prompted (login retry)

When a platform shows a sign-in page, the engine no longer skips it permanently. After all parallel platforms complete, the engine:

  1. Prints a clear sign-in prompt for each needs_login platform (with URL)
  2. Waits 90 seconds so you can sign in in Chrome
  3. Automatically retries those platforms

Other platforms are unaffected — their results are already collected.

Perplexity — "Computer" feature no longer triggered

The model picker and Research toggle now explicitly skip any option containing "computer" (case-insensitive), preventing accidental activation of the "Perplexity Computer" paid/credit feature. Input injection updated to prefer textarea (new Perplexity UI).

Platform-level browser-use fallback (new)

If a platform returns failed after all Playwright steps fail, and ANTHROPIC_API_KEY or GOOGLE_API_KEY is set, a full browser-use agent session retries the complete interaction: navigate → type → send → wait → extract. Uses 25 agent steps (DEEP) or 15 (REGULAR).

No breaking changes

All existing behavior preserved. Fallbacks are additive.

0.2.26040201 Alpha — /consolidator Redesigned as Standalone Skill

01 Apr 18:33

Choose a tag to compare

/consolidator redesigned as a standalone, generic skill

/consolidator is now a first-class skill for synthesizing content from any set of input sources into a unified structured report.

What's new

  • Renamed: multi-ai-consolidatorconsolidator (fixes display name in Claude Desktop)
  • Generic mode: consolidates documents, transcripts, meeting notes, URLs, pasted text — auto-detects content type and derives an appropriate report structure
  • AI-Responses mode (preserved): full CIR generation from orchestrator/specialist-skill archives — no behavioral change for existing MultAI workflows
  • Phase 0 mode detection: announces mode before proceeding
  • Source attribution: all claims attributed to specific sources; conflicts surfaced explicitly

No breaking changes

Existing workflows that invoke /consolidator from /multai, /solution-researcher, or /landscape-researcher are fully preserved.

0.2.26040105 Alpha

01 Apr 17:47

Choose a tag to compare

Redesign `/comparator` as a standalone skill and expose it publicly.

What's new

`/comparator` is now a first-class user-facing skill for comparing any two (or more) solutions — with or without a prior MultAI research run.

Gap fixes

Gap Fix
No capability discovery Phase 2 derives a capability framework from available evidence and confirms with the user before scoring
Manual build.json Auto-derived from evidence in Phase 5
No priority assignment phase Phase 3 — interactive priority review, optional ('auto' to skip)
Coupled to CIR format Phase 4 generalised to CIR Variant A/B, non-CIR docs, and LLM knowledge with confidence labels
No "compare two solutions" flow compare X vs Y is now a first-class operation
No readable output Phase 7 always produces a Markdown summary with ranked scores, category breakdown, and key differentiators
Domain knowledge required Domain file is fully optional — bootstrapped from scratch on first run

Also

  • README updated to surface /comparator alongside /multai
  • Skill description updated so Claude activates it on comparison intent

0.2.26040104 Alpha

01 Apr 00:11

Choose a tag to compare

Add Cowork runtime support via Claude-in-Chrome path.

What's new

Cowork tab support — MultAI now works in both Claude Code tab and Claude Cowork tab.

The Playwright engine cannot run inside the Cowork Ubuntu sandbox (no Mac Chrome, no CDP, no Keychain auth). This release adds a full Cowork execution path using the Claude-in-Chrome MCP, which operates the user's real signed-in Mac Chrome directly.

Changes

  • Phase 0a (runtime detection): Auto-detects Code tab vs Cowork on startup via sys.platform, shutil.which(chrome), and CDP port check. No user action needed.
  • Phase 2-Cowork: New sequential Claude-in-Chrome path — connection check, per-platform tab navigation, JS prompt injection, response polling, and login signal detection.
  • User messaging: Clear guidance when Claude-in-Chrome is not connected (with Code tab as recommended fallback).
  • chrome_selectors.py: Canonical CSS selectors for all 7 platforms (input, submit, login signals) for the Claude-in-Chrome path.
  • Playwright engine untouched — remains the primary, full-featured Code tab path with parallel execution.

Runtime comparison

Code tab Cowork tab
Engine Playwright + CDP Claude-in-Chrome MCP
Execution Parallel (all 7 at once) Sequential (one at a time)
Auth Mac Chrome profile User's real Chrome (already signed in)
Setup bash setup.sh Zero — extension already installed

0.2.26040102 Alpha

31 Mar 21:36

Choose a tag to compare

Security hardening (SENTINEL v2.3 audit) + versioning scheme update.

Security fixes

  • F-1 (Indirect prompt injection): Wrap all platform responses in <untrusted_platform_response> tags; add trust-boundary preamble to consolidator skill
  • F-2 (CDP binding): Add --remote-debugging-host=127.0.0.1 to bind Chrome DevTools Protocol to loopback only
  • F-3 (Credential file harvesting): Remove Login Data / Login Data-journal from files copied to ~/.chrome-playwright/; add chmod 0o700 on the directory
  • F-4 (Path traversal): Guard --output-dir against paths outside the project root
  • F-5 (Prompt file size): Enforce 500 KB ceiling on prompt file input
  • F-6 (Supply chain): Pin all dependencies to exact versions (playwright==1.58.0, openpyxl==3.1.5, anthropic==0.76.0, fastmcp==2.0.0, browser-use==0.12.2); add requirements.txt
  • F-7 (Consent gate): Require explicit user confirmation before dispatching prompts to external AI platforms
  • F-8 (Broad shell permission): Remove Bash(python3:*) wildcard from settings.json

Versioning

  • Switch from YYMMDDX (letter suffix) to YYMMDDNN (two-digit counter) — fully valid semver, up to 99 patches/day
  • Update plugin.json, pyproject.toml, CHANGELOG.md, and docs/CICD-Strategy-and-Plan.md

0.2.26040101 Alpha

31 Mar 21:36

Choose a tag to compare

Rename orchestrator skill to /multai.

MultAI v0.2.260331A Alpha — Orchestration Reliability & Tab Reuse

31 Mar 10:52

Choose a tag to compare

What's New in v0.2.260331A Alpha

This release addresses 7 engine reliability issues and introduces tab reuse across research runs.

Engine: 7 Reliability Fixes

1 — Playwright-Only Enforcement
Added a CRITICAL banner to the orchestrator SKILL.md explicitly preventing the host AI from using Claude-in-Chrome or computer-use tools instead of the Playwright engine.

2 — Sign-In Page Detection 🔑
The engine now detects login/sign-in pages (URL patterns + password-field detection) and returns a clear needs_login status instead of silently failing. Agent fallback attempts to navigate past the login page first.

3 — Broader Agent Fallback Coverage
Agent fallback now triggers on navigation failures, click_send errors, and configure_mode errors — paths that previously fell through without attempting recovery.

4 — Pre-Flight: Warn-Only, Never Skip
Rate-limit pre-flight changed from a hard gate to warnings. All 7 platforms always attempt to run. A platform is excluded only if it shows a sign-in page, is network-unreachable, or reports on-page quota exhaustion.

5 — Dynamic Global Timeout
Global timeout is now max(per-platform timeout) + stagger_total + 60s, ensuring the last staggered platform always gets its full wait time.

6 — Follow-Up Mode (--followup)
New --followup CLI flag injects the new prompt directly into existing open conversations — no navigation, no new tabs, no mode reconfiguration.

7 — Tab Reuse for New Topics
Default behaviour reuses existing open browser tabs for new conversations, navigating within the found tab. Tab URLs persisted to ~/.chrome-playwright/tab-state.json.

Other Changes

  • Dark mode is now the default on the website
  • Comparison table headings center-aligned
  • 3 new tests (UT-OR-12, UT-CF-09, UT-CF-10) → 98 total tests

Engine CLI

# Follow-up on an existing research thread:
python3 skills/orchestrator/engine/orchestrator.py \
  --prompt "Now focus specifically on pricing models" \
  --mode REGULAR \
  --task-name my-research \
  --followup

Full Changelog

See CHANGELOG.md for complete details.