diff --git a/docs/2026-05-13-windows-external-edit-productization.md b/docs/2026-05-13-windows-external-edit-productization.md new file mode 100644 index 00000000..6e621c55 --- /dev/null +++ b/docs/2026-05-13-windows-external-edit-productization.md @@ -0,0 +1,360 @@ +# Windows external-edit 产品化说明 + +## 目标 + +本次收敛把 `external-edit-lab` 接回正式 OpenLess 听写链路,形成一个仅限 Windows 首批发布的最小正式能力: + +- OpenLess 在支持场景中正式插入文本后,自动 arm 一个短窗口 observer。 +- 如果用户随后在同一目标控件里人工改正刚插入的术语,系统尝试抽取 deterministic 的 `old -> new` 替换对。 +- 学到的替换写入本地正式 terminology memory。 +- 后续正式听写优先命中该记忆。 +- 任一失败都只能静默回退,不能影响本次原始插入结果。 + +## 正式接入点 + +### 1. 插入完成后如何 arm + +- 接入文件:`openless-all/app/src-tauri/src/coordinator/dictation.rs` +- 触发时机:正式 dictation pipeline 完成 transcript 获取、correction rules 应用、插入之后。 +- 当前只在以下条件满足时 arm: + - Windows + - `windows_external_edit_learning = true` + - `InsertStatus::Inserted` + - `focus_ready_for_paste = true` +- arm 输入最小化为: + - `inserted_text` + - `window_title` + +### 2. observer 生命周期 + +- 实现文件:`openless-all/app/src-tauri/src/windows_external_edit.rs` +- 行为: + - arm 后先等待很短的 baseline delay。 + - 读取当前 focused element 文本作为 baseline。 + - 只在短 observation window 内轮询。 + - 新 session 开始时会 cancel 上一个 observer,避免跨 session 污染。 +- 读取路径: + - Windows UI Automation `ValuePattern` + - fallback 到 `TextPattern` +- 首批不引入常驻监听器,不增加新权限,不在 macOS 上实现 observer。 + +### 3. 观察成功后如何写入正式 terminology memory + +- v1 不新增 `terminology-memory.json`。 +- 正式 terminology memory 直接复用: + - `%APPDATA%\OpenLess\correction-rules.json` + - 存储实现:`CorrectionRuleStore` +- `CorrectionRuleStore::remember(pattern, replacement)` 规则: + - 已存在同 `pattern + replacement`:直接返回;如果原先 disabled,则重新启用。 + - 已存在同 `pattern` 但不同 `replacement`:拒绝写入,避免冲突覆盖。 + - 否则插入新 rule。 + +### 4. 后续正式输出如何 deterministic 命中 + +- 不新增第二套 rewrite 逻辑。 +- 继续复用正式链路中的 deterministic correction rules: + - `coordinator/dictation.rs` + - `apply_correction_rules(...)` +- 这意味着 external-edit 学到的规则,与用户手工维护的 correction rules,共用同一正式命中点。 + +## 失败回退契约 + +以下任一失败都不得影响本次原始插入: + +- baseline capture 失败 +- focused element 无法读取 +- inserted text 在 baseline 中不唯一 +- diff 无法确定归因到刚插入 span +- deterministic inference 失败 +- 写入 `correction-rules.json` 失败 + +对应日志统一走 `[extedit] ...`,但不弹窗、不阻断、不改写本次插入结果。 + +## 支持矩阵 + +### 首批正式支持边界 + +| 目标 | 状态 | 证据状态 | 备注 | +| --- | --- | --- | --- | +| Windows Notepad | 已知支持 | 仓库内可复跑正式插入 smoke + observer evidence | 首批正式支持 | +| Microsoft Edge textarea | 已知支持 | 当前 worktree 已补齐正式插入 smoke + observer evidence | 首批正式支持 | +| WeChat 输入框 | 已知支持 | 历史真实外部编辑验证已确认;当前 worktree 未补新正式证据包 | 首批正式支持 | +| Zed | 不支持 | 保留不支持边界 | 不进入首批支持矩阵 | +| Windows Terminal | 不支持 | 保留不支持边界 | 不进入首批支持矩阵 | +| VS Code | 未验证 | 当前机器未验证 | 不宣称支持 | + +### 说明 + +- 首批正式支持矩阵只包含: + - Windows Notepad + - Microsoft Edge textarea + - WeChat 输入框 +- 但当前仓库内已经补齐、可直接复跑的正式 observer 证据资产,覆盖: + - Notepad + - Edge textarea +- WeChat 仍需按 runbook 重新沉淀当前 worktree 的 stdout / JSON / log / rule-file 证据,之后才算“证据齐全”。 + +## 最小用户可见入口 + +- Settings -> Advanced + - 新增 `Windows external-edit auto learning` 总开关 +- Vocab -> Correction rules + - 继续承担查看、禁用、删除、验证已学规则 + +v1 不新增独立设置页,不做导入导出,不做云同步。 + +## 日志字段 + +关键日志前缀:`[extedit]` + +预期关键行: + +- `armed observer` +- `learned rule ...` +- `persisted rule id=...` +- `observation skipped: ...` +- `observation failed: ...` +- `observation cancelled` + +日志路径: + +- `%LOCALAPPDATA%\OpenLess\Logs\openless.log` + +## 验证分层 + +### 1. pure + +- 代码位置:`openless-all/app/src-tauri/src/windows_external_edit.rs` +- 覆盖: + - literal replacement + - numeric generalization + - reject diff outside inserted span + - reject ambiguous inserted span + +建议命令: + +```powershell +cargo test --manifest-path openless-all/app/src-tauri/Cargo.toml --lib --no-run +``` + +说明: + +- 当前机器直接运行 Rust test harness 会出现 `STATUS_ENTRYPOINT_NOT_FOUND`。 +- 但 `--no-run` 已证明测试二进制可编译产出。 + +### 2. formal insertion smoke + +脚本: + +- `openless-all/app/scripts/windows-real-asr-insertion-smoke.ps1` + +用途: + +- 证明正式 OpenLess 链路确实拿到 transcript、完成正式插入、写入 history,并能从目标控件读回。 +- 当前机器未配置 ASR / LLM,因此通过 debug-only transcript bypass 验证正式链路: + - `OPENLESS_DEBUG_TRANSCRIPT_FILE` + - 仍然会经过正式 hotkey、session lifecycle、正式插入、正式 observer arm + +Notepad 绿色基线: + +```powershell +powershell -ExecutionPolicy Bypass -File .\openless-all\app\scripts\windows-real-asr-insertion-smoke.ps1 ` + -ExePath D:\cargo-targets\x86_64-pc-windows-gnu\debug\openless.exe ` + -Target notepad ` + -AsrProvider foundry-local-whisper ` + -InjectedTranscriptText "今天记录了几粒样本" ` + -AllowClipboardFallback +``` + +### 3. observer artifact verifier + +脚本: + +- `openless-all/app/scripts/windows-external-edit-observer-smoke.ps1` + +用途: + +- 绑定以下证据: + - `correction-rules.json` + - `openless.log` + - summary JSON +- 当前版本已针对 Windows 做了两个验证层修正: + - 不再高频轮询 `correction-rules.json`,避免干扰原子 rename 持久化 + - 支持 `openless.log` 被 smoke 删除后重新创建的场景 + +示例: + +```powershell +powershell -ExecutionPolicy Bypass -File .\openless-all\app\scripts\windows-external-edit-observer-smoke.ps1 ` + -ExpectedPattern "粒" ` + -ExpectedReplacement "例" ` + -TimeoutSeconds 20 ` + -SummaryJsonPath .\.tmp\external-edit-evidence\notepad-summary-pass.json +``` + +### 4. real observer runbook + +必须把“正式插入验证”和“observer 学习验证”分开理解: + +- 插入 smoke 的最终断言要求目标文本仍等于本次 `finalText`。 +- 如果在 observer 窗口内故意把目标文本从 `old` 改成 `new`,那么这个断言会故意失败。 +- 所以 observer 学习验证时,`windows-real-asr-insertion-smoke.ps1` 的非零退出不能直接解读成产品失败。 + +正确做法: + +1. 先单独跑 formal insertion smoke,证明正式链路能落字。 +2. 再单独跑 observer verifier,同时在 observer 短窗口内修改目标术语。 +3. 以 `correction-rules.json` 和 `[extedit] learned / persisted` 为 observer 成功证据。 + +## 当前已固定证据 + +### A. Notepad 正式插入成功 + +- 命令输出: + - `.tmp/external-edit-evidence/notepad-rewrite-hit.log` + - `.tmp/notepad-insertion-smoke-readback-pidfixed.log` +- 关键结果: + - history 新增 session + - `insertStatus=inserted` + - Notepad UIA 读回成功 + +### B. Notepad observer 学习成功 + +- summary JSON: + - `.tmp/external-edit-evidence/notepad-summary-pass.json` +- verifier stdout: + - `.tmp/external-edit-evidence/notepad-observer-verifier-pass.log` +- 自动改词脚本日志: + - `.tmp/external-edit-evidence/notepad-auto-correct-pass.log` +- 正式链路日志关键行: + - `%LOCALAPPDATA%\OpenLess\Logs\openless.log` + - `armed observer` + - `learned rule 粒 -> 例` + - `persisted rule id=...` +- 规则文件: + - `%APPDATA%\OpenLess\correction-rules.json` + - 当前已验证写入:`粒 -> 例` + +### C. 后续正式输出命中成功 + +- 证据文件: + - `.tmp/external-edit-evidence/notepad-rewrite-hit.log` +- 关键结果: + - 新一轮正式 smoke 中 `rawTranscript` / `finalText` 已变成 `今天记录了几例样本` + - `openless.log` 出现: + - `[coord] correction rules adjusted raw transcript (9 → 9 chars)` + +### D. Edge textarea 正式插入与 observer 学习成功 + +- formal insertion smoke: + - `.tmp/edge-insertion-smoke-disabled-rule-instrumented.log` + - `.tmp/external-edit-evidence/edge-formal-chain-pass.log` +- observer verifier: + - `.tmp/external-edit-evidence/edge-observer-verifier-pass.log` + - `.tmp/external-edit-evidence/edge-summary-pass.json` +- 自动改词日志: + - `.tmp/external-edit-evidence/edge-auto-correct-pass.log` +- 正式链路日志关键行: + - `%LOCALAPPDATA%\OpenLess\Logs\openless.log` + - `armed observer` + - `learned rule 粒 -> 例` + - `persisted rule id=...` +- 关键结果: + - Edge guest 窗口中的真实 textarea 读回成功 + - observer 在短窗口内识别人工改正并重新启用 `粒 -> 例` + - `correction-rules.json` 已恢复为标准 JSON array 形态 + +### E. Edge 后续正式输出命中成功 + +- 证据文件: + - `.tmp/external-edit-evidence/edge-rewrite-hit.log` +- 关键结果: + - 新一轮正式 smoke 中 `rawTranscript` / `finalText` 已变成 `今天记录了几例样本` + - Edge textarea readback 与 history 一致 + - `openless.log` 出现: + - `[coord] correction rules adjusted raw transcript (9 → 9 chars)` + +## 本次验证中的关键 expected vs actual + +### 1. Notepad 读回失败 + +- expected: + - smoke 脚本应从真实 Notepad 窗口进程读取 `RichEditD2DPT / Document` 文本。 +- actual: + - 旧脚本把 `Start-Process notepad.exe` 返回的 launcher pid 当成了窗口 pid,导致 UIA 读回进错进程,结果为空。 +- 修复: + - `windows-real-asr-insertion-smoke.ps1` 改为通过 `Wait-ProcessWindow "Notepad"` 锁定真实窗口进程。 + +### 2. Notepad 读回误触发热键 + +- expected: + - smoke 读回不应再触发 OpenLess 自身热键。 +- actual: + - 旧 fallback 用 `Ctrl+A / Ctrl+C`,在当前热键配置为 `LeftControl` 时,会重新触发新 session。 +- 修复: + - Notepad 读回彻底移除 `Ctrl+A / Ctrl+C` fallback,改为 UIA `Document/ValuePattern` 轮询读取。 + +### 3. verifier 干扰规则持久化 + +- expected: + - observer verifier 只观察证据,不干扰 `correction-rules.json` 持久化。 +- actual: + - 旧 verifier 高频轮询 `correction-rules.json`,会在 Windows 上干扰原子 rename,造成 `[extedit] persist learned rule failed: rename failed ...` +- 修复: + - verifier 只记录 pre-state,轮询日志状态,结束后再读一次规则文件。 + +### 4. Edge 新 profile 抢焦点到 sync-confirmation 页面 + +- expected: + - Edge smoke 应把正式插入与 observer 都落在 fixture textarea。 +- actual: + - 新 profile 首启时,Edge sync-confirmation 页面会抢走 focused element;observer baseline 读到的是 `edge://sync-confirmation-dialog/`,browser readback 也不再指向 textarea。 +- 修复: + - `windows-real-asr-insertion-smoke.ps1` 改为以 `--guest` 启动 Edge smoke 窗口,规避首启同步引导页。 + +### 5. Edge guest 窗口无法通过 `Get-Process.MainWindowHandle` 锁定 + +- expected: + - browser smoke 应稳定拿到真实 Edge 顶层窗体并继续 UIA 聚焦 textarea。 +- actual: + - guest Edge 上 `Get-Process.MainWindowHandle` 不可靠,旧脚本会误判“Browser window process was not found”。 +- 修复: + - browser 窗口发现改为直接枚举 UIA 顶层窗体,并把真实 `pid/title/handle` 回填给后续 focus/readback 路径。 + +### 6. Windows 正式链路保存了不稳定 child HWND + +- expected: + - session 开始时捕获的 focus target 应该能在插入完成时仍可恢复。 +- actual: + - 旧逻辑直接保存 `GetForegroundWindow()` 返回的原始 HWND;在 Edge 上它可能是会销毁的 child 窗口,导致后续降级成 `CopiedFallback`。 +- 修复: + - `capture_focus_target()` 改为保存 `GA_ROOT` 根窗口 HWND,恢复前景时使用稳定顶层窗体。 + +### 7. `correction-rules.json` 被 PowerShell 包装对象污染后无法持久化 + +- expected: + - 就算历史验证脚本把规则文件写成 `{ value: [...], Count: N }` 形态,正式 learning 也不应因为解码失败而丢掉本次学到的规则。 +- actual: + - observer 已经 `learned rule 粒 -> 例`,但 `CorrectionRuleStore::remember(...)` 在读取 wrapper 形态文件时解码失败,导致 persist 失败。 +- 修复: + - `CorrectionRuleStore` 新增 wrapped-array 兼容读取;下次正式写入时自动回正成标准 JSON array。 + +## 当前实现边界 + +- v1 terminology memory 物理文件就是 `correction-rules.json` +- v1 当前学到的是“最小 deterministic 替换 span” + - 本次 Notepad 自动证据实际写入的是 `粒 -> 例` + - 不是整词 `几粒 -> 几例` +- 不做云同步、导入导出、多设备合并 +- 不扩 macOS observer +- 不把 Edge / WeChat / Zed / Windows Terminal 包装成“当前 worktree 已有同等级自动化证据” + +## 变更边界 + +本次变更覆盖: + +- Windows-only external-edit observer +- formal dictation pipeline arm / cancel / persist integration +- Settings 最小总开关 +- 复用 Vocab correction rules 作为正式 terminology memory +- Windows 正式 smoke / observer verifier / runbook / 支持矩阵文档 diff --git a/openless-all/app/scripts/windows-external-edit-observer-smoke.ps1 b/openless-all/app/scripts/windows-external-edit-observer-smoke.ps1 new file mode 100644 index 00000000..a08e4e7d --- /dev/null +++ b/openless-all/app/scripts/windows-external-edit-observer-smoke.ps1 @@ -0,0 +1,163 @@ +param( + [Parameter(Mandatory = $true)] + [string]$ExpectedPattern, + + [Parameter(Mandatory = $true)] + [string]$ExpectedReplacement, + + [int]$TimeoutSeconds = 20, + + [string]$SummaryJsonPath = "" +) + +$ErrorActionPreference = "Stop" + +function Read-JsonFile($Path) { + if (-not (Test-Path $Path)) { + return $null + } + $raw = Get-Content -Raw -Encoding UTF8 -Path $Path + if ([string]::IsNullOrWhiteSpace($raw)) { + return $null + } + return $raw | ConvertFrom-Json +} + +function Read-CorrectionRules($Path) { + $json = Read-JsonFile $Path + if ($null -eq $json) { + return @() + } + if ($json -is [System.Array]) { + return @($json) + } + return @($json) +} + +function Get-MatchingRules($Rules, $Pattern, $Replacement) { + return @( + $Rules | Where-Object { + $_.pattern -eq $Pattern -and $_.replacement -eq $Replacement + } + ) +} + +function Get-ExtEditLines($Path, $StartLine) { + if (-not (Test-Path $Path)) { + return @() + } + $lines = Get-Content -Encoding UTF8 -Path $Path + if ($lines.Count -eq 0) { + return @() + } + $effectiveStartLine = if ($StartLine -ge $lines.Count) { 0 } else { $StartLine } + return @( + $lines[$effectiveStartLine..($lines.Count - 1)] | + Where-Object { $_ -match "\[extedit\]" } | + ForEach-Object { [string]$_ } + ) +} + +$dataDir = Join-Path $env:APPDATA "OpenLess" +$rulesPath = Join-Path $dataDir "correction-rules.json" +$logPath = Join-Path $env:LOCALAPPDATA "OpenLess\Logs\openless.log" + +$preRules = Read-CorrectionRules $rulesPath +$preMatches = Get-MatchingRules $preRules $ExpectedPattern $ExpectedReplacement +$preMatchById = @{} +foreach ($rule in $preMatches) { + $preMatchById[$rule.id] = [bool]$rule.enabled +} + +$logStartLine = 0 +if (Test-Path $logPath) { + $logStartLine = (Get-Content -Encoding UTF8 -Path $logPath).Count +} + +Write-Host "== Windows external-edit observer smoke ==" +Write-Host "Expected pattern : $ExpectedPattern" +Write-Host "Expected replacement : $ExpectedReplacement" +Write-Host "Rules path : $rulesPath" +Write-Host "Log path : $logPath" +Write-Host "" +Write-Host "Expected:" +Write-Host " 1. 在已支持的 Windows 外部编辑目标中完成一次正式插入。" +Write-Host " 2. 在短窗口内对同一控件中的术语做人工纠正。" +Write-Host " 3. correction-rules.json 出现新的 enabled rule,或已有同 rule 被重新启用。" +Write-Host " 4. openless.log 出现 [extedit] armed / learned / persisted 等日志。" +Write-Host "" +Write-Host "Actual:" +Write-Host " 正在轮询本地数据文件与日志,等待结果..." + +$deadline = (Get-Date).AddSeconds($TimeoutSeconds) +$actualMatches = @() +$persistObserved = $false + +while ((Get-Date) -lt $deadline) { + Start-Sleep -Milliseconds 500 + $extEditLines = Get-ExtEditLines $logPath $logStartLine + if ($extEditLines | Where-Object { $_ -match "\[extedit\] persisted rule id=" }) { + $persistObserved = $true + break + } +} + +$postRules = Read-CorrectionRules $rulesPath +$actualMatches = Get-MatchingRules $postRules $ExpectedPattern $ExpectedReplacement +$success = $false +foreach ($rule in $actualMatches) { + $wasEnabled = $false + $hadRule = $preMatchById.ContainsKey($rule.id) + if ($hadRule) { + $wasEnabled = [bool]$preMatchById[$rule.id] + } + if ((-not $hadRule) -or ((-not $wasEnabled) -and [bool]$rule.enabled)) { + $success = $true + break + } +} + +$extEditLines = Get-ExtEditLines $logPath $logStartLine +$summary = [ordered]@{ + expectedPattern = $ExpectedPattern + expectedReplacement = $ExpectedReplacement + timeoutSeconds = $TimeoutSeconds + success = $success + persistObserved = $persistObserved + rulesPath = $rulesPath + logPath = $logPath + matchedRuleCount = @($actualMatches).Count + matchedRules = @( + $actualMatches | ForEach-Object { + [ordered]@{ + id = $_.id + enabled = [bool]$_.enabled + createdAt = if ($null -ne $_.PSObject.Properties["createdAt"]) { + $_.createdAt + } else { + $_.created_at + } + } + } + ) + exteditLogLines = $extEditLines +} + +$summaryJson = $summary | ConvertTo-Json -Depth 6 +if (-not [string]::IsNullOrWhiteSpace($SummaryJsonPath)) { + $summaryDir = Split-Path -Parent $SummaryJsonPath + if (-not [string]::IsNullOrWhiteSpace($summaryDir) -and -not (Test-Path $summaryDir)) { + New-Item -ItemType Directory -Path $summaryDir | Out-Null + } + [System.IO.File]::WriteAllText($SummaryJsonPath, $summaryJson, [System.Text.UTF8Encoding]::new($false)) +} + +Write-Host "" +Write-Host "Summary JSON:" +Write-Output $summaryJson + +if (-not $success) { + throw "No newly learned or re-enabled rule matched '$ExpectedPattern' -> '$ExpectedReplacement' within ${TimeoutSeconds}s." +} + +Write-Host "Windows external-edit observer smoke passed." diff --git a/openless-all/app/scripts/windows-real-asr-insertion-smoke.ps1 b/openless-all/app/scripts/windows-real-asr-insertion-smoke.ps1 index 5f76f934..5fa9a806 100644 --- a/openless-all/app/scripts/windows-real-asr-insertion-smoke.ps1 +++ b/openless-all/app/scripts/windows-real-asr-insertion-smoke.ps1 @@ -6,9 +6,10 @@ param( [string]$AsrProvider = "volcengine", [string]$Phrase = "OpenLess Windows real regression", [int]$TimeoutSeconds = 120, - [int]$VirtualKey = 0xA3, + [int]$VirtualKey = 0xA2, [string]$InjectedTranscriptText = "", [int]$ManualSpeechSeconds = 8, + [int]$PostSuccessDelaySeconds = 0, [switch]$ManualSpeech, [switch]$AllowClipboardFallback, [switch]$RequireJsonCredentials, @@ -89,7 +90,7 @@ function Write-TextUtf8($Path, $Text) { } function Restore-ClipboardValue($Value) { - if ($null -eq $Value) { + if ($null -eq $Value -or ($Value -is [string] -and $Value.Length -eq 0)) { cmd /c "echo off | clip" | Out-Null return } @@ -116,6 +117,15 @@ function Set-HoldHotkeyPreference($Path) { } else { $prefs.hotkey.mode = "hold" } + $dictationBinding = [pscustomobject]@{ + primary = "LeftControl" + modifiers = @() + } + if ($null -eq $prefs.PSObject.Properties["dictationHotkey"]) { + $prefs | Add-Member -NotePropertyName dictationHotkey -NotePropertyValue $dictationBinding + } else { + $prefs.dictationHotkey = $dictationBinding + } if ($null -eq $prefs.defaultMode) { $prefs | Add-Member -NotePropertyName defaultMode -NotePropertyValue "light" } if ($null -eq $prefs.enabledModes) { $prefs | Add-Member -NotePropertyName enabledModes -NotePropertyValue @("light", "structured", "formal", "raw") } if ($null -eq $prefs.launchAtLogin) { $prefs | Add-Member -NotePropertyName launchAtLogin -NotePropertyValue $false } @@ -292,6 +302,96 @@ function Get-OpenLessVaultCredentials { return $json } +function Get-OpenLessVaultSnapshot { + $manifestJson = Get-OpenLessKeyringPassword "credentials.v1" + if ([string]::IsNullOrWhiteSpace($manifestJson)) { + return [pscustomobject]@{ + HadVault = $false + ManifestJson = $null + ChunkAccounts = @() + ChunkValues = @() + VaultJson = $null + JsonValid = $false + Warning = $null + } + } + + try { + $manifest = $manifestJson | ConvertFrom-Json + } catch { + return [pscustomobject]@{ + HadVault = $true + ManifestJson = $manifestJson + ChunkAccounts = @() + ChunkValues = @() + VaultJson = $null + JsonValid = $false + Warning = "credential vault manifest is invalid JSON: $($_.Exception.Message)" + } + } + + if ($manifest.openless_credentials_storage -ne "chunked" -or $manifest.version -ne 1) { + return [pscustomobject]@{ + HadVault = $true + ManifestJson = $manifestJson + ChunkAccounts = @() + ChunkValues = @() + VaultJson = $null + JsonValid = $false + Warning = "unsupported credential vault manifest" + } + } + + $chunkAccounts = @() + $chunkValues = @() + $chunksMissing = $false + for ($i = 0; $i -lt [int]$manifest.chunks; $i++) { + $account = if ($null -ne $manifest.PSObject.Properties["generation"] -and -not [string]::IsNullOrWhiteSpace($manifest.generation)) { + "credentials.v1.chunk.$($manifest.generation).$i" + } else { + "credentials.v1.chunk.$i" + } + $chunkValue = Get-OpenLessKeyringPassword $account + $chunkAccounts += $account + $chunkValues += $chunkValue + if ($null -eq $chunkValue) { + $chunksMissing = $true + } + } + + if ($chunksMissing) { + return [pscustomobject]@{ + HadVault = $true + ManifestJson = $manifestJson + ChunkAccounts = $chunkAccounts + ChunkValues = $chunkValues + VaultJson = $null + JsonValid = $false + Warning = "credential vault chunk is missing" + } + } + + $vaultJson = ($chunkValues -join "") + try { + $null = $vaultJson | ConvertFrom-Json + $jsonValid = $true + $warning = $null + } catch { + $jsonValid = $false + $warning = "credential vault JSON is invalid: $($_.Exception.Message)" + } + + return [pscustomobject]@{ + HadVault = $true + ManifestJson = $manifestJson + ChunkAccounts = $chunkAccounts + ChunkValues = $chunkValues + VaultJson = $vaultJson + JsonValid = $jsonValid + Warning = $warning + } +} + function Set-OpenLessVaultCredentials($Json, $PreviousManifestJson) { $previousManifest = $null if (-not [string]::IsNullOrWhiteSpace($PreviousManifestJson)) { @@ -328,25 +428,14 @@ function Restore-ActiveAsrCredential($Snapshot, $Path) { return } if ($Snapshot.HadVault) { - $manifest = $Snapshot.VaultManifestJson | ConvertFrom-Json - $chunks = Split-OpenLessCredentialJson $Snapshot.VaultJson - $usesGeneratedChunks = $null -ne $manifest.PSObject.Properties["generation"] -and -not [string]::IsNullOrWhiteSpace($manifest.generation) - for ($i = 0; $i -lt $chunks.Count; $i++) { - $account = if ($usesGeneratedChunks) { - "credentials.v1.chunk.$($manifest.generation).$i" - } else { - "credentials.v1.chunk.$i" - } - Set-OpenLessKeyringPassword $account $chunks[$i] + for ($i = 0; $i -lt $Snapshot.VaultChunkAccounts.Count; $i++) { + Set-OpenLessKeyringPassword $Snapshot.VaultChunkAccounts[$i] $Snapshot.VaultChunkValues[$i] } Set-OpenLessKeyringPassword "credentials.v1" $Snapshot.VaultManifestJson - if ($usesGeneratedChunks) { - for ($i = 0; $i -lt $Snapshot.WrittenVaultChunks; $i++) { - Remove-OpenLessKeyringPassword "credentials.v1.chunk.$i" - } - } else { - for ($i = $chunks.Count; $i -lt $Snapshot.WrittenVaultChunks; $i++) { - Remove-OpenLessKeyringPassword "credentials.v1.chunk.$i" + for ($i = 0; $i -lt $Snapshot.WrittenVaultChunks; $i++) { + $generatedAccount = "credentials.v1.chunk.$i" + if ($Snapshot.VaultChunkAccounts -notcontains $generatedAccount) { + Remove-OpenLessKeyringPassword $generatedAccount } } } else { @@ -365,9 +454,15 @@ function Restore-ActiveAsrCredential($Snapshot, $Path) { function Set-ActiveAsrCredential($Path) { $previousLegacy = Read-TextUtf8 $Path - $previousManifest = Get-OpenLessKeyringPassword "credentials.v1" - $previousVault = Get-OpenLessVaultCredentials - $source = if (-not [string]::IsNullOrWhiteSpace($previousVault)) { $previousVault } else { $previousLegacy } + $vaultSnapshot = Get-OpenLessVaultSnapshot + if (-not [string]::IsNullOrWhiteSpace($vaultSnapshot.Warning)) { + Write-Warning "OpenLess credential vault is unreadable for smoke bootstrap; falling back to legacy/blank credentials. $($vaultSnapshot.Warning)" + } + $source = if ($vaultSnapshot.JsonValid -and -not [string]::IsNullOrWhiteSpace($vaultSnapshot.VaultJson)) { + $vaultSnapshot.VaultJson + } else { + $previousLegacy + } if ([string]::IsNullOrWhiteSpace($source)) { $credentials = [pscustomobject]@{ version = 1 @@ -413,15 +508,16 @@ function Set-ActiveAsrCredential($Path) { } $json = $credentials | ConvertTo-Json -Depth 12 -Compress $chunks = Split-OpenLessCredentialJson $json - Set-OpenLessVaultCredentials $json $previousManifest - if ([string]::IsNullOrWhiteSpace($previousVault)) { + Set-OpenLessVaultCredentials $json $vaultSnapshot.ManifestJson + if (-not $vaultSnapshot.JsonValid) { Write-TextUtf8 $Path ($credentials | ConvertTo-Json -Depth 12) } return [pscustomobject]@{ LegacyJson = $previousLegacy - VaultJson = $previousVault - VaultManifestJson = $previousManifest - HadVault = -not [string]::IsNullOrWhiteSpace($previousVault) + VaultManifestJson = $vaultSnapshot.ManifestJson + VaultChunkAccounts = @($vaultSnapshot.ChunkAccounts) + VaultChunkValues = @($vaultSnapshot.ChunkValues) + HadVault = [bool]$vaultSnapshot.HadVault WrittenVaultChunks = $chunks.Count } } @@ -459,11 +555,15 @@ function Get-LatestHistory($Path) { return @($json) | Select-Object -First 1 } -function Wait-HistoryCountGreaterThan($Path, $Baseline, $TimeoutSeconds) { +function Wait-HistoryAdvance($Path, $BaselineCount, $BaselineLatestId, $TimeoutSeconds) { $deadline = (Get-Date).AddSeconds($TimeoutSeconds) while ((Get-Date) -lt $deadline) { $count = Get-HistoryCount $Path - if ($count -gt $Baseline) { + if ($count -gt $BaselineCount) { + return $true + } + $latest = Get-LatestHistory $Path + if ($null -ne $latest -and -not [string]::IsNullOrWhiteSpace($latest.id) -and $latest.id -ne $BaselineLatestId) { return $true } Start-Sleep -Milliseconds 500 @@ -483,24 +583,35 @@ function Send-KeyEdge($Vk, $KeyUp, $Extended = $true) { [OpenLessRegressionWin32]::keybd_event([byte]$Vk, [byte]$scanCode, $flags, [UIntPtr]::Zero) } +function Test-ExtendedVirtualKey($Vk) { + return $Vk -in @(0xA3, 0xA5, 0x5C) +} + function Tap-Hotkey { - Send-KeyEdge $VirtualKey $false $true + $extended = Test-ExtendedVirtualKey $VirtualKey + Send-KeyEdge $VirtualKey $false $extended Start-Sleep -Milliseconds 180 - Send-KeyEdge $VirtualKey $true $true + Send-KeyEdge $VirtualKey $true $extended } function Press-Hotkey { - Send-KeyEdge $VirtualKey $false $true + Send-KeyEdge $VirtualKey $false (Test-ExtendedVirtualKey $VirtualKey) } function Release-Hotkey { - Send-KeyEdge $VirtualKey $true $true + Send-KeyEdge $VirtualKey $true (Test-ExtendedVirtualKey $VirtualKey) } function Ensure-TargetFocused($TargetInfo) { if ($null -eq $TargetInfo) { return $false } + if ($TargetInfo.TargetKind -eq "browser" -and $null -ne $TargetInfo.Process) { + if (-not (Focus-Window $TargetInfo.Process)) { + return $false + } + return (Focus-BrowserTextarea $TargetInfo.Process) + } if ($TargetInfo.TargetTitle) { $wshell = New-Object -ComObject WScript.Shell if ($wshell.AppActivate($TargetInfo.TargetTitle)) { @@ -515,15 +626,79 @@ function Ensure-TargetFocused($TargetInfo) { } function Focus-Window($Process) { - if ($null -eq $Process -or $Process.MainWindowHandle -eq 0) { + if ($null -eq $Process) { + return $false + } + $handle = 0 + if ($null -ne $Process.PSObject.Properties["MainWindowHandleOverride"]) { + $handle = [int64]$Process.MainWindowHandleOverride + } else { + $handle = [int64]$Process.MainWindowHandle + } + if ($handle -eq 0) { return $false } - [OpenLessRegressionWin32]::ShowWindow($Process.MainWindowHandle, 9) | Out-Null - [OpenLessRegressionWin32]::SetForegroundWindow($Process.MainWindowHandle) | Out-Null + [OpenLessRegressionWin32]::ShowWindow([IntPtr]$handle, 9) | Out-Null + [OpenLessRegressionWin32]::SetForegroundWindow([IntPtr]$handle) | Out-Null Start-Sleep -Milliseconds 500 return $true } +function Focus-BrowserTextarea($Process) { + if ($null -eq $Process) { + return $false + } + $focusScript = @" +import sys +import time +from pywinauto import Application + +pid = int(sys.argv[1]) +app = Application(backend='uia').connect(process=pid) +win = app.top_window() +win.set_focus() +time.sleep(0.2) + +candidate = None +for descendant in win.descendants(): + try: + if descendant.element_info.control_type != 'Edit': + continue + if descendant.class_name() == 'OmniboxViewViews': + continue + rect = descendant.rectangle() + if rect.width() <= 0 or rect.height() <= 0: + continue + candidate = descendant + if descendant.class_name() == '': + break + except Exception: + continue + +if candidate is None: + raise SystemExit(1) + +try: + candidate.set_focus() +except Exception: + pass +time.sleep(0.1) +candidate.click_input() +time.sleep(0.2) +raise SystemExit(0) +"@ + $focusPath = Join-Path $env:TEMP "openless-browser-focus.py" + Write-TextUtf8 $focusPath $focusScript + try { + python -X utf8 $focusPath $Process.Id | Out-Null + return $true + } catch { + return $false + } finally { + Remove-Item -LiteralPath $focusPath -Force -ErrorAction SilentlyContinue + } +} + function Wait-ProcessWindow($ProcessName, $After, $TimeoutSeconds) { $deadline = (Get-Date).AddSeconds($TimeoutSeconds) while ((Get-Date) -lt $deadline) { @@ -539,6 +714,49 @@ function Wait-ProcessWindow($ProcessName, $After, $TimeoutSeconds) { return $null } +function Wait-BrowserWindow($TitleFragment, $TimeoutSeconds) { + $probeScript = @" +import json +import sys +from pywinauto import Desktop + +title_fragment = sys.argv[1] +for win in Desktop(backend='uia').windows(): + try: + title = win.window_text() + if not title or title_fragment not in title: + continue + payload = { + "title": title, + "pid": win.process_id(), + "handle": int(win.handle), + } + print(json.dumps(payload, ensure_ascii=False)) + raise SystemExit(0) + except Exception: + continue +raise SystemExit(1) +"@ + $probePath = Join-Path $env:TEMP "openless-browser-window-probe.py" + Write-TextUtf8 $probePath $probeScript + try { + $deadline = (Get-Date).AddSeconds($TimeoutSeconds) + while ((Get-Date) -lt $deadline) { + try { + $json = python -X utf8 $probePath $TitleFragment + if (-not [string]::IsNullOrWhiteSpace($json)) { + return ($json | ConvertFrom-Json) + } + } catch { + } + Start-Sleep -Milliseconds 300 + } + return $null + } finally { + Remove-Item -LiteralPath $probePath -Force -ErrorAction SilentlyContinue + } +} + function Resolve-BrowserPath { $programFiles = if ($env:ProgramFiles) { $env:ProgramFiles } else { Join-Path $env:SystemDrive "Program Files" } $programFilesX86 = if (${env:ProgramFiles(x86)}) { ${env:ProgramFiles(x86)} } else { Join-Path $env:SystemDrive "Program Files (x86)" } @@ -654,9 +872,15 @@ function Start-InputTarget($TargetName) { if ($TargetName -eq "notepad") { $fixture = Join-Path $env:TEMP "openless-notepad-input-fixture.txt" Write-TextUtf8 $fixture "" - $process = Start-Process notepad.exe -ArgumentList $fixture -PassThru - Start-Sleep -Seconds 2 - $title = "openless-notepad-input-fixture.txt - Notepad" + $launcher = Start-Process notepad.exe -ArgumentList $fixture -PassThru + $process = Wait-ProcessWindow "Notepad" $startedAt 15 + if ($null -eq $process) { + throw "Notepad window process was not found." + } + $title = $process.MainWindowTitle + if ([string]::IsNullOrWhiteSpace($title)) { + $title = "openless-notepad-input-fixture.txt - Notepad" + } $activateScript = @" import sys, time, win32com.client title = sys.argv[1] @@ -675,6 +899,9 @@ raise SystemExit(1) python $activatePath $title | Out-Null } finally { Remove-Item -LiteralPath $activatePath -Force -ErrorAction SilentlyContinue + if ($null -ne $launcher) { + $launcher | Out-Null + } } Start-Sleep -Milliseconds 800 return [pscustomobject]@{ @@ -757,75 +984,272 @@ raise SystemExit(1) $fixture = New-BrowserInputFixture $url = ([System.Uri]$fixture).AbsoluteUri $processName = [System.IO.Path]::GetFileNameWithoutExtension($browserPath) - $profilePath = Join-Path $env:TEMP "openless-browser-smoke-profile" - Stop-BrowserProfileProcesses $profilePath - Remove-Item -LiteralPath $profilePath -Recurse -Force -ErrorAction SilentlyContinue - Start-Process -FilePath $browserPath -ArgumentList @( + $launcher = Start-Process -FilePath $browserPath -ArgumentList @( + "--guest", "--new-window", - "--user-data-dir=$profilePath", "--no-first-run", + "--no-default-browser-check", "--disable-extensions", $url - ) | Out-Null - $process = Wait-ProcessWindow $processName $startedAt 20 + ) -PassThru + $window = Wait-BrowserWindow "OpenLess Browser Input Fixture" 20 + if ($null -eq $window) { + throw "Browser window process was not found." + } + $process = Get-Process -Id $window.pid -ErrorAction Stop + $process | Add-Member -NotePropertyName MainWindowHandleOverride -NotePropertyValue ([int64]$window.handle) -Force if (-not (Focus-Window $process)) { throw "Browser window could not be focused." } + if (-not (Focus-BrowserTextarea $process)) { + throw "Browser textarea could not be focused." + } Start-Sleep -Seconds 1 - return [pscustomobject]@{ Process = $process; FixturePath = $fixture; ProfilePath = $profilePath; TargetKind = "browser" } + return [pscustomobject]@{ + Process = $process + FixturePath = $fixture + ProfilePath = $null + TargetKind = "browser" + TargetPid = $process.Id + TargetHandle = [int64]$window.handle + TargetTitle = $window.title + } } function Read-TargetContent($TargetInfo, $TargetName) { if ($TargetName -eq "notepad") { $readbackScript = @" +import json import sys +import time from pywinauto import Desktop pid = int(sys.argv[1]) -title = sys.argv[2] -out = sys.argv[3] -windows = [w for w in Desktop(backend='uia').windows() if getattr(w, 'process_id', lambda: None)() == pid] -win = None -for candidate in windows: - if candidate.window_text() == title: - win = candidate - break -if win is None and windows: - win = windows[0] -if win is None: - raise SystemExit(2) -for descendant in win.descendants(): - if descendant.class_name() == 'RichEditD2DPT': - value = descendant.window_text() - open(out, 'w', encoding='utf-8').write(value) +out = sys.argv[2] +debug_out = sys.argv[3] + +def collect_debug(): + payload = {"pid": pid, "windows": []} + for win in [w for w in Desktop(backend='uia').windows() if getattr(w, 'process_id', lambda: None)() == pid]: + win_info = { + "title": "", + "class_name": "", + "descendants": [], + } + try: + win_info["title"] = win.window_text() + except Exception as exc: + win_info["title"] = f"" + try: + win_info["class_name"] = win.class_name() + except Exception as exc: + win_info["class_name"] = f"<class error: {exc}>" + for descendant in win.descendants(): + try: + cls = descendant.class_name() + except Exception as exc: + cls = f"<class error: {exc}>" + try: + control_type = descendant.element_info.control_type + except Exception as exc: + control_type = f"<type error: {exc}>" + try: + name = descendant.window_text() + except Exception as exc: + name = f"<name error: {exc}>" + try: + value = descendant.iface_value.CurrentValue + except Exception: + value = None + if cls == 'RichEditD2DPT' or control_type == 'Document': + win_info["descendants"].append({ + "class_name": cls, + "control_type": control_type, + "name": name, + "value": value, + }) + payload["windows"].append(win_info) + with open(debug_out, 'w', encoding='utf-8') as fh: + json.dump(payload, fh, ensure_ascii=False, indent=2) + +def read_notepad_text(): + windows = [w for w in Desktop(backend='uia').windows() if getattr(w, 'process_id', lambda: None)() == pid] + if not windows: + return None + + win = None + for candidate in windows: + try: + if candidate.class_name() == 'Notepad': + win = candidate + break + except Exception: + continue + if win is None: + win = windows[0] + + for descendant in win.descendants(): + try: + cls = descendant.class_name() + control_type = descendant.element_info.control_type + except Exception: + continue + if cls != 'RichEditD2DPT' and control_type != 'Document': + continue + + for getter in ( + lambda: descendant.iface_value.CurrentValue, + lambda: descendant.window_text(), + lambda: getattr(descendant.element_info, 'name', ''), + ): + try: + value = getter() + except Exception: + continue + if value is not None and str(value).strip(): + return str(value) + return '' + +deadline = time.time() + 5 +last_text = None +while time.time() < deadline: + last_text = read_notepad_text() + if last_text is None: + time.sleep(0.2) + continue + if last_text.strip(): + open(out, 'w', encoding='utf-8').write(last_text) raise SystemExit(0) + time.sleep(0.2) + +if last_text is None: + collect_debug() + raise SystemExit(2) +collect_debug() +open(out, 'w', encoding='utf-8').write(last_text) raise SystemExit(1) "@ $readbackPath = Join-Path $env:TEMP "openless-notepad-readback.py" $outputPath = Join-Path $env:TEMP "openless-notepad-readback.txt" + $debugPath = Join-Path $env:TEMP "openless-notepad-readback-debug.json" Write-TextUtf8 $readbackPath $readbackScript try { Remove-Item -LiteralPath $outputPath -Force -ErrorAction SilentlyContinue - python -X utf8 $readbackPath $TargetInfo.TargetPid $TargetInfo.TargetTitle $outputPath | Out-Null - Start-Sleep -Milliseconds 400 + Remove-Item -LiteralPath $debugPath -Force -ErrorAction SilentlyContinue + python -X utf8 $readbackPath $TargetInfo.TargetPid $outputPath $debugPath | Out-Null if (Test-Path $outputPath) { return Get-Content -Raw -Encoding UTF8 $outputPath } + if (Test-Path $debugPath) { + Write-Warning "notepad readback debug: $(Get-Content -Raw -Encoding UTF8 $debugPath)" + } return $null } finally { Remove-Item -LiteralPath $readbackPath -Force -ErrorAction SilentlyContinue Remove-Item -LiteralPath $outputPath -Force -ErrorAction SilentlyContinue + Remove-Item -LiteralPath $debugPath -Force -ErrorAction SilentlyContinue } } if ($TargetName -eq "browser") { - Focus-Window $TargetInfo.Process | Out-Null - Start-Sleep -Milliseconds 400 - Send-CtrlChord 0x41 - Start-Sleep -Milliseconds 200 - Send-CtrlChord 0x43 - Start-Sleep -Milliseconds 400 - return Get-Clipboard -Raw -ErrorAction SilentlyContinue + $readbackScript = @" +import json +import sys +from pywinauto import Application + +pid = int(sys.argv[1]) +out = sys.argv[2] +debug_out = sys.argv[3] + +app = Application(backend='uia').connect(process=pid) +win = app.top_window() +payload = {"pid": pid, "window": win.window_text(), "candidates": []} + +def rank(descendant): + cls = descendant.class_name() + if cls == 'OmniboxViewViews': + return -1 + rect = descendant.rectangle() + area = max(rect.width(), 0) * max(rect.height(), 0) + score = area + if cls == '': + score += 1000000 + return score + +candidates = [] +for descendant in win.descendants(): + try: + if descendant.element_info.control_type != 'Edit': + continue + rect = descendant.rectangle() + value = '' + try: + value = descendant.iface_value.CurrentValue or '' + except Exception: + value = '' + info = { + "class_name": descendant.class_name(), + "name": descendant.window_text(), + "value": value, + "rect": [rect.left, rect.top, rect.right, rect.bottom], + "score": rank(descendant), + } + payload["candidates"].append(info) + if info["score"] >= 0: + candidates.append((info["score"], info, descendant)) + except Exception: + continue + +with open(debug_out, 'w', encoding='utf-8') as fh: + json.dump(payload, fh, ensure_ascii=False, indent=2) + +if not candidates: + raise SystemExit(1) + +candidates.sort(key=lambda item: item[0], reverse=True) +best_info = candidates[0][1] +best = candidates[0][2] +payload["selected"] = best_info + +for getter in ( + lambda: best.iface_value.CurrentValue, + lambda: best.window_text(), + lambda: getattr(best.element_info, 'name', ''), +): + try: + value = getter() + except Exception: + continue + if value is not None: + open(out, 'w', encoding='utf-8').write(str(value)) + raise SystemExit(0) + +raise SystemExit(1) +"@ + $readbackPath = Join-Path $env:TEMP "openless-browser-readback.py" + $outputPath = Join-Path $env:TEMP "openless-browser-readback.txt" + $debugPath = Join-Path $env:TEMP "openless-browser-readback-debug.json" + Write-TextUtf8 $readbackPath $readbackScript + try { + Remove-Item -LiteralPath $outputPath -Force -ErrorAction SilentlyContinue + Remove-Item -LiteralPath $debugPath -Force -ErrorAction SilentlyContinue + python -X utf8 $readbackPath $TargetInfo.TargetPid $outputPath $debugPath | Out-Null + if (Test-Path $outputPath) { + $value = Get-Content -Raw -Encoding UTF8 $outputPath + if ([string]::IsNullOrWhiteSpace($value) -and (Test-Path $debugPath)) { + Write-Warning "browser readback debug: $(Get-Content -Raw -Encoding UTF8 $debugPath)" + } + return $value + } + if (Test-Path $debugPath) { + Write-Warning "browser readback debug: $(Get-Content -Raw -Encoding UTF8 $debugPath)" + } + return $null + } finally { + Remove-Item -LiteralPath $readbackPath -Force -ErrorAction SilentlyContinue + Remove-Item -LiteralPath $outputPath -Force -ErrorAction SilentlyContinue + Remove-Item -LiteralPath $debugPath -Force -ErrorAction SilentlyContinue + } } if ($TargetName -in @("wt-cmd", "wt-powershell")) { @@ -926,6 +1350,8 @@ $clipboardCaptured = $false try { $baselineCount = Get-HistoryCount $historyPath + $baselineLatest = Get-LatestHistory $historyPath + $baselineLatestId = if ($null -ne $baselineLatest) { $baselineLatest.id } else { $null } $previousClipboard = Get-Clipboard -Raw -ErrorAction SilentlyContinue $clipboardCaptured = $true $previousPreferences = Set-HoldHotkeyPreference $preferencesPath @@ -992,7 +1418,7 @@ try { Start-Sleep -Milliseconds 800 Release-Hotkey - if (-not (Wait-HistoryCountGreaterThan $historyPath $baselineCount $TimeoutSeconds)) { + if (-not (Wait-HistoryAdvance $historyPath $baselineCount $baselineLatestId $TimeoutSeconds)) { throw "History did not receive a new dictation session within $TimeoutSeconds seconds." } @@ -1028,6 +1454,10 @@ try { Write-Host "[ok] History updated. raw='$($latest.rawTranscript)'" Write-Host "[ok] Final text length=$($latest.finalText.Length), insertStatus=$($latest.insertStatus)" Write-Host "[ok] $Target readback length=$($targetText.Length)" + if ($PostSuccessDelaySeconds -gt 0) { + Write-Host "[hold] Keeping OpenLess and target alive for $PostSuccessDelaySeconds seconds after insertion verification." + Start-Sleep -Seconds $PostSuccessDelaySeconds + } if (Test-Path $logPath) { $logText = Get-Content -Raw -Encoding UTF8 $logPath diff --git a/openless-all/app/src-tauri/Cargo.toml b/openless-all/app/src-tauri/Cargo.toml index ddde8db4..36340f6a 100644 --- a/openless-all/app/src-tauri/Cargo.toml +++ b/openless-all/app/src-tauri/Cargo.toml @@ -91,6 +91,7 @@ windows = { version = "0.58", features = [ "Win32_System_Ole", "Win32_System_Registry", "Win32_System_Threading", + "Win32_UI_Accessibility", "Win32_UI_Input_KeyboardAndMouse", "Win32_UI_Shell", "Win32_UI_TextServices", diff --git a/openless-all/app/src-tauri/src/coordinator.rs b/openless-all/app/src-tauri/src/coordinator.rs index 1fc90a6c..5169f7d3 100644 --- a/openless-all/app/src-tauri/src/coordinator.rs +++ b/openless-all/app/src-tauri/src/coordinator.rs @@ -51,6 +51,10 @@ use crate::types::{ HotkeyStatus, HotkeyStatusState, InsertStatus, OutputLanguagePreference, PolishMode, }; #[cfg(target_os = "windows")] +use crate::windows_external_edit::{ + ExternalEditObservationOutcome, ExternalEditObservationRequest, WindowsExternalEditObserver, +}; +#[cfg(target_os = "windows")] use crate::windows_ime_ipc::ImeSubmitTarget; #[cfg(target_os = "windows")] use crate::windows_ime_session::{PreparedWindowsImeSession, WindowsImeSessionController}; @@ -107,6 +111,8 @@ struct Inner { windows_ime: WindowsImeSessionController, #[cfg(target_os = "windows")] prepared_windows_ime_session: Arc<Mutex<Vec<PreparedWindowsImeSessionSlot>>>, + #[cfg(target_os = "windows")] + windows_external_edit: WindowsExternalEditObserver, state: Mutex<SessionState>, asr: Mutex<Option<SessionResource<ActiveAsr>>>, /// 本地 Qwen3-ASR 引擎缓存。跨会话复用,避免每次重加载 1.2GB+ 模型。 @@ -244,6 +250,7 @@ impl Coordinator { inserter: TextInserter::new(), windows_ime: WindowsImeSessionController::new(), prepared_windows_ime_session: Arc::new(Mutex::new(Vec::new())), + windows_external_edit: WindowsExternalEditObserver::new(), state: Mutex::new(SessionState::default()), asr: Mutex::new(None), recorder: Mutex::new(None), @@ -3615,14 +3622,16 @@ fn schedule_capsule_idle(inner: &Arc<Inner>, delay_ms: u64) { #[cfg(target_os = "windows")] fn capture_focus_target() -> Option<usize> { - use windows::Win32::UI::WindowsAndMessaging::GetForegroundWindow; + use windows::Win32::UI::WindowsAndMessaging::{GetAncestor, GetForegroundWindow, GA_ROOT}; let foreground = unsafe { GetForegroundWindow() }; if foreground.0.is_null() { - None - } else { - Some(foreground.0 as usize) + return None; } + + let root = unsafe { GetAncestor(foreground, GA_ROOT) }; + let target = if root.0.is_null() { foreground } else { root }; + Some(target.0 as usize) } #[cfg(not(target_os = "windows"))] diff --git a/openless-all/app/src-tauri/src/coordinator/dictation.rs b/openless-all/app/src-tauri/src/coordinator/dictation.rs index 58f6f44f..43873ca1 100644 --- a/openless-all/app/src-tauri/src/coordinator/dictation.rs +++ b/openless-all/app/src-tauri/src/coordinator/dictation.rs @@ -379,6 +379,9 @@ pub(super) fn request_stop_during_starting(inner: &Arc<Inner>, reason: &str) { } pub(super) async fn begin_session(inner: &Arc<Inner>) -> Result<(), String> { + #[cfg(target_os = "windows")] + inner.windows_external_edit.cancel(); + let current_session_id = { let mut state = inner.state.lock(); let Some(session_id) = @@ -903,7 +906,36 @@ pub(super) async fn end_session(inner: &Arc<Inner>) -> Result<(), String> { }; let uses_global_timeout = asr_transcribe_uses_global_timeout(&asr); - let raw = match asr { + #[cfg(any(debug_assertions, test))] + let debug_transcript = debug_transcript_override_text(); + #[cfg(not(any(debug_assertions, test)))] + let debug_transcript: Option<String> = None; + + let raw = if let Some(debug_text) = debug_transcript { + log::info!( + "[coord] bypassing ASR with debug transcript override (chars={})", + debug_text.chars().count() + ); + match asr { + ActiveAsr::Volcengine(asr) => asr.cancel(), + ActiveAsr::Bailian(asr) => asr.cancel(), + #[cfg(target_os = "windows")] + ActiveAsr::FoundryLocalWhisper(_) => { + schedule_foundry_local_asr_release(inner, current_session_id); + } + #[cfg(target_os = "macos")] + ActiveAsr::Local(_) => { + inner.local_asr_cache.touch(); + schedule_local_asr_release(inner); + } + ActiveAsr::Whisper(_) => {} + } + RawTranscript { + text: debug_text, + duration_ms: elapsed, + } + } else { + match asr { ActiveAsr::Volcengine(asr) => { debug_assert!(uses_global_timeout); if let Err(e) = asr.send_last_frame().await { @@ -1121,6 +1153,7 @@ pub(super) async fn end_session(inner: &Arc<Inner>) -> Result<(), String> { } } } + } }; // ASR 完成后 cancel 检查:用户在 transcribe 进行中按 Esc 时,这里就会命中。 @@ -1406,6 +1439,15 @@ pub(super) async fn end_session(inner: &Arc<Inner>) -> Result<(), String> { restore_prepared_windows_ime_session(inner, current_session_id); let inserted_chars = polished.chars().count() as u32; + #[cfg(target_os = "windows")] + maybe_arm_windows_external_edit_observer( + inner, + status, + focus_ready_for_paste, + &polished, + front_app.as_deref(), + ); + // 累计每条 enabled 词条在最终文本中的命中次数。 // 用 polished(最终插入的文本)扫描,与用户实际看到的输出一致。 let total_hits: u64 = match inner.vocab.record_hits(&polished) { @@ -1514,6 +1556,95 @@ pub(super) fn dictation_error_code( } } +#[cfg(target_os = "windows")] +fn maybe_arm_windows_external_edit_observer( + inner: &Arc<Inner>, + status: InsertStatus, + focus_ready_for_paste: bool, + final_text: &str, + front_app: Option<&str>, +) { + let prefs = inner.prefs.get(); + if !prefs.windows_external_edit_learning { + log::info!("[extedit] skip arm: preference disabled"); + return; + } + if status != InsertStatus::Inserted { + log::info!("[extedit] skip arm: insert status is not Inserted ({status:?})"); + return; + } + if !focus_ready_for_paste { + log::info!("[extedit] skip arm: focus target was not restored for paste"); + return; + } + if final_text.trim().is_empty() { + log::info!("[extedit] skip arm: final text empty"); + return; + } + + let request = ExternalEditObservationRequest { + inserted_text: final_text.to_string(), + window_title: front_app.map(ToOwned::to_owned), + }; + log::info!( + "[extedit] armed observer (chars={}, window={:?})", + final_text.chars().count(), + request.window_title + ); + let inner_for_callback = Arc::clone(inner); + inner.windows_external_edit.arm(request, move |outcome| { + handle_windows_external_edit_outcome(&inner_for_callback, outcome); + }); +} + +#[cfg(target_os = "windows")] +fn handle_windows_external_edit_outcome( + inner: &Arc<Inner>, + outcome: ExternalEditObservationOutcome, +) { + match outcome { + ExternalEditObservationOutcome::Learned(learned) => { + log::info!( + "[extedit] learned rule {} -> {} (baseline_chars={}, observed_chars={})", + learned.pattern, + learned.replacement, + learned.baseline_text.chars().count(), + learned.observed_text.chars().count() + ); + match inner + .correction_rules + .remember(learned.pattern.clone(), learned.replacement.clone()) + { + Ok(rule) => { + log::info!("[extedit] persisted rule id={}", rule.id); + if let Some(app) = inner.app.lock().clone() { + let _ = app.emit("vocab:updated", 0u64); + } + } + Err(err) => { + log::warn!( + "[extedit] persist learned rule failed: {err}; baseline_chars={} observed_chars={}", + learned.baseline_text.chars().count(), + learned.observed_text.chars().count() + ); + } + } + } + ExternalEditObservationOutcome::NoChange => { + log::info!("[extedit] observation window elapsed without edits"); + } + ExternalEditObservationOutcome::Skipped(reason) => { + log::info!("[extedit] observation skipped: {reason}"); + } + ExternalEditObservationOutcome::Failed(err) => { + log::warn!("[extedit] observation failed: {err}"); + } + ExternalEditObservationOutcome::Cancelled => { + log::info!("[extedit] observation cancelled"); + } + } +} + pub(super) fn cancel_session(inner: &Arc<Inner>) { let Some(decision) = ({ let mut state = inner.state.lock(); diff --git a/openless-all/app/src-tauri/src/lib.rs b/openless-all/app/src-tauri/src/lib.rs index 7ace76be..4fb5cf16 100644 --- a/openless-all/app/src-tauri/src/lib.rs +++ b/openless-all/app/src-tauri/src/lib.rs @@ -30,6 +30,7 @@ mod selection; mod shortcut_binding; mod types; mod unicode_keystroke; +mod windows_external_edit; mod windows_ime_ipc; mod windows_ime_profile; mod windows_ime_protocol; diff --git a/openless-all/app/src-tauri/src/persistence.rs b/openless-all/app/src-tauri/src/persistence.rs index 43f45e7d..b60ac9b3 100644 --- a/openless-all/app/src-tauri/src/persistence.rs +++ b/openless-all/app/src-tauri/src/persistence.rs @@ -117,6 +117,7 @@ fn data_dir() -> Result<PathBuf> { .join("share") .join("OpenLess")) } + } fn ensure_dir(dir: &Path) -> Result<()> { @@ -1051,6 +1052,15 @@ pub struct CorrectionRuleStore { lock: Mutex<()>, } +#[derive(Debug, Deserialize)] +#[serde(untagged)] +enum CorrectionRuleFile { + Array(Vec<CorrectionRule>), + Wrapped { + value: Vec<CorrectionRule>, + }, +} + impl CorrectionRuleStore { pub fn new() -> Result<Self> { let dir = data_dir()?; @@ -1084,6 +1094,43 @@ impl CorrectionRuleStore { Ok(rule) } + pub fn remember(&self, pattern: String, replacement: String) -> Result<CorrectionRule> { + let pattern = pattern.trim().to_string(); + let replacement = replacement.trim().to_string(); + validate_correction_rule_syntax(&pattern, &replacement)?; + let _guard = self.lock.lock(); + let mut rules = self.read_locked()?; + + if let Some(index) = rules + .iter() + .position(|rule| rule.pattern == pattern && rule.replacement == replacement) + { + if !rules[index].enabled { + rules[index].enabled = true; + self.write_locked(&rules)?; + } + return Ok(rules[index].clone()); + } + + if rules.iter().any(|rule| rule.pattern == pattern && rule.replacement != replacement) { + return Err(anyhow!( + "conflicting correction rule already exists for pattern {}", + pattern + )); + } + + let rule = CorrectionRule { + id: Uuid::new_v4().to_string(), + pattern, + replacement, + enabled: true, + created_at: Utc::now().to_rfc3339(), + }; + rules.insert(0, rule.clone()); + self.write_locked(&rules)?; + Ok(rule) + } + pub fn remove(&self, id: &str) -> Result<()> { let _guard = self.lock.lock(); let mut rules = self.read_locked()?; @@ -1113,7 +1160,25 @@ impl CorrectionRuleStore { } fn read_locked(&self) -> Result<Vec<CorrectionRule>> { - read_or_default::<Vec<CorrectionRule>>(&self.path) + if !self.path.exists() { + return Ok(Vec::new()); + } + let bytes = + fs::read(&self.path).with_context(|| format!("read failed: {}", self.path.display()))?; + if bytes.is_empty() { + return Ok(Vec::new()); + } + match serde_json::from_slice::<CorrectionRuleFile>(&bytes) + .with_context(|| format!("decode failed: {}", self.path.display()))? + { + CorrectionRuleFile::Array(rules) => Ok(rules), + CorrectionRuleFile::Wrapped { value } => { + log::warn!( + "[extedit] correction-rules.json used wrapped array format; normalizing on next write" + ); + Ok(value) + } + } } fn write_locked(&self, rules: &[CorrectionRule]) -> Result<()> { @@ -1288,7 +1353,7 @@ impl CredentialsVault { #[cfg(test)] mod tests { use super::{ - chunk_json_payload, list_vocab_presets, save_vocab_presets, + chunk_json_payload, list_vocab_presets, save_vocab_presets, CorrectionRuleFile, validate_correction_rule_syntax, KEYRING_CHUNK_MAX_UTF16_UNITS, }; use crate::types::{VocabPreset, VocabPresetStore}; @@ -1350,4 +1415,33 @@ mod tests { assert!(validate_correction_rule_syntax("{num}到{num}粒", "{num}例").is_err()); assert!(validate_correction_rule_syntax("几粒", "{num}例").is_err()); } + + #[test] + fn correction_rule_file_accepts_powershell_wrapped_array_shape() { + let payload = r#"{ + "value": [ + { + "id": "rule-1", + "pattern": "粒", + "replacement": "例", + "enabled": false, + "createdAt": "2026-05-13T07:30:50.897734+00:00" + } + ], + "Count": 1 +}"#; + let parsed: CorrectionRuleFile = + serde_json::from_str(payload).expect("wrapped correction rule payload should parse"); + match parsed { + CorrectionRuleFile::Wrapped { value } => { + assert_eq!(value.len(), 1); + assert_eq!(value[0].id, "rule-1"); + assert_eq!(value[0].pattern, "粒"); + assert_eq!(value[0].replacement, "例"); + assert!(!value[0].enabled); + assert_eq!(value[0].created_at, "2026-05-13T07:30:50.897734+00:00"); + } + CorrectionRuleFile::Array(_) => panic!("wrapped payload parsed as plain array"), + } + } } diff --git a/openless-all/app/src-tauri/src/types.rs b/openless-all/app/src-tauri/src/types.rs index c0bc5e37..8bd38745 100644 --- a/openless-all/app/src-tauri/src/types.rs +++ b/openless-all/app/src-tauri/src/types.rs @@ -290,6 +290,8 @@ pub struct UserPreferences { /// 默认 true(更接近用户习惯)。 #[serde(default = "default_true")] pub streaming_insert_save_clipboard: bool, + #[serde(default = "default_windows_external_edit_learning_enabled")] + pub windows_external_edit_learning: bool, } fn default_local_asr_model() -> String { @@ -320,6 +322,10 @@ fn default_foundry_local_runtime_source() -> String { "auto".into() } +fn default_windows_external_edit_learning_enabled() -> bool { + cfg!(target_os = "windows") +} + fn default_active_asr_provider() -> String { #[cfg(target_os = "windows")] { @@ -389,6 +395,8 @@ struct UserPreferencesWire { streaming_insert: bool, #[serde(default = "default_true")] streaming_insert_save_clipboard: bool, + #[serde(default = "default_windows_external_edit_learning_enabled")] + windows_external_edit_learning: bool, } impl Default for UserPreferencesWire { @@ -432,6 +440,7 @@ impl Default for UserPreferencesWire { start_minimized: prefs.start_minimized, streaming_insert: prefs.streaming_insert, streaming_insert_save_clipboard: prefs.streaming_insert_save_clipboard, + windows_external_edit_learning: prefs.windows_external_edit_learning, } } } @@ -492,6 +501,7 @@ impl<'de> Deserialize<'de> for UserPreferences { start_minimized: wire.start_minimized, streaming_insert: wire.streaming_insert, streaming_insert_save_clipboard: wire.streaming_insert_save_clipboard, + windows_external_edit_learning: wire.windows_external_edit_learning, }) } } @@ -605,6 +615,7 @@ impl Default for UserPreferences { start_minimized: false, streaming_insert: false, streaming_insert_save_clipboard: true, + windows_external_edit_learning: default_windows_external_edit_learning_enabled(), } } } diff --git a/openless-all/app/src-tauri/src/windows_external_edit.rs b/openless-all/app/src-tauri/src/windows_external_edit.rs new file mode 100644 index 00000000..600d9921 --- /dev/null +++ b/openless-all/app/src-tauri/src/windows_external_edit.rs @@ -0,0 +1,501 @@ +use std::sync::atomic::{AtomicU64, Ordering}; +use std::sync::Arc; +use std::thread; +use std::time::{Duration, Instant}; + +#[cfg(target_os = "windows")] +use windows::Win32::Foundation::RPC_E_CHANGED_MODE; +#[cfg(target_os = "windows")] +use windows::Win32::System::Com::{ + CoCreateInstance, CoInitializeEx, CoUninitialize, CLSCTX_INPROC_SERVER, + COINIT_APARTMENTTHREADED, +}; +#[cfg(target_os = "windows")] +use windows::Win32::UI::Accessibility::{ + CUIAutomation8, IUIAutomation, IUIAutomationElement, IUIAutomationTextPattern, + IUIAutomationValuePattern, UIA_TextPatternId, UIA_ValuePatternId, +}; + +const BASELINE_DELAY: Duration = Duration::from_millis(180); +const POLL_INTERVAL: Duration = Duration::from_millis(250); +const OBSERVATION_WINDOW: Duration = Duration::from_secs(8); +const MAX_LEARNED_SPAN_CHARS: usize = 48; +const MAX_LOG_PREVIEW_CHARS: usize = 120; + +#[derive(Debug, Clone)] +pub struct ExternalEditObservationRequest { + pub inserted_text: String, + pub window_title: Option<String>, +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct LearnedCorrection { + pub pattern: String, + pub replacement: String, + pub baseline_text: String, + pub observed_text: String, +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum ExternalEditObservationOutcome { + Learned(LearnedCorrection), + NoChange, + Skipped(&'static str), + Failed(String), + Cancelled, +} + +#[derive(Debug, Clone)] +struct TextSnapshot { + text: String, + class_name: String, + name: String, +} + +pub struct WindowsExternalEditObserver { + generation: Arc<AtomicU64>, +} + +impl Default for WindowsExternalEditObserver { + fn default() -> Self { + Self { + generation: Arc::new(AtomicU64::new(0)), + } + } +} + +impl WindowsExternalEditObserver { + pub fn new() -> Self { + Self::default() + } + + pub fn cancel(&self) { + self.generation.fetch_add(1, Ordering::SeqCst); + } + + pub fn arm<F>(&self, request: ExternalEditObservationRequest, on_outcome: F) + where + F: FnOnce(ExternalEditObservationOutcome) + Send + 'static, + { + let generation_state = Arc::clone(&self.generation); + let generation = generation_state.fetch_add(1, Ordering::SeqCst) + 1; + thread::spawn(move || { + let mut callback = Some(on_outcome); + + thread::sleep(BASELINE_DELAY); + if Self::is_cancelled(&generation_state, generation) { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Cancelled); + } + return; + } + + let baseline = match snapshot_external_edit_text() { + Ok(Some(snapshot)) => snapshot, + Ok(None) => { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Skipped("baselineUnavailable")); + } + return; + } + Err(err) => { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Failed(format!( + "baseline capture failed: {err}" + ))); + } + return; + } + }; + + if !request.inserted_text.is_empty() + && !has_unique_occurrence(&baseline.text, &request.inserted_text) + { + let occurrences = occurrence_count(&baseline.text, &request.inserted_text); + log::info!( + "[extedit] baseline rejected: reason=insertedTextNotUniqueInBaseline occurrences={} baseline_chars={} inserted_chars={} class={:?} name={:?} expected_window={:?} preview={:?}", + occurrences, + baseline.text.chars().count(), + request.inserted_text.chars().count(), + baseline.class_name, + baseline.name, + request.window_title, + preview_text(&baseline.text), + ); + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Skipped( + "insertedTextNotUniqueInBaseline", + )); + } + return; + } + + let deadline = Instant::now() + OBSERVATION_WINDOW; + while Instant::now() < deadline { + if Self::is_cancelled(&generation_state, generation) { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Cancelled); + } + return; + } + thread::sleep(POLL_INTERVAL); + let observed = match snapshot_external_edit_text() { + Ok(Some(snapshot)) => snapshot, + Ok(None) => continue, + Err(err) => { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Failed(format!( + "poll capture failed: {err}" + ))); + } + return; + } + }; + if observed.text == baseline.text { + continue; + } + match infer_learned_correction( + &baseline.text, + &observed.text, + &request.inserted_text, + ) { + Some(rule) => { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Learned(rule)); + } + return; + } + None => { + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::Skipped( + "deterministicInferenceFailed", + )); + } + return; + } + } + } + + if let Some(cb) = callback.take() { + cb(ExternalEditObservationOutcome::NoChange); + } + }); + } + + fn is_cancelled(generation_state: &AtomicU64, generation: u64) -> bool { + generation != generation_state.load(Ordering::SeqCst) + } +} + +fn has_unique_occurrence(haystack: &str, needle: &str) -> bool { + if needle.is_empty() { + return true; + } + let mut matches = haystack.match_indices(needle); + let first = matches.next(); + first.is_some() && matches.next().is_none() +} + +fn occurrence_count(haystack: &str, needle: &str) -> usize { + if needle.is_empty() { + return 0; + } + haystack.match_indices(needle).count() +} + +fn preview_text(text: &str) -> String { + let preview: String = text.chars().take(MAX_LOG_PREVIEW_CHARS).collect(); + preview.replace('\r', "\\r").replace('\n', "\\n") +} + +fn snapshot_external_edit_text() -> Result<Option<TextSnapshot>, String> { + #[cfg(not(target_os = "windows"))] + { + Ok(None) + } + + #[cfg(target_os = "windows")] + { + let com_initialized = unsafe { CoInitializeEx(None, COINIT_APARTMENTTHREADED) }; + let should_uninitialize = if com_initialized.is_ok() { + true + } else if com_initialized == RPC_E_CHANGED_MODE { + false + } else { + return Err(format!("CoInitializeEx: {com_initialized}")); + }; + let result = snapshot_external_edit_text_windows(); + if should_uninitialize { + unsafe { + CoUninitialize(); + } + } + result + } +} + +#[cfg(target_os = "windows")] +fn snapshot_external_edit_text_windows() -> Result<Option<TextSnapshot>, String> { + let automation: IUIAutomation = unsafe { + CoCreateInstance(&CUIAutomation8, None, CLSCTX_INPROC_SERVER) + .map_err(|err| format!("CoCreateInstance(CUIAutomation8): {err}"))? + }; + let element = unsafe { + automation + .GetFocusedElement() + .map_err(|err| format!("GetFocusedElement: {err}"))? + }; + read_text_from_element(&element) +} + +#[cfg(target_os = "windows")] +fn read_text_from_element(element: &IUIAutomationElement) -> Result<Option<TextSnapshot>, String> { + let class_name = unsafe { + element + .CurrentClassName() + .map(|value| value.to_string()) + .unwrap_or_default() + }; + let name = unsafe { + element + .CurrentName() + .map(|value| value.to_string()) + .unwrap_or_default() + }; + + if let Ok(value_pattern) = + unsafe { element.GetCurrentPatternAs::<IUIAutomationValuePattern>(UIA_ValuePatternId) } + { + let value = unsafe { + value_pattern + .CurrentValue() + .map_err(|err| format!("CurrentValue: {err}"))? + }; + let text = value.to_string(); + if !text.is_empty() { + return Ok(Some(TextSnapshot { + text, + class_name, + name, + })); + } + } + + if let Ok(text_pattern) = + unsafe { element.GetCurrentPatternAs::<IUIAutomationTextPattern>(UIA_TextPatternId) } + { + let document_range = unsafe { + text_pattern + .DocumentRange() + .map_err(|err| format!("DocumentRange: {err}"))? + }; + let text = unsafe { + document_range + .GetText(-1) + .map_err(|err| format!("GetText: {err}"))? + } + .to_string(); + if !text.is_empty() { + return Ok(Some(TextSnapshot { + text, + class_name, + name, + })); + } + } + + Ok(None) +} + +pub fn infer_learned_correction( + baseline: &str, + observed: &str, + inserted_text: &str, +) -> Option<LearnedCorrection> { + if baseline == observed { + return None; + } + if !inserted_text.is_empty() && !has_unique_occurrence(baseline, inserted_text) { + return None; + } + + let baseline_chars: Vec<char> = baseline.chars().collect(); + let observed_chars: Vec<char> = observed.chars().collect(); + let mut prefix = 0usize; + while prefix < baseline_chars.len() + && prefix < observed_chars.len() + && baseline_chars[prefix] == observed_chars[prefix] + { + prefix += 1; + } + + let mut baseline_suffix = baseline_chars.len(); + let mut observed_suffix = observed_chars.len(); + while baseline_suffix > prefix + && observed_suffix > prefix + && baseline_chars[baseline_suffix - 1] == observed_chars[observed_suffix - 1] + { + baseline_suffix -= 1; + observed_suffix -= 1; + } + + let old_span: String = baseline_chars[prefix..baseline_suffix].iter().collect(); + let new_span: String = observed_chars[prefix..observed_suffix].iter().collect(); + if old_span.trim().is_empty() || new_span.trim().is_empty() || old_span == new_span { + return None; + } + if old_span.contains('\n') || new_span.contains('\n') { + return None; + } + if old_span.chars().count() > MAX_LEARNED_SPAN_CHARS + || new_span.chars().count() > MAX_LEARNED_SPAN_CHARS + { + return None; + } + + if !inserted_text.is_empty() { + let (insert_start, insert_end) = unique_match_range(baseline, inserted_text)?; + let diff_start = char_to_byte_index(baseline, prefix)?; + let diff_end = char_to_byte_index(baseline, baseline_suffix)?; + if diff_start >= insert_end || diff_end <= insert_start { + return None; + } + } + + let (pattern, replacement) = generalize_numeric_pair(&old_span, &new_span) + .unwrap_or_else(|| (old_span.clone(), new_span.clone())); + + Some(LearnedCorrection { + pattern, + replacement, + baseline_text: baseline.to_string(), + observed_text: observed.to_string(), + }) +} + +fn unique_match_range(haystack: &str, needle: &str) -> Option<(usize, usize)> { + let mut matches = haystack.match_indices(needle); + let (start, matched) = matches.next()?; + if matches.next().is_some() { + return None; + } + Some((start, start + matched.len())) +} + +fn char_to_byte_index(text: &str, char_index: usize) -> Option<usize> { + if char_index == text.chars().count() { + return Some(text.len()); + } + text.char_indices().nth(char_index).map(|(index, _)| index) +} + +fn generalize_numeric_pair(old_span: &str, new_span: &str) -> Option<(String, String)> { + let old_token = single_ascii_digit_run(old_span)?; + let new_token = single_ascii_digit_run(new_span)?; + if old_token.text != new_token.text { + return None; + } + let old_pattern = replace_range_with_num(old_span, old_token.start, old_token.end)?; + let new_pattern = replace_range_with_num(new_span, new_token.start, new_token.end)?; + if old_pattern == "{num}" || new_pattern == "{num}" { + return None; + } + Some((old_pattern, new_pattern)) +} + +struct DigitRun<'a> { + start: usize, + end: usize, + text: &'a str, +} + +fn single_ascii_digit_run(text: &str) -> Option<DigitRun<'_>> { + let bytes = text.as_bytes(); + let mut run_start: Option<usize> = None; + let mut run_end = 0usize; + let mut seen = false; + for (index, byte) in bytes.iter().copied().enumerate() { + if byte.is_ascii_digit() { + if run_start.is_none() { + if seen { + return None; + } + run_start = Some(index); + } + run_end = index + 1; + } else if run_start.is_some() { + seen = true; + } + } + let start = run_start?; + Some(DigitRun { + start, + end: run_end, + text: &text[start..run_end], + }) +} + +fn replace_range_with_num(text: &str, start: usize, end: usize) -> Option<String> { + let prefix = text.get(..start)?; + let suffix = text.get(end..)?; + Some(format!("{prefix}{{num}}{suffix}")) +} + +#[cfg(test)] +mod tests { + use super::{infer_learned_correction, LearnedCorrection}; + + fn learned( + pattern: &str, + replacement: &str, + baseline: &str, + observed: &str, + ) -> LearnedCorrection { + LearnedCorrection { + pattern: pattern.into(), + replacement: replacement.into(), + baseline_text: baseline.into(), + observed_text: observed.into(), + } + } + + #[test] + fn infers_literal_replacement_inside_inserted_span() { + let baseline = "今天记录了一个几粒样本。"; + let observed = "今天记录了一个几例样本。"; + let inserted = "今天记录了一个几粒样本。"; + + assert_eq!( + infer_learned_correction(baseline, observed, inserted), + Some(learned("几粒", "几例", baseline, observed)) + ); + } + + #[test] + fn infers_numeric_generalization_when_number_matches() { + let baseline = "本轮共统计 2粒 阳性。"; + let observed = "本轮共统计 2例 阳性。"; + let inserted = baseline; + + assert_eq!( + infer_learned_correction(baseline, observed, inserted), + Some(learned("{num}粒", "{num}例", baseline, observed)) + ); + } + + #[test] + fn rejects_diff_outside_inserted_span() { + let baseline = "前缀A 几粒 后缀"; + let observed = "前缀B 几粒 后缀"; + + assert_eq!(infer_learned_correction(baseline, observed, "几粒"), None); + } + + #[test] + fn rejects_ambiguous_inserted_occurrence() { + let baseline = "几粒 和 几粒"; + let observed = "几例 和 几粒"; + + assert_eq!(infer_learned_correction(baseline, observed, "几粒"), None); + } +} diff --git a/openless-all/app/src/i18n/en.ts b/openless-all/app/src/i18n/en.ts index 8cdb2016..a62c077e 100644 --- a/openless-all/app/src/i18n/en.ts +++ b/openless-all/app/src/i18n/en.ts @@ -497,6 +497,12 @@ export const en: typeof zhCN = { 'Experimental: uses enigo + XTest to synthesize keystrokes. Stable on X11; on Wayland it depends on whether the compositor grants libei access — failures fall back automatically to one-shot insertion.', streamingInsertSaveClipboardLabel: 'Also copy to clipboard', streamingInsertSaveClipboardHint: 'After a successful streaming insert, write the final text to the system clipboard so Cmd+V can paste it again. Off = streaming never touches the clipboard.', + windowsExternalEditLearningTitle: 'Windows external-edit auto learning', + windowsExternalEditLearningDesc: + 'After a successful insert, OpenLess opens only a short local observation window. If you manually correct a term in the same text control right away, OpenLess tries to infer a deterministic old -> new replacement and stores it as a local correction rule for future dictation.', + windowsExternalEditLearningLabel: 'Enable external-edit auto learning', + windowsExternalEditLearningHint: + 'Windows-only for the first productized release. Observation is best-effort and limited to verified supported scenarios; if observation, attribution, inference, or persistence fails, the run falls back silently and never affects the original insert. Learned rules remain visible in Vocabulary -> Correction rules, where you can disable or delete them.', localAsrTitle: 'Local ASR models (experimental)', localAsrDesc: 'Move transcription from cloud ASR to on-device inference. Offline / privacy-sensitive use only.', localAsrWarningShort: 'Local inference is slower; under-spec hardware may drop words.', diff --git a/openless-all/app/src/i18n/ja.ts b/openless-all/app/src/i18n/ja.ts index ff51143d..1d1b9588 100644 --- a/openless-all/app/src/i18n/ja.ts +++ b/openless-all/app/src/i18n/ja.ts @@ -499,6 +499,12 @@ export const ja: typeof zhCN = { '実験的:enigo + XTest によるキーボードイベント合成。X11 では安定動作、Wayland では compositor が libei を許可するかどうかに依存し、失敗時は自動的に一括挿入にフォールバックします。', streamingInsertSaveClipboardLabel: 'クリップボードにも保存', streamingInsertSaveClipboardHint: 'ストリーミング入力成功後、最終テキストをシステムクリップボードに書き込んで Cmd+V で再貼り付けできるようにします。OFF にすると、ストリーミング処理はクリップボードに一切触れません。', + windowsExternalEditLearningTitle: 'Windows 外部編集の自動学習', + windowsExternalEditLearningDesc: + '挿入成功後、ごく短いローカル観察ウィンドウだけが開きます。その直後に同じテキストコントロール内で用語を手動修正すると、OpenLess は deterministic な old -> new 置換を推定してローカル correction rule として保存し、次回以降の音声入力で優先適用します。', + windowsExternalEditLearningLabel: '外部編集の自動学習を有効にする', + windowsExternalEditLearningHint: + '初回 productized release では Windows のみ対応です。観察は検証済みサポートシナリオに限定した best-effort で、観察・帰因・推定・保存のいずれかが失敗しても元の挿入結果には影響しません。学習済みルールは「Vocabulary -> Correction rules」で確認・無効化・削除できます。', localAsrTitle: 'ローカル ASR モデル(実験的)', localAsrDesc: '転写をクラウドから本機推論に切り替えます。オフライン/プライバシー重視向け。', localAsrWarningShort: 'ローカル推論は遅く、スペック不足では欠字の可能性があります。', diff --git a/openless-all/app/src/i18n/ko.ts b/openless-all/app/src/i18n/ko.ts index 35dfb31c..de059d79 100644 --- a/openless-all/app/src/i18n/ko.ts +++ b/openless-all/app/src/i18n/ko.ts @@ -499,6 +499,12 @@ export const ko: typeof zhCN = { '실험적: enigo + XTest 로 키보드 이벤트를 합성합니다. X11 에서는 안정적이며, Wayland 에서는 compositor 가 libei 를 허용하는지에 따라 다르고, 실패 시 자동으로 일괄 삽입으로 폴백됩니다.', streamingInsertSaveClipboardLabel: '클립보드에도 저장', streamingInsertSaveClipboardHint: '스트리밍 입력 성공 후 최종 텍스트를 시스템 클립보드에 기록하여 Cmd+V 로 재붙여넣기를 할 수 있도록 합니다. 끄면 스트리밍 동안 클립보드를 일절 건드리지 않습니다.', + windowsExternalEditLearningTitle: 'Windows 외부 편집 자동 학습', + windowsExternalEditLearningDesc: + '삽입이 성공하면 아주 짧은 로컬 관찰 창만 열립니다. 그 직후 같은 텍스트 컨트롤에서 용어를 수동으로 고치면 OpenLess 가 deterministic 한 old -> new 치환을 추론해 로컬 correction rule 로 저장하고, 이후 받아쓰기에서 우선 적용합니다.', + windowsExternalEditLearningLabel: '외부 편집 자동 학습 사용', + windowsExternalEditLearningHint: + '첫 productized release 에서는 Windows 만 지원합니다. 관찰은 검증된 지원 시나리오에 한정된 best-effort 이며, 관찰·귀속·추론·저장 중 어느 단계가 실패해도 원래 삽입 결과에는 영향을 주지 않습니다. 학습된 규칙은 "Vocabulary -> Correction rules" 에서 확인, 비활성화, 삭제할 수 있습니다.', localAsrTitle: '로컬 ASR 모델 (실험적)', localAsrDesc: '전사를 클라우드에서 로컬 추론으로 전환합니다. 오프라인 / 프라이버시용에만 권장됩니다.', localAsrWarningShort: '로컬 추론은 느리며, 사양 부족 시 글자 누락이 발생할 수 있습니다.', diff --git a/openless-all/app/src/i18n/zh-CN.ts b/openless-all/app/src/i18n/zh-CN.ts index 987e0a31..539d48e7 100644 --- a/openless-all/app/src/i18n/zh-CN.ts +++ b/openless-all/app/src/i18n/zh-CN.ts @@ -495,6 +495,12 @@ export const zhCN = { '实验性:enigo + XTest 模拟键盘事件,X11 稳定;Wayland 看 compositor 是否允许 libei,失败自动回落到一次性插入。', streamingInsertSaveClipboardLabel: '同步写入剪贴板', streamingInsertSaveClipboardHint: '流式输入成功后把这次的最终文本写到系统剪贴板,方便 Cmd+V 再次粘贴。关闭后流式过程完全不动剪贴板。', + windowsExternalEditLearningTitle: 'Windows 外部编辑自动学习', + windowsExternalEditLearningDesc: + '插入成功后只在一个很短的本地观察窗口内工作:如果你随后在同一文本控件里手工改正了刚插入的术语,OpenLess 会尝试提取确定性的 old -> new 替换,并写入本地纠正规则,供后续听写优先命中。', + windowsExternalEditLearningLabel: '启用外部编辑自动学习', + windowsExternalEditLearningHint: + '仅 Windows 首批支持。当前只在已验证支持的外部编辑场景里尽力观察;观察、归因、抽取或写入失败都会静默回退,不影响这次原始插入。已学到的规则可在“词汇表 -> 纠正规则”里查看、禁用或删除。', localAsrTitle: '本地 ASR 模型(实验性)', localAsrDesc: '把转写从云端切到本机推理。仅推荐离线 / 隐私敏感场景。', localAsrWarningShort: '本地推理较慢,配置不足时可能吞字。', diff --git a/openless-all/app/src/i18n/zh-TW.ts b/openless-all/app/src/i18n/zh-TW.ts index 35e82844..8f53f39e 100644 --- a/openless-all/app/src/i18n/zh-TW.ts +++ b/openless-all/app/src/i18n/zh-TW.ts @@ -497,6 +497,12 @@ export const zhTW: typeof zhCN = { '實驗性:enigo + XTest 模擬鍵盤事件,X11 穩定;Wayland 看 compositor 是否允許 libei,失敗自動回落到一次性插入。', streamingInsertSaveClipboardLabel: '同步寫入剪貼簿', streamingInsertSaveClipboardHint: '流式輸入成功後把這次的最終文字寫到系統剪貼簿,方便 Cmd+V 再次貼上。關閉後流式過程完全不動剪貼簿。', + windowsExternalEditLearningTitle: 'Windows 外部編輯自動學習', + windowsExternalEditLearningDesc: + '插入成功後只在很短的本地觀察窗口內工作:如果你隨後在同一文字控制項裡手工改正了剛插入的術語,OpenLess 會嘗試提取確定性的 old -> new 替換,並寫入本地糾正规則,供後續聽寫優先命中。', + windowsExternalEditLearningLabel: '啟用外部編輯自動學習', + windowsExternalEditLearningHint: + '僅 Windows 首批支援。當前只在已驗證支援的外部編輯場景裡盡力觀察;觀察、歸因、抽取或寫入失敗都會靜默回退,不影響這次原始插入。已學到的規則可在「詞彙表 -> 糾正规則」裡查看、停用或刪除。', localAsrTitle: '本地 ASR 模型(實驗性)', localAsrDesc: '把轉寫從雲端切到本機推理。僅推薦離線 / 隱私敏感場景。', localAsrWarningShort: '本地推理較慢,配置不足時可能吞字。', diff --git a/openless-all/app/src/lib/ipc.ts b/openless-all/app/src/lib/ipc.ts index b8e14f7a..2c5c3417 100644 --- a/openless-all/app/src/lib/ipc.ts +++ b/openless-all/app/src/lib/ipc.ts @@ -84,6 +84,7 @@ const mockSettings: UserPreferences = { updateChannel: 'stable', streamingInsert: false, streamingInsertSaveClipboard: true, + windowsExternalEditLearning: true, }; const mockHotkeyCapability: HotkeyCapability = { diff --git a/openless-all/app/src/lib/stylePrefs.test.ts b/openless-all/app/src/lib/stylePrefs.test.ts index 74a044cb..23d42723 100644 --- a/openless-all/app/src/lib/stylePrefs.test.ts +++ b/openless-all/app/src/lib/stylePrefs.test.ts @@ -53,6 +53,7 @@ const previousPrefs: UserPreferences = { updateChannel: 'stable', streamingInsert: false, streamingInsertSaveClipboard: true, + windowsExternalEditLearning: false, }; const nextPrefs: UserPreferences = { diff --git a/openless-all/app/src/lib/types.ts b/openless-all/app/src/lib/types.ts index b45242fa..7c6053e3 100644 --- a/openless-all/app/src/lib/types.ts +++ b/openless-all/app/src/lib/types.ts @@ -209,6 +209,7 @@ export interface UserPreferences { /** 流式输入成功后是否把最终润色文本写回剪贴板。开启后 Cmd+V 还能重复粘贴该次输出, * 与一次性路径行为对齐。默认 true。 */ streamingInsertSaveClipboard: boolean; + windowsExternalEditLearning: boolean; } export interface MicrophoneDevice { diff --git a/openless-all/app/src/pages/Settings.tsx b/openless-all/app/src/pages/Settings.tsx index d990a040..14f51a76 100644 --- a/openless-all/app/src/pages/Settings.tsx +++ b/openless-all/app/src/pages/Settings.tsx @@ -1771,6 +1771,27 @@ function AdvancedSection() { <Card> {/* 标题 + 右上角 inline 警告小字(替换原琥珀大警告条)。 */} + {isWin && ( + <div style={{ marginBottom: 14, paddingBottom: 14, borderBottom: '0.5px solid var(--ol-line-soft)' }}> + <div style={{ fontSize: 13, fontWeight: 600, marginBottom: 4 }}> + {t('settings.advanced.windowsExternalEditLearningTitle')} + </div> + <div style={{ fontSize: 11.5, color: 'var(--ol-ink-4)', marginBottom: 10 }}> + {t('settings.advanced.windowsExternalEditLearningDesc')} + </div> + <SettingRow + label={t('settings.advanced.windowsExternalEditLearningLabel')} + desc={t('settings.advanced.windowsExternalEditLearningHint')}> + <Toggle + on={!!prefs?.windowsExternalEditLearning} + onToggle={(next) => { + if (prefs) void updatePrefs({ ...prefs, windowsExternalEditLearning: next }); + }} + /> + </SettingRow> + </div> + )} + <div style={{ display: 'flex', alignItems: 'flex-start', justifyContent: 'space-between', gap: 12, marginBottom: 14 }}> <div style={{ minWidth: 0 }}> <div style={{ fontSize: 13, fontWeight: 600 }}>{t('settings.advanced.localAsrTitle')}</div>