Pure WinRT screen capture + OCR research tool.
Zero GDI. Zero Tesseract. Zero external dependencies.
MorphKatz · MalEmu · SilentLens
SilentLens captures a target window using Windows.Graphics.Capture,
OCR's the frame using Windows.Media.Ocr, and prints the extracted text
to stdout. The entire pipeline stays on the WinRT activation surface
(RoActivateInstance / RoGetActivationFactory) — no BitBlt, no
GetDC, no WinHttpOpen, no import-table strings a YARA rule would
flag.
Most screenshot-stealing research tools use GDI (BitBlt / GetDC /
PrintWindow) for capture and Tesseract for OCR. EDR vendors have
mature user-mode hooks on that surface. SilentLens asks a different
question: what happens when the entire pipeline lives on the WinRT
activation surface instead?
The answer is: most user-mode hooks never fire, because they sit on
win32u.dll / wininet.dll / kernel32.dll — not on WinRT activation
factories. ETW providers (Microsoft-Windows-Graphics-Capture,
Microsoft-Windows-DwmCore) still see it if the vendor subscribes.
This is a research POC for detection engineers and red teamers who want to understand the telemetry gap. It is not a production tool.
- WinRT-only capture —
Windows.Graphics.CapturewithDirect3D11CaptureFramePool::CreateFreeThreaded. GPU-backed, no GDI. - Built-in OCR —
Windows.Media.Ocr.OcrEngine. Ships with every Windows 10/11 install. No Tesseract, no training data, no external engine. - Smart chrome/content separation — uses DPI-aware window metrics to automatically strip title bars, menus, and toolbars from OCR output. Only actual content is extracted by default.
- Stealth minimized/hidden window capture — automatically
restores minimized or hidden windows off-screen (
-8000,-8000), captures the frame, then puts them back. No focus change, no visible animation. The user sees nothing. - Per-app region profiles — define custom named regions (JSON) for any GUI app. Skip sidebars, extract only chat messages, ignore ads. Profiles use percentage-based coordinates and adapt to any window size.
- Visual profile editor —
--build-profileopens a dim overlay on the live target window: drag rectangles, resize handles, name them, pickcontent/chrome/ignore, then press Enter to save JSON. - Multi-language OCR — supports any language Windows has a pack for (English, Arabic, Chinese, Russian, etc.). RTL and CJK handled natively.
- Interactive TUI picker — ANSI-colored, two-tab (Visible/Hidden), arrow-key navigated, type-to-filter window picker.
- Zero external dependencies — everything comes from the Windows
SDK. Single
silentlens.exe, no DLLs, no vcpkg, no NuGet. - Silent capture — disables the yellow capture border and cursor on supported builds (Win10 21H1+ / Win11).
- Loop mode —
--loop Ncaptures every N seconds.
Two tabs: Visible (real apps) and Hidden (background windows). Navigate with arrow keys, type to filter, Tab to switch tabs.
The target (Notepad) is minimized. SilentLens stealth-restores it off-screen, captures the frame, runs OCR, and re-minimizes it — all without interrupting the user.
--scan shows every OCR line with bounding box coordinates (percentages). Use these to build a profile JSON.
--build-profile opens a transparent overlay on the target window. Draw regions, name them, choose their type, and save as JSON.
- Windows 10 2004+ (build 19041) or Windows 11, x64
- A real GPU (WGC doesn't work on VMs without GPU passthrough)
- Visual Studio 2022 17.8+ with "Desktop development with C++"
- CMake 3.21+ (bundled with VS 2022)
- An OCR language pack installed (en-US ships by default)
cd SilentLens
# Option 1: one-click VS open
.\Open-in-VS.cmd
# Option 2: CLI
cmake --preset vs2022-x64
cmake --build build/vs2022-x64 --config Release# Interactive picker (no arguments)
.\silentlens.exe
# Capture by title substring
.\silentlens.exe --title "Slack"
# Capture by HWND
.\silentlens.exe --hwnd 0x1A2B3C
# Loop mode (capture every 5 seconds)
.\silentlens.exe --title "Discord" --loop 5
# Specify OCR language
.\silentlens.exe --title "Teams" --lang ar-SA
# List available OCR languages
.\silentlens.exe --list-langs
# Stealth is on by default; opt out with:
.\silentlens.exe --title "Slack" --no-stealth
# Build a profile visually
.\silentlens.exe --title "Discord" --build-profile profiles\discord.json
# Capture using a saved profile
.\silentlens.exe --title "Discord" --profile profiles\discord.jsonTarget:
--title <substring> Capture window whose title contains <substring>
--hwnd <hex> Capture window by HWND (e.g. 0x1A2B3C)
Output:
--output <path> Save OCR text to file (UTF-8)
--raw Raw output (no chrome/content separation)
--chrome Include chrome (title bar, menus) in output
--lang <tag> OCR language (e.g. en-US, ar-SA, zh-CN, ru-RU)
--list-langs Show available OCR languages and exit
Profiles:
--profile <path.json> Load a region profile for structured extraction
--scan Show all OCR lines with coordinates
--save-profile <path> Save auto-detected layout as a profile JSON
--build-profile <path> Visual overlay editor (draw regions, save JSON)
Advanced:
--stealth [default] Capture hidden/minimized windows silently
--no-stealth Refuse to touch hidden or minimized windows
--loop <seconds> Repeat capture every N seconds
--help Show usage
By default SilentLens uses DPI-aware window metrics to separate UI
chrome (title bar, menu bar) from actual application content. Only
[Content] is shown. Pass --chrome to include chrome, or --raw
to dump everything flat.
You can author profiles manually or use --scan + --save-profile:
{
"name": "Slack",
"match_title": "Slack",
"regions": [
{ "name": "sidebar", "type": "ignore", "x": 0.0, "y": 0.0, "w": 0.18, "h": 1.0 },
{ "name": "toolbar", "type": "chrome", "x": 0.18, "y": 0.0, "w": 0.82, "h": 0.06 },
{ "name": "messages","type": "content", "x": 0.18, "y": 0.06,"w": 0.82, "h": 0.88 },
{ "name": "input", "type": "content", "x": 0.18, "y": 0.94,"w": 0.82, "h": 0.06 }
]
}Region types: content (always shown), chrome (shown with --chrome),
ignore (skipped entirely). Coordinates are percentages (0.0-1.0) so
profiles adapt to any window size. Sample profiles for Slack, Discord,
and Teams are included in profiles/.
| Windows version | Capture | Silent (no border) | OCR |
|---|---|---|---|
| Win10 1803-1903 | Yes | No (yellow border visible) | Yes |
| Win10 2004-20H2 | Yes | Mouse pointer hidden; border may show | Yes |
| Win10 21H1+ | Yes | Yes | Yes |
| Win11 21H2-24H2 | Yes | Yes | Yes |
Note: WGC requires a real GPU with DXGI 1.2+ support. VMs without
GPU passthrough (e.g. basic Windows Server VPS) will get
E_NOINTERFACE. Use a physical machine or a VM with GPU passthrough.
| Application | Framework | Result |
|---|---|---|
| Slack | Electron | Full text extracted |
| Discord | Electron | Full text extracted |
| VS Code | Electron | Full text extracted |
| Windows Terminal | UWP/XAML | Full text extracted |
| Notepad | Win32 | Full text extracted |
| Signal Desktop | Electron + WDA_EXCLUDEFROMCAPTURE | Black frame (detected + warned) |
silentlens.exe
main.cpp arg parse, window resolve, capture, OCR, stdout
window.cpp EnumWindows, filtering, TUI picker, stealth show/hide
overlay.cpp GDI+ visual region editor (--build-profile)
capture.cpp WinRT GraphicsCapture session, frame acquisition
ocr.cpp WinRT OcrEngine, spatial classification, formatting
profile.cpp JSON region profiles, load/save, point-in-region
util.cpp D3D11 device creation, WinRT interop, texture to bitmap
All WinRT calls go through RoActivateInstance / RoGetActivationFactory
into runtime class DLLs. No classic Win32 capture or network surface is
touched.
- Network exfil via
Windows.Web.Http.HttpClient(pure WinRT, no WinHTTP) - UIAutomation fallback for display-affinity-protected windows
- Differential capture (only OCR frames that changed)
- Detection playbook (ETW providers, YARA rule, Sigma rule)
SilentLens is a security research tool for understanding EDR telemetry gaps around the WinRT activation surface. Use only on systems you own or are authorized to test.
Do not use SilentLens for unauthorized surveillance, credential theft, or any activity that violates applicable law or organizational policy.
The author publishes this tool to advance defensive understanding of undermonitored API surfaces — not to enable offense without accountability.
- MorphKatz — x64 polymorphic PE rewriter (C++20)
- MalEmu — Windows malware analysis toolkit with Unicorn-powered emulation (C++)
Apache-2.0. See LICENSE.
