ScreenGrab AI

AI-powered screenshot capture and analysis tool that runs locally on your machine.

Capture screenshots, extract text with OCR, and get AI-powered insights—without sending your data to the cloud.

✨ Features

Multiple Capture Modes
- 📸 Visible Tab — Capture what's currently on screen
- 📄 Full Page — Auto-scrolls and stitches the entire page
- ✂️ Area Selection — Draw a rectangle to capture a specific region
AI-Powered Analysis
- OCR text extraction from screenshots
- Contextual understanding and analysis
- Follow-up questions for deeper exploration
Privacy-First Architecture
- Run AI models locally via Ollama — works completely offline
- No data leaves your device
- Optional cloud providers: OpenAI, Grok, Google Gemini, Google Cloud Vision
Redirect Mode
- Open screenshots directly in ChatGPT or Grok's web interface
- Image automatically copied to clipboard
- Auto-paste functionality for seamless workflow
User-Friendly
- Floating capture icon on every page for quick access
- Floating progress indicator during analysis
- Markdown-formatted results with tables, code blocks, and structured output
- In-page result display with follow-up question support

📸 Screenshots

[Coming soon - Demo of capture modes, UI, and AI analysis results]

🚀 Installation

Option 1: Install from Chrome Web Store (Coming Soon)

Will be available once published.

Option 2: Manual Installation (Recommended for Development)

Download or Clone this Repository

git clone https://github.com/nuelcyoung/screengrab.git
cd screengrab

Open Chrome Extension Management
- In Google Chrome, navigate to chrome://extensions/
- Or: Chrome Menu (⋮) → More Tools → Extensions
Enable Developer Mode
- Toggle the "Developer mode" switch in the top-right corner
Load the Extension
- Click the "Load unpacked" button
- Select the screengrab folder (the folder containing manifest.json)
Verify Installation
- You should see "ScreenGrab AI" in your extensions list
- The extension icon will appear in your Chrome toolbar

📖 Usage

Basic Workflow

Open any webpage you want to capture
Click the extension icon or use the floating icon on the page
Choose a capture mode:
- Visible — Capture what you see
- Full Page — Capture the entire page (auto-scrolls)
- Select Area — Draw a rectangle around what you want
Wait for AI analysis — Floating progress indicator shows real-time status
View results — Text extraction and AI insights displayed in-page with follow-up question support

Follow-Up Questions

After capturing, you can ask follow-up questions about the content:

"Summarize the key points"
"What design patterns are mentioned?"
"Extract all code examples"
"Explain this section in simpler terms"

Redirect Mode

Enable Redirect Mode in Settings to:

Take a screenshot and automatically open the provider's website (OpenAI ChatGPT or Grok) with the image in your clipboard
No API keys required — uses your existing logged-in session
The image is automatically copied to clipboard and attempts to paste into the chat interface

Note: When Redirect Mode is enabled, no analysis result panel is shown in-page since the provider handles analysis in their web interface.

⚙️ Configuration

Local AI (Ollama) — Recommended for Privacy

Install Ollama
- Download from ollama.com
- Works on macOS, Linux, and Windows (native support for Apple Silicon)

Pull Vision Model (for OCR)

ollama pull qwen3-vl:4b
# or any other vision model like llava, minicpm-v, moondream

Pull Text Model (for analysis)

ollama pull qwen3-coder:480b-cloud
# or llama3, mistral, codellama, deepseek-coder, etc.

Configure Extension
- Click extension icon → Settings (⚙️)
- Select Vision Provider: Ollama (Local)
- Select Text Provider: Ollama (Local)
- Choose your models from the dropdown

Tip: Ollama model names are case-sensitive. Use exact names as shown in ollama list.

Cloud AI Providers (Optional)

If you prefer cloud-based models, configure these in Settings:

Provider	Use Case	Get API Key
Ollama Cloud	Vision + Text	ollama.com
OpenAI	Vision + Text	platform.openai.com
Grok (xAI)	Vision + Text	console.x.ai
Google Gemini	Vision + Text	ai.google.dev
Google Cloud Vision	OCR (vision only)	Google Cloud Console

API Keys Security:

All keys are stored locally in your browser via chrome.storage.local
Keys are only sent to the respective provider's API endpoints
No data is routed through third-party intermediaries or developer servers

Redirect Mode Setup

Enable "Redirect Mode" in Settings
Select OpenAI or Grok as your Vision Provider (web providers only)
Take a screenshot — it will automatically open the provider's website with the image in your clipboard
The image attempts to auto-paste into the chat interface, or you can paste manually (Ctrl/Cmd+V)

Note: Redirect Mode works with your existing browser session — no API keys needed. Just make sure you're already logged in to the provider's website.

🔒 Privacy & Security

What Gets Stored

Screenshots — Temporarily in browser memory during processing, then discarded
API Keys — Stored locally in chrome.storage.local (encrypted by Chrome)
User Settings — Stored locally on your device
Conversation History — Temporarily held in-page memory for follow-up questions

What Gets Shared

With Ollama (Local) — Nothing. All processing happens on your machine via localhost:11434.
With Ollama Cloud — Screenshot image sent directly to ollama.com API endpoints.
With Cloud Providers — Only the screenshot image data for analysis. No metadata, browsing history, or user identifiers.
With Redirect Mode — Screenshot is copied to your clipboard and the provider's website opens in a new tab. The extension doesn't send any data directly.

What Does NOT Get Shared

❌ No analytics or tracking
❌ No user identification or telemetry
❌ No data sent to developer servers
❌ No browsing history or tab contents (only captured screenshot)
❌ No data shared with third parties beyond your chosen AI provider

Compliance Notes

SSN patterns are automatically redacted from extracted text
Suitable for confidential documents, legal materials, and sensitive content (when using local Ollama)

🛠️ Tech Stack

Frontend: Vanilla JavaScript, HTML5, CSS3
Extension: Chrome Manifest V3
AI Integration:
- Ollama (local inference)
- OpenAI API
- Grok API (xAI)
- Google Gemini API
- Google Cloud Vision API

🏗️ Development

Project Structure

screengrab/
├── manifest.json          # Extension configuration
├── background.js          # Service worker
├── popup.js/html          # Extension popup UI
├── options.js/html        # Settings page
├── ai-service.js          # AI API client
├── ai-service-multimodal.js  # Multimodal + redirect mode
├── capture-queue.js       # Capture state management
├── selector.js            # Area selection UI
├── result-display.js      # Results display component
├── floating-icon.js       # Floating capture button
├── progress-indicator.js  # Progress UI
└── utils.js               # Helper functions

Building for Production

Update version in manifest.json
Test all capture modes and AI providers
Package the extension folder
Submit to Chrome Web Store

🔄 Version History

Version 2.1.0 (March 2026)

Fixed: Area selection now works on first click (no double-click required)
Fixed: Ollama models now properly load and display in the model dropdown
Fixed: Follow-up questions now render markdown correctly
Improved: Area selection overlay now has 400ms grace period to prevent accidental clicks
Improved: Storage-based queue architecture for more reliable capture processing
Improved: Floating progress indicator with real-time status updates
Changed: Redirect Mode description clarified — "Opens the AI provider's website... Uses your own logged-in session"

Version 2.0.0 (March 2026)

Added: Grok (xAI) provider support
Removed: Anthropic (Claude) provider
Fixed: Redirect mode no longer shows Analysis Result panel
Improved: Auto-paste functionality for Grok
Changed: Default redirect provider to use selected API provider

Version 1.1.0

Initial public release

🤝 Contributing

Contributions are welcome! Here's how to help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Ideas

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Ollama — Local AI inference
Chrome Extension Documentation
The open-source AI community

📧 Contact

Issues: GitHub Issues
Discussions: GitHub Discussions

Built with ❤️ for privacy-first AI

If you find this useful, consider ⭐ starring the repository!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ai-service-multimodal.js		ai-service-multimodal.js
ai-service.js		ai-service.js
background.js		background.js
capture-queue.js		capture-queue.js
floating-icon.js		floating-icon.js
icon128.png		icon128.png
icon16.png		icon16.png
icon48.png		icon48.png
manifest.json		manifest.json
offscreen.html		offscreen.html
options.css		options.css
options.html		options.html
options.js		options.js
popup.html		popup.html
popup.js		popup.js
progress-indicator.js		progress-indicator.js
result-display.css		result-display.css
result-display.js		result-display.js
selector.css		selector.css
selector.js		selector.js
utils.js		utils.js

Folders and files

Latest commit

History

Repository files navigation

ScreenGrab AI

✨ Features

📸 Screenshots

🚀 Installation

Option 1: Install from Chrome Web Store (Coming Soon)

Option 2: Manual Installation (Recommended for Development)

📖 Usage

Basic Workflow

Follow-Up Questions

Redirect Mode

⚙️ Configuration

Local AI (Ollama) — Recommended for Privacy

Cloud AI Providers (Optional)

Redirect Mode Setup

🔒 Privacy & Security

What Gets Stored

What Gets Shared

What Does NOT Get Shared

Compliance Notes

🛠️ Tech Stack

🏗️ Development

Project Structure

Building for Production

🔄 Version History

Version 2.1.0 (March 2026)

Version 2.0.0 (March 2026)

Version 1.1.0

🤝 Contributing

Development Ideas

📝 License

🙏 Acknowledgments

📧 Contact

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages