AI-powered screenshot capture and analysis tool that runs locally on your machine.
Capture screenshots, extract text with OCR, and get AI-powered insightsโwithout sending your data to the cloud.
-
Multiple Capture Modes
- ๐ธ Visible Tab โ Capture what's currently on screen
- ๐ Full Page โ Auto-scrolls and stitches the entire page
- โ๏ธ Area Selection โ Draw a rectangle to capture a specific region
-
AI-Powered Analysis
- OCR text extraction from screenshots
- Contextual understanding and analysis
- Follow-up questions for deeper exploration
-
Privacy-First Architecture
- Run AI models locally via Ollama โ works completely offline
- No data leaves your device
- Optional cloud providers: OpenAI, Grok, Google Gemini, Google Cloud Vision
-
Redirect Mode
- Open screenshots directly in ChatGPT or Grok's web interface
- Image automatically copied to clipboard
- Auto-paste functionality for seamless workflow
-
User-Friendly
- Floating capture icon on every page for quick access
- Floating progress indicator during analysis
- Markdown-formatted results with tables, code blocks, and structured output
- In-page result display with follow-up question support
[Coming soon - Demo of capture modes, UI, and AI analysis results]
Will be available once published.
-
Download or Clone this Repository
git clone https://github.com/nuelcyoung/screengrab.git cd screengrab -
Open Chrome Extension Management
- In Google Chrome, navigate to
chrome://extensions/ - Or: Chrome Menu (โฎ) โ More Tools โ Extensions
- In Google Chrome, navigate to
-
Enable Developer Mode
- Toggle the "Developer mode" switch in the top-right corner
-
Load the Extension
- Click the "Load unpacked" button
- Select the
screengrabfolder (the folder containingmanifest.json)
-
Verify Installation
- You should see "ScreenGrab AI" in your extensions list
- The extension icon will appear in your Chrome toolbar
- Open any webpage you want to capture
- Click the extension icon or use the floating icon on the page
- Choose a capture mode:
- Visible โ Capture what you see
- Full Page โ Capture the entire page (auto-scrolls)
- Select Area โ Draw a rectangle around what you want
- Wait for AI analysis โ Floating progress indicator shows real-time status
- View results โ Text extraction and AI insights displayed in-page with follow-up question support
After capturing, you can ask follow-up questions about the content:
- "Summarize the key points"
- "What design patterns are mentioned?"
- "Extract all code examples"
- "Explain this section in simpler terms"
Enable Redirect Mode in Settings to:
- Take a screenshot and automatically open the provider's website (OpenAI ChatGPT or Grok) with the image in your clipboard
- No API keys required โ uses your existing logged-in session
- The image is automatically copied to clipboard and attempts to paste into the chat interface
Note: When Redirect Mode is enabled, no analysis result panel is shown in-page since the provider handles analysis in their web interface.
-
Install Ollama
- Download from ollama.com
- Works on macOS, Linux, and Windows (native support for Apple Silicon)
-
Pull Vision Model (for OCR)
ollama pull qwen3-vl:4b # or any other vision model like llava, minicpm-v, moondream -
Pull Text Model (for analysis)
ollama pull qwen3-coder:480b-cloud # or llama3, mistral, codellama, deepseek-coder, etc. -
Configure Extension
- Click extension icon โ Settings (โ๏ธ)
- Select Vision Provider: Ollama (Local)
- Select Text Provider: Ollama (Local)
- Choose your models from the dropdown
Tip: Ollama model names are case-sensitive. Use exact names as shown in
ollama list.
If you prefer cloud-based models, configure these in Settings:
| Provider | Use Case | Get API Key |
|---|---|---|
| Ollama Cloud | Vision + Text | ollama.com |
| OpenAI | Vision + Text | platform.openai.com |
| Grok (xAI) | Vision + Text | console.x.ai |
| Google Gemini | Vision + Text | ai.google.dev |
| Google Cloud Vision | OCR (vision only) | Google Cloud Console |
API Keys Security:
- All keys are stored locally in your browser via
chrome.storage.local - Keys are only sent to the respective provider's API endpoints
- No data is routed through third-party intermediaries or developer servers
- Enable "Redirect Mode" in Settings
- Select OpenAI or Grok as your Vision Provider (web providers only)
- Take a screenshot โ it will automatically open the provider's website with the image in your clipboard
- The image attempts to auto-paste into the chat interface, or you can paste manually (Ctrl/Cmd+V)
Note: Redirect Mode works with your existing browser session โ no API keys needed. Just make sure you're already logged in to the provider's website.
- Screenshots โ Temporarily in browser memory during processing, then discarded
- API Keys โ Stored locally in
chrome.storage.local(encrypted by Chrome) - User Settings โ Stored locally on your device
- Conversation History โ Temporarily held in-page memory for follow-up questions
- With Ollama (Local) โ Nothing. All processing happens on your machine via
localhost:11434. - With Ollama Cloud โ Screenshot image sent directly to
ollama.comAPI endpoints. - With Cloud Providers โ Only the screenshot image data for analysis. No metadata, browsing history, or user identifiers.
- With Redirect Mode โ Screenshot is copied to your clipboard and the provider's website opens in a new tab. The extension doesn't send any data directly.
- โ No analytics or tracking
- โ No user identification or telemetry
- โ No data sent to developer servers
- โ No browsing history or tab contents (only captured screenshot)
- โ No data shared with third parties beyond your chosen AI provider
- SSN patterns are automatically redacted from extracted text
- Suitable for confidential documents, legal materials, and sensitive content (when using local Ollama)
- Frontend: Vanilla JavaScript, HTML5, CSS3
- Extension: Chrome Manifest V3
- AI Integration:
- Ollama (local inference)
- OpenAI API
- Grok API (xAI)
- Google Gemini API
- Google Cloud Vision API
screengrab/
โโโ manifest.json # Extension configuration
โโโ background.js # Service worker
โโโ popup.js/html # Extension popup UI
โโโ options.js/html # Settings page
โโโ ai-service.js # AI API client
โโโ ai-service-multimodal.js # Multimodal + redirect mode
โโโ capture-queue.js # Capture state management
โโโ selector.js # Area selection UI
โโโ result-display.js # Results display component
โโโ floating-icon.js # Floating capture button
โโโ progress-indicator.js # Progress UI
โโโ utils.js # Helper functions
- Update version in
manifest.json - Test all capture modes and AI providers
- Package the extension folder
- Submit to Chrome Web Store
- Fixed: Area selection now works on first click (no double-click required)
- Fixed: Ollama models now properly load and display in the model dropdown
- Fixed: Follow-up questions now render markdown correctly
- Improved: Area selection overlay now has 400ms grace period to prevent accidental clicks
- Improved: Storage-based queue architecture for more reliable capture processing
- Improved: Floating progress indicator with real-time status updates
- Changed: Redirect Mode description clarified โ "Opens the AI provider's website... Uses your own logged-in session"
- Added: Grok (xAI) provider support
- Removed: Anthropic (Claude) provider
- Fixed: Redirect mode no longer shows Analysis Result panel
- Improved: Auto-paste functionality for Grok
- Changed: Default redirect provider to use selected API provider
- Initial public release
Contributions are welcome! Here's how to help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Support for more AI providers
- Export results to PDF/Markdown
- Batch capture multiple areas
- Custom prompt templates
- Dark mode for results display
- Keyboard shortcuts
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama โ Local AI inference
- Chrome Extension Documentation
- The open-source AI community
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with โค๏ธ for privacy-first AI
If you find this useful, consider โญ starring the repository!