Ask questions about anything on your screen.
Gemini Lens is a desktop tool that lets you draw on your screen, capture it, and get AI-powered explanations about the content.
A demonstration of Gemini Lens analyzing a code snippet.
- Screen Overlay Canvas – Draw and annotate anywhere on your screen with a simple, transparent overlay.
- Multimodal AI Queries – Uses Google's Gemini Pro Vision to understand both your text prompts and screen captures.
- Context-Aware Analysis – Get explanations for code, text, images, or any visual element on your screen.
- Real-time Responses – AI-generated answers are streamed back with a typewriter effect for an interactive experience.
- Simple & Intuitive UI – A clean, minimal interface accessed via a right-click menu.
- Activate & Draw – Run the application to enable the transparent overlay. Use your mouse to draw a circle, arrow, or any annotation over the content you want to ask about.
- Ask with Screen – Right-click to open the context menu and select "ask with screen".
- Enter Your Prompt – A dialog box will appear. Type your question (e.g., "What does this function do?" or "Summarize this article").
- Get an Instant Answer – The tool captures your screen (with your drawings), sends it to the Gemini API along with your prompt, and displays the AI's response in a new window.
Follow these steps to get Gemini Lens running on your local machine.
- Python 3.9 or newer
- A Google AI API Key – You can get one from Google AI Studio
git clone https://github.com/lubaid-01/Gemini-Lens.git
cd gemini-lenspython -m venv venv
.\venv\Scripts\activatepython3 -m venv venv
source venv/bin/activate- Install Dependencies From requirements.txt:
pip install -r requirements.txtIf you don't have a requirements.txt file, you can create one with:
PyQt6
Pillow
google-generativeai
python-dotenvOr install manually:
pip install PyQt6 Pillow google-generativeai python-dotenv- Configure Your API Key
Create a file named .env in the root directory of the project and add:
GOOGLE_API_KEY="YOUR_API_KEY_HERE"Ensure your gemini.py script loads this key.
Run the main application script:
python main.py- Left-click & drag – Draw on the screen
- Right-click – Open the menu:
- Clear – Erase all drawings
- Minimize – Hide the application window -ask – Send a text-only prompt to the AI
- ask with screen – Capture the screen and send it with a prompt to the AI
- Quit – Close the application
- Built with the powerful PyQt6 framework.
- Image processing handled by Pillow.
- AI capabilities powered by Google's Gemini API.