English | ็น้ซไธญๆ
A lightweight, standalone bridge connecting the Google Gemini API to Instant Messaging platforms (like Telegram) with real-time streaming and interactive tool-use approvals.
Inspired by Claude-to-IM, this project brings the robust CLI/terminal AI experience directly to your mobile phone.
Ever wanted to turn Google Gemini into a personal AI assistant that can actually control your computer?
I've open-sourced gemini-to-im, a tool that connects Gemini to your Telegram and gives it superpowers. You can ask it to find available parking spots in Taipei, check the surf conditions in Kenting, or even run custom scripts on your machineโall from your phone.
It's free, open-source, and all your conversation data stays on your own hardware for maximum privacy. Build your own super-powered AI assistant today!
GitHub Link:
https://github.com/Harperbot/gemini-to-im
#Gemini #AIassistant #OpenSource #TelegramBot #DIY
Think of the official Gemini App on your phone as a premium taxi service โ powerful, convenient, but limited to what the driver (Google) can do. Your gemini-to-im is like your personal, customizable car parked in your garage.
| Feature | ๐ฑ Official Gemini App | ๐ป Your gemini-to-im |
|---|---|---|
| Core AI | Runs on Google's cloud servers. | Runs on your computer, connects to Google's cloud AI. |
| Superpowers | Integrates with Google services (Gmail, Docs). | Controls your computer! Can: โข Find parking (runs Python scripts on your machine) โข Check surf forecasts (runs Python scripts on your machine) โข Execute any program on your system (with your explicit approval). |
| Privacy & Data | Your conversations are stored on Google's cloud servers. | Your conversations are stored locally on your computer (sessions.json). |
| Customization | Limited to app settings. | Infinitely customizable. You (or Gemini itself!) can add new tools, modify its behavior, and tailor it to your exact needs. |
| Access | Available anywhere your phone has internet. | Accessible via Telegram wherever your host computer is running and connected to the internet. |
| Ideal for | General users seeking quick AI info. | AI enthusiasts & developers who want a personalized AI assistant that can interact with and automate their local system. |
- Typing Effect (Streaming): Streams Gemini's responses in real-time by continuously editing the IM message, providing a native "typewriter" experience without hitting API limits.
- Interactive Tool Approvals: When Gemini decides to use a registered "dangerous" tool (e.g., executing a local shell command), it pauses execution and sends an inline keyboard with
[โ Allow]and[โ Reject]buttons to your chat. - Standalone & Lightweight: No bulky agent frameworks required. It relies purely on Node.js, the official
@google/generative-aiSDK, and the Telegram Bot API. - Persistent Memory: Automatically saves your conversation history to
sessions.json. Even if the server restarts, Gemini remembers who you are and what you talked about. - Security Whitelisting: Restrict access to specific users via
ALLOWED_USER_IDSto protect your API quota and system. - Auto-Chunking & Fallback: Automatically splits long responses (>4000 chars) into multiple messages and falls back to plain text if Markdown parsing fails, ensuring reliable delivery.
- Built-in Localized Tools: Comes pre-packaged with powerful tools (currently tailored for Taiwan, set via
LOCALIZATION=TWin.env):๐ ฟ๏ธ Real-time Parking Query: Instantly finds available parking spots near a location or Google Maps link.- ๐ Surf Spot Weather: Gets real-time tides, wind conditions, and typhoon updates for Taiwan surf spots.
- Rate Limit Handling: Built-in throttler (debounce/throttle) prevents hitting Telegram's strict message editing rate limits during fast streaming.
- Modular Adapter Architecture: The core Gemini logic is decoupled from the IM platform. You can easily plug in your own adapters for Discord, LINE, or Slack.
Existing bots are often bloated with databases and complex setups. Gemini-to-IM focuses on simplicity:
- Purely Standalone: No databases, no complicated frameworks. Just one
index.jsand you're good to go. - Deep Gemini Integration: Specifically tuned for Google's Generative AI SDK, ensuring smooth streaming and reliable Function Calling.
- Developer-Friendly: A perfect boilerplate for developers looking to understand how to bridge LLM tools with mobile apps via Telegram.
This bridge is designed to be extremely lightweight and can run on almost any modern system:
- OS: Linux (Ubuntu, Debian, etc.), macOS, or Windows (via WSL2 recommended).
- Environment:
- Node.js: v18.0.0 or higher.
- Python: v3.9 or higher (required for local tools like parking/surf query).
- Hardware: Can run on low-resource devices like a Raspberry Pi, a home server, or even a free-tier VPS (Oracle Cloud, AWS, etc.).
For most personal use cases, this entire system can be run for free.
- Gemini API: Google provides a generous free tier (e.g., millions of tokens per month) which is more than enough for personal use.
- Telegram API: Free.
- Localized Tools (TDX/CWA): Free public data APIs.
- Hosting: The bridge is extremely lightweight. You can host it on:
- Existing Hardware: A Raspberry Pi, an old laptop, or your home server.
- Cloud Free Tiers: Services like Oracle Cloud's "Always Free" Ampere instances or AWS/GCP free tiers are perfect.
-
Encrypted Communication: All traffic between the bridge and the Google/Telegram APIs is secured via standard HTTPS.
-
Public Wi-Fi: If you run this bridge on a laptop in a public network (e.g., a coffee shop), it is strongly recommended to use a VPN to encrypt all your system traffic.
-
Remote Access (Advanced): If you host the bridge on a cloud VPS but want it to trigger scripts on your home computer, you need a secure tunnel. Instead of exposing your home network, we recommend using a zero-trust network solution like Tailscale.
Quick Tailscale Setup:
# On macOS brew install tailscale sudo tailscale up # On Linux curl -fsSL https://tailscale.com/install.sh | sh sudo tailscale up
This creates a secure virtual private network, allowing your cloud instance to safely communicate with your home devices as if they were in the same room.
To use this bridge, you will need the following accounts and API keys:
- Google AI Studio: Obtain your Gemini API Key. (Free tier available)
- Telegram: Message @BotFather to create a new bot and get your Telegram Bot Token.
- TDX (Transport Data eXchange): Register at tdx.transportdata.tw to get your
Client IDandSecretfor Parking Queries. - CWA Open Data: Register at opendata.cwa.gov.tw to get your
API Keyfor Surf Spot Weather.
If you have never used a CLI or programmed before, follow these steps:
- Install Node.js: Download and install the "LTS" version from nodejs.org.
- Install Python: Download and install from python.org. (Required only if you use Taiwan tools).
- Gemini Key: Go to Google AI Studio and click "Create API key".
- Telegram Token: Search for
@BotFatherin Telegram, send/newbot, and follow instructions to get your HTTP API Token. - Your Chat ID: Search for
@userinfobotin Telegram and send it a message to get your personal ID (a numbers string).
- Download this project as a ZIP and unzip it (or use
git clone). - Open your terminal (Command Prompt on Windows, Terminal on Mac).
- Type
cd(with a space) and drag your project folder into the terminal window, then hit Enter. - Run:
npm install - Create a file named
.envin the folder (see.env.examplefor format) and paste your keys.
git clone https://github.com/yourusername/gemini-to-im.git
cd gemini-to-imnpm installCopy the environment template:
cp .env.example .envEdit .env with your API keys:
TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
GEMINI_API_KEY=your_google_gemini_api_key_here
# Enable localized tools (Currently supports: TW)
LOCALIZATION=TWRun manually for testing:
node index.jsOr run in the background using PM2:
npm install -g pm2
pm2 start ecosystem.config.js
pm2 saveOnce running, simply open your Telegram Bot and start chatting.
To test the Interactive Approvals feature, ask Gemini:
"Please run a test shell command for me, like ls -la."
Gemini will attempt to call the run_shell_command tool, triggering the bot to send you an approval request with inline buttons. Once you click "Allow", the mock result is sent back to Gemini to continue the conversation.
Currently, the run_shell_command tool in index.js returns a mocked output for safety. If you wish to execute real system commands, you can integrate Node's child_process.exec inside the callback_query handler. Do this at your own risk and ensure your bot is strictly private!
The system is designed with a modular mindset. The core logic handles Gemini streaming, memory, and tools, while the "Adapter" handles receiving and sending messages.
To support LINE or Discord, you don't need to rewrite the AI logic. Simply:
- Look at the Telegram implementation (
index.js). - Swap out
node-telegram-bot-apifordiscord.jsor@line/bot-sdk. - Map your platform's incoming message event to the Gemini session handler.
- (For LINE) Since LINE does not support real-time message editing (streaming), you can accumulate the chunks and send them as a single
replyMessageonce the stream ends.
Pull requests are welcome! Feel free to add adapters for Discord, Slack, or Line.
MIT