Skip to content

Harperbot/gemini-to-im

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

17 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Gemini-to-IM ๐Ÿค–๐Ÿ’ฌ

English | ็น้ซ”ไธญๆ–‡

A lightweight, standalone bridge connecting the Google Gemini API to Instant Messaging platforms (like Telegram) with real-time streaming and interactive tool-use approvals.

Inspired by Claude-to-IM, this project brings the robust CLI/terminal AI experience directly to your mobile phone.


๐Ÿ“ฃ Quick Share (For Social Media)

Ever wanted to turn Google Gemini into a personal AI assistant that can actually control your computer?

I've open-sourced gemini-to-im, a tool that connects Gemini to your Telegram and gives it superpowers. You can ask it to find available parking spots in Taipei, check the surf conditions in Kenting, or even run custom scripts on your machineโ€”all from your phone.

It's free, open-source, and all your conversation data stays on your own hardware for maximum privacy. Build your own super-powered AI assistant today!

GitHub Link: https://github.com/Harperbot/gemini-to-im

#Gemini #AIassistant #OpenSource #TelegramBot #DIY


๐Ÿ†š Gemini App vs. Gemini-to-IM: What's the Difference?

Think of the official Gemini App on your phone as a premium taxi service โ€” powerful, convenient, but limited to what the driver (Google) can do. Your gemini-to-im is like your personal, customizable car parked in your garage.

Feature ๐Ÿ“ฑ Official Gemini App ๐Ÿ’ป Your gemini-to-im
Core AI Runs on Google's cloud servers. Runs on your computer, connects to Google's cloud AI.
Superpowers Integrates with Google services (Gmail, Docs). Controls your computer! Can:
โ€ข Find parking (runs Python scripts on your machine)
โ€ข Check surf forecasts (runs Python scripts on your machine)
โ€ข Execute any program on your system (with your explicit approval).
Privacy & Data Your conversations are stored on Google's cloud servers. Your conversations are stored locally on your computer (sessions.json).
Customization Limited to app settings. Infinitely customizable. You (or Gemini itself!) can add new tools, modify its behavior, and tailor it to your exact needs.
Access Available anywhere your phone has internet. Accessible via Telegram wherever your host computer is running and connected to the internet.
Ideal for General users seeking quick AI info. AI enthusiasts & developers who want a personalized AI assistant that can interact with and automate their local system.

โœจ Features

  • Typing Effect (Streaming): Streams Gemini's responses in real-time by continuously editing the IM message, providing a native "typewriter" experience without hitting API limits.
  • Interactive Tool Approvals: When Gemini decides to use a registered "dangerous" tool (e.g., executing a local shell command), it pauses execution and sends an inline keyboard with [โœ… Allow] and [โŒ Reject] buttons to your chat.
  • Standalone & Lightweight: No bulky agent frameworks required. It relies purely on Node.js, the official @google/generative-ai SDK, and the Telegram Bot API.
  • Persistent Memory: Automatically saves your conversation history to sessions.json. Even if the server restarts, Gemini remembers who you are and what you talked about.
  • Security Whitelisting: Restrict access to specific users via ALLOWED_USER_IDS to protect your API quota and system.
  • Auto-Chunking & Fallback: Automatically splits long responses (>4000 chars) into multiple messages and falls back to plain text if Markdown parsing fails, ensuring reliable delivery.
  • Built-in Localized Tools: Comes pre-packaged with powerful tools (currently tailored for Taiwan, set via LOCALIZATION=TW in .env):
    • ๐Ÿ…ฟ๏ธ Real-time Parking Query: Instantly finds available parking spots near a location or Google Maps link.
    • ๐Ÿ„ Surf Spot Weather: Gets real-time tides, wind conditions, and typhoon updates for Taiwan surf spots.
  • Rate Limit Handling: Built-in throttler (debounce/throttle) prevents hitting Telegram's strict message editing rate limits during fast streaming.
  • Modular Adapter Architecture: The core Gemini logic is decoupled from the IM platform. You can easily plug in your own adapters for Discord, LINE, or Slack.

๐Ÿ’ก Why this project?

Existing bots are often bloated with databases and complex setups. Gemini-to-IM focuses on simplicity:

  • Purely Standalone: No databases, no complicated frameworks. Just one index.js and you're good to go.
  • Deep Gemini Integration: Specifically tuned for Google's Generative AI SDK, ensuring smooth streaming and reliable Function Calling.
  • Developer-Friendly: A perfect boilerplate for developers looking to understand how to bridge LLM tools with mobile apps via Telegram.

๐Ÿ–ฅ๏ธ System Requirements

This bridge is designed to be extremely lightweight and can run on almost any modern system:

  • OS: Linux (Ubuntu, Debian, etc.), macOS, or Windows (via WSL2 recommended).
  • Environment:
    • Node.js: v18.0.0 or higher.
    • Python: v3.9 or higher (required for local tools like parking/surf query).
  • Hardware: Can run on low-resource devices like a Raspberry Pi, a home server, or even a free-tier VPS (Oracle Cloud, AWS, etc.).

๐Ÿ’ฐ Cost & Hosting

For most personal use cases, this entire system can be run for free.

  • Gemini API: Google provides a generous free tier (e.g., millions of tokens per month) which is more than enough for personal use.
  • Telegram API: Free.
  • Localized Tools (TDX/CWA): Free public data APIs.
  • Hosting: The bridge is extremely lightweight. You can host it on:
    • Existing Hardware: A Raspberry Pi, an old laptop, or your home server.
    • Cloud Free Tiers: Services like Oracle Cloud's "Always Free" Ampere instances or AWS/GCP free tiers are perfect.

๐Ÿ” Security Considerations

  • Encrypted Communication: All traffic between the bridge and the Google/Telegram APIs is secured via standard HTTPS.

  • Public Wi-Fi: If you run this bridge on a laptop in a public network (e.g., a coffee shop), it is strongly recommended to use a VPN to encrypt all your system traffic.

  • Remote Access (Advanced): If you host the bridge on a cloud VPS but want it to trigger scripts on your home computer, you need a secure tunnel. Instead of exposing your home network, we recommend using a zero-trust network solution like Tailscale.

    Quick Tailscale Setup:

    # On macOS
    brew install tailscale
    sudo tailscale up
    
    # On Linux
    curl -fsSL https://tailscale.com/install.sh | sh
    sudo tailscale up

    This creates a secure virtual private network, allowing your cloud instance to safely communicate with your home devices as if they were in the same room.

๐Ÿ”‘ Account Requirements

To use this bridge, you will need the following accounts and API keys:

Mandatory (Core AI Features)

  1. Google AI Studio: Obtain your Gemini API Key. (Free tier available)
  2. Telegram: Message @BotFather to create a new bot and get your Telegram Bot Token.

Optional (For Taiwan Localized Tools)

  • TDX (Transport Data eXchange): Register at tdx.transportdata.tw to get your Client ID and Secret for Parking Queries.
  • CWA Open Data: Register at opendata.cwa.gov.tw to get your API Key for Surf Spot Weather.

๐Ÿš€ Getting Started (Step-by-Step for Beginners)

If you have never used a CLI or programmed before, follow these steps:

1. Prepare your environment

  • Install Node.js: Download and install the "LTS" version from nodejs.org.
  • Install Python: Download and install from python.org. (Required only if you use Taiwan tools).

2. Get your Keys (It's Free!)

  • Gemini Key: Go to Google AI Studio and click "Create API key".
  • Telegram Token: Search for @BotFather in Telegram, send /newbot, and follow instructions to get your HTTP API Token.
  • Your Chat ID: Search for @userinfobot in Telegram and send it a message to get your personal ID (a numbers string).

3. Setup the Project

  1. Download this project as a ZIP and unzip it (or use git clone).
  2. Open your terminal (Command Prompt on Windows, Terminal on Mac).
  3. Type cd (with a space) and drag your project folder into the terminal window, then hit Enter.
  4. Run: npm install
  5. Create a file named .env in the folder (see .env.example for format) and paste your keys.

๐Ÿš€ Installation (Advanced Users)

1. Clone the repository

git clone https://github.com/yourusername/gemini-to-im.git
cd gemini-to-im

2. Install dependencies

npm install

3. Configuration

Copy the environment template:

cp .env.example .env

Edit .env with your API keys:

TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
GEMINI_API_KEY=your_google_gemini_api_key_here

# Enable localized tools (Currently supports: TW)
LOCALIZATION=TW

4. Run the Bridge

Run manually for testing:

node index.js

Or run in the background using PM2:

npm install -g pm2
pm2 start ecosystem.config.js
pm2 save

๐ŸŽฎ Usage

Once running, simply open your Telegram Bot and start chatting.

To test the Interactive Approvals feature, ask Gemini:

"Please run a test shell command for me, like ls -la."

Gemini will attempt to call the run_shell_command tool, triggering the bot to send you an approval request with inline buttons. Once you click "Allow", the mock result is sent back to Gemini to continue the conversation.

๐Ÿ—๏ธ How to Add Real Commands

Currently, the run_shell_command tool in index.js returns a mocked output for safety. If you wish to execute real system commands, you can integrate Node's child_process.exec inside the callback_query handler. Do this at your own risk and ensure your bot is strictly private!

๐Ÿ”Œ Building Your Own Adapter (Discord / LINE / Slack)

The system is designed with a modular mindset. The core logic handles Gemini streaming, memory, and tools, while the "Adapter" handles receiving and sending messages.

To support LINE or Discord, you don't need to rewrite the AI logic. Simply:

  1. Look at the Telegram implementation (index.js).
  2. Swap out node-telegram-bot-api for discord.js or @line/bot-sdk.
  3. Map your platform's incoming message event to the Gemini session handler.
  4. (For LINE) Since LINE does not support real-time message editing (streaming), you can accumulate the chunks and send them as a single replyMessage once the stream ends.

๐Ÿค Contributing

Pull requests are welcome! Feel free to add adapters for Discord, Slack, or Line.

๐Ÿ“œ License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors