An extensible AI assistant powered by OpenAI's GPT-4o that integrates with Google Workspace and other services. Features a modular agent architecture for handling specialized tasks like calendar management, email, contacts, and more.
- 🤖 GPT-4o powered personal assistant
- 💬 Rich Telegram interface with:
- Text messaging
- Voice transcription
- Image analysis and OCR
- Document processing
- 📅 Google Calendar integration
- 📧 Gmail integration
- 👥 Google Contacts management
- 📝 Google Docs integration
- 📁 File Management with scratchpad directory
- 🌤️ Weather information
- 🔍 Internet search capabilities
- ➗ Basic calculations
- 🌐 Language translation
- CLI interface for testing
The project uses a modular agent-based architecture where each capability is encapsulated in a specialized agent class. Key architectural components:
BaseOpenAIAgent: Abstract base class providing common OpenAI integration functionalityZodUtils: Utility for converting Zod schemas to OpenAI function calls, dramatically reducing boilerplate- Each agent extends BaseOpenAIAgent and defines its capabilities using Zod schemas
- Node.js 18+
- pnpm (
npm install -g pnpm) - Google Cloud Platform account
- OpenAI API key
- Telegram Bot Token (optional, for bot interface)
- Perplexity API key (for internet search capabilities)
- OpenWeather API key
Create a .env file in the project root:
# OpenAI core bot and tool calling framework
OPENAI_API_KEY=
# Weather information
OPENWEATHER_API_KEY=
# IP address lookup used for location awareness of bot not client (future improvement opportunity initial use is only local)
IPINFO_TOKEN=
# Internet search via Perplexity
PERPLEXITY_API_KEY=
# Telegram bot interface for easy voice and text input on the go
TELEGRAM_BOT_TOKEN=
-
Create a new GCP project
-
Enable the following APIs:
- Google Calendar API
- Gmail API
- Google People API (Contacts)
- Google Drive API
- Google Docs API
-
Configure OAuth Consent Screen:
- Set application type to "Desktop app"
- Add test users (required for development)
- Add required scopes:
https://www.googleapis.com/auth/contactshttps://www.googleapis.com/auth/calendarhttps://www.googleapis.com/auth/gmail.modifyhttps://www.googleapis.com/auth/documentshttps://www.googleapis.com/auth/drivehttps://www.googleapis.com/auth/drive.fileprofile
-
Create OAuth Client ID:
- Application type: Desktop
- Download client secret JSON
- Create
secretsdirectory in project root - Save client secret JSON in
secretsdirectory
Install dependencies
pnpm install
Setup Google authentication
pnpm setup-google-auth
The setup script will open your browser for OAuth authentication. After authorizing, the token will be saved in secrets/google-token.json.
pnpm run cli
pnpm run start
- Schedule, update, and delete meetings
- Search for meetings in date ranges
- Handles Google Meet integration
- Manages attendees and meeting details
- Search emails
- Read email content
- Create email drafts
- Support for HTML formatting
- Find contacts by name or email
- Update contact details
- Manage contact information (phone, email, organization)
- Create new documents
- Read document content
- Update existing documents
- Search across documents
- Get current weather conditions
- Retrieve weather forecasts
- Support for location-based weather
- Temperature, humidity, wind data
- Real-time internet searches
- Detailed or concise results
- Powered by Perplexity API
- Basic arithmetic operations
- Precise numerical calculations
- Error handling for invalid operations
- Safe file operations in a dedicated scratchpad directory
- Basic operations: create, read, update, delete
- Directory management and file organization
- File search and metadata retrieval
- Automatic source language detection and translation
- Format-preserving multilingual support
- Optimized for accuracy and consistency
Contributions are welcome! The modular agent architecture makes it easy to add new capabilities:
- Create a new agent class extending
BaseOpenAIAgent - Define capabilities using Zod schemas
- Implement the required functionality
- Add the agent to
PersonalAssistant
This project is licensed under the MIT License.