AI chat with Gemini streaming, function calling, RAG, image generation, and MCP integration.
- Streaming Responses: Real-time text generation via Server-Sent Events (SSE)
- Function Calling: Gemini calls Drive tools, MCP tools, RAG/File Search, and Google Search
- Drive Tool Integration: Read, search, list, create, update, and rename Drive files from chat
- MCP Tools: Dynamically-discovered tools from MCP servers (prefixed
mcp_{serverId}_{tool}) - RAG / Web Search: Retrieval-Augmented Generation via Gemini File Search, or Google Search mode
- Extended Thinking: Collapsible thinking/reasoning display for supported models
- Image Generation: Generate images with Imagen-capable models
- Chat History: Auto-saved to Google Drive with optional encryption
- Slash Commands:
/commandwith template variables and per-command overrides - File References:
@filenameto reference Drive files in messages - Attachments: Image, PDF, and text file attachments via drag-and-drop or file picker
Chat uses SSE-compatible chunk types. The execution path depends on the user's API plan:
- Free plan (Chat API): Executes locally in the browser via
executeLocalChat, calling the Gemini Chat API (ai.chats.create) directly with a cached API key. Tool calls are executed locally in the same process. - Paid plan (Interactions API): Uses the Gemini Interactions API (
ai.interactions.create) via a server-side proxy (/api/chat/interactions). The server streams events to the client. When tool calls are needed, the server sends arequires_actionchunk; the client executes tools locally (preserving local-first), then POSTs results back to continue the interaction. Conversation state is chained viaprevious_interaction_id(stored asinteractionIdonMessage).
The legacy server-side /api/chat SSE endpoint exists as a fallback.
| Type | Description |
|---|---|
text |
Incremental text content |
thinking |
Extended thinking / reasoning content |
tool_call |
Function call (name + args) |
tool_result |
Function call result |
rag_used |
RAG sources used in response |
web_search_used |
Web search sources used |
image_generated |
Base64-encoded generated image |
mcp_app |
MCP tool UI metadata |
drive_file_created |
Drive file was created (triggers file tree refresh) |
drive_file_updated |
Drive file was updated locally (triggers editor refresh) |
requires_action |
Interactions API only: server needs tool results from client |
error |
Error message |
done |
Stream complete (includes interactionId for Interactions API) |
Free plan (local execution):
- Call
executeLocalChatwhich streams from Gemini API directly in the browser - Parse chunks, accumulate text/thinking/toolCalls
- On
drive_file_created→ update local sync meta, dispatchtree-meta-updated(refreshes file tree) - On
drive_file_updated→ save to local cache + edit history, dispatchfile-modified/file-restored(refreshes editor) - On
done→ build finalMessageobject and save to history
Paid plan (Interactions API multi-round):
- POST to
/api/chat/interactionswith messages, tools, and optionalpreviousInteractionId - Parse SSE chunks, accumulate text/thinking
- On
requires_action→ execute pending tool calls locally (same dispatchers as local execution: Drive tools via IndexedDB, MCP via/api/workflow/mcp-proxy, JS sandbox, skill workflows) - POST tool results back to
/api/chat/interactionswithcurrentInteractionId - Repeat until
done→ build finalMessage(withinteractionId) and save to history
When enabled, Gemini can call tools during chat. Tool execution happens within the local chat executor (or server-side for the SSE fallback).
| Tool | Description |
|---|---|
read_drive_file |
Read file content by ID |
search_drive_files |
Search by name or content, with optional folder filter |
list_drive_files |
List files and virtual folders |
create_drive_file |
Create a new file (path separators for virtual folders) |
update_drive_file |
Update existing file content |
rename_drive_file |
Rename a file by ID |
bulk_rename_drive_files |
Rename multiple files at once |
After create_drive_file, the file is created on Drive (an ID is needed), and a drive_file_created chunk is emitted. The client seeds the local cache and sync meta so the file tree refreshes.
After update_drive_file, the file is not written to Drive. A drive_file_updated chunk returns the new content to the client, which saves it to the local cache and edit history. The change is pushed to Drive on the next manual push.
| Mode | Tools Available |
|---|---|
all |
All 7 drive tools |
noSearch |
Read, create, update, rename, bulk rename only (no search/list) |
none |
No drive tools |
Mode is auto-constrained by model and RAG settings:
- Gemma 4 + Web Search: forced to
none(Gemma 4 cannot combine function calling with Web Search) - Web Search mode: forced to
none(incompatible with other tools — free plan only) - RAG enabled: function calling tools disabled (free plan only — the Chat API does not support fileSearch + functionDeclarations simultaneously)
Paid plan advantage: The Interactions API allows function tools + RAG + Web Search simultaneously. The above RAG/Web Search tool restrictions do not apply to paid plan users (except Gemma 4 + Web Search, which is a model-level limitation).
MCP tools are dynamically discovered from configured MCP servers. Tool names use the format mcp_{serverId}_{toolName}. MCP server selection is persisted to localStorage as server IDs.
| Setting | Default | Description |
|---|---|---|
maxFunctionCalls |
20 | Maximum tool calls per response |
functionCallWarningThreshold |
5 | Warn when remaining calls drop to this count |
When the limit is reached, Gemini receives a system message requesting a summary of gathered information.
Models are determined by the user's API plan (Free or Paid). Each model has different capabilities:
- Standard models: Streaming text + function calling + thinking
- Image models: Image generation (no function calling)
- Gemma 4 models: Function calling + built-in thinking (always on, thinking config parameters not supported). Cannot combine function calling with Web Search
- Flash Lite: When thinking is enabled, uses
thinkingBudget: -1(no explicit limit) - gemini-3-pro / gemini-3.1-pro models: Thinking is required and cannot be disabled (thinkingBudget cannot be set to 0)
Model selection is per-chat via the dropdown. Slash commands can override the model.
Select a RAG store from the dropdown. Gemini uses Gemini File Search with configured store IDs. Results include source attribution displayed as badges.
Select "Web Search" from the dropdown. Uses googleSearch tool. Incompatible with function calling and MCP tools (auto-disabled).
Configurable in settings (1-20, default 5). Controls how many search results are considered.
Type / to open command autocomplete. Commands provide:
| Feature | Description |
|---|---|
promptTemplate |
Text template sent as message |
| Template variables | {content} (active file), {selection} (editor selection) |
| Model override | Use a specific model for this command |
| Search setting override | Use specific RAG store or Web Search |
| Drive tool mode override | Control tool access per command |
| MCP server override | Enable specific MCP servers per command |
Type @ to open file mention autocomplete. @filename references are resolved before sending:
- Drive tools enabled: replaced with
[file: name, fileId: id](Gemini can read via tools) - Drive tools disabled: file content is fetched and inlined
When no explicit context ({content}, {selection}, @file) is provided, the currently open file's name and ID are appended automatically so Gemini can use read_drive_file if needed.
Drag-and-drop or click the paperclip button to attach files.
| Type | Formats |
|---|---|
| Image | image/* — sent as inline Base64 data |
application/pdf — sent as inline Base64 data |
|
| Text | Other file types — sent as inline text data (fallback) |
Attachments are included in the Gemini API request as inlineData parts.
When an image-capable model is selected (e.g., gemini-3.1-flash-image-preview), the chat switches to image generation mode:
- Uses
generateContent(not streaming chat) - Response can contain both text and images
- Images displayed inline with download and save-to-Drive buttons
- Save-to-Drive dispatches
sync-completeto refresh file tree
Chat histories are stored as JSON files in history/chats/ on Google Drive, named chat_{id}.json. Each chat has:
id: Unique chat identifiertitle: First message content (truncated to 50 chars)messages: Array ofMessageobjectscreatedAt/updatedAt: Timestamps
A _meta.json file in the chat history folder indexes all chats for fast listing.
When encryptChatHistory is enabled in settings, new chats are encrypted before saving to Drive. Encrypted chats are decrypted client-side using cached credentials or a password prompt.
| Action | Description |
|---|---|
| New Chat | Clear messages and start fresh |
| Select Chat | Load messages from Drive (decrypt if needed) |
| Delete Chat | Remove from Drive and history list |
| Auto-save | Saves after each assistant response (done chunk) |
Free plan (Chat API — browser-side):
Browser (ChatPanel) Gemini API
┌──────────────────┐ ┌──────────────┐
│ messages state │ executeLocalChat │ generateContent│
│ streaming state │◄────────────────────────►│ Stream │
│ tool call display │ (direct API call │ Function calls│
│ autocomplete │ with cached API key) └──────────────┘
│ chat history │
│ │──► Drive tools (IndexedDB local-first)
│ │──► MCP tools (/api/workflow/mcp-proxy)
└──────────────────┘
│
┌─────▼──────┐
│ IndexedDB │
│ cache │
│ editHistory│
└─────┬──────┘
│ Push
┌─────▼──────┐
│ Google Drive│
│ _sync-meta │
│ history/ │
└────────────┘
Paid plan (Interactions API — server proxy + local tool execution):
Browser (ChatPanel) Server Gemini API
┌──────────────────┐ ┌────────────────┐ ┌──────────────────┐
│ messages state │ POST │ /api/chat/ │ stream │ interactions. │
│ streaming state │──────────►│ interactions │◄────────►│ create() │
│ tool call display │◄── SSE ──│ (proxy only) │ │ (server-stored │
│ chat history │ └────────────────┘ │ conversation) │
│ │ └──────────────────┘
│ requires_action: │
│ execute locally │──► Drive tools (IndexedDB local-first)
│ POST results back│──► MCP tools (/api/workflow/mcp-proxy)
│ │──► JS sandbox, skill workflows
└──────────────────┘
The Interactions API endpoint does not support CORS, so browser-side calls are not possible. The server acts as a pure proxy — tool execution remains client-side (local-first). Conversation state is chained via previous_interaction_id, reducing token usage on long conversations.
| File | Role |
|---|---|
app/routes/api.chat.tsx |
Chat SSE API (server-side, legacy fallback) — streaming, tool dispatch |
app/routes/api.chat.interactions.tsx |
Interactions API SSE proxy (paid plan) — multi-round tool call protocol |
app/routes/api.chat.history.tsx |
Chat history CRUD (list, save, delete) |
app/hooks/useLocalChat.ts |
Browser-side chat execution (free plan) — calls Gemini Chat API directly |
app/hooks/useInteractionsChat.ts |
Interactions API client (paid plan) — multi-round SSE with local tool execution |
app/services/gemini-chat-core.ts |
Browser-compatible Gemini Chat API client (streaming, function calling, RAG, thinking, image generation) |
app/services/gemini-interactions.server.ts |
Server-only Interactions API wrapper (tool conversion, input building, stream translation) |
app/services/gemini-chat.server.ts |
Server-only re-export of gemini-chat-core.ts |
app/services/drive-tools.server.ts |
Drive tool definitions and execution |
app/services/drive-tool-definitions.ts |
Drive tool schema definitions (7 tools) |
app/services/chat-history.server.ts |
Chat history persistence (Drive + _meta.json) |
app/services/mcp-tools.server.ts |
MCP tool discovery and execution |
app/components/ide/ChatPanel.tsx |
Chat panel — state management, plan-based routing (paid→Interactions, free→local) |
app/components/chat/ChatInput.tsx |
Input area — model/RAG/tool selectors, autocomplete, attachments |
app/components/chat/MessageList.tsx |
Message list with streaming partial message |
app/components/chat/MessageBubble.tsx |
Message display — thinking, tool badges, images, markdown |
app/components/chat/AutocompletePopup.tsx |
Autocomplete popup UI |
app/hooks/useAutocomplete.ts |
Autocomplete logic (slash commands, file mentions, variables) |
app/types/chat.ts |
Chat type definitions (Message, StreamChunk, ToolCall, etc.) |
| Route | Method | Description |
|---|---|---|
/api/chat |
POST | Chat SSE stream with function calling (legacy fallback) |
/api/chat/interactions |
POST | Interactions API SSE proxy (paid plan, multi-round) |
/api/chat/history |
GET | List chat histories |
/api/chat/history |
POST | Save chat history |
/api/chat/history |
DELETE | Delete chat history |