-
Notifications
You must be signed in to change notification settings - Fork 0
Gemini #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
apmiller108
wants to merge
11
commits into
main
Choose a base branch
from
gemini
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Gemini #81
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a7ab6dd
Adds basic support for Gemini models
apmiller108 8f4fb9d
Fixes and updates
apmiller108 cd29099
Adds todos
apmiller108 168c628
Updates todos
apmiller108 1c422b3
Adds vendor
apmiller108 d42d32c
Updates max prompt length
apmiller108 6eeaf88
Updates todos
apmiller108 1753fee
Updates front and and controllers to support contexts by vendor
apmiller108 8a84fdc
Fixes arguments
apmiller108 79bbeac
Updates system message
apmiller108 e7f0716
Adds files before last user prompt
apmiller108 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| # Gemini API Adapter Specifications | ||
|
|
||
| This document outlines the plan to build an adapter to consume the Google Gemini API, achieving feature parity with the existing Anthropic API integration. | ||
|
|
||
| ## Goals | ||
|
|
||
| 1. **Generate Text:** Support text generation with Gemini models. [DONE] | ||
| 2. **Multimodal:** Support file uploads (images, PDFs) and usage in requests. [DONE] | ||
| 3. **Tool Calling:** Support function calling (tools). [DONE] | ||
| 4. **Streaming:** Support streaming responses. [DONE] | ||
|
|
||
| ## Architecture | ||
|
|
||
| The integration will follow the existing pattern used for Anthropic: | ||
| - **Namespace:** `Gemini` module in `lib/gemini.rb` and `lib/gemini/`. [DONE] | ||
| - **Client:** `Gemini::Client` to handle HTTP requests. [DONE] | ||
| - **Request Object:** `Gemini::InvokeModelRequest` to format the payload. [DONE] | ||
| - **Response Object:** `Gemini::InvokeModelResponse` and `Gemini::StreamResponse` to normalize outputs. [DONE] | ||
| - **Turns:** `Gemini::Turn` to format conversation history. [DONE] | ||
| - **Files:** `Gemini::FilesClient` for the Files API. [DONE] | ||
|
|
||
| ## Development Phases | ||
|
|
||
| ### Phase 1: Foundation & Text Generation | ||
|
|
||
| **Goal:** successfully generate text from a single prompt using a Gemini model. | ||
|
|
||
| - [x] Create `lib/gemini.rb` and `lib/gemini/client.rb`. | ||
| - [x] Implement `Gemini::Client#initialize` using `GEMINI_API_KEY`. | ||
| - [x] Define Gemini models in `lib/gemini.rb` (e.g., `gemini-3-flash-preview`, `gemini-2.5-pro`). | ||
| - [x] Update `GenerativeText::MODELS` to include Gemini models (vendor: `:google`). | ||
| - [x] Update `GenerativeText.client_for` to handle `:google` vendor. | ||
| - [x] Create `lib/gemini/invoke_model_request.rb` to format basic text prompts. | ||
| - [x] Create `lib/gemini/invoke_model_response.rb` to wrap the response. | ||
| - [x] Implement `Gemini::Client#invoke_model`. | ||
| - [x] Verify basic text generation in Rails console. | ||
|
|
||
| ### Phase 2: Multi-turn Conversations (Chat) | ||
|
|
||
| **Goal:** Support conversation history. | ||
|
|
||
| - [x] Create `lib/gemini/turn.rb`. | ||
| - [x] Implement `Gemini::Turn.for(request, turns:)` to format `GenerateTextRequest` and history into Gemini's `contents` format (`role`, `parts`). | ||
| - [x] Update `GenerateTextRequest#to_turn` to handle `:google` vendor. | ||
| - [x] Update `Gemini::InvokeModelRequest` to accept and format the full conversation history. | ||
| - [x] Verify multi-turn chat in Rails console. | ||
|
|
||
| ### Phase 3: Streaming | ||
|
|
||
| **Goal:** Support real-time response streaming. | ||
|
|
||
| - [x] Create `lib/gemini/stream_event.rb` to parse SSE chunks. | ||
| - [x] Create `lib/gemini/stream_response.rb` to aggregate chunks. | ||
| - [x] Implement `Gemini::Client#invoke_model_stream`. | ||
| - [x] Verify streaming works in the UI. | ||
|
|
||
| ### Phase 4: Multimodal & Files | ||
|
|
||
| **Goal:** Support attaching images and PDFs to prompts. | ||
|
|
||
| - [x] Create `lib/gemini/files_client.rb` to wrap Gemini Files API (`upload`, `get`, `delete`). | ||
| - [x] Add `upload_file` and `delete_file` methods to `lib/gemini.rb`. | ||
| - [x] Update `Gemini::Turn` to include file parts in the content. | ||
| - [x] Verify image/PDF analysis. | ||
|
|
||
| ### Phase 5: Tool Calling | ||
|
|
||
| **Goal:** Support defining and invoking tools. | ||
|
|
||
| - [x] Update `Gemini::InvokeModelRequest` to map `LlmTool` definitions to Gemini's `tools` -> `function_declarations` format. | ||
| - [x] Handle tool use responses in `Gemini::InvokeModelResponse`. | ||
| - [x] Verify the model can call tools (e.g., getting the weather, or whatever tools are defined). | ||
|
|
||
| ### Phase 6: Polish & Testing | ||
|
|
||
| **Goal:** Ensure code quality and stability. | ||
|
|
||
| - [x] Add RSpec tests for `Gemini::Client`. | ||
| - [x] Add RSpec tests for `Gemini::Turn` and `Gemini::InvokeModelRequest`. | ||
| - [x] Add VCR cassettes for API interactions (using WebMock stubs in this implementation). | ||
| - [x] Ensure error handling (map Gemini errors to `Gemini::ClientError` equivalents). | ||
|
|
||
| ## Reference: Data Structures | ||
|
|
||
| **Gemini Content Format:** | ||
| ```json | ||
| { | ||
| "role": "user", | ||
| "parts": [ | ||
| { "text": "Hello" }, | ||
| { "file_data": { "mime_type": "...", "file_uri": "..." } } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| **Gemini Tools Format:** | ||
| ```json | ||
| { | ||
| "function_declarations": [ | ||
| { | ||
| "name": "get_weather", | ||
| "description": "...", | ||
| "parameters": { ... } | ||
| } | ||
| ] | ||
| } | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| # Gemini API Integration Specifications | ||
|
|
||
| This document outlines the plan to integrate the Gemini API adapter into the existing Rails application, ensuring seamless user interaction for model selection, text generation, and file uploads. | ||
|
|
||
| ## Analysis | ||
|
|
||
| ### Current Architecture | ||
| 1. **Model Selection:** | ||
| - **UI:** `PromptFormComponent` uses `GenerativeText.active_models` to populate the model dropdown. Since Gemini models are now registered in `GenerativeText::MODELS`, they automatically appear in the UI. | ||
| - **Frontend:** `prompt_form_controller.js` handles file input toggling based on model capabilities (already supported via `model_data` serialization). | ||
| - **Settings:** User settings for default models (`User#setting.text_model`) are string-based and agnostic to the vendor. | ||
|
|
||
| 2. **Request Handling:** | ||
| - **Controller:** `ConversationsController` uses `ConversationForm` to process requests. | ||
| - **Form:** `ConversationForm` creates a `GenerateTextRequest` with the selected model. | ||
| - **Job:** `GenerateTextJob` executes the request in the background. It calls `GenerativeText.new.invoke_model(request)`. | ||
| - **Service:** `GenerativeText` delegates to the appropriate client (Anthropic or Gemini) based on the model's vendor. | ||
|
|
||
| 3. **File Uploads:** | ||
| - **Controller:** `ConversationContextsConversationsController` handles file uploads. Logic has been updated to use `Gemini.upload_file` when the user's preferred model is a Gemini model. | ||
| - **Context:** `ConversationContext` stores the file reference (URI for Gemini, ID for Anthropic) and mime type. | ||
|
|
||
| 4. **Response Handling:** | ||
| - **Job:** `GenerateTextJob` expects the response object to respond to `.data` (for storage) and `.content` (for broadcasting). | ||
| - **View:** `GenerateTextRequestComponent` renders the response content. | ||
|
|
||
| ### Identified Gaps | ||
| 1. **Missing `data` Method:** `Gemini::InvokeModelResponse` does not expose the raw response data via a `data` method, which is required by `GenerateTextJob` to save the raw response to the database. | ||
|
|
||
| ## Development Phases | ||
|
|
||
| ### Phase 1: Fix Response Interface | ||
|
|
||
| **Goal:** Ensure `Gemini::InvokeModelResponse` adheres to the interface expected by `GenerateTextJob`. | ||
|
|
||
| - [x] Update `Gemini::InvokeModelResponse` to expose `@response_json` via a `data` method/attribute. | ||
| - [x] Add a spec to verify `data` returns the raw hash. | ||
|
|
||
| ### Phase 2: User Interface Polish (Optional) | ||
|
|
||
| **Goal:** Improve visual distinction between models. | ||
|
|
||
| - [x] (Optional) Update `ConversationTurnComponent` or CSS to display vendor-specific icons (e.g., Google logo for Gemini) if desired. Currently, it uses a generic robot icon. | ||
|
|
||
| ### Phase 3: End-to-End Verification | ||
|
|
||
| **Goal:** Verify the full flow from UI to Database. | ||
|
|
||
| - [x] Verify `GenerateTextJob` runs successfully with a Gemini model. | ||
| - [x] Verify `GenerateTextRequest` saves the raw Gemini JSON response in the `response` column. | ||
| - [x] Verify streaming works in the browser (simulated via system tests). | ||
|
|
||
| ## TODOs | ||
|
|
||
| - [x] Fix `Gemini::InvokeModelResponse#data`. | ||
| - [x] Verify `GenerateTextJob` with Gemini model via console/test. | ||
|
|
||
| ### Phase 4: Provider-Aware Conversation Contexts | ||
|
|
||
| **Goal:** Make `ConversationContext` explicitly aware of its provider to ensure only compatible contexts are used. | ||
|
|
||
| - [x] Add `vendor` column to `ConversationContext` table. | ||
| - [x] Update `ConversationContextsConversationsController#create` to determine and save the `vendor` when uploading a new file. | ||
| - [x] Update `InvokeModelRequest` for both `Anthropic` and `Gemini` to filter contexts based on the active model's vendor. | ||
|
|
||
| ### Phase 5: Dynamic Context UI | ||
|
|
||
| **Goal:** Update the "Attach File" UI to dynamically show contexts that are compatible with the selected model. | ||
|
|
||
| - [x] Add a new route/action to fetch available `ConversationContext` records filtered by `vendor`. | ||
| - [x] Update `prompt_form_controller.js` to fetch and render the filtered context list when the model selection changes. | ||
| - [x] Create a Turbo Stream view to render the updated context list. | ||
| - [x] Add vendor badges to the context selection UI. | ||
| - [x] Allow uploading `.md` (markdown) files. | ||
|
|
||
| ### Phase 6: Update Gemini Models | ||
|
|
||
| **Goal:** Update the Gemini model list to the latest models. | ||
|
|
||
| - [x] Update `lib/gemini.rb` with the latest model names and ensure they are marked as active. | ||
| - [x] Set max tokens to 65,536 for all models. | ||
|
|
||
|
|
||
| ### Phase 7: Conversation Contexts | ||
|
|
||
| - [x] In app/views/conversation_contexts_conversations/index.html.haml, the @available_contexts should be scoped to the model selected in the prompt form component. When selecting an Anthropic model, the available contexts should only be Anthropic file uploads. When selecting a Google model, the available contexts should only be Google file uploads. These needs to happen dynamically. | ||
| - [x] In app/views/conversation_contexts_conversations/index.html.haml show a vendor badge, Anthropic or Google. This needs to change dynamically when the user selects a model in the prompt form component. Also show a vendor badge next to each of the selected conversation contexts. Show them in a disabled state when the currently selected model vendor is different from the context's vendor. | ||
| - [x] In app/views/conversation_contexts/_conversation_context.html.haml show a vendor badge next to each conversation context. | ||
| - [x] Create a scheduled sidekiq job the deletes Google file uploads / conversation contexts, 48 hours after they are uploaded / created. | ||
| - [x] When creating a conversation_conversation_context record, the front end should pass the vendor. Currently the vendor is inferred from the user's settings which may not match the model selected in prompt form component. | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
Copilot Autofix
AI 4 months ago
In general, to fix this, any user-controlled value interpolated into HTML must be escaped or strongly validated/whitelisted before being marked
html_safe. You should avoid callinghtml_safeon strings that contain raw user input, or ensure the user input is sanitized (e.g.,ERB::Util.html_escape) or restricted to a known safe set of values.The minimal fix here without changing existing behavior is: ensure
@vendoris converted into a safe, normalized value before being interpolated, and avoid passing rawparams[:vendor]through. We already have acurrent_vendormethod that derives a vendor symbol fromparams[:vendor]or other internal sources, so we can reuse that. In theavailableaction, instead of setting@vendor = params[:vendor], set@vendor = current_vendor. When interpolated in the label"Conversation Context … #{@vendor.to_s.titleize}", this will now be built from the whitelisted symbol returned bycurrent_vendor, not arbitrary user input. We can keep thehtml_safecall because only the known vendor name will be inserted between tags, and Rails will escape the symbol’s string representation when it was originally constructed or ensure it is not tainted with arbitrary HTML; alternatively, if you want to be stricter, you could wrap@vendor.to_s.titleizewithERB::Util.html_escape, but that requires adding an import. The most straightforward fix within the shown code is to change line 70 to usecurrent_vendor.Concretely:
app/controllers/conversation_contexts_conversations_controller.rb.availableaction, replace@vendor = params[:vendor]with@vendor = current_vendor.current_vendoris already defined in this controller.