Skip to content

RadEdje/Agentic_Rad_Workflow_MedGemma

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic_Rad_Workflow_MedGemma_1.5

  • Agentic workflow for radiology using MedGemma1.5.
  • An introduction for radiologists learning to use and write code for local AI.
  • For devs/ programmers, a peak into the world of a radiologist hoping to build local AI he'd actually want to use.

(If you already know about Ollama and VS Code and just want the Agentic workflow with google adk, read the Disclaimer and then go straight to section V: Advanced Mode: Agentic Workflow for Radiology):

I.Introduction/Inspiration

I’m a radiologist by profession but I also enjoy programming/coding my own AI web apps (started before the VibeCoding era so I literally had to learn javascript and python by lurking in stackoverflow). I like the intersection of clinical and diagnostic radiology with software development especially with the advent of AI. Most AI projects I see today focus on the exponential/scalable use cases for enterprise level products catered to the hospital, clinics, or even public health projects. The goal is always AI at scale. Produce a scalable solution first and then trickle it down for acceptability to the radiologist/actual end user. This however often faces a problem or a common road block at the end… trust in the product from the actual end user, the radiologist. No matter how often a model is validated internally, it will always need “external and local validation” to ensure that it remains performant for actual deployment to the local hospital, clinic or project site. The alternative is that radiologists/doctors will just “trust” the AI with their licenses…. with their practice…. with their patient’s lives. This is less of a problem if the hospital/clinic radiologists/doctors developed the AI themselves; but, the burden of proof increases if the AI was developed elsewhere.

I have deployed my own AI web apps featured at the google tensorflow dev summit in 2020 with associated blog post, co-authored a published study on the use of AI for PTB screening/detection at multicenter sites, helped write guidelines for AI in radiology for our Philippine College of Radiology, as well as written an editorial, on AI and the value of trust…. and it is this “trust” that I always see as a major challenge for AI implementation especially in my country where we do not have the infrastructure yet to routinely validate, train and deploy our own medical AI at scale. Privacy is always important (hence the HIPAA) for patients but what about the end user’s privacy? the Doctor’s privacy? It’s not just patients that can be data mined but doctors and their expertise too. Sometimes, you just want to talk to an AI without the constant fear that everything you type in might be datamined so that the AI you are talking to can replace you. You want that “privacy” but still want access to the online more powerful models. So how do you bridge the gap? This is where I think MedGemma 1.5 and Agentic workflows can truly shine…. bridging the Gap between "Privacy" and "Ability" ... from MedGemma to Google Gemini.

This project is not meant to compete with enterprise level large scale solutions. I believe these still have a place. Instead, I hope this project will serve as a starting point, a bridge between the doctor in me and the programmer in me, a bridge between other programmers (who want a peak at what a local solitary radiologist might actually want in AI versus enterprise level solutions) and actual other radiologists (residents, fellows, or even consultants) that actually want to feel empowered by AI, work with AI, and not just end up in fear of AI.

II.Disclaimer

This project, including the implementations of:

  1. Local Chat Bot: Utilizing Ollama and MedGemma for interactive local querying.

  2. Radiology Integrated Reporting Environment (IRE): Utilizing VS Code, and MedGemma 1.5 for integrated reporting support.

  3. Agentic Workflow: Utilizing MedGemma and Google ADK for image classification (Emergency, Abnormal, Normal), report drafting, and email notification

    are strictly a proof of concept and developed solely for educational and demonstrative purposes.

Not for Clinical or Professional Use

NONE of the components within this project are intended or validated for use in a professional clinical, diagnostic, or patient-care setting. This solution is not a substitute for professional medical judgment, validated clinical software, or established diagnostic procedures. The outputs generated by the models (MedGemma, Gemma, Gemini via ADK) are computational suggestions and must not be used to guide patient treatment or diagnosis without an actual licensed medical professional.

No Warranty and Limitation of Liability

THIS PROJECT IS PROVIDED "AS IS," WITHOUT WARRANTY OF ANY KIND, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.

The creator and developer of this project will not be held liable for any damages, losses, or other liabilities arising from the use or inability to use the software. Users assume all responsibility and risk for the use of this project in any manner they choose.

III.Easy Mode:Local AI Chat Bot ( MedGemma 1.5 and Ollama)

This guide details the process for setting up and running MedGemma 1.5 locally using Ollama, transforming your personal workstation into a private, medical-focused chat environment. Ollama is just one of the options. I use Ollama for the demo since I feel it will be easier to get up and running for my fellow radiologists or co doctors (without messing with too much code). For more technical people, other options include but are not limited to: lmstudio, and llama.cpp.

Prerequisites

  • A computer with sufficient hardware (CPU, RAM, and potentially a GPU for faster inference).
  • Access to a command-line interface (Terminal on macOS/Linux, Command Prompt/PowerShell on Windows).

1. Install Ollama

Ollama is a simplified framework for running large language models locally.

  1. Download Ollama: Navigate to the official Ollama website and download the application appropriate for your operating system (macOS, Linux, or Windows).
  2. Install: Run the installer and follow the on-screen instructions.
  3. Optional: Configure Model Directory: If you wish to store the downloaded models in a specific location other than the default, open the Ollama application settings and adjust the models download folder path. This is helpful when the models are big and you want these in a separate disk on your computer.

2. Download and Run MedGemma 1.5 GGUF Model

You will use the command line to download the quantized GGUF version of MedGemma 1.5 from Hugging Face, optimized by Unsloth, directly through Ollama.

  1. Open your terminal/command line interface.
  2. Execute the following command: This command tells Ollama to download and run the specified model. Since it's the first time running it, Ollama will first download the model file.
    ollama run hf.co/unsloth/medgemma-1.5-4b-it-GGUF:Q4_K_M
    Wait for the download to complete. This Model is around 2-3.3 gb in size.
  3. Start Chatting: Once the download is finished, the model will start running, and you will be in the interactive chat environment with MedGemma 1.5. You can now type your queries.
  4. Exit: To exit the chat session, type /bye or press Ctrl + D.

3. Optional: Rename the Model for Easier Access

The initial model name is quite long. You can rename it within Ollama for simpler future use.

  1. Open your terminal/command line interface.
  2. Execute the copy command: This command copies the existing model with the long name and assigns it a simpler alias, medgemma-1.5.
    ollama cp hf.co/unsloth/medgemma-1.5-4b-it-GGUF:Q4_K_M medgemma-1.5
  3. Run with the new name: You can now start the chat bot using the simpler name:
    ollama run medgemma-1.5

4. Using a Graphical User Interface (GUI)

  1. Ollama GUI Click on the ollama icon or run the ollama app (you can press the windows key and search for "Ollama" on your comoputer) to launch the ollama GUI.
  2. Select the downloaded model (either the original hf.co/unsloth/medgemma-1.5-4b-it-GGUF:Q4_K_M or the renamed medgemma-1.5) from the model selection dropdown within the Ollama GUI.

You now have a local, private MedGemma 1.5 chat bot running on your machine, where all resources and data stay local and secure.

IV.Intermediate Mode:Radiology Integrated Reporting Environment (MedGemma 1.5 and Visual Studio Code)

This guide details how to integrate the local MedGemma 1.5 model into your Visual Studio Code or VS Code environment using the Continue extension, turning it into a powerful, context-aware Integrated Reporting Environment (IRE) for drafting and checking radiology reports, or looking up radiology notes. Again, you do not need to use VS code. Another option is Google Antigravity.

Prerequisites

  • Ollama is installed.
  • The MedGemma 1.5 GGUF model is downloaded and accessible via Ollama (e.g., under the alias medgemma-1.5).
  • VS Code is installed.

1. Set Up VS Code and the Continue Extension

  1. Download and Install VS Code: If you haven't already, download and install Visual Studio Code.
  2. Install the Continue Extension:
    • Open VS Code.
    • Go to the Extensions view by clicking the square icon on the left sidebar or pressing Ctrl+Shift+X.
    • Search for Continue and click Install.

2. Run Ollama to Serve MedGemma 1.5

Ensure your local MedGemma 1.5 model is running and accessible to external applications like VS Code.

  1. Run Ollama: Open your terminal or command line and execute the command to start the model server:
    ollama run medgemma-1.5
    (Replace medgemma-1.5 with the full model name if you did not rename it.) Alternatively, you can start Ollama by opening the Ollama application (clicking the application icon).

3. Connect Continue to Ollama

Now, you need to configure the Continue extension within VS Code to use your locally running MedGemma model.

  1. Open Continue: Click on the Continue icon (a newly added icon in the left sidebar of VS Code).
  2. Add Chat Model:
    • In the Continue interface, find the model selection area (usually labeled "Select model").
    • Click on Add Chat Model.
  3. Configure Ollama Provider:
    • Click on the Provider dropdown and select Ollama.
  4. Autodetect Model:
    • Click on the Model dropdown and select Autodetect. Continue will scan your running Ollama instance and find MedGemma 1.5.
  5. Connect: Click Connect. The Continue extension should now be successfully connected to your local Ollama server hosting MedGemma 1.5.
  6. Select MedGemma 1.5: Go back to the main Continue chat interface. You should now be able to select medgemma-1.5 (or the detected model name) from the model selection dropdown menu.

4. Use MedGemma 1.5 for Context-Aware Reporting

With the model connected, you can now use it within your reporting workflow, allowing it to reference the file you are currently working on.

  1. Open Your Report File: Open your radiology report file (e.g., report.txt, report.md, or the active reporting window content).
  2. Add Context: In the Continue chat box, type the @ symbol. A list of options will appear, allowing you to select a file to provide as context to the model. Select the file containing your report.
  3. Ask for Assistance: You can now ask MedGemma 1.5 to perform tasks relevant to the context of your report. For example:
    • "Double check this report for inconsistencies in laterality (right vs. left confusion)."
    • "Suggest a more professional phrasing for the finding described in the third paragraph."
    • "Review the key findings and draft a more concise summary for the referring physician."

MedGemma 1.5 will now use the content of the selected file to provide context-aware, private suggestions, significantly enhancing your reporting workflow.

V.Advanced Mode: Agentic Workflow for Radiology (MedGemma 1.5, Google ADK, and Gemini)

Agentic Rad Workflow connecting MedGemma 1.5, Gemma, and Google Gemini: App Setup & Usage Guide

Prerequisites

  • Git is installed. Go to the Git website to download and install git. (To my fellow radiologists or first time users, this is so you can use git commands in the terminal so you can clone my repository).

  • Python installed (see python website)

  • VS Code installed

  • Ollama is installed

    • the MedGemma-1.5 GGUF model is already downloaded for Ollama and renamed as "medgemma-1.5" (see above for instructions).
    • Gemma is downloaded for use with Ollama.
      • This agentic workflow uses a second local AI or large language model(LLM) called gemma (the more general variant that medGemma is related to). Download this for use with ollama with the following command in your terminal
        ollama run gemma3:4b
        
  • API keys and an .env file.

    • Since we will be using an agentic workflow so that local models can use the power of an online model like google gemini and an email MCP server tool, we will need API keys to access these external online features.

    • 1. Gemini API Key (for Google Gemini 3 Flash)

      This key grants your application access to Google's powerful online language models, such as Gemini 3 Flash.

      1. Go to Google AI Studio: Navigate to the official Google AI Studio website.
      2. Generate API Key: In the user interface, typically located at the lower-left corner, click on "API key" (or a similar option like "Create API key").
      3. Follow Instructions: Complete the on-screen instructions to generate your new API key.
      4. Copy and Secure: Immediately copy the generated API key. This key is sensitive. Do not expose it in your code.
      5. Storage: Keep this key safe for storage in your local .env file in the step 3.
      6. Free vs Paid/Google AI studio and Google Cloud I am currently only using the free version so this has limited runs per day. If you are new to the use of API keys, I suggest you start with this as well before moving on to paid versions. I chose to explain how to get an API key from Google AI studio since this might be easier for my fellow doctors or radiologists to follow. Google Cloud and Vertex AI can always be used for more functionality and control by more seasoned programmers and developers.
    • 2. Zapier MCP Server Setup (for Create and Send Gmail Tool) This setup uses Zapier's MCP server to securely send emails via a dedicated server, which the Gemini agent will leverage. You do not have to use Zapier. You can use other MCP server tools for sending emails or even explore Google Cloud or Vertex AI. I just chose Zapier for this quick demo (so my fellow radiologists/doctors don't need to write more code).

      Reference: For more details, see the Zapier MCP Guide.

      1. Go to Zapier MCP: Navigate to the Zapier MCP website. Create a Zapier account if you don't already have one, and then click Start Building.
      2. Create New MCP Server: Once in the Zapier MCP interface, click New MCP Server.
      3. Set Client: Set the client to "other" (since we are using the Google ADK).
      4. Add Tool: Click "Add tool".
      5. Select and Connect Gmail:
        • Select Gmail. Zapier will ask for permission to connect with your Google/Gmail account.
        • Security Note: It is highly recommended to use a separate Gmail account specifically for programming projects or AI demos. (not your personal, private, or professional email).
      6. Select Send Email Action: Now, select the Send Email action.
      7. Configure Send Email Form: A form will appear.
        • Set the Gmail account to your connected or nominated email account.
        • Set the To: and From: fields to your nominated email account.
        • Fill in as many of the fields as possible to reduce the agent's work.
        • Crucially, keep the "subject" and "body" set to "Have AI generate a value for this field". This allows the AI Agentic workflow to fill these in later based on the findings or drafted email content.
      8. Save and Connect:
        • Click Save.
        • Click Connect at the top middle of the screen. This will give you the option to get your ZAPIER_URL and ZAPIER_TOKEN.
      9. Copy and Secure: Make sure to immediately copy and safely store your ZAPIER_URL and ZAPIER_TOKEN for step 3.
    • 3. Storing API Keys Securely in a .env File

      Once you have both the Gemini API Key and the Zapier MCP URL and Token, you must store them securely in a .env file within your project's root directory. This prevents the keys from being accidentally committed to version control (like GitHub).

      1. Create .env File: In the root directory of your project (e.g., Agentic_Rad_Workflow_MedGemma), create a new file named .env.
      2. Add Keys: Open the .env file and add the keys using the following format:
          # Google Gemini API Key
          GOOGLE_API_KEY="PASTE_YOUR_GEMINI_API_KEY_HERE"
      
          # Zapier mcp server email tool access
          ZAPIER_URL="PASTE_YOUR_ZAPIER_URL_HERE"
          ZAPIER_TOKEN=”PAST_YOUR_ZAPIER_TOKER_HERE”
      

      Crucial Security Note: Ensure your project's .gitignore file includes the line .env to prevent the accidental upload of your API keys to public repositories. Always keep your Private API keys safe and secure. These are technically passwords.

      1. lite_llm: In addition to the Google Gemini and Zapier keys, we need additional variables. To access the local MedGemma 1.5 and Gemma models served from Ollama, we are using Google adk's built in functions for lite_llm which needs the following:
      OPENAI_API_BASE=http://localhost:11434/v1 
      OPENAI_API_KEY=anything
      

      Please add these to the .env file.

Step-by-Step Guide for Running the Agentic Wokrflow

1. Set Up Your Development Environment

  • Ensure Python is installed and accessible from the terminal.
  • Open VS Code, click on your terminal (at the bottom of the screen) and navigate to your designated project folder.

2. Clone the Project Repository

  • Open the terminal in VS Code (or Git Bash).
  • Navigate to your project folder
    cd /path/to/your/project/folder
  • Clone the repository:
    git clone [https://github.com/RadEdje/Agentic_Rad_Workflow_MedGemma.git](https://github.com/RadEdje/Agentic_Rad_Workflow_MedGemma.git)

This downloads the project files to your current directory.

  • Enter the root folder of the cloned project by typing the following into your terminal in VS code
    cd Agentic_Rad_Workflow_MedGemma
    

3. Set Up the Virtual Environment

  • Create a virtual environment:
    python -m venv .venv
  • Activate the virtual environment:
    • Linux/macOS:
      source .venv/bin/activate
    • Windows: Command Prompt (cmd)
      .\venv\Scripts\activate
    • Windows: git bash
      source ./venv/Scripts/activate
      I prefer to use git bash on windows. If you are using the terminal in VS Code, click the option for git bash (instead of command prompt)
  • Upgrade pip:
    python -m pip install --upgrade pip
  • Install dependencies:
    pip install -r requirements.txt
    This installs all required Python packages.

4. Run the Application

  • Start the web interface:
    adk web
    This launches the google ADK web interface running the sequential agentic workflow.
    • Select agent_1 from the dropdown menu on the upper-left corner.
  • Upload an Image for Analysis:
    • Click the "Attach" icon in the chat interface.
    • Upload an image for MedGemma1.5 to analyze.
    • Optional: Add a PID
      • Enter a PID (Personal ID) in the format PID: [NUMBER] to simulate a unique identifier in the chat box.
      • This simulates meta data that can be passed on to the other agents.
  • Send the Request:
    • Click "Send" or press Enter to submit the image and PID.

5. Agentic Workflow Execution

  • MedGemma1.5 Analysis (uses the local MedGemma 1.5)
    • The image is sent to the MedGemma_Analyzer_Agent, which uses MedGemma1.5 to analyze the image.
    • Output is generated and passed to the next agent.
  • Gemma_Draft_Agent (uses the local Gemma model)
    • The Gemma_Draft_Agent filters the analysis results to extract key information.
    • It drafts an email with a subject and body based on the filtered data.
  • Async_Email_Agent (uses the online Gemini model)
    • The Async_Email_Agent (running Gemini) handles the drafted email.
    • It handles only filtered data (no private image details) is sent to the MCP server.
    • Gemini3 coordinates with the MCP server to send to the designated email.
  • Modifying instructions for each AI agent
    • You can modify the specific instructions for each of the agents by editing the agent.py found within the agents folder (folder bearing the agent's name).
    • For example: agent.py in the MedGemma_Analyzer_Agent will conatain the instructions for the MedGemma Agent (analyzes the image)
    • For example: agent.py in the Gemma_Draft_Agent will conatain the instructions for the Gemma Draft Agent (drafts the email)
    • For example: agent.py in the Async_Email_Agent will conatain the instructions for the Email Agent running Gemini (drafts the email)

6. Bonus Round : Web App

  1. If you are still on VS CODE. Go to the frontEnd folder and open the index.html file.
  2. Click the VS code extensions button on the left side panel and install the Live Server extension on VS code. Once installed, a "Go Live" option/button should appear at the top lower right of VS CODE. You can actually use any other extension or any way to serve the index.html. Live Server is just the one I used for this demo (so my fellow radiologista/doctors don't need to write more code).
  3. Open the terminal (I'm using gitbash) and type the following:
    python adk_api.py
    
    This will launch a fastAPI server for a web app.
  4. With the index.html file open, click on Go Live botton to launch the web app.
  5. You can drag n drop an image onto the chat and add a PID as well. Google adk web is not meant for production. This is my sample attempt at a minimum viable product or MVP.

Notes, and Tips for Developers and Radiologists

  • Model Requirements: Ensure all locally hosted models (MedGemma1.5, and Gemma) are properly installed and accessible. Make sure Gemini 3 is accessible via a valid Google API Key.
  • MCP Server: The Async_Email_Agent requires a working MCP server connection to send emails so if you don't want to use Zapier MCP, feel free to use other MCP Servers for sending emails. Just make sure that you have an alternative MCP server tool for sending emails.
  • I started this github repo to form a bridge between developers (who might want to work on health related AI agents) and radiologists (who might want to start a study or do research related to agentic AI workflows). This is why I tried to minimize or balance out writing code with more GUI related options (like Zapier MCP or Ollama).
  • Our country (The Philippines), has a lot of talent on both the developer side and the radiology side. Let's bridge the gap with local AI. Let's bridge the gap with MedGemma 1.5.

Next Steps

  • Explore extending the workflow with additional agents or features. I'm thinking of building a "loop" agent with google adk for "deep research" with a google search API.
  • Customize Agents: Modify agent logic in the agents/ directory for different workflows. MedGemma 1.5 still looks at images without considering that in radiogic imaging, the patient is facing "us" so what is on the agent's left is actually the patient's right. I'm still working on a potential system prompt for the MedGemma Analyzer Agent that fully addresses this.
  • Experiment with Models: Try using different models for the agents or different versions of MedGemma, Gemma, and Gemini
  • Explore Structured input/output: see if I can further standardize the outputs (using google ADK) of the initial agents for the subsequent agents as well as the final output on the email. A structured JSON might be better if connecting the agentic workflow to other systems or if I plan to create my own MCP server.
  • Potential real world validation: Reach out to my fellow doctors who may want to check for potential use in public health projects needing Pulmonary Tuberculosis screening and triage in underserved areas in our country(Philippines)? Check for intracranial hemorrhage detection for neuro/brain attack team workflows.

References, Citations, Pertinent links:

Fereshteh Mahvar, Yun Liu, Daniel Golden, Fayaz Jamil, Sunny Jansen, Can Kirmizi, Rory Pilgrim, David F. Steiner, Andrew Sellergren, Richa Tiwari, Sunny Virmani, Liron Yatziv, Rebecca Hemenway, Yossi Matias, Ronit Levavi Morad, Avinatan Hassidim, Shravya Shetty, and María Cruz. The MedGemma Impact Challenge. https://kaggle.com/competitions/med-gemma-impact-challenge, 2026. Kaggle.

Sellergren et al. "MedGemma Technical Report." arXiv preprint arXiv:2507.05201 (2025).

Links to Models

https://huggingface.co/google/medgemma-1.5-4b-it

https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF

Links to Videos

Sink or Sync, AI and I with MedGemma 1.5

About

Agentic workflow for radiology using medgeamma1.5

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors