Skip to content

Avalon-S/LLM-Health-Assistant

Repository files navigation

LLM-Health-Assistant


logo

Logo


Disclaimer

The LLM Health Assistant provides general information only and does not constitute medical advice. It does not establish a doctor-patient relationship. Always consult a qualified healthcare professional for medical concerns. We are not responsible for any decisions made based on the platform’s information.


Table of Contents

  1. 📖Introduction
  2. 🏗️Service Architecture
  3. 🛠️Technology Stack and Development Tools
  4. 🚀Usage
  5. 🎥Project Display
  6. 🔍Reflection
  7. 📜License

Introduction

The LLM Health Assistant is a health consultation platform based on a large language model (LLM), leveraging generative AI and retrieval-augmented generation (RAG) technologies to provide users with personalized and intelligent health Q&A services. The system integrates multiple functional modules, including text interaction, voice interaction, PubMed paper retrieval, user information management, and conversation storage.

Back to Table of Contents


Service Architecture

The system follows a 4-Layer Architecture (not include presentation layer) to ensure efficiency, scalability, and security:

  1. Presentation Layer

    • Provides the user interface for interactions, supporting both text and voice input.
    • Sends user requests to the Process Centric Layer for processing.
    • Key Components:
      • Web Frontend (HTML/CSS/JS): Login, health consultation, voice chat, user profile management.
  2. Process Centric Layer

    • Coordinates the overall business logic and invokes various APIs for task execution.
    • Core functionalities:
      • User Authentication (OAuth2 + JWT)
      • Text Chat Processing (GLM-4-Plus)
      • Voice Processing (GLM-4-Voice)
      • History Retrieval (Pinecone)
      • Medical Paper Search (PubMed API)
  3. Business Logic Layer

    • Handles AI interaction, context retrieval, and query parsing.
    • Core functionalities:
      • LLM Processing (GLM-4-Plus for intelligent responses)
      • Context Retrieval (Pinecone for historical conversation storage)
      • Query Handling (PubMed API for medical paper retrieval)
  4. Adapter Services Layer

    • Manages interactions with external APIs and ensures system extensibility.
    • Key components:
      • GLM-4-Plus API (Processes text-based queries)
      • GLM-4-Voice API (Handles voice interactions)
      • Pinecone Adapter (Stores and retrieves user conversations)
      • SQLite Adapter (Manages user authentication and data)
      • PubMed API (Fetches the latest medical research)
  5. Data Services Layer

  • Provides foundational AI and database services that power the application.
  • Key components:
    • GLM-4-Voice (Processes speech input and generates voice responses)
    • GLM-4-Plus (Handles text-based health queries and generates intelligent responses)
    • Pinecone (Stores and retrieves user conversation history for context-aware interactions)
    • PubMed (Provides medical research data for evidence-based health consultations)
    • SQLite (Manages user authentication and stores basic user information)
Architecture.png

System Architecture

Back to Table of Contents


Technology Stack and Development Tools

Backend

  • FastAPI (Lightweight web framework supporting high concurrency)
  • SQLite (Lightweight database for user data storage)
  • Pinecone (Vector database for user conversation history)
  • OAuth2.0 + JWT (User authentication for API security)
  • PubMed API (Medical literature retrieval)

Frontend

  • HTML, CSS, JavaScript (For the user interface)
  • Fetch API (For frontend-backend communication)

AI Models

  • GLM-4-Plus (Text-based health consultation)
  • GLM-4-Voice (Voice input processing)
  • Sentence Transformer (all-MiniLM-L6-v2) (Text embedding for context retrieval and semantic search)

Development Tools

  1. Hardware
  • Operating System: Windows 11 Home
  • CPU: Intel(R) Core(TM) i7-14700HX @2.1GHZ
  • GPU: NVIDIA GeForce RTX 4070 Laptop GPU (8 GB)
  • Memory: 32 GB
  1. Software
Tool Purpose
Anaconda Development environment management
VS Code Code development
JupyterLab Early-stage experiment exploration
Edge Browser Frontend interface testing
Postman API testing
  1. External API Key Sources

Back to Table of Contents


Usage

  • During development, torch 2.6+cu124 was used for acceleration, but CUDA is not mandatory. Since the CPU computation speed is within an acceptable range, the Docker image is built with the CPU version of Torch for convenience. If you wish to use GPU acceleration within the image, please install the NVIDIA Container Toolkit. Dockerfile and docker-compose.yml need to be reconfigured.

  • For user data management, this project also includes a database management system that allows querying and removing accounts from the two databases.

  1. Running Code

Using conda to manage environment.

conda create -n sde python=3.9
conda activate sde 
git clone https://github.com/Avalon-S/LLM-Health-Assistant
cd LLM-Health-Assistant

Create a .env file in the project root and add the following API keys:

# SECRET_KEY can be generated randomly by you.
SECRET_KEY=your_secret_key
ZHIPU_API_KEY=your_glm_api_key
PINECONE_API_KEY=your_pinecone_key

For Pinecone configuration, set index name=healthassistant, set region as us-east-1, cloud as AWS.

  • Install Dependencies
pip install -r requirements.txt
  • Run the FastAPI Server
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Once started, API documentation can be accessed via:

http://localhost:8000/redoc
  • Run the Frontend Input http://localhost:8000/ in your browser to access the LLM Health Assistant web interface.

  • Enter the administrator system

python CLI_DB_Manager.py
code_running

Code Running

  1. Build & Run the Docker Image

Before doing this, make sure the Docker CLI is enabled. It is recommended to install Docker Desktop.

git clone https://github.com/Avalon-S/LLM-Health-Assistant
cd LLM-Health-Assistant

Create a .env file in the project root and add the following API keys:

# SECRET_KEY can be generated randomly by you.
SECRET_KEY=your_secret_key
ZHIPU_API_KEY=your_glm_api_key
PINECONE_API_KEY=your_pinecone_key

For Pinecone configuration, set index name=healthassistant, set region as us-east-1, cloud as AWS.

  • Build the image (without using cache), it takes about 10-20 minutes, depending on your internet speed.
docker-compose build --no-cache
  • Start the container (run in the background)
docker-compose up -d
  • Stop all containers started by docker-compose down.
docker-compose down
  • Enter the administrator system (keep the container is running)
docker ps # Get CONTAINER ID
docker exec -it <CONTAINER ID> /bin/bash
python CLI_DB_Manager.py
image_running

Image Running

Back to Table of Contents


Project Display

login

Login Page

dashboard

Dashboard Page

health_chat

Health Chat Page

voice_chat

Voice Chat Page

Back to Table of Contents


Reflection

At the beginning, the initial plan was to locally deploy LLaMA 3.2 1B and 3B. However, during later development, there were numerous dependency conflicts, and the models performed extremely poorly in multi-turn dialogues with severe hallucinations. Moreover, locally deploying an LLM would result in an excessively large Docker image, making deployment time-consuming. Therefore, we switched to using GLM-4-Plus, which delivers performance comparable to GPT-4o, and the results have been satisfactory.

It should be noted that the strategy for deciding whether to call specific APIs to enhance the prompt in this project follows an expert system approach. Specifically, if certain keywords are detected, such as my age or paper, the system will automatically call the Pinecone or PubMed API, respectively, for retrieval. This is a simple, fast, and effective strategy. LangChain was not used because experiments showed that the task was not complex (no deep reasoning required), and using an agent to determine which API to call took significantly longer than letting the LLM respond directly. Additionally, there was no difference in answer quality—GLM-4-Plus was already powerful enough.

Overall, despite the tight timeline, I am fairly satisfied with the implementation of this project.

Back to Table of Contents


License

This project is licensed under the MIT License. See the LICENSE file for details.

Back to Table of Contents


About

The LLM Health Assistant is an AI-driven health consultation platform using LLM and RAG for intelligent Q&A, supporting text/voice interaction, PubMed retrieval, and user data management.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors