Local AI Chatbot

Introduction

The aim of this project is to develop an intelligent chatbot for the Université du Québec à Chicoutimi (UQAC), capable of answering employees' questions about the university's management manual. Using the Retrieval Augmented Generation (RAG) technique, the chatbot extracts and synthesizes relevant information from a large set of documents, including HTML pages and PDF files.

🐋Docker images of the backend are available here.

⚙️Technologies and tools

The chatbot is built using the following technologies and tools :

⚒️Installation

⚠️Prerequisites: You need to have Docker installed on your machine to deploy the backend in a container. If you don't have it, you can download it here. You also need to have Node.js installed to run the frontend. If you don't have it, you can download it here.

Here's a step-by-step guide to installing the chatbot on your local machine :

Clone the repository to your local machine using the following command:

git clone https://github.com/AubinSeptier/local-ai-chatbot.git

Go to huggingface.co, create or log in to your account and generate an access token (keep it for later).
ℹ️Note: To access to Llama models, you'll need to accept the terms and conditions on the specific model page on HuggingFace.
Go to openai.com, create or log in to your account and generate an API Key (keep it for later).
ℹ️Note: An OpenAI API Key is required to use the RAG functionality in the chatbot. OpenAI API key is not free and you may need to pay for it.
Open the backend folder in a terminal and launch the bash script deploy-backend.sh:

bash deploy-backend.sh

Paste the HuggingFace access token and the OpenAI API Key you generated earlier when asked. Then choose if you want to deploy the chatbot in a Docker container or not (if you choose not to, you'll have to install the requirements.txt manually before).
Then, the script will build the container and launch the app via Docker (port: 7860) or directly on your machine.
ℹ️Note: To configure Docker Dekstop on Windows (WSL2) to use GPU acceleration, you can follow the instructions here.
Open the frontend/my-chatbot-frontend folder in another terminal and execute the following commands:

npm upgrade 
npm run dev

Open your web browser and go to http://localhost:5173 to access the chatbot.
You can restart the server at any time by running the following command in the backend folder:

bash deploy-backend.sh

ℹ️Note: To restart it in the previously built Docker container, please run the script from the container.

🛠️Configuration

Here are some configuration options you can change in the backend/src/app.py file :

    generation_config = {
        "max_new_tokens": 1024,
        "temperature": 0.7,
        "top_k": 50,
        "top_p": 0.95
        "do_sample": True
        # ... other generation parameters
    }

    chat_api = ChatAPI(
        model_name="meta-llama/Llama-3.2-1B-Instruct",
        generation_config=generation_config,
        max_history=100,
        system_prompt="You are a helpful assistant.",
        db=db
    )

You can change generation config (temperature, max_new_tokens, etc.) to adjust the chatbot's responses.
You can change the model_name variable to use another model from HuggingFace. Just copy the model name from the model's page URL on HuggingFace (e.g. meta-llama/Llama-3.2-3B-Instruct).
You can change the system_prompt to define the chatbot's role.
In the backend/models folder, you can see and manage all downloaded models available for the chatbot.

ℹ️Note: If you want to change the configuration or the model used by the chatbot, you'll need to restart the backend server to apply the changes. Don't need to rebuild the Docker container, just restart the server from the container or your machine depending if you used Docker or not.

📅What's next ?

This project is still in development and many improvements can be made. Here are some ideas for future updates :

Download and load the desired LLM from the web interface.
Modify LLM configuration directly from the web interface.
Improving RAG performance.
To be able to choose whether or not to use RAG.
Rephrase chatbot responses.
To be able to synthesize the conversation, both to push back the LLM context size limit and to give the user feedback on the conversation.
Retain the user's theme preference (Light/Dark).
Integrate query_data.py directly into the Conversation class.
Improve and optimize parts of the code.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
backend		backend
frontend/my-chatbot-frontend		frontend/my-chatbot-frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local AI Chatbot

Introduction

⚙️Technologies and tools

⚒️Installation

🛠️Configuration

📅What's next ?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local AI Chatbot

Introduction

⚙️Technologies and tools

⚒️Installation

🛠️Configuration

📅What's next ?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages