LLM Model Comparison: Gemini vs Ollama

This project benchmarks and compares the outputs of two large language models: Google Gemini (via the official API) and an Ollama-hosted model (e.g., Gemma3:12b).
It evaluates their responses to a set of prompts using ROUGE and BERTScore metrics, as well as latency and token usage, and saves all results to a CSV file.

Features

Batch evaluation of multiple prompts
Automatic querying of both Gemini and Ollama models
ROUGE and BERTScore metrics for output similarity
Latency and token usage tracking
Results saved in a structured CSV file for further analysis

Requirements

Python 3.8+
Ollama running locally with your chosen model (default: gemma3:12b)
Google Gemini API access and API key

Python dependencies

This project uses uv as the package manager.
Install all required packages with:

uv sync

Setup

Clone this repository.
Set up your .env file:

Create a file named .env in the project root with the following content:
```
GEMINI_API_KEY=your_actual_gemini_api_key_here
```
Start your Ollama server and ensure your chosen model is available (default: gemma3:12b).

Usage

Run the script:

python compare_models.py

The script will process a set of prompts, query both models, compute metrics, and save results to model_comparison_results.csv.

Output

The CSV file contains, for each prompt:

The prompt text
Gemini response
Ollama response
Latency (seconds) for each model
Token count for each model (estimated)
ROUGE-1, ROUGE-2, ROUGE-L (precision, recall, F1)
BERTScore (precision, recall, F1)

Acknowledgements

Developed by Martin Saxa, Lukáš Švihura a Dominik Keil

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
model_comparison_results.csv		model_comparison_results.csv
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Model Comparison: Gemini vs Ollama

Features

Requirements

Python dependencies

Setup

Usage

Output

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Model Comparison: Gemini vs Ollama

Features

Requirements

Python dependencies

Setup

Usage

Output

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages