Skip to content

martinsaxa/NN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Model Comparison: Gemini vs Ollama

This project benchmarks and compares the outputs of two large language models: Google Gemini (via the official API) and an Ollama-hosted model (e.g., Gemma3:12b).
It evaluates their responses to a set of prompts using ROUGE and BERTScore metrics, as well as latency and token usage, and saves all results to a CSV file.


Features

  • Batch evaluation of multiple prompts
  • Automatic querying of both Gemini and Ollama models
  • ROUGE and BERTScore metrics for output similarity
  • Latency and token usage tracking
  • Results saved in a structured CSV file for further analysis

Requirements

  • Python 3.8+
  • Ollama running locally with your chosen model (default: gemma3:12b)
  • Google Gemini API access and API key

Python dependencies

This project uses uv as the package manager.
Install all required packages with:

uv sync

Setup

  1. Clone this repository.

  2. Set up your .env file:

    Create a file named .env in the project root with the following content:

    GEMINI_API_KEY=your_actual_gemini_api_key_here
    
  3. Start your Ollama server and ensure your chosen model is available (default: gemma3:12b).


Usage

Run the script:

python compare_models.py
  • The script will process a set of prompts, query both models, compute metrics, and save results to model_comparison_results.csv.

Output

The CSV file contains, for each prompt:

  • The prompt text
  • Gemini response
  • Ollama response
  • Latency (seconds) for each model
  • Token count for each model (estimated)
  • ROUGE-1, ROUGE-2, ROUGE-L (precision, recall, F1)
  • BERTScore (precision, recall, F1)

Acknowledgements

Developed by Martin Saxa, Lukáš Švihura a Dominik Keil

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages