AutoGEO

AutoGEO is a framework for Automatic Generative Engine Optimization (GEO) that helps web content gain higher visibility in LLM-generated answers. Our paper has been accepted by ICLR 2026.

📄 Paper: "What Generative Search Engines Like and How to Optimize Web Content Cooperatively"
👥 Authors: Yujiang Wu*, Shanshan Zhong*, Yubin Kim, Chenyan Xiong (*Equal contribution)

🔍 Overview

AutoGEO automatically extracts content preference rules from generative engines and rewrites documents to maximize visibility while preserving accuracy.

How GEO models work:

Input: Target document
Output: Rewritten document with higher visibility in generative engine (GE) responses
Goal: Maximize visibility without harming GE utility

Three core components:

Rule Extraction — Automatically mines content preferences from GEs.
AutoGEO_API — Prompt-based GEO model using extracted rules
AutoGEO_Mini — Cost-effective GEO model trained with reinforcement learning

Evaluation metrics: GEO score (visibility) and GEU score (utility)

News

🔥 [2026-01-28]: Cheers! Our paper has been accepted by ICLR 2026!
🔥 [2026-01-17]: We have released our AutoGEO_Mini Demo. Feel free to try it out!
🔥 [2026-01-17]: We have released our checkpoints (E-commerce, GEO-Bench, Researchy-GEO).
🔥 [2025-12-08]: We have released our code and datasets (E-commerce, GEO-Bench, Researchy-GEO).
🔥 [2025-10-11]: Our paper is now available on arXiv. Check it out!

🚀 Installation

For using AutoGEO_API and rule extraction:

# Clone the repository
git clone --recursive https://github.com/cxcscmu/AutoGEO
cd AutoGEO

# Run installation script
bash install.sh

# Activate environment
conda activate autogeo

# Configure API keys (required)
nano keys.env  # Add your API keys

Optional: For training AutoGEO_Mini models:

# First complete Option 1, then:
conda activate autogeo
bash install_mini.sh

⚠️ Note: AutoGEO_Mini requires:

CUDA-compatible GPU * 2 (A100 40GB+ recommended)
~4h for SFT and ~48h for GRPO on Researchy-GEO

⚡ Quick Start

Rewrite a document using AutoGEO_API:

from autogeo.rewriters import rewrite_document

rewritten_text = rewrite_document(
    document="AutoGEO automatically extracts content preference rules from generative engines and rewrites documents to maximize visibility while preserving accuracy.",
    dataset="Researchy-GEO",   # Options: E-commerce, GEO-Bench, Researchy-GEO
    engine_llm="gemini"        # Options: gemini, gpt, claude
)

print(rewritten_text)

🧩 Rule Extraction

Extract content preference rules from a generative engine (example: Gemini on E-commerce):

python -m autogeo.extract_rules \
    --dataset E-commerce \
    --engine_llm gemini-2.5-flash-lite

Rules are saved to: data/E-commerce/rule_sets/gemini-2.5-flash-lite/. Tips:

Reduce concurrency if hitting API rate limits: --max_workers 4
Test on a small subset: --num_examples 10

Use extracted or custom rules for rewriting:

from autogeo.rewriters import rewrite_document

rewritten_text = rewrite_document(
    document="Your document text here",
    rule_path=f"data/{dataset}/rule_sets/{engine_llm}/merged_rules.json"
)

Custom rules format: JSON file with root key "filtered_rules"

🧩 AutoGEO_API

AutoGEO provides a unified evaluation framework for all models.

Model types:

vanilla — Original documents (baseline)
autogeo_api — Rewritten documents generated by prompt-based GEO model
autogeo_mini — Rewritten documents generated by cost-effective GEO model

Evaluate baseline:

python -m autogeo.evaluate \
    --model vanilla \
    --dataset E-commerce \
    --engine_llm gemini-2.5-flash-lite

Evaluate AutoGEO_API:

python -m autogeo.evaluate \
    --model autogeo_api \
    --dataset E-commerce \
    --engine_llm gemini-2.5-flash-lite

Tips:

Include GEU score: --need_geu_score
Test subset: --num_examples 10

🧩 AutoGEO_Mini

Train a cost-effective GEO model using reinforcement learning.

Step 1: Cold Start (Supervised Fine-Tuning)

bash run_cold_start.sh E-commerce

Using training data (data/E-commerce/RL/finetune.json) and starts LLaMA-Factory training. Checkpoint saved to outputs/E-commerce/cold_start.

Step 2: GRPO Training

bash run_grpo.sh E-commerce

Trains the model using Group Relative Policy Optimization. Checkpoint saved to outputs/E-commerce/grpo.

If you encounter GRPO-related dependency errors, it is usually caused by version conflicts between LLaMA-Factory and open-r1. To resolve this, reinstall open-r1:

cd open-r1
GIT_LFS_SKIP_SMUDGE=1 pip install -e ".[dev]"

Step 3: Evaluation

python -m autogeo.evaluate \
    --model autogeo_mini \
    --model_path outputs/E-commerce/grpo \
    --dataset E-commerce \
    --engine_llm gemini-2.5-flash-lite

📚 Supported Datasets & Engines & Metrics

Datasets:

Researchy-GEO — Academic dataset
E-commerce — Commercial dataset
GEO-Bench — Benchmark from GEO

Generative Engines:

Gemini (e.g., gemini-2.5-flash-lite)
GPT (e.g., gpt-4o-mini)
Claude (e.g., claude-3-5-sonnet-20241022)

Metrics:

GEO Score — Visibility (position, token count, citation frequency)
GEU Score — Utility (citation quality, keypoint coverage, response quality)

🙏 Acknowledgements

We thank the authors of GEO, AutoRule, LLaMA-Factory, open-r1, and DeepResearchGym for their inspiring works. We also thank Qwen3 and DeepSeek-R1 for their excellent models.

📖 Citation

If you find AutoGEO useful, please cite:

@article{wu2025generative,
  title={What Generative Search Engines Like and How to Optimize Web Content Cooperatively},
  author={Wu, Yujiang and Zhong, Shanshan and Kim, Yubin and Xiong, Chenyan},
  journal={arXiv preprint arXiv:2510.11438},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LLaMA-Factory		LLaMA-Factory
autogeo		autogeo
data/Researchy-GEO/key_point		data/Researchy-GEO/key_point
open-r1		open-r1
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
install_mini.sh		install_mini.sh
keys.env.example		keys.env.example
requirements.txt		requirements.txt
run_cold_start.sh		run_cold_start.sh
run_grpo.sh		run_grpo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoGEO

🔍 Overview

News

🚀 Installation

⚡ Quick Start

🧩 Rule Extraction

🧩 AutoGEO_API

🧩 AutoGEO_Mini

📚 Supported Datasets & Engines & Metrics

🙏 Acknowledgements

📖 Citation

About

Uh oh!

Releases

Packages

Languages

License

cxcscmu/AutoGEO

Folders and files

Latest commit

History

Repository files navigation

AutoGEO

🔍 Overview

News

🚀 Installation

⚡ Quick Start

🧩 Rule Extraction

🧩 AutoGEOAPI

🧩 AutoGEOMini

📚 Supported Datasets & Engines & Metrics

🙏 Acknowledgements

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

🧩 AutoGEO_API

🧩 AutoGEO_Mini

Packages