VillageSQL AI Extension

A powerful AI extension for VillageSQL Server that adds AI prompt capabilities and text embeddings directly in SQL queries. Interact with AI models from Anthropic, OpenAI, Google, and local Ollama using familiar SQL syntax.

Features

AI Prompting: Send prompts to AI models directly from SQL queries
Multiple Providers: Support for Anthropic Claude, Google Gemini, OpenAI GPT, and local Ollama
Embedding Generation: Create text embeddings for vector search and similarity analysis using Google Gemini, OpenAI, or local Ollama
High Performance: Efficient C++ implementation with minimal overhead
Secure: HTTPS communication with SSL certificate verification

Installation

Build from Source

Prerequisites

VillageSQL build directory (specified via VillageSQL_BUILD_DIR)
CMake 3.16 or higher
C++17 compatible compiler
OpenSSL development libraries (for HTTPS connections)

📚 Full Documentation: Visit villagesql.com/docs for comprehensive guides on building extensions, architecture details, and more.

Build Instructions

Clone the repository:

git clone https://github.com/villagesql/vsql-ai.git
cd vsql-ai

Configure CMake with required paths:

Linux:

mkdir -p build
cd build
cmake .. -DVillageSQL_BUILD_DIR=$HOME/build/villagesql

macOS:

mkdir -p build
cd build
cmake .. -DVillageSQL_BUILD_DIR=~/build/villagesql

Note: VillageSQL_BUILD_DIR should point to your VillageSQL build directory.

Build the extension:
```
make -j $(($(getconf _NPROCESSORS_ONLN) - 2))
```
This creates the veb package in the build directory.
Install the VEB (optional):
```
make install
```
This copies the VEB to the directory specified by VEB_INSTALL_DIR. If not using make install, you can manually copy the VEB file to your desired location.

Usage

After installation, load the extension in VillageSQL:

INSTALL EXTENSION vsql_ai;

AI Prompting Examples

Anthropic Claude

-- Simple prompt with Claude
SELECT ai_prompt(
    'anthropic',
    'claude-sonnet-4-5-20250929',
    'your-api-key-here',
    'Explain quantum computing in one sentence'
) AS response;

Google Gemini

-- Simple prompt with Gemini
SELECT ai_prompt(
    'google',
    'gemini-2.5-flash',
    'your-api-key-here',
    'Explain quantum computing in one sentence'
) AS response;

OpenAI GPT

-- Simple prompt with GPT
SELECT ai_prompt(
    'openai',
    'gpt-4o-mini',
    'your-api-key-here',
    'Explain quantum computing in one sentence'
) AS response;

Local Ollama

-- Simple prompt with local Ollama (no API key needed)
SELECT ai_prompt(
    'local',
    'llama3.2',
    '',
    'Explain quantum computing in one sentence'
) AS response;

Using with Table Data

-- Use with table data
CREATE TABLE questions (id INT, question TEXT);
INSERT INTO questions VALUES
    (1, 'What is machine learning?'),
    (2, 'Explain neural networks'),
    (3, 'What is deep learning?');

-- Get AI responses for multiple questions using Claude
SET @api_key = 'your-anthropic-api-key';
SELECT id, question,
       ai_prompt('anthropic', 'claude-sonnet-4-5-20250929', @api_key, question) AS answer
FROM questions;

-- Or use Gemini
SET @api_key = 'your-google-api-key';
SELECT id, question,
       ai_prompt('google', 'gemini-2.5-flash', @api_key, question) AS answer
FROM questions;

Text Embeddings Examples

Generate Embeddings

-- Generate embedding for text
SET @api_key = 'your-google-api-key';
SELECT ai_embedding(
    'google',
    'gemini-embedding-001',
    @api_key,
    'Machine learning is fascinating'
) AS embedding;

-- Returns: [0.02646778, 0.019067757, -0.05332306, ...]

Using Embeddings with Table Data

-- Create a table to store documents and their embeddings
CREATE TABLE documents (
    id INT PRIMARY KEY,
    content TEXT,
    embedding JSON
);

-- Generate and store embeddings
SET @api_key = 'your-google-api-key';
INSERT INTO documents (id, content, embedding)
VALUES (1, 'Machine learning is a subset of artificial intelligence',
        ai_embedding('google', 'gemini-embedding-001', @api_key,
                            'Machine learning is a subset of artificial intelligence'));

-- Query to generate embeddings for multiple documents
SELECT id, content,
       ai_embedding('google', 'gemini-embedding-001', @api_key, content) AS embedding
FROM documents;

Supported Providers

Currently supported:

Anthropic (provider: `anthropic`)

Claude 4.5 models:

Claude Sonnet 4.5: claude-sonnet-4-5-20250929 (recommended - best for complex agents and coding)
Claude Haiku 4.5: claude-haiku-4-5-20251001 (fastest with near-frontier intelligence)
Claude Opus 4.5: claude-opus-4-5-20251101 (maximum capability and intelligence)

Google (provider: `google`)

Generative:

Gemini 3 Flash: gemini-3-flash-preview (balanced model for speed, scale, and frontier intelligence)
Gemini 3 Pro: gemini-3-pro-preview (best model for multimodal understanding)
Gemini 2.5 Flash: gemini-2.5-flash (stable - best price-performance ratio)
Gemini 2.5 Pro: gemini-2.5-pro (stable - state-of-the-art reasoning over complex problems)

Embeddings:

Gemini Embedding 001: gemini-embedding-001 (3072 dimensions by default)
Gemini Embedding 2: gemini-embedding-2-preview (improved embedding model)

OpenAI (provider: `openai`)

Chat:

GPT-4o: gpt-4o (flagship multimodal model)
GPT-4o mini: gpt-4o-mini (fast and cost-effective)

Embeddings:

text-embedding-3-small: text-embedding-3-small (1536 dimensions, best value)
text-embedding-3-large: text-embedding-3-large (3072 dimensions, highest performance)

Local Ollama (provider: `local`)

Connects to Ollama running on 127.0.0.1:11434. No API key required. Use any model you have pulled in Ollama.

Chat: Any Ollama chat model (e.g., llama3.2, mistral, gemma2)

Embeddings: Any Ollama embedding model (e.g., nomic-embed-text, mxbai-embed-large)

Function Reference

`ai_prompt(provider, model, api_key, prompt)`

Send a prompt to an AI provider and get a response.

Parameters:

provider (STRING): AI provider name ("anthropic", "google", "openai", "local")
model (STRING): Model identifier (e.g., "claude-sonnet-4-5-20250929", "gemini-2.5-flash", "gpt-4o-mini", "llama3.2")
api_key (STRING): API key for authentication (use empty string '' for local provider)
prompt (STRING): The prompt text to send to the AI

Returns: STRING - The AI model's response

Examples:

-- Anthropic Claude
SELECT ai_prompt('anthropic', 'claude-sonnet-4-5-20250929', @api_key, 'Hello!');

-- Google Gemini
SELECT ai_prompt('google', 'gemini-2.5-flash', @api_key, 'Hello!');

-- OpenAI GPT
SELECT ai_prompt('openai', 'gpt-4o-mini', @api_key, 'Hello!');

-- Local Ollama (no API key needed)
SELECT ai_prompt('local', 'llama3.2', '', 'Hello!');

`ai_embedding(provider, model, api_key, text)`

Generate text embeddings for vector search and similarity analysis.

Parameters:

provider (STRING): Embedding provider ("google", "openai", "local")
model (STRING): Model identifier (e.g., "gemini-embedding-001", "text-embedding-3-small", "nomic-embed-text")
api_key (STRING): API key for authentication (use empty string '' for local provider)
text (STRING): Text to create embedding from

Returns: STRING - JSON array of embedding vector (dimensions vary by model)

Examples:

-- Google Gemini text embeddings
SELECT ai_embedding('google', 'gemini-embedding-001', @api_key, 'Machine learning is fascinating');

-- Result: [0.02646778, 0.019067757, -0.05332306, ...]

-- OpenAI text embeddings
SELECT ai_embedding('openai', 'text-embedding-3-small', @api_key, 'Machine learning is fascinating');

-- Local Ollama text embeddings (no API key needed)
SELECT ai_embedding('local', 'nomic-embed-text', '', 'Machine learning is fascinating');

Security Considerations

API Key Safety

Important: API keys passed as function parameters may be visible in query logs, slow query logs, and process lists.

Best Practices:

Use Session Variables (Recommended):

-- Store API key in session variable
SET @api_key = 'sk-ant-your-api-key';

-- Use variable in queries
SELECT ai_prompt('anthropic', 'claude-sonnet-4-5-20250929', @api_key, 'prompt');

Session variables keep API keys out of query text and reduce exposure in logs.

Avoid Hardcoded Keys:

-- ❌ BAD: Key visible in logs
SELECT ai_prompt('anthropic', 'model', 'sk-ant-12345...', 'prompt');

-- ✅ GOOD: Use session variable
SELECT ai_prompt('anthropic', 'model', @api_key, 'prompt');

Shell Environment Variables (Future Enhancement): Future versions may support reading API keys directly from shell environment variables (e.g., $ANTHROPIC_API_KEY) for additional security.

Network Security

All API requests use HTTPS with SSL certificate verification
Connections timeout after 30 seconds by default
Failed connections return clear error messages

Performance Considerations

Timeouts

AI API calls can take 5-30 seconds depending on prompt complexity and model speed. Consider:

MySQL Query Timeout: You may need to adjust max_execution_time:
```
SET SESSION max_execution_time = 60000; -- 60 seconds
```
Batch Processing: For multiple prompts, process in batches to avoid long-running queries

Rate Limiting

AI providers impose rate limits on API requests:

Anthropic: Varies by plan (typically 50+ requests/minute)
Error messages will indicate rate limit issues
Consider spacing out bulk operations

Testing

The extension includes comprehensive tests using the MySQL Test Runner (MTR) framework.

Running Tests

Option 1 (Default): Using installed VEB

This method assumes you have successfully run make install to install the VEB to your veb_dir.

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test

# Run individual test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test error_handling

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test

# Run individual test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test error_handling

Option 2: Using a specific VEB file

Use this to test a specific VEB build without installing it first:

Linux:

cd $HOME/build/villagesql/mysql-test
VSQL_AI_VEB=/path/to/vsql-ai/build/vsql_ai.veb \
  perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test

macOS:

cd ~/build/mysql-test
VSQL_AI_VEB=/path/to/vsql-ai/build/veb \
  perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test

Testing with Live API Calls

The extension includes live API tests for each provider. Each test will skip live API calls if the corresponding environment variable is not set.

Testing Anthropic Claude

Export your API key:

export ANTHROPIC_API_KEY='your-api-key-here'

Run the test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_anthropic

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_anthropic

Testing Google Gemini

Export your API key:

export GEMINI_API_KEY='your-api-key-here'

Run the prompt test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_google

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_google

Run the embeddings test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_google

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_google

Testing OpenAI GPT

Export your API key:

export OPENAI_API_KEY='your-api-key-here'

Run the prompt test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_openai

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_openai

Run the embeddings test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_openai

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_openai

Testing Local Ollama

Ensure Ollama is running on 127.0.0.1:11434 with a model pulled.

Export the model name:

export OLLAMA_MODEL='llama3.2'
export OLLAMA_EMBED_MODEL='nomic-embed-text'

Run the prompt test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_local

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_prompt_local

Run the embeddings test:

Linux:

cd $HOME/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_local

macOS:

cd ~/build/villagesql/mysql-test
perl mysql-test-run.pl --suite=/path/to/vsql-ai/mysql-test ai_embedding_local

Security Note: All tests automatically:

Skip live API calls if the environment variable is not set
Hide API keys from test output using --disable_query_log
Validate that responses contain expected content

Note: The error_handling test does not require an API key and only validates input validation and error handling.

Development

Project Structure

vsql-ai/
├── src/
│   ├── ai_functions.cc      # VEF function implementations and registration
│   ├── ai_providers.h/.cc   # AI provider implementations (Anthropic, OpenAI, Google, Local)
│   └── http_client.h/.cc    # HTTP client wrapper for API calls
├── include/
│   ├── httplib.h            # cpp-httplib single header
│   └── nlohmann/json.hpp    # nlohmann/json single header
├── cmake/
│   └── FindVillageSQL.cmake # CMake module to locate VillageSQL SDK
├── mysql-test/
│   ├── t/                   # MTR test files
│   └── r/                   # MTR expected results
├── manifest.json            # VEB package manifest
├── CMakeLists.txt           # Build configuration
└── AGENTS.md                # AI coding assistant instructions

Architecture

The extension uses:

VillageSQL Extension Framework (VEF): Native extension API
cpp-httplib: Header-only HTTP/HTTPS client library
nlohmann/json: Header-only JSON parsing library
OpenSSL: SSL/TLS for secure HTTPS connections

Build Targets

make - Build the extension and create the veb package

Roadmap

✅ Anthropic Claude integration
✅ Google Gemini integration
✅ Embedding generation (Google Gemini)
✅ Session variable support for API keys
✅ OpenAI GPT integration
✅ OpenAI embeddings
✅ Local Ollama provider (no API key required)
⏳ Shell environment variable support for API keys
⏳ Configurable timeouts
⏳ Response streaming for long outputs
⏳ Token counting utilities

Reporting Bugs and Requesting Features

If you encounter a bug or have a feature request, please open an issue using GitHub Issues. Please provide as much detail as possible, including:

A clear and descriptive title.
A detailed description of the issue or feature request.
Steps to reproduce the bug (if applicable).
Your environment details (OS, VillageSQL version, etc.).

License

License information can be found in the LICENSE file.

Contributing

VillageSQL welcomes contributions from the community. For more information, please see the VillageSQL Contributing Guide.

Contact

We are excited you want to be part of the Village that makes VillageSQL happen. You can interact with us and the community in several ways:

File a bug or issue and we will review
Start a discussion in the project discussions
Join the Discord channel

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
cmake		cmake
include		include
mysql-test		mysql-test
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
manifest.json		manifest.json

Folders and files

Latest commit

History

Repository files navigation

VillageSQL AI Extension

Features

Installation

Build from Source

Prerequisites

Build Instructions

Usage

AI Prompting Examples

Anthropic Claude

Google Gemini

OpenAI GPT

Local Ollama

Using with Table Data

Text Embeddings Examples

Generate Embeddings

Using Embeddings with Table Data

Supported Providers

Anthropic (provider: anthropic)

Google (provider: google)

OpenAI (provider: openai)

Local Ollama (provider: local)

Function Reference

ai_prompt(provider, model, api_key, prompt)

ai_embedding(provider, model, api_key, text)

Security Considerations

API Key Safety

Network Security

Performance Considerations

Timeouts

Rate Limiting

Testing

Running Tests

Testing with Live API Calls

Testing Anthropic Claude

Testing Google Gemini

Testing OpenAI GPT

Testing Local Ollama

Development

Project Structure

Architecture

Build Targets

Roadmap

Reporting Bugs and Requesting Features

License

Contributing

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Anthropic (provider: `anthropic`)

Google (provider: `google`)

OpenAI (provider: `openai`)

Local Ollama (provider: `local`)

`ai_prompt(provider, model, api_key, prompt)`

`ai_embedding(provider, model, api_key, text)`

Packages