A reliable LLM interface with retries, validation, and RL-based routing.
- Smart model selection using reinforcement learning
- Automatic retries and fallbacks
- Response validation (JSON and LLM judge)
- Performance monitoring and optimization
- Efficient caching
- Support for Google (Gemini) and Groq models
- Clone the repository:
git clone https://github.com/yourusername/retryLLM.git
cd retryLLM- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install the package:
pip install -e .- Create a
.envfile with your API keys:
GOOGLE_API_KEY=your_google_api_key
GROQ_API_KEY=your_groq_api_key
- Basic usage (auto-selects best model):
retryLLM "Your prompt here"- Specify a model:
retryLLM --model gemini-pro "Your prompt here"- Use JSON validation:
retryLLM --validate json "Generate a JSON object with name and age fields"- Use LLM judge validation:
retryLLM --validate llm_judge --judge-model gemini-pro "Explain quantum computing"- Set fallback model and retries:
retryLLM --model llama3-70b-8192 --fallback gemini-pro --max-retries 2 "Your prompt here"- Get full JSON output:
retryLLM --json "Your prompt here"from llm_guardrail import safe_call
# Basic call
result = safe_call("Your prompt here")
# With validation and specific model
result = safe_call(
prompt="Generate a JSON object with name and age fields",
model="gemini-pro",
validate="json",
max_retries=3
)
# With LLM judge validation
result = safe_call(
prompt="Explain quantum computing",
validate="llm_judge",
judge_model="gemini-pro"
)gemini-pro: Google's powerful general-purpose modelllama3-70b-8192: Groq's high-performance Llama model
Contributions are welcome! Please feel free to submit a Pull Request.