Thank you for considering contributing to Instructor! This document provides guidelines and instructions to help you contribute effectively.
- Contributing to Instructor
By participating in this project, you agree to abide by our code of conduct: treat everyone with respect, be constructive in your communication, and focus on the technical aspects of the contributions.
-
Fork the Repository: Click the "Fork" button at the top right of the repository page.
-
Clone Your Fork:
git clone https://github.com/YOUR-USERNAME/instructor.git cd instructor -
Set up Remote:
git remote add upstream https://github.com/instructor-ai/instructor.git
-
Install UV (recommended):
# macOS/Linux curl -LsSf https://astral.sh/uv/install.sh | sh # Windows PowerShell powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
-
Install Dependencies:
# Using uv (recommended) uv pip install -e ".[dev,docs,test-docs]" # Using poetry poetry install --with dev,docs,test-docs # For specific providers, add the provider name as an extra # Example: uv pip install -e ".[dev,docs,test-docs,anthropic]"
-
Set up Pre-commit:
pip install pre-commit pre-commit install
-
Create a Branch:
git checkout -b feature/your-feature-name
-
Make Your Changes and Commit:
git add . git commit -m "Your descriptive commit message"
-
Keep Your Branch Updated:
git fetch upstream git rebase upstream/main
-
Push Changes:
git push origin feature/your-feature-name
We support both UV and Poetry for dependency management. Choose the tool that works best for you:
UV is a fast Python package installer and resolver. It's recommended for day-to-day development in Instructor.
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install project and development dependencies
uv pip install -e ".[dev,docs]"
# Adding a new dependency (example)
uv pip install new-packageKey UV commands:
uv pip install -e .- Install the project in editable modeuv pip install -e ".[dev]"- Install with development extrasuv pip freeze > requirements.txt- Generate requirements fileuv self update- Update UV to the latest version
Poetry provides more comprehensive dependency management and packaging.
# Install Poetry
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies including development deps
poetry install --with dev,docs
# Add a new dependency
poetry add package-name
# Add a new development dependency
poetry add --group dev package-nameKey Poetry commands:
poetry shell- Activate the virtual environmentpoetry run python -m pytest- Run commands within the virtual environmentpoetry update- Update dependencies to their latest versions
Instructor uses optional dependencies to support different LLM providers. Provider-specific utilities live under instructor/utils. When adding integration for a new provider:
-
Update pyproject.toml: Add your provider's dependencies to both
[project.optional-dependencies]and[dependency-groups]:[project.optional-dependencies] # Add your provider here my-provider = ["my-provider-sdk>=1.0.0,<2.0.0"] [dependency-groups] # Also add to dependency groups my-provider = ["my-provider-sdk>=1.0.0,<2.0.0"]
-
Create Provider Client: Implement your provider client in
instructor/clients/client_myprovider.py -
Add Tests: Create tests in
tests/llm/test_myprovider/ -
Document Installation: Update the documentation to include installation instructions:
# Install with your provider support uv pip install "instructor[my-provider]" # or poetry install --with my-provider -
Create Provider Utilities and Handlers:
- Add a new module at
instructor/utils/myprovider.py - Implement
reaskfunctions for validation errors andhandle_*functions for formatting requests - Define
MYPROVIDER_HANDLERSmappingModevalues to these functions
- Add a new module at
-
Register the Provider:
- Add a value in
instructor/utils/providers.pyto theProviderenum - Extend
get_providerwith detection logic for your base URL
- Add a value in
-
Update
process_response.py:- Import your handler functions and include them in the
mode_handlersdictionary so the library can route requests to your provider process_response.pyrelies on these handlers to format arguments and parse results for eachMode
- Import your handler functions and include them in the
If you find a bug, please create an issue on our issue tracker with:
- A clear, descriptive title
- A detailed description including:
- The
response_modelyou are using - The
messagesyou are using - The
modelyou are using - Steps to reproduce the bug
- The expected behavior and what went wrong
- Your environment (Python version, OS, package versions)
- The
For feature requests, please create an issue describing:
- The problem your feature would solve
- How your solution would work
- Alternatives you've considered
- Examples of how the feature would be used
- Create a Pull Request from your fork to the main repository.
- Fill out the PR template with details about your changes.
- Address review feedback and make requested changes.
- Wait for CI checks to pass.
- Once approved, a maintainer will merge your PR.
Documentation improvements are always welcome! Follow these guidelines:
- Documentation is written in Markdown format in the
docs/directory - When creating new markdown files, add them to
mkdocs.ymlunder the appropriate section - Follow the existing hierarchy and structure
- Use a grade 10 reading level (simple, clear language)
- Include working code examples
- Add links to related documentation
We encourage contributions to our evaluation tests:
- Explore existing evals in the evals directory
- Contribute new evals as pytest tests
- Evals should test specific capabilities or edge cases of the library or models
- Follow the existing patterns for structuring eval tests
We use automated tools to maintain consistent code style:
- Ruff: For linting and formatting
- PyRight: For type checking
- Black: For code formatting (enforced by Ruff)
General guidelines:
- Typing: Use strict typing with annotations for all functions and variables
- Imports: Standard lib → third-party → local imports
- Models: Define structured outputs as Pydantic BaseModel subclasses
- Naming: snake_case for functions/variables, PascalCase for classes
- Error Handling: Use custom exceptions from exceptions.py, validate with Pydantic
- Comments: Docstrings for public functions, inline comments for complex logic
We use conventional comments in code reviews and commit messages. This helps make feedback clearer and more actionable:
<label>: <subject>
<description>
Labels include:
- praise: highlights something positive
- suggestion: proposes a change or improvement
- question: asks for clarification
- nitpick: minor, trivial feedback that can be ignored
- issue: points out a specific problem that needs to be fixed
- todo: notes something to be addressed later
- fix: resolves an issue
- refactor: suggests reorganizing code without changing behavior
- test: suggests adding or improving tests
Examples:
suggestion: consider using Pydantic's validator for this check
This would ensure validation happens automatically when the model is created.
question: why is this approach used instead of async processing?
I'm wondering if there would be performance benefits.
fix: correct the type hint for the client parameter
The client should accept OpenAI instances, not strings.
For more details, see the Conventional Comments specification.
We follow the Conventional Commits specification for commit messages. This helps us generate changelogs and understand the changes at a glance.
The commit message should be structured as follows:
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
- feat: A new feature
- fix: A bug fix
- docs: Documentation only changes
- style: Changes that do not affect the meaning of the code (white-space, formatting, etc)
- refactor: A code change that neither fixes a bug nor adds a feature
- perf: A code change that improves performance
- test: Adding missing tests or correcting existing tests
- build: Changes that affect the build system or external dependencies
- ci: Changes to our CI configuration files and scripts
feat(openai): add support for response_format parameter
fix(anthropic): correct tool calling format in Claude client
docs: improve installation instructions for various providers
test(evals): add evaluation for recursive schema handling
Breaking changes should be indicated by adding ! after the type/scope:
feat(api)!: change parameter order in from_openai factory function
Including a scope is recommended when changes affect a specific part of the codebase (e.g., a specific provider, feature, or component).
Run tests using pytest:
# Run all tests
pytest tests/
# Run specific test
pytest tests/path_to_test.py::test_name
# Skip LLM tests (faster for local development)
pytest tests/ -k 'not llm and not openai'
# Generate coverage report
coverage run -m pytest tests/ -k "not docs"
coverage reportmainbranch is the development branch- Releases are tagged with version numbers
- We follow Semantic Versioning
Cursor (https://cursor.sh) is a code editor powered by AI that can help you create PRs efficiently. We encourage using Cursor for Instructor development:
-
Install Cursor: Download from cursor.sh
-
Create a Branch: Start a new branch for your feature using Cursor's Git integration
-
Use Cursor Rules: We have Cursor rules that help with standards:
new-features-planning: Use when implementing new featuressimple-language: Follow when writing documentationdocumentation-sync: Reference when making code changes to keep docs in sync
-
Generate Code with AI: Use Cursor's AI assistance to generate code that follows our style
-
Auto-Create PRs: Use Cursor's PR creation feature with our template:
# Create PR using gh CLI gh pr create -t "Your PR Title" -b "Description of changes" -r jxnl,ivanleomk -
Include Attribution: Add
This PR was written by [Cursor](https://cursor.sh)to your PR description
For more details, see our Cursor rules in .cursor/rules/.
By contributing to Instructor, you agree that your contributions will be licensed under the project's MIT License.